Job Description
As a Senior Cloud Operations Engineer, They will own the end-to-end operability of OCI compute services, working closely with product and development teams to ensure reliable, secure, and highly scalable cloud infrastructure. This role combines deep systems knowledge, incident management, automation, and continuous improvement to support Oracle’s rapidly growing cloud platforms.
Date Posted: January 14, 2026
Expiration Date: NA
Experience: 3 to 5+ Years (6–10+ years overall IT experience preferred)
Job ID: 317367
Role: Individual Contributor (IC3)
Apply: Apply Now
Responsibilities
Cloud Operations & Service Management
- Completely take shared full-stack ownership of the OCI compute services, knowing the end-to-end configurations, dependencies, and also the behavior of the service.
- Be the person in-charge for the service performance, availability, scalability, and operability in production environments.
Incidents, Problems & Changes Management
- Manage the response and resolution of critical incidents in staging and production environments as per defined SLAs.
- Get to be the last escalation point for complicated problems and do root cause analysis to promote permanent fixes.
Production Support & Reliability
- Make sure that you run and support the production environment that is of large scale and is also secure, performing well, and available all the time.
- Put in, watch over, maintain, and optimize server hardware and software assuming there is no downtime at all during the process.
Automation, DevOps & Release Management
- Come up with the process, do the testing, the deployment, and the automation to take the place of manual operations in the background.
- Create and manage CI/CD pipelines, deployment tools, and processes to cater for the high scalability of cloud services.
Collaboration & Continuous Enhancement
- Work closely and in partnership with the development and service teams to identify complex issues that can only be resolved by performing code-level troubleshooting.
- Keep documentation, SOPs, and monitoring instruments up-to-date. This would help not only to avoid incidents but also to minimize their resolution time.
Essential Qualifications
- At least a minimum of 6-10 years of overall IT experience, out of which 4 years should be in Cloud environments.
- Excellent hands-on skills with Oracle Cloud Infrastructure (OCI) Compute or equivalent cloud platforms.
- Expertise in Unix/Linux administration across various hardware types.
- Good grasp of distributed systems, service topology, and system architecture.
- Handled incidents, on-call and SLA driven environments.
Preferred Qualifications
- Experience with cloud services that are very large scale, have very high throughput and are very intensive in IO.
- Knowledge of automation and orchestration principles is very strong.
- Having worked in a fast-paced cloud production environment that provides 24×7 support.
- Professional curiosity combined with the will to help others understand cloud services and technologies very well.
Required Skills
- Oracle Cloud Infrastructure (OCI) – Compute
- Cloud Operations & Incident Management
- Linux Administration
- Virtualization Technologies
- Scripting (Python / Java / Go)
- Monitoring & Instrumentation (Prometheus, Grafana, New Relic, Elastic)
- Terraform, Chef, Git, Jenkins/Hudson, Artifactory
- Docker, Kubernetes & CI/CD
- Networking, Load Balancers & Autoscaling