Reliability Engineer, Data Platforms, Apple

Apply for this job

Email *
Executive Name *

Job Description

The Reliability Engineer for Data Platforms positioned at Apple is the one responsible for delivering and maintaining cloud data platforms that perform well, are economical and also trustworthy. The engineer is involved in the entire life cycle of the systems that support analytics and AI/ML workloads, there is always teamwork and at the same time, the engineer is able to use his/her powerful Unix/Linux skills and comply with Apple’s engineering and privacy regulations.

Date Posted: Jan 13, 2026

Role Number: 200639748-1052

Apply: Click Here

Primary Responsibilities

  • Design and manage vast big data platforms employing both open-source software and commercial products.
  • Analytica, reporting and AI/ML applications in the company-wide Apple ecosystem supported.
  • Boost data platform performance, capacity, and efficiency in terms of cost.
  • Take over big data system processes that are responsible for reliability and operations.
  • Monitor, find and fix production bugs to keep the system very available and dependable.
  • Work with others from different departments to make the platform’s stability and users’ experience better.

Key Qualifications

  • Professional software engineering experience of 3 years or more in big data platforms of large-scale size.
  • Java, Scala, Python or Go programming knowledge at an expert level.
  • Excellent skills in using Apache Spark and distributed data processing systems.
  • Aptitude in using data lake and table formats such as Apache Iceberg.
  • High competence in incident management, root cause analysis and performance tuning.
  • Knowledge of and good use of Unix/Linux environment and command-line debugging tools.

Preferred Qualifications

  • Proficient in building and maintaining systems with characteristics of low latency, fault tolerance, high availability and distributed nature.
  • Involvement in open-source projects will be considered a plus.
  • Understanding of public cloud infrastructure and ability to operate Kubernetes clusters in a large-scale environment.
  • Familiarity with workflow automation tools such as Airflow or DBT.
  • Knowledge about and familiarity with data modeling, data warehousing, and AI/ML stacks (GPUs, MLFlow, LLMs).
  • Having a solid understanding of software engineering practices and the entire development cycle.
  • Flexibility for constant learning and strong collaboration skills.