LMTS | Data Scientist | AI for Platform Reliability & Operations, Salesforce

Apply for this job

Email *
Executive Name *

Job Description

Salesforce is looking to hire an LMTS Data Scientist who will work with its Falcon Kubernetes Platform team to create AI systems which improve reliability and performance and automate operations in their extensive cloud infrastructure. The role involves using platform telemetry data to develop predictive analytics and decision-making systems and automated processes which will enhance operational efficiency in both Kubernetes and multi-cloud environments.

Date Posted: February 2026

Expiration Date: NA

Qualification: Bachelor’s/Master’s Degree

Apply: Apply Now

Main Duties

  • Develop artificial intelligence solutions which enhance platform performance through improved system reliability and increased system availability and better operational performance across all infrastructure components. 
  • Artificial intelligence systems which detect anomalies and predict failures and automate alerts and optimize resources for Kubernetes environments. 
  • Data pipelines and telemetry ingestion systems and analytics frameworks to achieve large-scale observability and intelligence. 
  • The team works together with SRE and platform engineering and product teams to develop AI solutions which solve operational problems for their organization. 
  • Automated workflows which include AI-assisted runbooks and decision-support tools for any platform operation.

Essential Qualifications

  • Experience with generative artificial intelligence and automation systems and operational analytics platforms. 
  • Using observability tools that include Prometheus and Grafana and Elasticsearch and similar platforms. 
  • Collaborated with SRE teams and infrastructure engineering teams to address platform reliability issues. 
  • Knowledge about multi-cloud architectures together with container workloads. 
  • The capacity to guide teams while fostering teamwork among team members.

Preferred Qualifications

  • Experience with generative artificial intelligence, automation technologies and operational analytics platforms. 
  • Expert with observability tools which include Prometheus and Grafana and Elasticsearch and similar software solutions. 
  • Use site reliability engineering and infrastructure engineering to fix platform reliability problems. 
  • Knowledge of container-based workload management and multi-cloud environment operations. 
  • Skills to guide team development while fostering teamwork within the group.