Position Summary:

We are seeking a skilled MLOps Engineer to support and enhance our machine learning operations infrastructure. In this role, you will be responsible for monitoring production services, troubleshooting issues, and collaborating with teams to improve automation and system reliability. You will play a critical role in ensuring seamless model deployment, performance, and integration within our ML platform

Key Responsibilities:

  • Monitor support channels and incident queues to proactively identify and address operational issues.
  • Investigate and resolve issues reported by automation systems, alerts, or customer feedback.
  • Maintain and support online production services for serving ML models, ensuring high availability and performance.
  • Collaborate with engineering teams to automate processes and improve operational efficiency.
  • Gain a deep understanding of ML platform capabilities and integrations, providing technical insights to enhance system reliability.
  • Identify recurring issues and provide feedback to ML platform engineers for continuous improvements.
  • Contribute to documentation efforts, ensuring clarity and accuracy for internal teams and stakeholders.

Required Skills & Qualifications:

  • Bachelor's or master's degree in computer science or related field.
  • Relevant experience of 3 years in Python programming.
  • Programming: Proficiency in Python (Mandatory).
  • ML Infrastructure: Hands-on experience with Databricks, Tecton, and ML Concepts (Model Deployment, Feature Engineering, Monitoring).
  • DevOps & Automation: Strong knowledge of Kubernetes, Jenkins, and GitHub for CI/CD pipelines and infrastructure automation.
  • Cloud Computing: Expertise in AWS services related to ML Ops.
  • Version Control & Monitoring: Experience with GitHub Actions, observability tools, and system monitoring frameworks.
  • Problem-Solving & Communication: Strong analytical skills, ability to debug production issues, and effectively communicate with cross-functional teams.

Preferred Qualifications:

  • Experience working with large-scale distributed ML systems.
  • Knowledge of Terraform for infrastructure as code.
  • Experience with logging and observability best practices for ML models in production.
  • If you are excited about building scalable ML infrastructure and driving automation, we’d love to hear from you! Apply now to be part of our team.

What we Offer:

  • Bootstrapped and financially stable with high pre-money evaluation.
  • Above industry remunerations.
  • Additional compensation tied to Renewal and Pilot Project Execution.
  • Additional lucrative business development compensation.
  • Firm building opportunities that offer stage for holistic professional development, growth, and branding.
  • Empathetic, excellence and result driven organization. Believes in mentoring and growing a team with constant emphasis on learning.

About Saarthee:

Saarthee is global analytics consulting firm unlike any other, where our passion for helping others fuels our approach and our products and solutions. We are a one-stop shop for all things data and analytics. Unlike other analytics consulting firms that are technology or platform specific, Saarthee’s holistic and tool agnostic approach along with ability delivery strategic actionable insights is unique in the marketplace. Our Analytics Value Chain framework meets our customers where they are in their data journey. Our diverse and global team of skilled data engineers, data analysts, and data scientists work with one objective in mind: Our Customers’ Success. At Saarthee, we are passionate about guiding organizations towards insights-fueled success. That’s why we call ourselves Saarthee–inspired by the Sanskrit word ‘Saarthi’, which means charioteer, trusted guide, or companion. Co-founded in 2015 by Mrinal Prasad and Shikha Miglani, Saarthee already encompasses all the components of Data Analytics consulting. Saarthee is based out of Philadelphia, USA with office in UK and India.