MLOps Engineer 36 s
F
Foundry
7.8 - 10K USD
Full-time
Remote
EngineerAwsDockerKubernetes
Job Description:
Role Overview: We are looking for an MLOps Engineer who excels in managing the deployment, monitoring, and lifecycle of machine learning models while also having strong capabilities in DevOps practices and infrastructure management. The ideal candidate will have hands-on experience with MLOps tools, DevOps methodologies, and infrastructure technologies to ensure seamless and scalable integration of machine learning models into production environments.
Key Responsibilities:
MLOps Implementation: Design and implement MLOps pipelines to automate and streamline the deployment, monitoring, and management of machine learning models. Ensure the models are scalable, reproducible, and maintainable.
- Infrastructure Management: Manage and optimize infrastructure using Terraform and cloud platforms (e.g., AWS, Azure, GCP) to support machine learning workloads. Automate infrastructure provisioning and configuration using tools like
- CI/CD Pipelines: Develop and maintain CI/CD pipelines tailored for machine learning
- Containerization and Orchestration: Use Docker for containerizing machine learning applications and Kubernetes for orchestrating and managing containerized
- Monitoring and Performance: Implement monitoring solutions for machine learning models and infrastructure. Track model performance, detect anomalies, and ensure system
- Collaboration: Work closely with data scientists to understand model requirements and deployment needs. Provide guidance on best practices for model integration and
- Documentation: Maintain thorough documentation for MLOps workflows, infrastructure setups, and deployment
Requirements:
- Experience: 3-6 years of experience in MLOps, DevOps, or a related role with a strong focus on machine learning model deployment and infrastructure
- MLOps Tools: Experience with MLOps frameworks and tools such as MLflow, Kubeflow, or similar.
- DevOps: Experience with DevOps practices and tools, including Terraform for infrastructure as code (IaC), Ansible for configuration management, Docker for containerization, and Kubernetes for
- Infrastructure Knowledge: Strong understanding of infrastructure components and architecture, including networking, storage, and compute resources. Experience with designing and maintaining infrastructure for high availability, scalability, and
- Cloud Platforms: Experience with cloud services (AWS, Azure, GCP) and cloud- native tools for managing infrastructure and applications.
- CI/CD Tools: Proficiency with CI/CD tools such as Jenkins, GitLab CI, or
- Python: Proficiency in Python for scripting, automation, and integrating machine learning models into production environments.
- Monitoring Tools: Experience with monitoring tools such as Prometheus, Grafana, ELK Stack, or similar for tracking system health and performance.
- Version Control: Experience with version control systems, particularly
- Problem-Solving: Excellent troubleshooting skills with a proactive approach to resolving
- Communication: Strong communication skills and the ability to work effectively in a collaborative team environment.
Preferred Qualifications:
- Bachelor’s Degree in Computer Science, Engineering, Data Science, or a related
- Certifications: Relevant certifications in cloud platforms (e.g., AWS Certified DevOps Engineer) and Kubernetes.
- Experience with Machine Learning Frameworks: Proficiency with machine learning frameworks and deployments