
Mistral AI
Research Engineer, Machine Learning
HybridPalo AltoApplications closed
As posted by Mistral AI
Role Summary
About the Research Engineering team
The team spans Platform (shared infra & clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.
As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:
- Platform RE Team: Enhance the shared training framework, data pipelines and cluster tooling used by every team; or- Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, …) and turn fresh ideas into repeatable, scalable code.
What will you do
• Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.• Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.• Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).• Design, implement and benchmark ML algorithms; write clear, efficient code in Python.• Deliver prototypes that become production-grade components for Le Chat and our enterprise API.
About you
• Master’s or PhD in Computer Science (or equivalent proven track record).• 4 + years working on large-scale ML codebases.• Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).• Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.• Strong software-design instincts: testing, code review, CI/CD.• Self-starter, low-ego, collaborative.
About the Research Engineering team
The team spans Platform (shared infra & clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.
As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:
- Platform RE Team: Enhance the shared training framework, data pipelines and cluster tooling used by every team; or- Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, …) and turn fresh ideas into repeatable, scalable code.
What will you do
• Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.• Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.• Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).• Design, implement and benchmark ML algorithms; write clear, efficient code in Python.• Deliver prototypes that become production-grade components for Le Chat and our enterprise API.
About you
• Master’s or PhD in Computer Science (or equivalent proven track record).• 4 + years working on large-scale ML codebases.• Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).• Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.• Strong software-design instincts: testing, code review, CI/CD.• Self-starter, low-ego, collaborative.
