Senior Research Engineer (Code World Models)

Related skills

nlp python machine learning data pipelines distributed training

๐Ÿ“‹ Description

  • Design, pre-training, continued pre-training, and mid-training experiments for code models.
  • Build data pipelines for large-scale model training, including filtering and quality checks.
  • Work with code corpora, repositories, tests, execution traces, and synthetic data.
  • Develop evaluations for complex repository-level code reasoning tasks.
  • Collaborate with researchers and engineers working on ML for code and AI developer tools.

๐ŸŽฏ Requirements

  • Hands-on experience with model pre-training, continued training, or mid-training.
  • Strong engineering skills in Python and experience with modern ML frameworks.
  • Understand large-scale ML training workflows: data processing, distributed training, evaluation, and debugging.
  • Experience with large datasets and data quality: contamination, sampling, and reproducibility.
  • Background in NLP, ML for software engineering, or a similar domain.
  • Enjoy researching problems with high uncertainty and turning ideas into experiments.

๐ŸŽ Benefits

  • Equal opportunity employer
  • Open and inclusive workplace
  • Recruitment Privacy Policy
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’