Senior Research Engineer (Agentic Behavior)

Related skills

sql python pytorch kotlin evaluation

๐Ÿ“‹ Description

  • Build tools to capture and analyze errors in AI Kotlin code generation.
  • Create observability pipelines over agent traces in JetBrains IDEs and other agents.
  • Design evaluation pipelines for Kotlin code gen quality across metrics.
  • Build simulation environments for Kotlin dev tasks (Gradle, Spring, etc.).
  • Own evaluation infrastructure: metrics, experiments, regression checks.
  • Collaborate with model providers to improve Kotlin models.

๐ŸŽฏ Requirements

  • Hands-on experience building eval/analysis pipelines for LLMs/AI agents in research or prod.
  • Strong Python engineering (โ‰ฅ3 years), clean, data/ML-adjacent codebase.
  • Experience with data analysis at scale: SQL/Athena, data pipelines, stats of experiments.
  • Own projects end to end: identify problems, design eval, run experiments, ship fixes.
  • Product-aware mindset: translate real failure modes into eval/training work.
  • Familiarity with Kotlin or willingness to learn deeply (daily Kotlin code).

๐ŸŽ Benefits

  • Strong base salary
  • Flexible work location
  • Remote work up to 30 days per year abroad
  • Relocation support
  • Learning and development opportunities
  • Mental health support

๐Ÿšš Relocation support

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’