Sr. Production Engineer

Added
4 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

ansible terraform aws prometheus python

๐Ÿ“‹ Description

  • Hybrid role: 3 days/wk in San Jose, CA or remote
  • Improve reliability of a global, multi-cloud platform processing 200+B transactions daily
  • Drive automation-first culture with Python/Go to reduce toil
  • Improve observability; define SLIs/SLOs, and error budgets
  • Lead on-call incidents; develop playbooks and post-incident analyses
  • Partner with Engineering for operability reviews

๐ŸŽฏ Requirements

  • 3-5+ years of reliability, scalability for large-scale production services
  • Deep programming in Python, Go, or C/C++
  • Strong background in networking protocols, Linux/RHEL, and distributed architecture
  • Experience in high-stakes incident management and 24/7 on-call rotation
  • Proficiency with ITIL frameworks and incident data-driven maturity

๐ŸŽ Benefits

  • Various health plans
  • Time off plans for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’