Site Reliability Engineer

Added
11 days ago
Type
Full time
Salary
Salary not provided

Related skills

azure aws grafana prometheus python

📋 Description

  • Support reliability, availability, performance, and scalability of JFrog SaaS across multi-cloud Kubernetes.
  • Investigate and troubleshoot production issues across distributed systems and cloud envs.
  • Design and develop backend services, internal platforms, and tooling with Python or Go.
  • Improve reliability and observability via SRE practices, monitoring, postmortems, and safer CI/CD.
  • Evaluate AI-assisted automation to improve ops efficiency and workflows.
  • Support resilience initiatives: disaster recovery validation, service readiness, health checks.

🎯 Requirements

  • 2-4 years in SRE, Production Eng, DevOps, or similar role.
  • Strong troubleshooting and analytical skills for structured issue investigation.
  • Hands-on experience with Kubernetes-based container workloads.
  • Experience with AWS, GCP, or Azure.
  • Experience developing backend services, platforms, automation, or tooling with Python/Go.
  • Understanding Linux fundamentals, networking, HTTP, DNS, and production troubleshooting.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →