DevOps Engineer - AI, Data & Platforms

Added
12 days ago
Type
Full time
Salary
Salary not provided

Related skills

datadog terraform github actions prometheus python

📋 Description

  • Own CI/CD, IaC (Terraform), and release pipelines for AI/data services.
  • Manage cloud infra, networking, secrets, and access controls per security standards.
  • Build observability, alerting, and incident response into every platform service.
  • Partner with AI Infra and AI Eng to move code from commit to production.
  • Drive cost visibility and optimization across cloud and GPU spend with Finance.
  • Establish runbooks, on-call practices, and post-incident reviews for AI footprint.

🎯 Requirements

  • 6+ years in DevOps/SRE/cloud platform engineering, AI/data workloads.
  • Strong Python (or Go) skills for automation and tooling.
  • Hands-on cloud experience across AWS, GCP, or Azure.
  • Kubernetes, Terraform, GitHub Actions/ArgoCD, and observability stacks (Datadog/Prometheus).
  • Experience with AI/ML workloads: GPUs, large model artifacts, long runs.
  • Advanced degree in CS/Engineering or equivalent.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →