Senior DevOps Engineer

Terminal
Terminal

Software Engineering

Posted on Jun 23, 2026

About Atlas

Atlas understands how people work. Our team of trusted experts support you with the ins-and-outs of global expansion, local compliance, and employee experience across the globe.

You find the talent — we handle everything else.


About The Role

We're looking for a Senior DevOps / SRE Engineer to own the reliability, security, and delivery of the Atlas platform. You will be a core member of the infrastructure team, responsible for the full lifecycle of our cloud infrastructure — from pipeline design to production promotion. Day-to-day, you will manage our four-environment pipeline (dev → qa → training → prod) and act as the engineering gate that stands between a deployment and production: you hold the SRE approval that allows a release through. You'll work closely with backend, frontend, and security engineers, and you will own the on- call process end-to-end — from alert configuration to postmortem closure. This isn't a role for someone who waits for direction. We're looking for engineers who are genuinely curious, hold themselves to a high bar, and figure things out — people who take ownership without being asked. Our stack runs on Azure, but if you've been building on AWS or GCP, that experience translates; we'll invest in getting you up to speed on the specifics.


What You’ll Do

•Design, operate, and continuously improve AKS clusters — managing node pools, autoscaling policies (Karpenter), Helm chart versioning, and Kong ingress routing across multiple namespaces • Build and maintain GitHub Actions CI/CD pipelines with Kubernetes-hosted self-hosted runners — covering build, SAST, software bill of materials (SBOM) generation, dependency scanning, container publishing to Azure Container Registry, and AKS deployment • Manage all Azure infrastructure as code with Terraform — VNets, Private Endpoints, Application Gateway + WAF, Azure SQL, MySQL Flexible Server, Redis Cache, Cosmos DB, Blob Storage, Azure DNS, Key Vault, and DDoS Protection across all environments • Own the observability stack: Application Insights dashboards, Log Analytics queries, synthetic uptime monitors, and on-call scheduling — defining SLOs and closing alert gaps before customers notice them • Drive security posture improvements in partnership with InfoSec — triaging findings from cloud security posture management (CSPM) tooling, a vulnerability management platform, and endpoint detection and response (EDR) alerts; coordinating with the managed security service provider (MSSP) on escalations • Manage identity and access: Azure AD / Entra ID, Workload Identity Federation for secretless pod-level authentication, and Keycloak for SSO across JWT and SAML flows • Own the incident response lifecycle — triage P1–P4 events, coordinate resolution across engineering teams, drive blameless postmortems, and close action items with a 48-hour RCA target for critical incidents • Support ISO 27001, ISO 27017, and ISO 27018 audit cycles — producing evidence, reviewing controls, and closing findings in collaboration with engineering and compliance teams


What You’ll Bring

•5+ years of cloud infrastructure experience — Azure, AWS, or GCP. Our stack is Azure-native; what matters is that you've operated cloud infrastructure at scale and know how to own it. If you're coming from another cloud, the concepts carry over and we'll close the gaps together. • Proven Kubernetes operations depth — Helm, autoscaling (HPA/VPA/node- level), ingress controller management, and multi-environment cluster operations. Kubernetes is Kubernetes regardless of the cloud it runs on. • Strong infrastructure-as-code discipline — Terraform at scale: reusable modules, remote state management, consistent environments across dev/qa/training/prod. • Solid CI/CD pipeline experience — GitHub Actions, CircleCI, Jenkins, or equivalent. You've built and maintained pipelines that cover build, test, security scanning, container publishing, and deployment end-to-end. • Security-first mindset — direct experience across at least two of: SAST tooling, SBOM generation, dependency scanning, CSPM platforms, or vulnerability management workflows. • Experience supporting compliance audits (ISO 27001 or SOC 2) — not just awareness, but producing evidence artifacts and engaging directly with auditors. Nice to Have • Hands-on Keycloak administration: realm configuration, SAML/OIDC client setup, and Workload Identity Federation integration • Experience operating Azure Cosmos DB or managed MongoDB at scale — index tuning, failover testing, backup validation • Familiarity with Azure Purview data classification and information protection policies • Exposure to product analytics event pipelines and data residency controls at the infrastructure layer



*This job posting exists to fill a vacancy.