<p><strong>Role Overview</strong></p><p>We’re seeking an experienced Site Reliability Engineer (SRE) to strengthen and operate our Azure workloads, automate infrastructure with Terraform, streamline deployments via ArgoCD or Nomad, and ensure environments meet rigorous security and compliance standards. This role is hands-on across infrastructure, CI/CD, and observability, focusing on reliability, performance, and cost efficiency.</p><p><br></p><p><strong>Key Outcomes (First 90 Days)</strong></p><ul><li>IaC baselined: Standardized Terraform modules for core Azure resources (AKS, networking, storage, Key Vault, databases).</li><li>Deployment flow repeatable: ArgoCD applications and GitHub workflows with clear promotion paths and automated guardrails.</li><li>Reliability metrics live: SLIs/SLOs published; dashboards and alerts tuned to actionable thresholds.</li><li>Security posture improved: Secrets managed via Key Vault, RBAC enforced, and network controls hardened.</li><li>Operational runbooks: Documented incident response, backup/restore, and DR procedures for MySQL and SQL Server.</li></ul><p><strong>Day-to-Day Responsibilities</strong></p><ul><li>Design and operate Azure infrastructure using Terraform and Git-based workflows.</li><li>Manage GitOps deployments with ArgoCD integrated with GitHub.</li><li>Provision and maintain Azure MySQL and Azure SQL Server databases.</li><li>Define SLIs/SLOs, implement telemetry, and optimize performance.</li><li>Enforce security best practices for identity, access, and network segmentation.</li><li>Automate pipelines and maintain high-quality runbooks and architecture diagrams.</li></ul>