Search jobs now Find the right job type for you Create a job alert Explore how we help job seekers Contract talent Permanent talent Learn how we work with you Executive search Finance and Accounting Technology Marketing and Creative Legal Administrative and Customer Support Technology Risk, Audit and Compliance Finance and Accounting Digital, Marketing and Customer Experience Legal Operations Human Resources 2026 Salary Guide Demand for Skilled Talent Report Job Market Outlook Press Room Tech insights Labor market overview AI in recruiting Navigating the AI era Staffing for small businesses Cost of a bad hire Browse jobs Find your next hire Our locations
Cloud Engineer
<p>Senior Cloud Engineer – Observability &amp; Performance Engineering</p><p>Location: Washington, DC 20549</p><p>Work Arrangement: Fully Onsite</p><p>Clearance Requirement: Ability to obtain and maintain Public Trust</p><p><br></p><p>Position Overview</p><p>We are seeking a highly experienced Cloud Engineer (Observability) to lead the engineering, optimization, and operational maturity of enterprise observability platforms across hybrid cloud and containerized environments.</p><p>This role is ideal for a hands-on engineer with deep expertise in Datadog, distributed tracing, APM, cloud monitoring, performance engineering, and site reliability practices. The successful candidate will partner with infrastructure, cloud, platform, and application teams to improve operational visibility, reduce alert fatigue, accelerate incident resolution, and drive data-informed operational decisions.</p><p><br></p><p>Key Responsibilities</p><p>Observability Platform Engineering</p><ul><li>Engineer and operate enterprise observability solutions including:</li><li>Metrics</li><li>Logs</li><li>Distributed tracing</li><li>APM</li><li>Real User Monitoring (RUM)</li><li>Synthetic monitoring</li><li>Network monitoring</li><li>Build and optimize dashboards, alerts, SLOs, and SLIs</li><li>Implement OpenTelemetry and language-specific instrumentation</li><li>Integrate observability tooling with ServiceNow, CI/CD pipelines, and incident management workflows</li><li>Establish and maintain telemetry tagging standards and governance</li></ul><p>Cloud &amp; Container Monitoring</p><ul><li>Design monitoring solutions for Azure and AWS workloads</li><li>Implement observability for:</li><li>Serverless services</li><li>Managed databases</li><li>Networking</li><li>Identity services</li><li>Cloud-native platforms</li><li>Support Kubernetes and OpenShift monitoring including clusters, nodes, workloads, and service mesh environments</li><li>Develop reusable observability modules using Infrastructure-as-Code</li></ul><p>Performance Engineering &amp; Reliability</p><ul><li>Lead investigation and remediation of performance, latency, reliability, and capacity issues</li><li>Utilize APM, profiling, distributed tracing, and database analytics to identify bottlenecks</li><li>Define trace-based alerting and deployment correlation strategies</li><li>Support major incident response activities and root cause analysis efforts</li></ul><p>Capacity Planning &amp; Operational Excellence</p><ul><li>Analyze telemetry and capacity trends to identify risks and opportunities</li><li>Develop reporting and dashboards for leadership and engineering teams</li><li>Improve alert quality, monitoring coverage, and operational maturity</li><li>Support enterprise SLA, KPI, and availability objectives</li></ul>
<p>Required Qualifications</p><p>Bachelor&#39;s degree in Information Technology, Computer Science, Engineering, or a related field</p><p>8+ years of experience in infrastructure, platform, cloud, or operations engineering</p><p>5+ years of experience focused on:</p><ul><li>Observability</li><li>Site Reliability Engineering (SRE)</li><li>Performance Engineering</li><li>Application Performance Monitoring (APM)</li></ul><p>Experience administering and optimizing observability platforms such as:</p><ul><li>Datadog</li><li>Dynatrace</li><li>New Relic</li><li>Splunk Observability</li><li>Grafana/Prometheus</li></ul><p>Strong experience with:</p><ul><li>OpenTelemetry</li><li>Distributed tracing</li><li>Performance tuning</li><li>APM engineering</li><li>Cloud-native monitoring</li></ul><p>Experience supporting Azure, AWS, and containerized platforms</p><p>Proven ability to troubleshoot complex performance and reliability issues</p><p>Ability to obtain and maintain Public Trust clearance</p><p><br></p><p>Preferred Qualifications</p><p>Experience supporting federal or regulated environments</p><p>Experience with:</p><ul><li>Kubernetes</li><li>OpenShift</li><li>Terraform</li><li>ARM</li><li>Bicep</li></ul><p>Strong understanding of:</p><ul><li>SLO/SLI engineering</li><li>Incident management</li><li>Capacity planning</li><li>Operational analytics</li></ul><p>Experience integrating observability platforms with ServiceNow and CI/CD tooling</p>
<h3 class="rh-display-3--rich-text">Technology Doesn't Change the World, People Do.<sup>®</sup></h3> <p>Robert Half is the world’s first and largest specialized talent solutions firm that connects highly qualified job seekers to opportunities at great companies. We offer contract, temporary and permanent placement solutions for finance and accounting, technology, marketing and creative, legal, and administrative and customer support roles.</p> <p>Robert Half works to put you in the best position to succeed. We provide access to top jobs, competitive compensation and benefits, and free online training. Stay on top of every opportunity - whenever you choose - even on the go. <a href="https://www.roberthalf.com/us/en/mobile-app" target="_blank">Download the Robert Half app</a> and get 1-tap apply, notifications of AI-matched jobs, and much more.</p> <p>All applicants applying for U.S. job openings must be legally authorized to work in the United States. Benefits are available to contract/temporary professionals, including medical, vision, dental, and life and disability insurance. Hired contract/temporary professionals are also eligible to enroll in our company 401(k) plan. Visit <a href="https://roberthalf.gobenefits.net/" target="_blank">roberthalf.gobenefits.net</a> for more information.</p> <p>© 2025 Robert Half. An Equal Opportunity Employer. M/F/Disability/Veterans. By clicking “Apply Now,” you’re agreeing to Robert Half’s <a href="https://www.roberthalf.com/us/en/terms">Terms of Use</a> and <a href="https://www.roberthalf.com/us/en/privacy">Privacy Notice</a>.</p>
  • Washington, DC
  • onsite
  • Temporary / Contract
  • 55 - 60 USD / Hourly
  • <p>Senior Cloud Engineer – Observability &amp; Performance Engineering</p><p>Location: Washington, DC 20549</p><p>Work Arrangement: Fully Onsite</p><p>Clearance Requirement: Ability to obtain and maintain Public Trust</p><p><br></p><p>Position Overview</p><p>We are seeking a highly experienced Cloud Engineer (Observability) to lead the engineering, optimization, and operational maturity of enterprise observability platforms across hybrid cloud and containerized environments.</p><p>This role is ideal for a hands-on engineer with deep expertise in Datadog, distributed tracing, APM, cloud monitoring, performance engineering, and site reliability practices. The successful candidate will partner with infrastructure, cloud, platform, and application teams to improve operational visibility, reduce alert fatigue, accelerate incident resolution, and drive data-informed operational decisions.</p><p><br></p><p>Key Responsibilities</p><p>Observability Platform Engineering</p><ul><li>Engineer and operate enterprise observability solutions including:</li><li>Metrics</li><li>Logs</li><li>Distributed tracing</li><li>APM</li><li>Real User Monitoring (RUM)</li><li>Synthetic monitoring</li><li>Network monitoring</li><li>Build and optimize dashboards, alerts, SLOs, and SLIs</li><li>Implement OpenTelemetry and language-specific instrumentation</li><li>Integrate observability tooling with ServiceNow, CI/CD pipelines, and incident management workflows</li><li>Establish and maintain telemetry tagging standards and governance</li></ul><p>Cloud &amp; Container Monitoring</p><ul><li>Design monitoring solutions for Azure and AWS workloads</li><li>Implement observability for:</li><li>Serverless services</li><li>Managed databases</li><li>Networking</li><li>Identity services</li><li>Cloud-native platforms</li><li>Support Kubernetes and OpenShift monitoring including clusters, nodes, workloads, and service mesh environments</li><li>Develop reusable observability modules using Infrastructure-as-Code</li></ul><p>Performance Engineering &amp; Reliability</p><ul><li>Lead investigation and remediation of performance, latency, reliability, and capacity issues</li><li>Utilize APM, profiling, distributed tracing, and database analytics to identify bottlenecks</li><li>Define trace-based alerting and deployment correlation strategies</li><li>Support major incident response activities and root cause analysis efforts</li></ul><p>Capacity Planning &amp; Operational Excellence</p><ul><li>Analyze telemetry and capacity trends to identify risks and opportunities</li><li>Develop reporting and dashboards for leadership and engineering teams</li><li>Improve alert quality, monitoring coverage, and operational maturity</li><li>Support enterprise SLA, KPI, and availability objectives</li></ul>
  • 2026-07-01T00:00:00Z

Cloud Engineer Job in Washington, DC | Robert Half