Back to jobs
S

Lead Software Engineer (Observability & Telemetry)

🇺🇸Salesforce

Type
Full Time
Level
Lead
Location
Washington - Bellevue
Posted 4d ago

Job Description

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Software Engineering Job Details About Salesforce Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition meets action. Tech meets trust. And innovation isn’t a buzzword — it’s a way of life. The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all. Ready to level-up your career at the company leading workforce transformation in the agentic era? You’re in the right place! Agentforce is the future of AI, and you are the future of Salesforce. Join the team responsible for innovating and maintaining the massive-scale, distributed systems that monitor Salesforce’s infrastructure. This position is located in the Bellevue office and requires onsite presence. The Network Visibility and Telemetry team is responsible for designing, building, and operating a set of systems and services which deliver metrics, telemetry and alerting for data center infrastructure (network, storage, etc). We are part of the Infrastructure Strategy Datacenter Operations organization, which is a dynamic, global team delivering and supporting technology infrastructure to meet the substantial growth needs of the business. In this role, you will leverage your experience in building and deploying large-scale systems to automate systems services across all types of infrastructure (storage, network, server), enable the collection of infrastructure telemetry, make the infrastructure visible and accessible, and ensure that alerts are generated where action is needed. Responsibilities: Design, build, and operate large-scale observability systems that deliver metrics, telemetry, and alerting across data center infrastructure including network and storage environments Develop and maintain distributed services in Java and/or Python to enable automated collection of infrastructure telemetry at scale, ensuring full visibility into critical systems Build and deploy automation solutions using tools such as Ansible, Puppet, or Chef to streamline infrastructure services across storage, network, and server environments Publish and consume REST APIs to integrate telemetry pipelines and expose infrastructure data to downstream systems and stakeholders Drive alerting frameworks that surface actionable signals from infrastructure telemetry, reducing noise and ensuring the right teams are notified when intervention is needed Partner with a global, cross-functional Infrastructure Strategy Datacenter Operations team to support rapid growth, leveraging CI/CD practices (Jenkins), source control (Git), and Linux (RedHat) expertise to deliver reliable, scalable observability tooling Build and ship high-quality, production-grade softwar

Read original posting

Required Skills

RObservability
S

Salesforce