Senior Site Reliability Engineering

🇨🇦RBC

TORONTO, Ontario, Canada0 applicants

Posted 1d ago · Apr 30, 2026, 12:00 AMApply by Fri, May 29, 2026

Full TimeSenior

Job Description

Job Description WHAT IS THE OPPORTUNITY? This role will be responsible for leading the design, development, implementation and support of Site Reliability Engineering (SRE) solutions for applications supported by the Commercial Payments Technology (CPT) SRE organization. The incumbent will need advanced knowledge and experience working in an application development and/or technology operations organization. Perform production support role and partner with SRE Delivery team in incident management and problem management. WHAT WILL YOU DO? Technical Leadership: Lead code and non-functional (performance, security, maintainability, compliance, change management) reviews of all production bound SRE solutions Drive transformation by continuously looking for ways to automate existing processes, Run engineering mindset meetups accelerating breadth and depth of knowledge in community Manage SRE application assets (virtual machines, cloud instances, mainframe, source code repositories, etc.)Publish technical design for SRE solutions Publish and/or review implementation plans for SRE solutions bound to production, Explore new capabilities and technologies to drive innovation (including coding and publishing how-to documentation) Track, audit, monitor and implement on technical work streams, Act as portfolio SME (Subject Matter Expert) – understand & document common components, core functionalities, infrastructure of supported application Production Support: Escalation point in the on-call rotation, and support our maintenance, scheduled work, support and release deployment requirements Lead in incident management and problem management for applications in scope Incident management and problem management for applications in scope and RCA Action items fulfillment/ownership Focus on Continuous improvement and technical standards – Drive improvements in productivity, monitoring, tooling and best practices Manage technology currency (server patching, certificate renewal, compliance, etc.) with keen eye on automating opportunities Ensure availability and uptime of applications in scope, as per service level objectives, Manage PagerDuty rules/tuning/tagging, Moogsoft Situation management, Dynatrace tuning (RUM, Problem Card reduction), Provide expertise, direction, coaching and development to build the SRE teams capability Provide assistance with selecting & building a high performing diverse team that leverages individual capabilities & strengths WHAT DO YOU NEED TO SUCCEED? Must have: Advanced knowledge of industry practices, with focus on SRE Advanced experience in a variety of environments (Cloud, distributed and mainframe, business workflows and services/APIs, databases) Excellent communication skills, direct style (e.g. I did or did not do something, it does or does not work as opposed I believe or I understand it to be) Effective negotiation skills, stakeholder management, Ability to influence at the Director level (unit and other partner units) Mainframe kn

Read original posting

Required Skills

RBC