Senior Full-Stack Software Engineer - Global AI Platform
馃嚚馃嚘Manulife
Job Description
This role designs and delivers a scalable, secure, cloud native AI platform that enables enterprise grade AI and agent powered solutions. The position focuses on building event driven, distributed services using Akka and modern MLOps practices to support continuous learning, experimentation, and governance. Working closely with architects, data scientists, and business leaders, this role ensures reliable, high performance AI infrastructure that accelerates innovation while meeting compliance and regulatory requirements. Position Responsibilities: Builds and maintains high-performance, fault-tolerant, secure, and scalable AI platform services and abstractions that support diverse AI solutions with automation-first delivery. Designs, builds, and maintains the technology platform's features and infrastructure, including hardware, software, and network components. Integrate AKKA, AdaptiveML workflows for continuous/online learning, feature stores, model registries, and A/B experimentation. Implement AI Foundry components for orchestration, feature engineering, model deployment, and governance. Develop reusable reference patterns, inner-source components that meet reliability, security, and compliance standards. Implement shared runtimes for multi agent coordination, state management, memory persistence, and messaging. Design interoperable APIs/SDKs used by data scientists and developers to build agent powered applications. Maintain and improve CI/CD pipelines and developer toolchains for AI services to enable rapid, compliant delivery. Evaluate emerging AI/ML infrastructure capabilities; prototype and introduce tools that improve developer productivity and reliability. Develop and operate scalable backend services supporting high traffic agent interactions, retrieval operations, and real time execution flows. Use cloud native technologies (containers, orchestration, IaC, CI/CD) to deliver reliable, cost-efficient services. Optimize runtime performance across CPU/GPU/accelerator workloads. Monitors and resolves persistent platform issues when surfaced by technical support teams such as bottlenecks, connectivity problems, and system failures. Considers compliance and regulatory requirements throughout the platform lifecycle. Implements security measures, such as access controls, encryptions, and vulnerability assessments when applicable. Partners with architects and business leaders to design and build robust platforms across all Global AI Platform capability layers. Forms a holistic understanding of tools, key business concepts, and the data and cross-team dependencies. Investigates new platform solutions to enhance service delivery experience. Performs peer reviews of code / deliverables and analysis for continuous learning and continuous improvement. Required Qualifications: 5 years in software engineering; 3 years leading teams/projects in AI/ML or distributed systems. Strong expertise in Akka and event-driven microservices at scale. Hands-on exper
Read original postingRequired Skills
Manulife