Principal ML Ops Engineer, Azure
🇨🇦RBC
Job Description
Job Description What's the opportunity? We’re looking for a Principal MLOps Engineer, Azure who will bring focus and subject-matter expertise around designing and implementing machine learning infrastructure and automation tools (MLOps and DevOps). This is a unique opportunity to grow in the world of machine learning infrastructure and work with a team of passionate individuals committed to the mission of bringing ML to enterprise. At RBC Borealis, you’ll be joining a team that works directly with leading researchers in machine learning, has access to rich and massive datasets, and offers the computational resources to support ongoing development in areas such as reinforcement learning, unsupervised learning and computer vision. You can find out more about our research areas at rbcborealis.com . Your responsibilities include: Designing, building, and optimizing machine learning deployment tools and automation systems that operate the business’s data and ML applications; Designing, building, and optimizing cloud (Azure) infrastructure using Infrastructure as Code (Terraform) Designing and implementing best practices and standards for data and machine learning pipelines across the organization; Collaborating with engineers, and machine learning researchers to automate code analysis, build, integration and deployment of ML applications; Supporting applications and projects with infrastructure design decision, and monitoring solution; Building highly scalable, resilient cloud and on-premise systems for hosting machine learning systems using state-of-the-art technologies. You're our ideal candidate if you have: Strong and relevant experience designing and implementing distributed systems and Machine Learning systems; Hands-on experience building and deploying hybrid environments on-prem and major cloud environments, such as AWS and Azure; Strong and relevant experience with programming languages such as Python, Bash, or JavaScript; Previous experience with MLOps orchestration tools such as AirFlow, KubeFlow, Dagster, Flyte, or MetaFlow; Working with building and maintaining DevOps pipeline such as Jenkins, GitHub actions; In-depth knowledge of various stages of the machine learning application deployment process; Experience with building tools and applications to automate various infrastructure and DevOps tasks; Implementing monitoring solutions to identify system bottlenecks and production issues; Knowledge of professional software engineering best practices for the full software development life cycle, including testing methods, coding standards, code reviews and source control management; Familiarity with machine learning frameworks such as PyTorch, TensorFlow and/or similar. What's in it for you? Become part of a team that thinks progressively and works collaboratively. We care about seeing each other reach full potential; A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stoc
Read original postingRequired Skills
RBC