Machine Learning Infrastructure Specialist

vectorinstitute · toronto, on, Canada

Location
toronto
Job Type
Full-time
Posted
May 22, 2026

Job Description

Machine Learning Infrastructure Specialist

Position Summary

As an ML Infrastructure Specialist focused on systems and scalable AI infrastructure, you will build and improve efficient, reusable systems to train, deploy, monitor, and serve large-scale machine learning models, including large language models (LLMs). Working at the intersection of applied research and production systems, you will collaborate with Vector’s AI Engineering team members, researchers, and industry partners to bring advanced AI capabilities into real-world use. You will contribute to initiatives that strengthen software and systems supporting state-of-the‑art AI development and deployment, owning well‑scoped projects from end‑to‑end.

Key Responsibilities

  • Design and implement distributed systems for scalable ML training, inference, and serving on multi‑GPU/multi‑node environments, with a focus on large foundation models.
  • Configure...

Ready to Apply?

Submit your application for Machine Learning Infrastructure Specialist at vectorinstitute

Apply Now