Job Description
About the Role
Running ML/AI workloads in containers and serverless environments looks simple on a slide and is anything but in production. Between the object store and the GPU sit a dozen layers — image pulls, network filesystems, page cache, model loaders, runtime initialization — and each one quietly contributes to how long a workload takes to become useful and how fast it runs once it is.
We're hiring a Performance Engineer to own that surface for Verda's container and serverless GPU platforms. You'll characterize where time and throughput actually go across the stack, and turn that understanding into concrete platform improvements. Cold-start latency is one of the more visible expressions of the problem — a 70B model that takes ninety seconds to load is a ninety-second outage from the user's perspective — but the same fundamentals shape steady-state inference throughput, training step time, and checkpoint behavior. We want someone who understands how all of t...
Ready to Apply?
Submit your application for Performance Engineer, Containers/Serverless at Verda Innovations Inc.
Apply Now