AI Inference & Compression Engineer

PERSOL APAC · singapore, singapore, Singapore

Location
singapore
Job Type
Full-time
Posted
June 05, 2026

Job Description

About the company:
We have partnered with a renowned global leader in information and communications technology (ICT) infrastructure and smart devices. They are providing full-stack, all-scenario solution for products and services carriers, enterprises, governments, and individual consumers worldwide.
Our client is looking for an
AI Inference & Compression Engineer
to join the team.
Job Overview: This role focuses on developing high-performance compression and inference techniques across both classical video/media codecs and modern Large Language Model (LLM) inference systems. You will design intelligent pipelines that deliver higher visual quality at lower bitrates, while simultaneously developing algorithms to reduce memory footprint and computational bottlenecks in generative AI serving.
Key Responsibilities LLM Inference Acceleration. Research and develop advanced compression algorithms to accelerate LLM serving. Focus on KV cache optimization, model quantization,...

Ready to Apply?

Submit your application for AI Inference & Compression Engineer at PERSOL APAC

Apply Now