Back to Jobs

Machine Learning Engineering Manager – LLM Serving, Infrastructure

Remote, USA Full-time Posted 2025-11-24
• Lead a high-performing engineering team to develop, build, and deploy a high-scale, low-latency LLM Serving Infrastructure. • Drive the implementation of a unified serving layer to support multiple LLM models and inference types (batch, offline eval flows and real-time/streaming). • Lead all aspects of the development of the Model Registry for deploying, versioning, and running LLMs across production environments. • Ensure successful integration with the core Personalization and Recommendation systems to deliver LLM-powered features. • Define and champion standardized technical interfaces and protocols for efficient model deployment and scaling. • Establish and monitor the serving infrastructure's performance, cost, and reliability, including load balancing, autoscaling, and failure recovery. • Collaborate closely with data science, machine learning research, and feature teams (Autoplay, Home, Search, etc.) to drive the active adoption of the serving infrastructure. • Scale up the serving architecture to handle hundreds of millions of users and high-volume inference requests for internal domain-specific LLMs. • Drive Latency and Cost Optimization: partner with SRE and ML teams to implement techniques like quantization, pruning, and efficient batching to minimize serving latency and cloud compute costs. • Develop Observability and Monitoring: build dashboards and alerting for service health, tracing, A/B test traffic, and latency trends to ensure consistency to defined SLAs. • Contribute to Core LPM Serving: focus on the technical strategy for deploying and maintaining the core Large Personalization Model (LPM). Apply tot his job Apply tot his job Apply To this Job

Similar Jobs

Experienced Customer Service Representative – Remote Full-Time Opportunity for Excellent Communicators and Problem-Solvers

Remote, USA Full-time

SQL Developer

Remote, USA Full-time

AI-Based Cybersecurity Research Intern

Remote, USA Full-time

[Remote] Generative AI Annotation Operations Engineer

Remote, USA Full-time

Data Science and Analytics Senior Manager (Virtual)

Remote, USA Full-time

Business Analyst

Remote, USA Full-time

Senior Manager, CRM Systems Administration

Remote, USA Full-time

[Remote] Senior DevOps Engineer (Google Cloud Platform)

Remote, USA Full-time

[Remote] Venture Advisor (Equity position only)

Remote, USA Full-time

Woocommerce Developer with WordPress

Remote, USA Full-time

Immediate Hiring: Construction Project Scheduler- Facility Asset

Remote, USA Full-time

American Airlines | Work At Home Airline Customer Service Agent

Remote, USA Full-time

Fullstack Software Developer

Remote, USA Full-time

Delta Airlines Work At Home ? Entry Level At Careermilard - Now

Remote, USA Full-time

Night Order Selector/Picker (Permanent/Full-time)

Remote, USA Full-time

[Remote/WFM] Urgently Require Personal Trainer in Greenville, DE

Remote, USA Full-time

Disney Remote Job At Home $25/Hr (Entry Level/No Experience)

Remote, USA Full-time

Employelevate Apply To Be A Netflix Tagger ( Work From Home )

Remote, USA Full-time

**Experienced Full Stack Customer Support Specialist – Virtual Chat Assistant Jobs | Entry-Level | $20-$25/hr | No Experience Needed | Remote**

Remote, USA Full-time

TSO

Remote, USA Full-time