[Remote] Software Engineer Intern (Inference Infrastructure) - 2026 Start (PHD)
Note: The job is a remote job and is open to candidates in USA. ByteDance is a rapidly growing technology company known for its innovative products like TikTok. They are seeking a Software Engineer Intern to contribute to the Inference Infrastructure team, focusing on building large-scale, cloud-native systems for AI workloads and collaborating with world-class engineers.
Responsibilities
- Design and build large-scale, container-based cluster management and orchestration systems with extreme performance, scalability, and resilience
- Architect next-generation cloud-native GPU and AI accelerator infrastructure to deliver cost-efficient and secure ML platforms
- Collaborate across teams to deliver world-class inference solutions using vLLM, SGLang, TensorRT-LLM, and other LLM engines
- Stay current with the latest advances in open source (Kubernetes, Ray, etc.), AI/ML and LLM infrastructure, and systems research; integrate best practices into production systems
- Write high-quality, production-ready code that is maintainable, testable, and scalable
Skills
- B.S./M.S. in Computer Science, Computer Engineering, or related fields with 2+ years of relevant experience (Ph.D. with strong systems/ML publications also considered)
- Strong understanding of large model inference, distributed and parallel systems, and/or high-performance networking systems
- Hands-on experience building cloud or ML infrastructure in areas such as resource management, scheduling, request routing, monitoring, or orchestration
- Solid knowledge of container and orchestration technologies (Docker, Kubernetes)
- Proficiency in at least one major programming language (Go, Rust, Python, or C++)
- Experience contributing to or operating large-scale cluster management systems (e.g., Kubernetes, Ray)
- Experience with workload scheduling, GPU orchestration, scaling, and isolation in production environments
- Hands-on experience with GPU programming (CUDA) or inference engines (vLLM, SGLang, TensorRT-LLM)
- Familiarity with public cloud providers (AWS, Azure, GCP) and their ML platforms (SageMaker, Azure ML, Vertex AI)
- Strong knowledge of ML systems (Ray, DeepSpeed, PyTorch) and distributed training/inference platforms
- Excellent communication skills and ability to collaborate across global, cross-functional teams
- Passion for system efficiency, performance optimization, and open-source innovation
Benefits
- Interns have day one access to health insurance
- Life insurance
- Wellbeing benefits and more
- Interns also receive 10 paid holidays per year
- Paid sick time (56 hours if hired in first half of year, 40 if hired in second half of year)
- Interns who are not working 100% remote may also be eligible for housing allowance.
Company Overview
Company H1B Sponsorship
Apply To This Job