Post your job offer for free on H1BConnect with no upfront cost!
Redmond, WA
Internship
Salary: $6,550 - $13,920 per month
Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. If you are excited about investigating and implementing cutting-edge large language model (LLM) inference techniques and optimizations like quantized KV-caches, flash/paged/radix attention, speculative decoding, and advanced collective communication on graphics processing units (GPUs), come join the AIFX team at Microsoft Azure and contribute to a production-focused, planetary-scale LLM serving stack that is being built on top of excellent open-source efforts like vLLM, SGLang, and HuggingFace. The work includes investigation of cutting-edge, state-of-the-art approaches like "You only cache once (YOCO)" and leveraging them to save memory and compute for serving LLMs at scale. You will get a chance to explore, implement, optimize, and publish your research ideas in collaboration with teams at Microsoft working on real-world production workloads at an unprecedented scale.
© 2024 H1BConnect. All rights reserved.