Research Intern - LLM Quantization - Redmond, Washington, USA

Redmond, Washington, USA

1 day ago

Research Intern - LLM Quantization

Microsoft

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

Our team works on performance analysis and optimization of large language models, spanning the stack from GPU kernel implementation through to changes in model architecture. A key challenge is that quantization of models to use smaller data types is only effective if we can dequantize the formats and use them efficiently during computation. In this Research Internship we are going to tackle this problem by exploring the co-design of quantization techniques (e.g., fewer bits per weight) and kernel design for efficient decode (e.g., expanding weights to the 4-bit, 6-bit and 8-bit floating point formats in modern GPUs).

Save & Apply Later Applying Later... Click to ApplyI AppliedDidn't Apply

Confirm your E-mail: Send Email

Apply for this job

All Jobs from Microsoft

320 Microsoft jobs in Redmond, WA 585 Microsoft jobs in US