Research Scientist—Responsible Technologies Intern: 2025

Hursley, GBR

1 day ago

IBM

**Introduction** The Responsible Tech team's research is focused on the intersection of technology and society. They study and devise approaches to mitigate technology risks across research and development processes, and foster innovations that expand societal benefits of technology. **Your role and responsibilities** We are seeking a motivated intern with background in computer science, artificial intelligence, applied mathematics, computational linguistics, or a related field to leverage state-of-the-art reinforcement learning techniques for language model alignment. In this role, you will have a freedom to explore multiple research directions: developing sophisticated reward models that capture human values by effectively decomposing complex value alignment into learnable components; creating synthetic training data through policy-guided rejection sampling; and implementing RL for alignment. We're particularly interested in approaches that can leverage human feedback efficiently, scale to large language models, and provide verifiable alignment guarantees. Candidates currently enrolled in graduate programs are encouraged to apply. **Required technical and professional expertise** * Strong experience in deep RL * Familiarity with language models and alignment challenges * Machine Learning * Advanced experience with Python, PyTorch, TensorFlow * Experience with reward modeling and synthetic data generation is highly valued **Preferred technical and professional experience** * Cloud based computation * Hands on experience with fine tuning of large language models * Experience working with Huggingface models and data

Save & Apply Later Applying Later... Click to ApplyI AppliedDidn't Apply

Confirm your E-mail: Send Email

Apply for this job

Next Job »

All Jobs from IBM

42 IBM jobs in Hursley 29 IBM jobs in GB