2025 Summer Intern - Biology Research | Artificial Intelligence Development - Multimodal representation learning and inference for perturbation screens
Department Summary
Welcome to the Biology Research | AI Development (BRAID) department, a part of Genentech Computational Sciences (gCS). Our mission is to push the boundaries of biological discovery by employing and developing cutting-edge AI/ML methodologies. The perturbation team at BRAID has a specific focus on target discovery from diverse data modalities generated within Genentech's Research Division. Our department offers a highly dynamic and cooperative environment at the intersection of computational methods and scientific research.
Intern Project Overview
Perturbation screening is a technique to systematically investigate the effects of various perturbations (such as drugs, genetic modifications, or environmental changes) on biological systems. Combined with machine learning, multidimensional readouts from single-cell RNA sequencing (scRNA-seq) and microscopy images provide a robust approach that can be used to (i) identify disease-associated phenotypes, (ii) understand disease mechanisms, and (iii) predict a drug’s activity, toxicity or mechanism of action.
Our team is looking for a highly motivated Research Intern to develop machine learning algorithms for the analysis of high-content perturbation screens with a focus on multimodal representation learning for screens with morphological and sequencing readouts. Potential projects include (1) multimodal representation learning for screens with morphological and sequencing readouts; (2) disease mechanism/pathway inference using perturbation screening data.
This internship position is located in South San Francisco, CA, On-site
The Opportunity
Explore machine learning approaches to representation learning on multimodal data.
Evaluate the quality of learned representations for biological discovery, and their ability to reduce the effect of unwanted confounders.
Develop methods for interpreting biological phenotypes across modalities.
Develop algorithms in disease mechanism/pathway inference on perturbation maps.
Present scientific findings to the BRAID Department, and Computational Sciences organization.
Program Highlights
Intensive 12-week, full time (40 hours per week) paid internship.
Program start dates are in May/June (Summer)
A stipend, based on location, will be provided to help alleviate costs associated with the internship.
Ownership of challenging and impactful projects.
Work with some of the most talented people in the biotechnology industry.
Who You Are
Required Education
Must be pursuing a PhD (enrolled student).
Required Majors: A quantitative/computational field such as Computational Biology, Computer Science, Statistics, Mathematics, or similar.
Required Skills:
Programming (Python), Machine Learning (PyTorch), Computer vision and/or scRNA-seq data analysis, multimodal representation learning, familiarity with high content screening data.
Preferred qualifications:
Familiarity with Machine Learning topics such as multimodal representation learning, graph theory, interpretability, and feature disentanglement.
Familiarity with high-content imaging data such as cell painting, and optical pooled screening.
Familiarity with scRNAseq data, GWAS data, and human genetics.
Track record of tackling challenging biological problems with advanced computational methods
Relocation benefits are not available for this job posting.
The expected salary for this position based on the primary location of California is $50 hour. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. This position also qualifies for paid holiday time off benefits.
#GNE-R&D-Interns-2025
Genentech is an equal opportunity employer, and we embrace the increasingly diverse world around us. Genentech prohibits unlawful discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin or ancestry, age, disability, marital status and veteran status.