Bangalore, India
4 days ago
Data Scientist, Advanced

Remote Work: Hybrid


Overview: At Zebra, we are a community of innovators who come together to create new ways of working to make everyday life better. United by curiosity and care, we develop dynamic solutions that anticipate our customer’s and partner’s needs and solve their challenges.
Being a part of Zebra Nation means being seen, heard, valued, and respected. Drawing from our diverse perspectives, we collaborate to deliver on our purpose. Here you are a part of a team pushing boundaries to redefine the work of tomorrow for organizations, their employees, and those they serve.
You have opportunities to learn and lead at a forward-thinking company, defining your path to a fulfilling career while channeling your skills toward causes that you care about – locally and globally. We’ve only begun reimaging the future – for our people, our customers, and the world.
Let’s create tomorrow together.

Highly skilled and motivated Data Scientist (LLM Specialist) to join our AI/ML team. This role is ideal for an individual passionate about Large Language Models (LLMs), workflow automation, and customer-centric AI solutions. You will be responsible for building robust ML pipelines, designing scalable workflows, interfacing with customers, and independently driving research and innovation in the evolving agentic AI space.

Key Responsibilities:

· LLM Development & Optimization: Train, fine-tune, evaluate, and deploy Large Language Models (LLMs) for various customer-facing applications.

· Pipeline & Workflow Development: Build scalable machine learning workflows and pipelines that facilitate efficient data ingestion, model training, and deployment.

· Model Evaluation & Performance Tuning: Implement best-in-class evaluation metrics to assess model performance, optimize for efficiency, and mitigate biases in LLM applications.

· Customer Engagement: Collaborate closely with customers to understand their needs, design AI-driven solutions, and iterate on models to enhance user experiences.

· Research & Innovation: Stay updated on the latest developments in LLMs, agentic AI, reinforcement learning with human feedback (RLHF), and generative AI applications. Recommend novel approaches to improve AI-based solutions.

· Infrastructure & Deployment: Work with MLOps tools to streamline deployment and serve models efficiently using cloud-based or on-premise architectures, including Google Vertex AI for model training, deployment, and inference.

· Foundational Model Training: Experience working with open-weight foundational models, leveraging pre-trained architectures, fine-tuning on domain-specific datasets, and optimizing models for performance and cost-efficiency.

· Cross-Functional Collaboration: Partner with engineering, product, and design teams to integrate LLM-based solutions into customer products seamlessly.

· Ethical AI Practices: Ensure responsible AI development by addressing concerns related to bias, safety, security, and interpretability in LLMs.


Responsibilities:

Education: Bachelor's/Master’s/Ph.D. in Computer Science, Machine Learning, AI, Data Science, or a related field.

· Experience: experience in ML, NLP, or AI-related roles, with a focus on LLMs and generative AI.

· Programming Skills: Proficiency in Python and experience with ML frameworks like TensorFlow, PyTorch

· LLM Expertise: Hands-on experience in training, fine-tuning, and deploying LLMs

(e.g., OpenAI’s GPT, Meta’s LLaMA, Mistral, or other transformer-based architectures).

· Foundational Model Knowledge: Strong understanding of open-weight LLM architectures, including training methodologies, fine-tuning techniques, hyperparameter optimization, and model distillation.

· Data Pipeline Development: Strong understanding of data engineering concepts, feature engineering, and workflow automation using Airflow or Kubeflow.

· Cloud & MLOps: Experience deploying ML models in cloud environments like AWS, GCP (Google Vertex AI), or Azure using Docker and Kubernetes.

· Model Serving & Optimization: Proficiency in model quantization, pruning, distillation, and knowledge distillation to improve deployment efficiency and scalability.

· Research & Problem-Solving: Ability to conduct independent research, explore novel solutions, and implement state-of-the-art ML techniques.

· Strong Communication Skills: Ability to translate technical concepts into actionable insights for non-technical stakeholders.

· Version Control & Collaboration: Proficiency in Git, CI/CD pipelines, and working in cross-functional teams.

Nice-to-Have:

· Experience with Reinforcement Learning (RLHF) for LLMs.

· Knowledge of vector databases and retrieval-augmented generation (RAG) architectures.

· Familiarity with multi-modal AI models (vision-language models, speech-to-text, etc.).

· Understanding of agentic AI frameworks (e.g., AutoGPT, LangChain, LlamaIndex).

· Hands-on experience with Google Vertex AI Pipelines, AutoML, and model monitoring.

We are seeking LLM enthusiast with a knack for research, customer interaction, and building impactful AI solutions


Qualifications:
B.Tech/M.Tech/PhD in CS/ML/StatisticsPreferably 8+ years’ experience as Data Scientist OR in
related field (Statistics / Operation Research) OR as System
Architect / Enterprise ArchitectDesign and conduct Analysis/Experiments (Plan Analysis and
address competing explanations, Determine best way to evaluate
results, Explore structured and unstructured data appropriately,
Apply appropriate algorithms, Clearly document and articulate
findings),
Incorporate Analysis into pipelines and pipelines into
business/Engineering stack (Read/write data to/from any
format / location, incorporate complex matching and filtering,
Make work compatible with/suitable for engineering stack,
Discover needs of business, Navigate Business organization
structure, package technical work for diverse audiences),
Build the profession (Mentor/Coach Junior Data scientists,
enable/upskill citizen data scientists)Problem Solving / Critical Thinking, Good business intuition,
Programming and Coding (Python / R), Mathematics and
Statistics (Equivalent to Graduate level Stat 101 and Math
101), Machine Learning / Deep Learning / AI, Communication
/ Appropriate articulation, Data Architecture, Risk Analysis /
Systems Engineering 

To protect candidates from falling victim to online fraudulent activity involving fake job postings and employment offers, please be aware our recruiters will always connect with you via @zebra.com email accounts. Applications are only accepted through our applicant tracking system and only accept personal identifying information through that system. Our Talent Acquisition team will not ask for you to provide personal identifying information via e-mail or outside of the system. If you are a victim of identity theft contact your local police department.
Confirm your E-mail: Send Email
All Jobs from The Zebra