Bangalore
4 days ago
DBT with PySpark - Lead

We are seeking an experienced Data Engineer with expertise in DBT (Data Build Tool) to join our dynamic and fast-growing team. In this role, you will be responsible for transforming critical data, specifically focusing on the creation and management of silver and gold data tiers within our data pipeline. Working closely with data architects and engineers, you will help design, develop, and optimize data transformation processes, ensuring that data is clean, reliable, and ready for business intelligence and analytics teams.

Key Responsibilities:

DBT-Based Data Transformations: Lead the design, development, and implementation of data transformations using DBT, with a focus on creating and managing silver and gold data tiers within the pipeline. Data Workflow Management: Oversee DBT workflows from data ingestion to transformation and final storage in optimized data models. Ensure seamless integration between DBT models and source systems. Integration with Data Adapters: Work with data adapters like AWS Glue, Amazon Athena, and Amazon Redshift to ensure smooth data flow and transformation across platforms. Data Quality & Optimization: Implement best practices to ensure data transformations are efficient, scalable, and maintainable. Optimize data models for query performance and reduced processing time. Cross-Functional Collaboration: Collaborate with data analysts, business intelligence teams, and data architects to understand data needs and deliver high-quality datasets for analytics and reporting. Documentation & Best Practices: Develop and maintain comprehensive documentation for DBT models, workflows, and configurations. Establish and enforce best practices in data engineering. Data Warehousing Concepts: Apply core data warehousing principles, including star schema, dimensional modeling, ETL processes, and data governance, to build efficient data pipelines and structures.

Required Skills & Qualifications:

DBT Expertise: Strong hands-on experience with DBT for data transformations and managing data models, including advanced DBT concepts like incremental models, snapshots, and macros. ETL and Cloud Integration: Proven experience with cloud data platforms, particularly AWS, and tools like AWS Glue, Amazon Athena, and Amazon Redshift for data extraction, transformation, and loading (ETL). Data Modeling Knowledge: Solid understanding of data warehousing principles, including dimensional modeling, star schemas, fact tables, and data governance. SQL Expertise: Proficient in writing and optimizing complex SQL queries for data manipulation, transformation, and reporting. Version Control: Experience with Git or similar version control systems for code management and collaboration. Data Orchestration: Familiarity with orchestration tools like Apache Airflow for managing ETL workflows. Data Pipeline Monitoring: Experience with monitoring and ing tools for data pipelines. Additional Tools: Familiarity with other data transformation tools or languages such as Apache Spark, Python, or Pandas is a plus.
Confirm your E-mail: Send Email