Role Definition:
Data engineers build and test scalable Big Data ecosystems for the businesses and help build data systems that are stable and highly optimized. Data engineers also update the existing systems with newer or upgraded versions of the current technologies to improve the efficiency of the databases, data storage layers and process the data optimized workflows. Data Engineers are responsible to
• Analyse and understand data sources and APIs
• Design and develop methods to connect and collect data from different data sources
• Work closely with data scientists to ensure the source data is aggregated and cleansed
• Work with product managers to understand the business objectives
• Work with cloud and data architects to define robust architecture in cloud setup pipelines and workflows
• Work with DevOps to build automated data pipelines
Required Capabilities:
Expertise in all phases of data engineering from initial data analysis through data extraction, cleansing, transformation of data
Experience in building diverse data engineering activities on Azure Data platform.
Build & Deliver Data pipeline connecting various enterprise data sources both RDBMS,
Design and development of data extraction, data ingestion, data quality rules implementation
Scripting using programming language
Clean and process the data for Machine Learning consumption, NoSQL & APIs.
Strong working experience with Azure Data Factory Pipelines and Azure services: Azure Data Lake Gen 2, Data factory, Data Flows, and Synapse (Azure SQL DW)
Must have experience on Azure Databricks. Good hands-on skills working with Python, PySpark, and SparkSQL
Experience with Power BI is a plus.
Must have worked on loading and processing data from various sources
Experience working in SCRUM methodology
Expert use of SQL and experience with relational and no SQL databases