Data Engineer
Cummins Inc.
**DESCRIPTION**
GPP Database Link (https://cummins365.sharepoint.com/sites/CS38534/)
**Job Summary:**
Supports, develops, and maintains a data and analytics platform to effectively and efficiently process, store, and make data available to analysts and other consumers. Works with Business and IT teams to understand requirements and best leverage technologies for agile data delivery at scale. Though the role category is listed as Remote, this position is designated as Hybrid.
**Key Responsibilities:**
+ **Product & Business Alignment** – Collaborate with the Product Owner to align data solutions with business objectives and product vision.
+ **Data Pipeline Development** – Design, develop, and implement efficient data pipelines for ingesting, transforming, and transporting data into Cummins Digital Core (Azure DataLake, Snowflake) from various sources, including transactional systems (ERP, CRM).
+ **Architecture & Standards Compliance** – Ensure alignment with AAI Digital Core and AAI Solutions Architecture standards for data pipeline design, storage architectures, and governance processes.
+ **Automation & Optimization** – Implement and automate distributed data systems, ensuring reliability, scalability, and efficiency through monitoring, alerting, and performance tuning.
+ **Data Quality & Governance** – Develop and enforce data governance policies, including metadata management, access control, and retention policies, while actively monitoring and troubleshooting data quality issues.
+ **Modeling & Storage** – Design and implement conceptual, logical, and physical data models, optimizing storage architectures using distributed and cloud-based platforms (e.g., Hadoop, HBase, Cassandra, MongoDB, Accumulo, DynamoDB).
+ **Documentation & Best Practices** – Create and maintain data engineering documentation, including standard operating procedures (SOPs) and best practices, with guidance from senior engineers.
+ **Tool Evaluation & Innovation** – Support proof-of-concept (POC) initiatives and evaluate emerging data tools and technologies to enhance efficiency and effectiveness.
+ **Testing & Troubleshooting** – Participate in the testing, troubleshooting, and continuous improvement of data pipelines to ensure data integrity and usability.
+ **Agile & DevOps Practices** – Utilize agile development methodologies, including DevOps, Scrum, and Kanban, to drive iterative improvements in data-driven applications.
**RESPONSIBILITIES**
**Core Competencies:**
+ **System Requirements Engineering:** Translate stakeholder needs into verifiable requirements, establish acceptance criteria, track requirement status, and assess impact changes.
+ **Collaborates:** Build partnerships and work collaboratively with others to meet shared objectives.
+ **Communicates Effectively:** Deliver multi-mode communications tailored to different audiences.
+ **Customer Focus:** Build strong customer relationships and provide customer-centric solutions.
+ **Decision Quality:** Make good and timely decisions that drive the organization forward.
+ **Data Extraction:** Perform ETL activities from various sources using appropriate tools and technologies.
+ **Programming:** Develop, test, and maintain code using industry standards, version control, and automation tools.
+ **Quality Assurance Metrics:** Measure and assess solution effectiveness using IT Operating Model (ITOM) standards.
+ **Solution Documentation:** Document knowledge gained and communicate solutions for improved productivity.
+ **Solution Validation Testing:** Validate configurations and solutions to meet customer requirements using SDLC best practices.
+ **Data Quality:** Identify, correct, and manage data flaws to support effective governance and decision-making.
+ **Problem Solving:** Use systematic analysis to determine root causes and implement robust solutions.
+ **Values Differences:** Recognize and leverage the value of diverse perspectives and cultures.
**Education, Licenses, and Certifications:**
+ Bachelor's degree in a relevant technical discipline, or equivalent experience required.
+ This position may require licensing for compliance with export controls or sanctions regulations.
**QUALIFICATIONS**
**Preferred Experience:**
+ Hands-on experience gained through internships, co-ops, student employment, or team-based extracurricular projects.
+ Proficiency in SQL query language and experience in developing analytical solutions.
+ Exposure to open-source Big Data technologies such as Spark, Scala/Java, MapReduce, Hive, HBase, and Kafka.
+ Familiarity with cloud-based, clustered computing environments and large-scale data movement applications.
+ Understanding of Agile software development methodologies.
+ Exposure to IoT technology and data-driven solutions.
**Technical Skills:**
1. **Programming Languages:** Proficiency in Python, Java, and/or Scala.
2. **Database Management:** Expertise in SQL and NoSQL databases.
3. **Big Data Technologies:** Hands-on experience with Hadoop, Spark, Kafka, and similar frameworks.
4. **Cloud Services:** Experience with Azure, Databricks, and AWS platforms.
5. **ETL Processes:** Strong understanding of Extract, Transform, Load (ETL) processes.
6. **Data Replication:** Working knowledge of replication technologies like Qlik Replicate is a plus.
7. **API Integration:** Experience working with APIs to consume data from ERP and CRM systems.
**Job** Systems/Information Technology
**Organization** Cummins Inc.
**Role Category** Remote
**Job Type** Exempt - Experienced
**ReqID** 2410682
**Relocation Package** No
Confirm your E-mail: Send Email
All Jobs from Cummins Inc.