Our client in the Bedminster area of NJ has an excellent opportunity for a Databricks Data Engineer! It pays up to 135K, depending on experience and technical skills. You are expected to be in the office two days a week.

Responsibilities:

  • Assist with leading the team’s transition to the Databricks platform and utilize the newer features of Delta Live Tables, Workflows etc.
  • Design and develop data pipelines that extract data from Oracle, load it into the data lake, transform it into the desired format, and load it into Databricks data lakehouse.
  • Optimize data pipelines and data processing workflows for performance, scalability, and efficiency.
  • Implement data quality checks and validations within data pipelines to ensure the accuracy, consistency, and completeness of data.
  • Help create and maintain documentation for data mappings, data definitions, architecture and data flow diagrams.
  • Build proof-of-concepts to determine viability of possible new processes and technologies.
  • Deploy and manage code in non-prod and prod environments.
  • Investigate and troubleshoot data related issues and fix or provide solutions to fix defects.
  • Identify and resolve performance bottlenecks, which could include suggesting ways to optimize and performance tune databases and queries to enhance query performance.

Requirements:

  • Bachelor’s Degree in Computer Science, Data Science, Software Engineering, Information Systems, or related quantitative field.
  • 4+ years of experience working as a Data Engineer, ETL Engineer, Data/ETL Architect or similar roles.
  • Current/active Databricks Data Engineer/Analyst certification a BIG plus.
  • 3+ years working with Databricks with knowledge and expertise of data structures, data storage and change data capture gained from prior production implementations of data pipelines, optimizations, and best practices.
  • Solid continuous experience in Python.
  • 3+ years of experience in Kimball dimensional modeling (star-schema comprising of facts, type1 and type2 dimensions, aggregates, etc.) with solid understanding of ELT/ETL.
  • 3+ years of solid experience writing SQL and PL/SQL code.
  • 2+ years of experience with Airflow.
  • 3+ years of experience working with relational databases (Oracle preferred).
  • 2+ years of experience working with NoSQL databases: MongoDB, Cosmos DB, DocumentDB or similar.
  • 2+ years of cloud experience (Azure preferred).
  • Experience with CI/CD utilizing git/Azure DevOps.
  • Experience with storage formats including Parquet/Arrow/Avro.

To apply for this job email your details to haveagreatday@seanryaninc.com

Apply for this job