As a Data Engineer at GoodHabitz, you’ll be part of an exciting journey as we continue our migration to AWS, enhancing and modernizing our data infrastructure to support a rapidly growing business. In our scale-up environment, adaptability and problem-solving are key. This role is instrumental in designing, building, and optimizing scalable, cloud-native data architectures that meet evolving business and analytical needs.
You’ll collaborate with cross-functional teams to implement robust, efficient data pipelines and help shape a high-performing data platform centered around Databricks on AWS, Unity Catalog, Delta Live Tables, and other related services. If you're eager to make a real impact and contribute to a forward-thinking data engineering culture, we’d love to hear from you.
Key Responsibilities :
- Data Architecture & Design : Design scalable, high-performance data architectures leveraging AWS native services, Databricks (Unity Catalog, Delta Live Tables, Delta Lake), and modern best practices.
- Pipeline Development : Build and maintain efficient, production-grade ETL / ELT pipelines using Databricks (PySpark, SQL), with a strong focus on data quality, lineage, and maintainability.
- Cloud Infrastructure : Combine open-source technologies with AWS and Databricks services to create optimized and automated data workflows.
- Data Governance & Quality : Implement best practices for data governance, including lineage tracking, quality checks, and access control using Unity Catalog and related tools.
- Automation & CI / CD : Implement deployment automation and pipeline monitoring using Terraform for infrastructure-as-code and GitLab for version control and CI / CD workflows.
- Performance Optimization : Optimize data processing performance using techniques such as data partitioning, indexing, caching, and efficient storage formats.
- Collaboration & Mentoring : Work closely with data analysts, software engineers, and other stakeholders to gather requirements and deliver solutions. Mentor junior team members and promote engineering best practices.
Requirements
5+ years of experience in data engineering, including strong exposure to cloud-native architectures and modern data platforms.Deep understanding of ETL processes, data modeling, data warehousing , and Medallion architecture .Hands-on experience with Databricks on AWS , including Unity Catalog, Delta Live Tables, and related tools.Proficiency in Python, PySpark , and SQL for large-scale data processing and transformation.Experience with infrastructure-as-code tools , especially Terraform (bonus if you’ve worked with AWS-native provisioning and configuration tools).Experience working with GitLab for version control, repository management, and CI / CD workflows.Strong knowledge of SQL performance tuning, query optimization , and handling large datasets efficiently.Comfortable working with orchestration tools such as Apache Airflow, AWS Step Functions, or similar.Nice to Have :
Experience with containerization and serverless frameworks (e.g., Docker, Kubernetes, AWS Lambda).Familiarity with monitoring / observability tools such as Prometheus, Grafana, AWS CloudWatch, or Databricks-native logging / monitoring.Exposure to data security best practices and compliance in cloud environments.Benefits
Here's a glimpse of what's waiting for you :
A competitive salary package that rewards your hard work.25 paid vacation days. And if that's not enough, you can purchase up to 10 more.A world of growth and development opportunities to enhance your skills. You'll have unlimited access to our treasure trove of GoodHabitz resources and MyAcademy .Access to mental coaching through our partner, OpenUp, to keep your mind in top shape.An annual do-good-day, fully paid, so you can contribute to a cause you're passionate about.Travel and expense reimbursement because we've got your journey covered.Pension and disability insurance, securing your financial well-being in the long run.A hybrid way of working .Working in a company that welcomes artificial intelligence and uses it to improve internal processes and push AI-powered features quickly .A company laptop.