About the Role
We’re looking for a Data Engineer to join our growing data team. In this role, you’ll help design, build, and maintain scalable data pipelines and platforms. You’ll collaborate with data scientists, analysts, and product teams to support reliable data flows, build high quality datasets, and contribute to the evolution of our data infrastructure.
This is a hands-on role for someone who enjoys working with modern data tools, stream and batch processing, and scalable data systems.
What You’ll Do
- Contribute to the design and development of real-time and batch data pipelines using Kafka, Spark, and Airflow.
- Support schema design, optimization, and performance tuning for ClickHouse.
- Help build and maintain ETL/ELT workflows to bring data from MySQL and other sources into ClickHouse.
- Assist in developing CDC pipelines using Kafka Connect, Debezium, and Schema Registry.
- Work with MinIO for object storage and distributed data access.
- Write clean and maintainable code in Python or Java.
- Build and support reusable datasets for analytics and self-service dashboards via Metabase.
- Monitor data pipeline health and performance using Grafana and Prometheus.
- Deploy and manage data services and applications on Kubernetes (containerization, manifests, Helm charts, etc.).
- Contribute to CI/CD workflows using Argo (Argo Workflows or Argo CD).
- Collaborate with BI and data science teams to improve data accessibility, quality, and documentation.
Required Qualifications
- 2–3 years of experience as a data engineer or in a similar data-focused role.
- Proficiency in Python and/or Java.
- Hands-on experience with ClickHouse (schema design, query optimization, or production usage).
- Working knowledge of Kafka, Kafka Connect, and Schema Registry.
- Experience building ETL/ELT workflows; familiarity with CDC concepts.
- Practical experience with Apache Airflow.
- Strong SQL skills and experience with MySQL or similar RDBMS.
- Experience with or exposure to Metabase for BI and self-service analytics.
- Experience working with MinIO or similar object storage layers.
- Experience with Spark or PySpark.
- Working knowledge of Docker and deploying applications/services on Kubernetes.
- Familiarity with Argo Workflows or Argo CD for automation and deployment.
- Understanding of monitoring tools like Grafana and Prometheus.