About the Role
We are looking for a Senior Data Platform Engineer to help design and build our Cloud Data Platform at scale.
You will work on the core infrastructure that powers real-time data streaming, batch processing, and data lakehouse analytics.
This role sits at the intersection of Big Data, Backend Engineering, and DevOps.
Responsibilities
- Design and maintain scalable, fault-tolerant data platform architectures
- Build and operate real-time and batch data processing systems using Apache Spark and Apache Flink
- Develop backend services and APIs using Python or Golang
- Implement and operate a Data Lakehouse using Iceberg / Delta Lake / Hudi
- Design data ingestion and orchestration pipelines with Apache Airflow
- Implement CDC and event-driven architectures using Debezium and Apache Kafka
- Own production systems: containerization, deployment, and operations using Docker and Kubernetes
- Ensure platform reliability through monitoring, logging, and alerting (Prometheus, Grafana, ELK)
Requirements
- Strong software engineering fundamentals (data structures, algorithms, system design)
- Proficiency in Java/Scala and Python (Golang is a plus)
- Hands-on experience with Kafka, Spark/Flink
- Strong SQL skills and experience with PostgreSQL
- Experience with Data Lakehouse architectures and open table formats
- Experience building and operating production systems on Kubernetes
Nice to Have
- Experience with cloud platforms (AWS, Databricks, Snowflake)
- Familiarity with dbt, MLflow, or Kubeflow
- Experience with workflow and ingestion tools like Vector, Benthos, NiFi
- Experience working with data science or machine learning teams