Our Journey So Far
At Snapp, we’re redefining how cities move. Our ride-hailing and mobility platform connects millions of riders and drivers every day, delivering safe, reliable, and efficient transport solutions. Powered by real-time data and robust infrastructure, we make urban travel faster, simpler, and more sustainable.
We operate with the mindset of a global tech leader and the agility of a startup, building services that scale across markets while staying responsive to local needs.
Your Impact
As a Senior Data Engineer, you will design, build, and maintain scalable data infrastructure and pipelines that handle billions of records each day. You will ensure fast, reliable, and high-quality data flows across our lakehouse platform, supporting both streaming and batch processing. Your work will be essential in enabling dependable data access, powering analytics, and accelerating AI-driven initiatives across the organization.
What You’ll Drive Forward
- Design and maintain large-scale ETL/ELT pipelines in Apache Flink, Ariflow and Spark for both streaming and batch workloads.
- Build and optimize real-time streaming systems using Kafka.
- Develop scalable ingestion frameworks for Delta Lake, Iceberg, and Hudi.
- Manage and optimize Ceph-based object storage within our data lakehouse.
- Oversee ClickHouse operations to ensure high-performance analytical querying.
- Drive reliability, scalability, and cost efficiency across systems handling billions of daily records.
- Deliver production-grade code in Python, Go, or Java.
- Implement data quality, monitoring, and observability frameworks.
- Collaborate with ML/AI teams to support model training, feature pipelines, and inference workflows.
- Reduce data pipeline latency by implementing efficient streaming architectures.
- Optimize storage costs while maintaining query performance across lakehouse layers.
What Powers Your Drive
- +6 years of experience in data engineering roles.
- Strong proficiency in at least two programming languages: Python, Go, or Java.
- Hands-on experience with Kafka and stream processing (Flink or Spark Streaming).
- Solid understanding of Spark and distributed computing.
- Experience with at least one lakehouse table format (Delta Lake, Iceberg, or Hudi).
- Strong SQL skills and experience with analytical databases (ClickHouse or similar columnar databases).
- Experience with DataOps practices for managing production environments, including Infrastructure as Code (e.g., Terraform, Ansible) and GitOps-based deployment strategies (e.g., Kubernetes, ArgoCD).
- Strong understanding of data modeling, data warehousing concepts, and ETL best practices
- Experience with version control (Git) and CI/CD practices.
- Strong problem-solving abilities and analytical thinking.
- Excellent collaboration and communication skills.
- Adaptability to rapidly evolving technology landscape.
Ready to Get on Board?
Help us shape the future of ride-hailing and urban mobility. Submit your CV and let’s build smarter cities together.