About the Role
As a Senior Data Engineer at Telewebion, you will be a key individual contributor responsible for driving the technical execution of key components within our high-scale data infrastructure. Working closely with the Data Team Lead, you will transform high-level product goals into concrete, scalable engineering solutions. We are looking for an engineer who proactively explores the "why" behind every challenge, collaborates on architectural improvements, and takes ownership of implementing pipelines that handle millions of concurrent users with high reliability.
Responsibilities
* Execute & Implement:
Build and maintain high-performance data pipelines capable of processing high-velocity logs (1M+ RPS) from geographically distributed CDN nodes.
* Solution Engineering: Translate product needs into detailed engineering designs. You will proactively propose and iterate on models for complex problems like real-time QoE monitoring or intelligent routing.
* Scalability & Optimization: Collaborate with the team to optimize data partitioning, storage formats, and resource management to maintain system performance under heavy loads.
* Technical Ownership: Drive the implementation of data ingestion strategies (Pull vs. Push), ensuring maximum visibility into service quality with minimal latency.
* Quality & Reliability: Take ownership of the production health of your assigned pipelines. Implement monitoring, alerting, and automated recovery while ensuring data integrity.
* Analysis & Improvement: Perform root cause analysis on system bottlenecks and failures, contributing to structural code and architecture enhancements.
Requirements
* 5+ Years of Experience: Solid track record in Data Engineering or Backend Infrastructure, specifically dealing with distributed systems and high-throughput data processing.
* Technical Autonomy: Proven ability to navigate ambiguity and drive the technical implementation of a component from ideation to production with minimal supervision.
* Engineering Depth: Strong understanding of software engineering principles (Clean Code, Modularity) and a solid grasp of system internals (JVM/Python memory management and how databases/brokers handle high-concurrency).
* Data Modeling: Experience in modeling business domains using appropriate data structures and choosing the right storage engine for specific access patterns.
* Distributed Systems Knowledge: Familiarity with the CAP theorem and the practical challenges of maintaining data consistency and idempotency in distributed environments.
* Production-Grade Engineering: Mastery of Python or Java/Scala and SQL, with a focus on writing testable, modular code and optimizing for cost-efficiency.
* Operational Awareness: Experience in monitoring critical data services and understanding the importance of SLAs/SLOs.
Bonus / Familiarity with our Stack
* Analytical Databases: Experience with ClickHouse, BigQuery, or Druid.
* Big Data Storage: Knowledge of HDFS, Ceph, or Iceberg.
* Streaming & Messaging: Hands-on experience with Kafka or similar event-streaming platforms.
* Orchestration: Experience with Airflow, ArgoCD, or Dagster.
* Big Data Tools: Familiarity with Spark, Trino, or Flink (Internals, UI, and optimization).
* Low-Level Knowledge: Understanding of Linux I/O stack and consensus protocols (e.g., Raft/Paxos).
* Infrastructure: Experience with Docker and Kubernetes..