Design, develop, and optimize scalable ETL/ELT data pipelines using distributed processing frameworks like Apache Spark.
Manage and enhance data storage solutions to support large-scale data ingestion and retrieval.
Implement real-time data streaming pipelines with Apache Kafka or similar technologies to process event-level data.
Build and maintain fault-tolerant, highly available data processing systems on distributed clusters.
Collaborate with data scientists, analysts, and product teams to design datasets optimized for DMP use cases such as user segmentation and targeting.
Perform data modeling, indexing, and tuning to enhance data query performance.
Develop monitoring and alerting mechanisms for data pipelines and cluster health.
Explore and integrate new open-source technologies to improve system scalability and reduce processing latency.
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
2+ years of experience in data engineering or big data development, preferably in an AdTech or MarTech environment.
Strong proficiency in programming languages such as Python, Scala, or Java.
Hands-on experience with distributed data processing systems like Apache Spark or Apache Flink.
Experience with distributed messaging systems like Apache Kafka.
Deep knowledge of distributed storage systems: HDFS, NoSQL databases such as Cassandra, HBase, or MongoDB.
Solid experience with SQL and experience optimizing complex queries on large datasets.
Familiarity with cluster management and resource schedulers.
Experience working in on-premises or hybrid data infrastructure environments.
Strong problem-solving skills and the ability to troubleshoot distributed systems at scale.
این آگهی از وبسایت جاب ویژن پیدا شده، با زدن دکمهی تماس با کارفرما، به وبسایت جاب ویژن برین و از اونجا برای این شغل اقدام کنین.