As a data engineer you need to design and build systems for collecting, storing, and analyzing data at scale. Here, we are using modern and lesser-known tools, software and libraries, so you need to be open to learning and flexible enough to catch up.
Responsibilities:
- Designing and building systems for collecting, storing, and analyzing data at scale
- Acquire datasets that align with business needs
- Collecting data from different data sources, including XML files, JSON files, REST APIs, NBI, etc.
- Develop algorithms to transform data into useful, actionable information
- Build, test, and maintain ETL/ELT pipeline architectures
- Collaborate with management to understand company objectives
- Create new data validation methods and data analysis tools
- Ensure compliance with data governance and security policies
General Skills:
- Python
- Debian based Linux distros
- Git
- Docker
- Docker-Compose
- Coding best practices
- Experienced with code reviews and pull request.
- English language proficiency
Special Skills:
- Python data wrangling libraries (pandas, numpy, etc.)
- Python multi-tasking paradigms and libraries
- Python ORM specially SQLAlchemy
- PostgreSQL
- Apache Airflow
- NoSQL databases (experience with at least one column-based DB is required)
Additional Skills:
- Experience managing and working with Clickhouse is a big plus.
- Experience with at least one graph database is a plus.