We are looking for a skilled and proactive Senior DevOps Engineer to help expand and strengthen Abantether’s platform. You will work on high-scale infrastructure, modern DevOps practices, and cloud-native architectures while supporting a growing ecosystem of services and distributed systems.
Key Responsibilities:
- Design and maintain scalable, reliable, and secure infrastructure for distributed applications.
- Build and optimize CI/CD pipelines (GitLab CI/CD, ArgoCD) to improve delivery speed, quality, and stability.
- Implement advanced deployment strategies using Helm, Argo CD, and GitOps workflows.
- Improve and automate deployment workflows using container-based patterns and industry best practices.
- Automate infrastructure and operational processes using Ansible, Bash/Python, and cloud-native tools.
- Manage and optimize distributed clusters such as Kafka, Redis, Postgres, MongoDB and other core infrastructure components, ensuring proper partitioning, replication, scaling, and high-availability/failover strategies.
- Operate and optimize observability systems such as Zabbix, Prometheus/Grafana stack, and OpenTelemetry, focusing on scalability, retention, and query performance.
- Improve system resilience, performance, and debugging capabilities across services and infrastructure layers.
- Troubleshoot performance bottlenecks across networking, storage, application, and cluster components.
- Configure and optimize web servers and load balancers, including NGINX, HAProxy, Traefik, and Kubernetes ingress controllers.
- Enhance system reliability, fault tolerance, and incident response processes.
- Contribute to architectural decisions and lead infrastructure initiatives.
Required:
- 5+ years of experience as a DevOps Engineer or in a similar role.
- 3+ years of hands-on Kubernetes experience (deployment, troubleshooting, scaling).
- Strong Linux knowledge and proficiency in automation/scripting (Bash/Python).
- Solid understanding of distributed systems, networking fundamentals, and infrastructure design.
- Experience running CI/CD pipelines and managing production deployments.
- Understanding of event-driven and microservice architectures.
- Strong communication and collaboration skills; ability to work cross-functionally with development teams.
- Willingness to participate in on-call rotations and incident response.
Advantageous Skills (Nice to Have):
- Container security, compliance scanning, and image hardening.
- Cloud-native security tools (OPA Gatekeeper, Vault, Trivy).
- Cilium CNI and eBPF for network policy enforcement.
- Kubernetes networking, service mesh, and security hardening.
- Cloud-native observability (OpenTelemetry, Prometheus, Grafana stack).
- Distributed storage solutions (Longhorn, Ceph).
- Object storage systems (MinIO, S3-compatible APIs).
What We Value:
- Ownership and proactive mindset to drive reliability and system improvements.
- Curiosity, continuous learning, and willingness to share knowledge/mentor others.
- Strong communication skills to collaborate effectively with engineering and product teams.
- Empathy for developers and end users, with a collaborative and supportive approach.