The Principal DevOps Engineer is a senior technical leader responsible for defining, evolving, and governing the organization’s DevOps, cloud, and infrastructure architecture. This role combines deep hands-on expertise with strategic ownership, driving reliability, scalability, security, and operational excellence across software delivery and infrastructure platforms.
Responsibilities
- Define and own the DevOps and platform architecture across the organization, ensuring scalable, secure, and highly available systems that support current and future business needs.
- Lead the design, implementation, and continuous improvement of CI/CD platforms, release strategies, and deployment workflows across multiple teams and environments.
- Act as a technical authority and advisor for DevOps, SRE, cloud-native, and infrastructure practices, influencing engineering standards and long-term technical direction.
- Architect and oversee hybrid infrastructure environments, including virtualized platforms (VMware, OpenStack), container orchestration (Kubernetes), and supporting services.
- Drive the development, maintenance, and evolution of internal platforms, automation frameworks, and tooling, with hands-on contribution using modern programming languages (Golang preferred, but not limited to it).
- Establish and enforce best practices for infrastructure-as-code (IaC), configuration management, and platform automation across cloud and on-prem environments.
- Lead performance, scalability, reliability, and capacity planning initiatives across virtual machines, containers, and underlying infrastructure layers.
- Champion observability, monitoring, logging, and alerting standards, ensuring actionable insights and measurable service-level objectives (SLOs/SLIs).
- Guide and mentor senior and mid-level engineers, conducting architectural reviews, design discussions, and post-incident analyses.
- Identify systemic risks and technical debt; design and drive long-term remediation strategies.
- Partner with security teams to embed DevSecOps principles, including vulnerability management, compliance automation, and secure software supply chains.
- Evaluate, select, and govern third-party platforms, vendors, and technologies related to DevOps, cloud, and infrastructure.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field, or equivalent industry experience.
- 4+ years of experience in DevOps, SRE, Platform Engineering, or Infrastructure Engineering roles, with demonstrated technical leadership at scale.
- Deep expertise in DevOps and cloud-native technologies, including:
- Version control systems (Git)
- CI/CD platforms and build automation
- Configuration management (e.g., Ansible)
- Containerization and orchestration (Docker, Kubernetes)
- Strong hands-on experience with virtualization technologies, including:
- VMware (vSphere, ESXi, vCenter)
- OpenStack (Nova, Neutron, Cinder, Glance)
- Proven experience designing and operating hybrid and on-prem infrastructure integrated with modern container platforms.
- Advanced proficiency in software development for infrastructure and platform tooling; Golang is strongly preferred, with experience in additional languages considered a plus.
- Strong experience with Infrastructure as Code (IaC) tools such as Terraform, including large-scale and multi-environment deployments.
- Deep understanding of Linux systems, networking, storage, and distributed systems fundamentals.
- Demonstrated ability to lead complex technical initiatives, influence engineering direction, and drive adoption across multiple teams.
- Exceptional troubleshooting and root-cause analysis skills, particularly in high-availability and large-scale systems.
- Excellent communication, documentation, and mentorship skills.
- Experience with DevSecOps, compliance frameworks, and secure CI/CD pipelines is a strong plus.