Job Description:
As a DevOps Engineer, you will lead in designing, implementing, and maintaining our DevOps infrastructure. Using your expertise in continuous integration, deployment, and automation, you’ll work with cross-functional teams to streamline development processes and improve system reliability. Responsibilities include managing cloud infrastructure, implementing configuration management best practices, ensuring application security and scalability.
Key Responsibilities:
- Design, implement, and manage CI/CD pipelines to automate and optimize software development and deployment processes.
- Deploy, monitor, and maintain AI/ML models in production, ensuring scalability, security, and high availability.
- Manage and optimize cloud infrastructure (AWS, Azure, or GCP) for AI-driven applications, ensuring cost-effectiveness and performance.
- Implement infrastructure as code (IaC) using Terraform, CloudFormation, or similar tools.
- Monitor system performance, troubleshoot issues, and ensure 99.9% uptime for AI services.
- Enhance containerization and orchestration strategies using Docker and Kubernetes.
- Implement security best practices, ensuring compliance with industry standards and regulations.
- Automate and streamline deployment, configuration management, and system monitoring.
- Collaborate with AI/ML engineers, data scientists, and software developers to ensure seamless integration of AI models.
- Set up logging, monitoring, and alerting solutions using Prometheus, Grafana, ELK stack, or similar tools.
Required Skills & Experience:
- Sound Experience in DevOps, Cloud Engineering, or Infrastructure Automation.
- Expertise in cloud platforms – AWS, Azure, or Google Cloud.
- Strong hands-on experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, CircleCI).
- Proficiency in containerization (Docker) and orchestration (Kubernetes, Helm).
- Experience with Infrastructure as Code (IaC) – Terraform, Ansible, CloudFormation.
- Strong scripting skills in Python, Bash, or PowerShell for automation.
- Knowledge of security best practices – IAM, firewalls, encryption, vulnerability scanning.
- Experience in monitoring/logging tools – Prometheus, Grafana, ELK stack, Datadog.
- Familiarity with networking concepts, load balancing, and DNS management.
- Certifications in AWS (AWS Certified DevOps Engineer), Azure (Azure DevOps Engineer), or Kubernetes (CKA, CKS, EKS) would be preferred.