Actively Hiring

DevOps Automation Engineer

UplersBhubaneswar, KhurdaPosted 2 April 2026

Salary

Not Disclosed

Job Type

full time

Location

Khurda

Job Description

Experience: 3.00 years

Placement Type: Full Time Indefinite Contract 40 hrs a week/160 hrs a month Note: This is a requirement for one of Uplers client - Strategic Transformation Through Digital Physical Innovation What do you need for this opportunity Must have skills required: Grafana, Kubernetes tools, Monitoring tools, Promotheus, Scripting, CI/CD, CockroachDB, Terraforms, AWS, Docker, GCP, Github, Kubernetes Strategic Transformation Through Digital Physical Innovation is Looking for: Dev

This role spans: - Dual - cloud infrastructure AWS GCP - Developer workstation management - Security automation - Incident response - CI/CD pipeline operations The platform is scaling to serve millions of consumers across 20,000 veterinary clinics in 100 countries, with tens of millions of projected concurrent sessions. You must have real - world experience operating infrastructure at this scale, including: - Massive server loads - Database replication and failover at scale - Disaster recovery - Performance optimization under heavy concurrent traffic The current production stack runs on ECS Fargate with RDS PostgreSQL, with CockroachDB planned for distributed workloads. You will manage: - Tailscale VPN infrastructure - 20 EC2 dev boxes - Cloudflare for 23 zones - Pager

Duty alerting - A robust security automation layer This is a hands - on role with real operational ownership. What You ll Do Cloud Infrastructure AWS GCP - Design, build, and manage dual - cloud infrastructure across AWS and Google Cloud Platform - Manage ECS Fargate deployments task definitions, service discovery, ALB target groups, and blue/green deployments - Automate infrastructure provisioning using Terraform with modular, reusable configurations - Build and maintain CI/CD pipelines using Git

Hub Actions - Manage containerized applications using Docker, ECS, and Kubernetes EKS/GKE for planned workloads - Support multi - tenant and multi - region application architectures across 6 global regions - Implement and maintain CockroachDB clusters for distributed, geo - partitioned data planned migration from RDS PostgreSQL - Implement infrastructure cost optimization through: - Auto - scaling - Reserved capacity - Right - sizing - Spot instances - Savings Plans - Continuously monitor and reduce cloud spend across AWS and GCP - Optimize database costs through: - Right - sizing instances - Storage tiering - Reserved capacity - Query performance tuning Developer Workstation Infrastructure - Provision and manage 20 EC2 dev boxes across 3 AWS regions - Build custom AMIs using Packer for dev boxes and DERP relays - Deploy and maintain: - Memory watchdog - noVNC - Cloud

Watch agent configurations - Run fleet management commands across dev boxes via AWS Systems Manager SSM - Monitor dev box health and performance Tailscale VPN Administration - Manage Tailscale ACL policies and user access - Operate custom DERP relays in 3 regions - Configure app connectors for SaaS IP lockdown - Maintain Mullvad VPN integration for egress control Security Automation - Own Guard

Hub Enterprise security, including: - Org management - IP allowlists - Secret scanning policies - Runner management Scale, Performance Disaster Recovery - Design and operate infrastructure capable of handling millions of concurrent users and tens of millions of sessions across global regions - Implement and manage auto - scaling policies, including: - ECS service auto - scaling - EC2 ASGs - RDS read replicas - Conduct load testing and capacity planning - Design and maintain database scaling strategies: - Read replicas - Connection pooling - Query optimization - Sharding for high - throughput workloads - Own disaster recovery DR planning and execution: - Multi - region failover - RTO/RPO targets - Automated recovery runbooks - Regular DR drills - Implement and manage database backup strategies: - Point - in - time recovery - Cross - region replication - Automated restore testing - Optimize CDN and edge caching Cloudflare for global traffic at scale - Monitor and resolve performance bottlenecks across: - Application servers - Databases - Caches - Network layers - Build runbooks for incident response during: - High - traffic events - Database failovers - Regional outages Monitoring, Alerting Incident Response - Configure and maintain Pager

Hub - Maintain 45 cron jobs on the admin box - Manage Cloudflare across 23 zones, including: - CDN - DNS - WAF configuration - Collaborate with developers to improve deployment workflows and reduce lead time AI/ML Infrastructure Tooling - Use Claude Code / Cursor for: - Terraform authoring - Script generation - Infrastructure debugging - Support AI/ML infrastructure, including: - GPU instance management - Model deployment pipelines - Maintain and improve AI - assisted monitoring and alerting - Support infrastructure requirements for AI - enabled platform capabilities Must - Have Skills - 3 5 years of experience in Dev

Critical must have operated infrastructure serving millions of users with high concurrency - Experience with: - Server load management - Database scaling read replicas, connection pooling, sharding - Auto - scaling policies - Performance optimization under heavy traffic - Strong hands - on experience with AWS, including: - ECS Fargate - EKS - Lambda - S3 - RDS - Cloud

Query - Cloud Functions - IAM - ECS Fargate production experience - Terraform Infrastructure as Code with multi - environment, modular patterns - Tailscale VPN administration ACLs, DERP relays, app connectors - Packer for AMI builds - Docker and container orchestration in production - Experience with Git

Duty or equivalent incident response configuration - Production experience with CockroachDB or distributed SQL databases or strong willingness to learn - Disaster recovery planning and execution, including: - Multi - region failover - Backup automation - RTO/RPO targets - Recovery runbooks - Database performance optimization at scale, including: - Replication - Connection pooling - Query tuning - Capacity planning - Cost Optimization Critical proven track record of reducing cloud infrastructure costs through: - Right - sizing - Reserved capacity - Spot instances - Storage tiering - Waste reduction - Good understanding of Linux systems, networking, and security fundamentals - Strong communication skills and ability to work in a remote, globally distributed team Nice - to - Have Skills - Experience with Kubernetes tools: - Helm - ArgoCD - Flux - Experience with monitoring stacks: - Prometheus - Grafana - ELK - Loki - AWS Systems Manager fleet management at scale - Experience working in startup or fast - paced product environments - Scripting

Ops certification or formal cloud cost management framework experience What We re Looking For Mindset - Strong ownership and problem - solving mindset - Comfort working in a fast - growing, evolving environment - Ability to balance speed with stability and security - Willingness to learn and adapt to new tools and technologies - Clear, proactive communicator who surfaces issues early How to apply for this opportunity - Step 1: Click On Apply And Register or Login on our portal. - Step 2: Complete the Screening Form Upload updated Resume - Step 3: Increase your chances to get shortlisted meet the client for the Interview About Uplers: Our goal is to make hiring reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant contractual onsite opportunities and progress in their career. We will support any grievances or challenges you may face during the engagement.

Note: There are many more opportunities apart from this on the portal. Depending on the assessments you clear, you can apply for them as well . So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, dont hesitate to apply today. We are waiting for you

DevOps Automation Engineer

Job Description

Experience: 3.00 years

Requirements

Skills

About the Company