Senior Cloud Applications Engineer
OpsWerks is a technical consulting company specializing in operational services for the high-tech industry. We help platform and infrastructure teams operate multi-cloud environments, execute complex migrations, and enable seamless app deployments.
Your Role
Serve as Subject Matter Expert (SME) for distributed applications on hybrid cloud platforms, documenting best practices and providing guidance to peers.
Champion continuous operational improvements informed by metrics analysis and customer feedback.
Lead incident management, troubleshooting, response coordination, and conduct comprehensive post-incident reviews.
Clearly communicate complex technical issues to development teams, document root causes, and collaborate internally to create robust solutions.
Manage, deploy, and maintain enterprise applications and cloud-based systems using secure, scalable, and reliable frameworks.
Proactively monitor, troubleshoot, and optimize the health, performance, and reliability of applications and platforms.
Perform detailed log analysis and utilize stack traces to debug and resolve issues reported by partners and end-users.
Develop comprehensive documentation covering operational procedures, system configurations, and environment setups.
Continuously identify and implement automation opportunities to reduce manual tasks and operational overhead.
Train junior engineers in different subjects of expertise.
Participate in a 24x7 shifting rotation.
Your Qualifications
Bachelor’s degree in Information Technology, Engineering, or a related technical field.
Minimum 5 years of experience supporting critical, high-availability production systems with a focus on automation, reliability, and operational excellence.
-
At least 5+ years of hands-on experience in at least 1–2 tools per domain:
Linux Administration & Troubleshooting: RHEL, CentOS, Ubuntu, or similar Unix-based OS.
Distributed Applications: Microservices architecture and distributed application support.
Logging & Monitoring: Splunk, Grafana, Prometheus.
Incident Management: PagerDuty, ServiceNow.
Version Control: Git, GitHub, GitLab.
Plus points if you have:
Certifications such as CKA, CKAD, or cloud certifications (AWS, Azure, GCP).
Experience supporting and maintaining PaaS environments, CDNs, Messaging Queues, API Gateways, and Proxies in scalable, resilient architectures.
Proven success in cross-functional collaboration within modern DevOps environments.
Ability to drive operational efficiency through automation, using Bash, Python, or similar scripting languages.
Ready to start your awesome journey and be part of OpsWerks?