Posted:1 month ago| Platform:
Work from Office
Full Time
L2 Support Engineer (SRE Chaos Engineering) Area: Private cloud VMware, OpenStack, Kubernetes Linux, Monitoring, Reliability Engineering Defining & implementing practices in Resiliency Engineering, Automation, Observability & Chaos Testing while also engraining a proactive Chaos Culture that thinks reliability first design Scope of work • Supervise a team of SREs, ensuring that production applications which team supports are stable, reliable, and well documented. Own end to end availability and performance of mission critical service. Contributing to the design/architecture of the system. Analyze system architectures to identify single points of failure and other areas that may present a resiliency deficiency. Develop software to automate chaos and resiliency test cases that simulate failures in a system that performs financial data processing. Integrate Chaos engineering with CI/CD process. Establish a process to define a hypothesis around a steady-state and to simulate real-world events. Executing Game Days on mission critical applications. Identification of top errors, reliability issues and driving root cause to avoid repeat of incidents. Ability to analyze and debug complex issues across tiers from frontend to mid-tier to infrastructure. Hands on experience on any Chaos tool (Harness, Litmus, Gremlin, Chaos monkey, and ChaosBlade). Mindset to identify and explore chaotic situations and conduct formalized experiments. Experience with monitoring and logging tools (e. g. Datadog, ELK, Prometheus, Grafana). Experience with Kubernetes and Docker. Deep understanding of SRE concepts like SLAs, SLOs, SLIs, and error budgets. Experience working on cross department efforts by communicating and negotiating with multiple teams to accomplish goals. Expert with troubleshooting issues and bugs. Programming experience (Python/Go/shell). Experience in financial domain (desirable). Prior SRE/DevOps experience desirable. Skill Set " Experience in OS platforms (windows, linux, centos, ubuntu etc., ) highly skilled Site Reliability Engineer to join our Technology team and will be working as part of a cross-functional product team to create elegant solutions to highly complex and intricate business challenges. Ability to prioritize and multitask. Excellent communication and interpersonal skills
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Pune, Bengaluru, Mumbai (All Areas)
INR 4.0 - 8.5 Lacs P.A.
Pune, Mumbai, Gurgaon
INR 25.0 - 30.0 Lacs P.A.
Pune, Bengaluru, Mumbai (All Areas)
INR 10.0 - 20.0 Lacs P.A.
Navi Mumbai, Mumbai (All Areas)
INR 4.0 - 9.0 Lacs P.A.
Chennai, Bengaluru
INR 12.0 - 19.0 Lacs P.A.
INR 9.0 - 10.0 Lacs P.A.
Pune, Bangalore Rural, Mumbai (All Areas)
INR 0.6 - 3.0 Lacs P.A.
Pune, Bengaluru
INR 5.0 - 15.0 Lacs P.A.
Hyderabad
INR 10.0 - 20.0 Lacs P.A.
Pune, Noida, Mumbai (All Areas)
INR 15.0 - 25.0 Lacs P.A.