SRE/ Application Production Support

5 - 10 years

15.0 - 30.0 Lacs P.A.

Bengaluru

Posted:2 months ago| Platform: Naukri logo

Apply Now

Skills Required

Application Production SupportShell ScriptingAutoSysSplunkGrafanaSLA/SLO

Work Mode

Work from Office

Job Type

Full Time

Job Description

Responsibilities Accountable for driving and dictating efforts to mitigate and/or restore to normal business operations Review of Temporary Access requests / Get Access entitlements and grant/revoke as appropriate Follow documented procedures, and update documentation for resolved issues Determine customer experience and assess risk (core vs minor functions, regulatory or reputational) Determine incident severity, priority, impact and proper escalation protocols Ensure proper technology teams are engaged and supporting restoral of services Research and remediate customer/client data issues If required, request external support from software and/or hardware manufacturers Utilize tools and data to assist in research and determination of root-cause and restoral steps Ensure proper measures are taken such as batch restarts, routing, splashing, contingency or escalation to achieve restoral. Accountable for High Impact Incident Communications (Status, Restoral Efforts, Customer/Service Impact) to appropriate technology/business groups Drives requirements for Task Automation and tooling. Provides requirements and oversight of Resiliency Exercise Chaos Engineering Responsible for the data content, accuracy, and completeness in the Incident ticket before creation of Problem ticket Drive "reconvene" efforts to identify Granular Root Cause – Sev1 – Sev4 Incidents Provide critical incident details and restoration efforts to Problem Management team (Incident Description, Chronological timeframes, Scope of Impact, Triage Participants and etc.) Ensure "known errors" and "repeatable errors" are logged into the "Known Error" database. Document standardize processes and playbooks for Incident Management (Enhanced Identification and Restoral) Conduct Weekly Incident Reviews Self-Identified-Audit and Compliance issues – identification and remediation Permit-to-Operate – participation and awareness on complex, highly intrusive, high-risk change activities Providing personnel whose training, skills, and personal skills are adequate to think outside the box Providing personnel who are experienced professionals, or professionals with specific knowledge of the client's implementation or business. Coordinate events applicable to functional area of responsibility. Level 2 should be able to resolve 95% of the incidents that come to them.

Human Resources Consulting
San Francisco

RecommendedJobs for You

Chennai, Bengaluru, Hyderabad

Pune, Mumbai (All Areas)

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata