• Proactively monitors the work queues.
• Performs operational tasks to resolve all incidents/requests in a timely manner and within the agreed SLA.
• Updates tickets with resolution tasks performed.
• Identifies, investigates, analyses issues and errors prior to or when they occur, and logs all such incidents in a timely manner.
• Captures all required and relevant information for immediate resolution.
• Provides second level support to all incidents, requests and identifies the root cause of incidents and problems.
• Communicates with other teams and clients for extending support.
• Executes changes with clear identification of risks and mitigation plans to be captured into the change record.
• Follows the shift handover process highlighting any key tickets to be focussed on along with a handover of upcoming critical tasks to be carried out in the next shift.
• Escalates all tickets to seek the right focus from CoE and other teams, if needed continue the escalations to management.
• Works with automation teams for effort optimization and automating routine tasks.
• Ability to work across various other resolver group (internal and external) like Service Provider, TAC, etc.
• Identifies problems and errors before they impact a client’s service.
• Provides Assistance to L1 Security Engineers for better initial triage or troubleshooting.
• Leads and manages all initial client escalation for operational issues.
• Contributes to the change management process by logging all change requests with complete details for standard and non-standard including patching and any other changes to Configuration Items.
• Ensures all changes are carried out with proper change approvals.
• Plans and executes approved maintenance activities.
• Audits and analyses incident and request tickets for quality and recommends improvements with updates to knowledge articles.
• Produces trend analysis reports for identifying tasks for automation, leading to a reduction in tickets and optimization of effort.
• May also contribute to / support on project work as and when required.
• May work on implementing and delivering Disaster Recovery functions and tests.Performs any other related task as required.
Required Experience:
• Versed in Windows Technologies such as Domain Services, SQL, etc. not limiting to the list.
• SRE Oriented and Focus
• Familiar with AWS administration, Nutanix, Apache (Web server administration), Linux System administration, BigFix
• Familiar ELK Stack (Elasticsearch, Logstash, Kibana) with AI/ML integration
• Familiar with SNMP (Simple Network Management Protocol), F5 (Load Balancers and related technologies)
• Familiar with JSON (Data formatting and processing)
• Familiar with API, Automation, Ansible, CI/CD, etc.
• Languages PS, Bash, Python, etc.
• Moderate level years of relevant managed services experience handling cross technology infrastructure.
• Moderate level knowledge in ticketing tools preferably Service Now.
• Moderate level working knowledge of ITIL processes.
• Moderate level experience working with vendors and/or 3rd parties.