Site Reliability Engineer in India

Вакансії
PLWarsawGrzybowska 6000-844

Summary

Andersen is hiring a Site Reliability Engineer in India to drive reliability and performance for large-scale digital insurance platforms, enhancing integrations, optimizing cloud systems, and ensuring stable, high-quality service delivery.

The customer is a well-established global organization providing financial protection and risk-management services across various markets. With a diverse portfolio and teams operating in multiple regions, the company supports businesses and individuals through reliable, scalable solutions.

The project focuses on enhancing large-scale digital platforms, improving cloud performance, optimizing integrations, and modernizing systems to support efficient service delivery and ongoing expansion.

Responsibilities

  • Ensuring high availability, performance, scalability, and overall reliability of application infrastructure through proactive monitoring, automation, and continuous improvement.
  • Developing and implementing performance optimization strategies, including code optimization, memory management, load testing, and capacity planning.
  • Implementing and maintaining end-to-end observability, including real-time telemetry, CUJ-level metrics, dashboards, alerts, and actionable reporting.
  • Monitoring Critical User Journeys (CUJs) with product and business teams to improve end-to-end user experience and service reliability.
  • Managing SLIs, SLOs, SLAs, and error budgets across critical services while ensuring uptime and availability targets are consistently met.
  • Implementing next-generation architectural patterns and SRE recommendations to enhance fault tolerance, resilience, and disaster recovery capabilities.
  • Identifying and mitigating reliability risks, proactively addressing issues that may impact availability and minimizing service disruptions.
  • Automating key operational tasks such as deployments, scaling, failover, and remediation, and reducing manual toil through tools and process improvements.
  • Leading incident response efforts, participating in on-call rotations, and driving automated remediation for common failure scenarios.
  • Performing root-cause analysis, conducting blameless post-mortems, and implementing corrective actions to prevent recurring incidents.
  • Creating and maintaining comprehensive runbooks, operational documentation, and guidelines for incident response and system reliability.
  • Collaborating with global and regional digital teams on reliability best practices, mentoring junior SREs, and contributing to the hiring and onboarding of new SRE candidates.

Requirements

  • Experience in application support and reliability engineering environments for 10+ years.
  • Strong technical background with proficiency in software development principles, application production support, SDLC best practices, and Agile methodology.
  • Hands-on SRE skills, including familiarity with SLOs, SLIs, error budgets, incident management, and conducting blameless post-mortems.
  • Solid understanding of application architectures with the ability to analyze systems and identify areas for improvement.
  • Experience working with monitoring, logging, and observability tools to track and optimize application performance.
  • Proficiency in scripting and automation tools (e.g., Python, Bash, Terraform) to reduce toil and improve operational efficiency.
  • Strong incident response and troubleshooting skills with the ability to perform effective root cause analysis.
  • Excellent collaboration and communication skills for working with cross-functional teams and clearly explaining technical concepts.
  • Ability to coach and mentor team members in SRE practices and foster a culture of reliability.
  • Proactive mindset with a focus on continuous improvement to enhance application reliability and performance.
  • Level of English – from Intermediate+ and above.

Reasons to join us

  • Experience in teamwork with leaders in FinTech, Healthcare, Retail, Telecom, and others. Andersen cooperates with such businesses as Samsung, Siemens, Johnson & Johnson, BNP Paribas, Ryanair, Mercedes, TUI, Verivox, Allianz, T-Systems, etc..
  • The opportunity to change the project and/or develop expertise in an interesting business domain.
  • Job conditions – you can work both fully remotely and from the office or can choose a hybrid variant.
  • Guarantee of professional, financial, and career growth! The company has introduced systems of mentoring and adaptation for each new employee.
  • The opportunity to earn up to an additional 1,000 USD per month, depending on the level of expertise, which will be included in the annual bonus, by participating in the company's activities.
  • Access to the corporate training portal, where the entire knowledge base of the company is collected and which is constantly updated.
  • Bright corporate life (parties / pizza days / PlayStation / fruits / coffee / snacks / movies).
  • Certification compensation (AWS, PMP, etc).
  • Referral program.
  • English courses.
  • Private health insurance and compensation for sports activities.

Join us!

Будемо раді бачити вас!

або Порекомендувати друга

Ми обробляємо персональні дані відповідно до GDPR

Шукаєте нові можливості для розвитку? Ознайомтеся з відкритими позиціями в Andersen просто зараз