Site Reliability Engineer Job Vacancy in Bizquad Consultants Remote – Updated today

Are you looking for a New Job or Looking for better opportunities?
We got a New Job Opening for

Full Details :
Company Name :
Bizquad Consultants
Location : Remote
Position :

Job Description : As a Site Reliability Engineer, Observability & AIOPs your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on FusionOperate for the biggest Financial Institutions in the world. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software engineering problem aiming to build reactive systems that self-heal, ensuring we keep revenue-critical systems up & running despite natural disasters, unexpected surges in traffic, and configuration errors.Your day will vary from the fine-grained details of optimizing disk performance, authoring operational code for our applications to the big picture of reliability modelling. You will operate as part of a global scaled agile SRE team applying your experience in Continuous Delivery.Location: Work from Home(India)Experience & Qualifications · 15+ years of experience in Computer Science· Application Development using Continuous Delivery for a SaaS or Managed Hosted application with operations experience· Authoring and consuming Open API, gRPC based APIs· Design and Implement Multi Cloud, Self Service oriented Observability, AIOPs and Self-Healing Platforms· Instrumenting metrics, logs & traces for applications & infrastructure you have worked on· Implementing Alerts/Logs Correlation, De-duplication for Noise Reduction· Design and Implementing Self-Healing Scenarios· Integration with Monitoring tools, AIOPs and Incident Management eco systems· Implementing and Delivering robust Infrastructure as code (IaC)· Designing, deploying and orchestrating microservices using Kubernetes· Appropriate RHEL, Kubernetes & Cloud Certifications a plusResponsibilities: · Proactively identifying & eliminating excess operational work and poorly performing services· Authoring observability for applications, infrastructure using RED & USE methods· Defining the required reliability of your service through service-level indicators (SLI) and service-level objectives (SLO) & utilization of an error budget to manage the pace of innovation with reliability· Participate in continuous operations review and feedback loop to improve effectiveness of monitoring· Implementing Resiliency Tests, Self-Healing & Circuit Breakers to handle chaotic conditions & ensure your service behaves reasonably even in the face of unexpected demand· Practicing Chaos Engineering in pipeline helping us implement and mature· Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable· Leading Blameless Postmortems analysis for IncidentsTechnology Stack Experience Required – Must Have at least minimum 5 years of relevant experience· Multi Cloud; Azure, AWS, GCP· Prometheus, Grafana, Loki and Tempo (Open Telemetry), Cortex· Programming (Python, Nodejs, Nest Js, Golang, Java, JavaScript)· Kubernetes, Helm, & ArgoCD, Serverless (OpenShift a plus)· Terraform, Ansible and/or Puppet· Moogsoft or similar AIOPs Tools is a Plus· Data Services (delta lake, knative, mongodb, postgresql/cockroachdb, kafka, spark, camel) is PlusJob Types: Full-time, Regular / PermanentPay: ₹2,500,000.00 – ₹3,000,000.00 per yearBenefits:Work from homeSchedule:Monday to FridayExperience:SRE: 10 years (Preferred)

This post is listed Under  Software Development
Disclaimer : Hugeshout works to publish latest job info only and is no where responsible for any errors. Users must Research on their own before joining any company

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *