Senior Software Enginer I, Site Reliability Engineering (SRE) (SSE-1) Job Vacancy in Yatra Online Delhi, Delhi – Updated today

Are you looking for a New Job or Looking for better opportunities?
We got a New Job Opening for

Full Details :
Company Name :
Yatra Online
Location : Delhi, Delhi
Position :

Job Description : What you will be responsible for: Yatra, is one of the World’s rapidly growing online travel booking platform. Here, we are constantly striving to make the travelling experience more convenient, safer and easier than ever. With a diverse portfolio under our belt, we at Yatra, have an enormous client base for different edge of travel technology. To help us stay at the top of our game, we actively look for performance-driven, innovative and collaborative people to join us and help us grow and offer a stable and reliable travelling platform to our consumers. Your work output will enable other development teams to deliver, deploy and monitor their software systems which enable online bookings, payment transactions and personalized messaging to millions of customers who book their travel with Yatra.com. This career establishing role exposes you to complex programming skills, design patterns, SRE and DevOps practices. Therefore, the role requires you to demonstrate ability to quickly learn a programming language and an automation framework that you can use to build an A class service. In general, you will be part of a SRE team which is responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. TEAM We believe that “Travel make a change to people lives”. At Yatra being a travel tech company provides unique amazing experience to book a travel. At Yatra, SRE is a capability that ensures stability and reliability of its products built and run on large scale, distributed systems which in turn provide exceptional, uninterrupted User Experience for our Web and Mobile platforms. As individuals, it’s a place of creative, collaborative and confident people. As an agile team we are passionate to running a robust and reliable platforms for our consumers. Profile What you need to succeed in the role: Responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning Create a bridge between development and operations by applying a software engineering mindset to system administration topics. Split time between operations/on-call duties and developing systems and software that help increase site reliability and performance. Build self-service tools for user groups that rely on SRE for example automatic provisioning of test environments, logs, and statistics visualization, monitoring dashboards, developer portals, repository access management, etc. Collaborate closely with product developers to ensure that the designed solution responds to non-functional requirements such as availability, performance, security, and maintainability. Contribute to SLI, SLO and SLA definition, monitoring, alerting and reporting efforts. Staying abreast of new trends in application and infrastructure monitoring, provisioning, maintenance and uptime. Learn, prototype and apply newest tools and best practices in real life to meet the goals of SRE practice. Minimum 4 years of experiece in on any infra monitoring & APM tools like centreon, nagios, zabbix, Appdynamics, Appoptics, New Relic, etc. Minimum 4 years of experience in log/event aggregation and monitoring systems such as Splunk, Elasticsearch (ELK), Prometheus, Grafana, Graylog or similar Familiarity with one or more log and event aggregation and monitoring systems such as Splunk, Elasticsearch (ELK), Prometheus, Grafana, Graylog or similar Should have sound experience in a distributed environment (preferably kubernetes) to troubleshoot performance issue related to PODs, Network, Application Servers, Load Balancers, etc. Strong expertise in managing production incidents, with experience driving for resolution and stakeholder communication during incidents Additional Skills: Proficiency on one or more scripting languages for automating systems, eg. Bash, Python, Ansible, Puppet would be asset. Must have skills in investigating and troubleshooting complicated systems/platforms, and identifying key points of failure Knowledge of Distributed Systems fundamental principles (architectures, micro-services, high-availability, elections) will be an added advantage How to apply: Email your latest resume to jobs@yatra.com. Mention the job title “Senior Software Enginer I, Site Reliability Engineering (SRE) (SSE-1)” in the subject field for quick consideration .

This post is listed Under  Technology
Disclaimer : Hugeshout works to publish latest job info only and is no where responsible for any errors. Users must Research on their own before joining any company

Similar Posts