Specialist Site Reliability Engineer Job Vacancy in NICE CXone Pune, Maharashtra – Updated today
Are you looking for a New Job or Looking for better opportunities?
We got a New Job Opening for
Full Details :
Company Name : NICE CXone
Location : Pune, Maharashtra
Position :
Job Description : Description
Being an efective SRE is as much about how you think, as it is about your technical skills. The SRE role requires a mix of development and operations skills. Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.
First and foremost, an SRE is a software developer that builds things. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you are expected to manage the complex challenges of scale which are unique to Nice InContact, while using your expertise in coding , systems, complexity of operating systems and large-scale system design. SRE’s culture of diversity, intellectual curiosity, problem solving, and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives.
Generalists thrive in this role as an SRE. Ensuring that the Nice InContact services—both our internally critical and our externally-visible systems—have the reliability/uptime appropriate to users’ needs. Additionally, SRE’s will keep an ever-watchful eye on our systems capacity and performance.
Answering the questions: How does something work? How can I make it run better? How do I know it’s working? How do I measure the performance? Now, once you have answered those questions: How do I work within the organizational departments to do this?
Education/Experience:
Bachelor’s Degree in a related field or equivalent time/experience of relevant work history that should consist of 1 – 3 years in cloud environments provisioning automation within AWS and Azure. There should also be a demonstratable software development experience.
Managing services inside the cloud AWS/Azure connections to Enterprise infrastructure.
Prior experience with Microsoft and Linux troubleshooting, coding/scripting, higher level languages C#, Java, etc
Can demonstrate the SOLID principles while writing code, can follow code flow, use version control Git/GitHub. Familiar with the building of CI/CD pipelines with Jenkins
Understanding of common scripting PowerShell, Bash, AWS CLI
Experience managing a full application stack with high availability requirements is preferred
Managing of both Microsoft and Linux servers and services
Experience leveraging monitoring and alerting tools such as Grafana, Prometheus. Inspec testing for auditing. Chef scripts for reliable builds.
Understand containerization and the orchestration of Docker/Kubernetes,
Strong written and verbal communication skill
Duties
Writing is our primary means of communication, from pull requests, team chat, knowledge sharing, and communicating changes. Excellent writing skills are crucial to success.
What is TOIL? Understanding of TOIL and its characteristics, including having a drive to measure and eliminate it.
What is an Error budget?
What is SLI and SLO?
Continued improvement of tech skills is a requirement. You should be learning a new tech skill each quarter. Seeking industry certifications to establish your level of knowledge
Required: A self-motivated individual with a track record of having the internal drive and motivation to begin and continue tasks without external prodding or extra rewards.
Maintain obtainable goals with manager
Within the duties are three main areas of focus: Reliability, Monitoring/Alerting, and Service
Reliability/Availability
Collaborate and contribute with other enterprise teams
Communicate availability to the team and manager
Monitoring – site reliability and health of our systems. Learn to identify those areas critical, major, minor.
Alerting – the critical problems/errors of systems and their processes.
Metrics – gathering data for troubleshooting of all kinds. Exposing application metrics to managing/monitoring/monitoring
Building new features and services is a big part of this role. We are continually developing and implementing new ways to support our teams, understanding our customers’ needs, and becoming experts in site reliability.
Monitoring and Alerting
Development and the Deployment of new tools to support our systems and services in an automated fashion
Hardening our systems where applicable
Supporting the deployment of new product services
Use software development approaches to operations. You should have a breadth of experience in software development, operations, and be actively practicing site reliability principles
This post is listed Under Technology
Disclaimer : Hugeshout works to publish latest job info only and is no where responsible for any errors. Users must Research on their own before joining any company