Are you looking for a New Job or Looking for better opportunities?
We got a New Job Opening for
Full Details :
Company Name : Larsen & Toubro Infotech Limited
Location : Mumbai, Maharashtra
Job Description : Ability to design, build and unit test applications on Spark framework on Python.
Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and HQL databases as well.
Develop and execute data pipeline testing processes and validate business rules and policies
Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD’s.
Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec respectively.
Ability to design & build real-time applications using Apache Kafka & Spark Streaming
Build integrated solutions leveraging Unix shell scripting, RDBMS, Hue, Hive, HDFS File System, HDFS File Types, HDFS compression codec.
Build data tokenization libraries and integrate with Hive & Spark for column-level obfuscation • Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources.
Create and maintain integration and regression testing framework on Jenkins integrated with BitBucket and/or GIT repositories
Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings
Good to have Alteryx basic usage knowledge.
Work collaboratively with onsite and offshore team.
Develop & review technical documentation for artifacts delivered.
Ability to solve complex data-driven scenarios and triage towards defects and production issues
Ability to learn-unlearn-relearn concepts with an open and analytical mindset
Participate in code release and production deployment. • Challenge and inspire team members to achieve business results in a fast paced and quickly changing environment
Desired Candidate Profile
Experience : 8 -12 years
BE/B.Tech/ B.Sc. in Computer Science/ Statistics, Econometrics from an accredited college or university.
Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications.
Expertise in handling complex large-scale Big Data environments preferably (20Tb+).
Minimum 3 years of experience in the following: HIVE, YARN, HDFS preferably on Cloud Data Platform.
Job Segment: Database, Computer Science, Unix, SQL, Technology
This post is listed Under Technology
Disclaimer : Hugeshout works to publish latest job info only and is no where responsible for any errors. Users must Research on their own before joining any company