Senior Data Engineer PySpark/Hadoop

0 - 2 years

2.0 - 4.0 Lacs P.A.

Kolkata

Posted:2 months ago| Platform: Naukri logo

Apply Now

Skills Required

redissemrdbmsnosqldatabase creationcontinuous integrationci/cdpysparkdockersqldatabase designpostgresqlsparkmysqlhadoopetlmongodbdata lakecachepythongithubdata processingdata engineeringquery optimizationawsetl process

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role :Senior Data Engineer. Experience :4+ yrs. Work Mode :Remote. Joining Time Frame :Immediate to 15 days. Key Responsibilities. Design and architect robust data pipelines for structured, semi-structured, and unstructured data. Develop, manage, and optimize databases, including RDMS (MySQL, PostgreSQL), NoSQL (MongoDB), and data lakes (S3). Implement efficient ETL processes using tools like PySpark and Hadoop to transform and prepare data for analytics and #AI use cases. Optimize database performance, including query tuning, indexing, and caching strategies using tools like Redis and AWS-specific caching databases. Build and maintain CI/CD pipelines, manage YML files, and use GitHub for version control and collaboration. Leverage Docker for containerized deployment, with hands-on experience in running Docker commands for database and pipeline management. Ensure solutions adhere to best practices in system design, focusing on trade-offs, security, performance, and efficiency. Monitor, maintain, and troubleshoot database infrastructure to ensure high availability and performance. Collaborate with engineering teams to design scalable solutions for large-scale data processing. Stay updated on the latest database technologies and implement best practices for database design and management. Qualifications. 4+ years of experience in database architecture and optimization. Expertise in RDMS, NoSQL, and semi-structured databases (MySQL, PostgreSQL, MongoDB). Proficiency in programming languages for database integration and optimization (Python preferred). Strong knowledge of distributed data processing tools like PySpark and Hadoop. Hands-on experience with AWS services for data storage and processing, including S3. Strong familiarity with Redis for caching and query optimization. Proven experience with Docker for containerized deployments and writing CI/CD pipelines using YML files. (ref:hirist.tech). Show more Show less

Outsourcing and Offshoring Consulting

RecommendedJobs for You

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Pune, Bengaluru, Mumbai (All Areas)

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Bengaluru, Hyderabad, Mumbai (All Areas)

Hyderabad, Gurgaon, Mumbai (All Areas)