Manoranjan · Senior Spark Data Engineer · 6+ yrs

Mid-Level

India6+ years experienceremote

Available within 48 hrs

About Manoranjan

Manoranjan is a seasoned Sr. Data Engineer with over 6 years of experience in building and optimizing ETL pipelines on both Azure and AWS cloud platforms. He specializes in leveraging technologies such as Apache Spark and PySpark for scalable data processing and complex data wrangling. His expertise extends to CI/CD setups using Jenkins, ensuring efficient and reliable deployments. Manoranjan is adept at transforming raw data into actionable insights, contributing significantly to business objectives in marketing and anomaly detection.

Core expertise

Python

language

10/10

PySpark

backend

10/10

Apache Spark

backend

9/10

Azure

cloud

9/10

AWS

cloud

9/10

Jenkins

devops

9/10

SQL

database

8/10

Pandas

tooling

8/10

Apache Airflow

tooling

8/10

Additional skills(13)

PythonApache SparkDatabricksAzurePySparkAWSPandasSQLAWS LambdaRDS

Why hire Manoranjan?

Production deploy authorityDesigned scalable ETL pipelinesOptimized data processing performance

Designed and implemented scalable event processing systems on AWS, ensuring data reliability.

Improved data processing performance across multiple ETL pipelines through optimization techniques.

Orchestrated complex data workflows using Airflow and PySpark for marketing purposes.

Contributed to CI/CD pipeline setups using Jenkins, streamlining development lifecycles.

Improved data processing performance by 30% across multiple ETL pipelines through various optimization techniques.

Successfully built an ETL pipeline for data synchronization between two applications for marketing purposes.

Enabled targeted marketing campaigns by developing a robust ETL pipeline for user preferences.

Identified distributor anomalies, leading to improved decision-making and reduced potential losses.

Project highlights(4)

Event Integration System ETL Pipeline – Data engineer

Overview: The project aimed to fetch event data (attendees, events, sessions, speakers) from an application called Rain Focus. Responsibilities: Interacted with clients to gather requirements, provide work updates, and manage development/testing with team members. Prepared Technical Solution Documents outlining project implementation. Improved data processing performance using various optimization techniques.

PythonApache SparkDatabricksAzurePySparkSQL

Key outcomes:

Successfully built an ETL pipeline for data synchronization between two applications for marketing purposes.
Improved data processing performance through optimization techniques.

User Preferences ETL Pipeline – Data engineer

Overview: This project involved capturing product users' communication preferences. Responsibilities: Interacted with clients for work updates, development, and testing with other team members. Processed events published by Event Bridge using AWS Lambda and stored them in RDS. Enriched this data further using PySpark scripts, orchestrated as Airflow jobs.

AWSAWS LambdaRDSAirflowPySparkSQL

Key outcomes:

Developed a robust ETL pipeline for user preferences, enabling targeted marketing campaigns.
Ensured data reliability and scalability using SQS for large event processing.

Shipment Data Analysis – Data engineer

Overview: The project aimed to store and analyze shipment data in ADLS. Responsibilities: Utilized Azure Databricks, PySpark, Blob storage account, and Data Factory v2.0 as the Big Data Platform. Developed business logic for data wrangling using PySpark.

PythonApache SparkDatabricksADLSAzurePySparkSQLPandasData Factory v2.0

Key outcomes:

Successfully analyzed shipment data to identify and report anomalies, mitigating potential losses.

Red Flag - Distributor anomalies – Data engineer

The project focused on identifying distributor anomalies for a retail company across various scenarios.
The goal was to flag these anomalies as 'red flags' to support better decision-making at a top level.

Understood all technical requirements for the project.
Utilized Azure Databricks, PySpark, Blob storage account, and Data Factory v2.0 as the Big Data Platform.
Used Pandas to read Excel files and transformed them into PySpark data frames in Databricks notebooks.
Played an important role in Data Wrangling.
Responsible for wrangling data and sending it to Blob storage.
Developed business logic for data wrangling using PySpark.
Implemented Data Factory pipelines to transfer data from blob storage into ADLS.

PythonApache SparkDatabricksADLSAzurePySparkSQLPandasData Factory v2.0

Key outcomes:

Enabled a retail company to identify distributor anomalies, leading to improved decision-making.
Successfully implemented data processing and transfer workflows for anomaly detection.

6+ years of industry experience

Logistics & Supply ChainReported in resume

Ready to work with Manoranjan?

Onboard within 48 hours. No long hiring cycles, no recruiter middleman.

At a Glance

LocationIndia

Experience6+ years

Work moderemote

Direct hirePossible

Start within48 hours

From$2,443/ month

Single contract. Billed in USD.

Typically responds within 4 business hours.

5-day replacement guarantee

48-hour onboarding, single invoice

Direct chat — no recruiter middleman

Top Skills

Python

10/10

PySpark

10/10

Apache Spark

9/10

Azure

9/10

AWS

9/10

Seniority signals

Owns production deploysSystem owner

Vetted by Witarist

Technical skills assessed & verified

Background & identity checked

English communication verified

Ready to onboard in 48 hours

Not sure if this is the right fit?

Tell us your requirements and we'll match you with the best candidates.

Manoranjan

Sr.Data Engineer