Manoranjan  ·  Senior Spark Data Engineer  ·  6+ yrs

Mid-Level
India6+ years experienceremote
Available within 48 hrs

About Manoranjan

Manoranjan is a seasoned Sr. Data Engineer with over 6 years of experience in building and optimizing ETL pipelines on both Azure and AWS cloud platforms. He specializes in leveraging technologies such as Apache Spark and PySpark for scalable data processing and complex data wrangling. His expertise extends to CI/CD setups using Jenkins, ensuring efficient and reliable deployments. Manoranjan is adept at transforming raw data into actionable insights, contributing significantly to business objectives in marketing and anomaly detection.

Core expertise

Python
language
10/10
PySpark
backend
10/10
Apache Spark
backend
9/10
Azure
cloud
9/10
AWS
cloud
9/10
Jenkins
devops
9/10
SQ
SQL
database
8/10
Pandas
tooling
8/10
AA
Apache Airflow
tooling
8/10

Additional skills(13)

PythonApache SparkDatabricksAzurePySparkAWSPandasSQLAWS LambdaRDS

Why hire Manoranjan?

Production deploy authorityDesigned scalable ETL pipelinesOptimized data processing performance

Designed and implemented scalable event processing systems on AWS, ensuring data reliability.

Improved data processing performance across multiple ETL pipelines through optimization techniques.

Orchestrated complex data workflows using Airflow and PySpark for marketing purposes.

Contributed to CI/CD pipeline setups using Jenkins, streamlining development lifecycles.

Improved data processing performance by 30% across multiple ETL pipelines through various optimization techniques.

Successfully built an ETL pipeline for data synchronization between two applications for marketing purposes.

Enabled targeted marketing campaigns by developing a robust ETL pipeline for user preferences.

Identified distributor anomalies, leading to improved decision-making and reduced potential losses.

Project highlights(4)

Event Integration System ETL PipelineData engineer

Overview: The project aimed to fetch event data (attendees, events, sessions, speakers) from an application called Rain Focus. Responsibilities: Interacted with clients to gather requirements, provide work updates, and manage development/testing with team members. Prepared Technical Solution Documents outlining project implementation. Improved data processing performance using various optimization techniques.

PythonApache SparkDatabricksAzurePySparkSQL

Key outcomes:

  • Successfully built an ETL pipeline for data synchronization between two applications for marketing purposes.

  • Improved data processing performance through optimization techniques.

User Preferences ETL PipelineData engineer

Overview: This project involved capturing product users' communication preferences. Responsibilities: Interacted with clients for work updates, development, and testing with other team members. Processed events published by Event Bridge using AWS Lambda and stored them in RDS. Enriched this data further using PySpark scripts, orchestrated as Airflow jobs.

AWSAWS LambdaRDSAirflowPySparkSQL

Key outcomes:

  • Developed a robust ETL pipeline for user preferences, enabling targeted marketing campaigns.

  • Ensured data reliability and scalability using SQS for large event processing.

Shipment Data AnalysisData engineer

Overview: The project aimed to store and analyze shipment data in ADLS. Responsibilities: Utilized Azure Databricks, PySpark, Blob storage account, and Data Factory v2.0 as the Big Data Platform. Developed business logic for data wrangling using PySpark.

PythonApache SparkDatabricksADLSAzurePySparkSQLPandasData Factory v2.0

Key outcomes:

  • Successfully analyzed shipment data to identify and report anomalies, mitigating potential losses.

Red Flag - Distributor anomaliesData engineer

  • The project focused on identifying distributor anomalies for a retail company across various scenarios.
  • The goal was to flag these anomalies as 'red flags' to support better decision-making at a top level.
  • Understood all technical requirements for the project.
  • Utilized Azure Databricks, PySpark, Blob storage account, and Data Factory v2.0 as the Big Data Platform.
  • Used Pandas to read Excel files and transformed them into PySpark data frames in Databricks notebooks.
  • Played an important role in Data Wrangling.
  • Responsible for wrangling data and sending it to Blob storage.
  • Developed business logic for data wrangling using PySpark.
  • Implemented Data Factory pipelines to transfer data from blob storage into ADLS.
PythonApache SparkDatabricksADLSAzurePySparkSQLPandasData Factory v2.0

Key outcomes:

  • Enabled a retail company to identify distributor anomalies, leading to improved decision-making.

  • Successfully implemented data processing and transfer workflows for anomaly detection.

6+ years of industry experience

Logistics & Supply ChainReported in resume

Ready to work with Manoranjan?

Onboard within 48 hours. No long hiring cycles, no recruiter middleman.

At a Glance

LocationIndia
Experience6+ years
Work moderemote
Direct hirePossible
Start within48 hours
From$2,443/ month

Single contract. Billed in USD.

Typically responds within 4 business hours.

5-day replacement guarantee
48-hour onboarding, single invoice
Direct chat — no recruiter middleman

Top Skills

Python
10/10
PySpark
10/10
Apache Spark
9/10
Azure
9/10
AWS
9/10
Seniority signals
Owns production deploysSystem owner
VerifiedVetted by Witarist
Technical skills assessed & verified
Background & identity checked
English communication verified
Ready to onboard in 48 hours

Not sure if this is the right fit?

Tell us your requirements and we'll match you with the best candidates.

Manoranjan

Sr.Data Engineer