Gaugran · Senior Multi-Cloud Data Engineer · 5+ yrs

Mid-Level

India5+ years experienceremote

Available within 48 hrs

About Gaugran

Gaurang is a skilled Data Engineer with a strong background in cloud-based data solutions and machine learning. He has successfully designed and implemented robust data pipelines across various domains, ensuring data quality and observability. His hands-on experience with technologies like AWS, Azure, Databricks, and Apache Airflow allows him to manage complex data workflows efficiently. Gaurang is committed to delivering high-quality data solutions that drive business insights and decision-making.

Core expertise

Databricks

devops

10/10

Python

language

10/10

PySpark

language

10/10

AWS

cloud

9/10

Azure

cloud

9/10

Apache Airflow

devops

9/10

Snowflake

database

9/10

PostgreSQL

database

9/10

TensorFlow

8/10

dbt

tooling

8/10

Additional skills(17)

DatabricksSnowflakePySparkPostgreSQLPythonNumpyPandasMatplotlibTensorFlowAWS Glue

Why hire Gaugran?

Production deploy authorityExpert in AWS and Azure

Designed and implemented robust end-to-end data pipelines across multiple domains.

Achieved 89% prediction accuracy in a Real Estate Price Prediction Model through optimization.

Automated data workflows using Apache Airflow to ensure scalability and reliability.

Implemented custom data observability systems and alerting mechanisms for early detection of data quality issues.

Project highlights(5)

Google Analytics to Cloud Analytics Pipeline – Data Engineer

Overview: This project designed and implemented a robust end-to-end data pipeline to extract data from Google Analytics and transfer it to a cloud-based analytics platform for business insights. Responsibilities: Designed and implemented the end-to-end data pipeline using AWS Glue, Databricks, and Apache Airflow for orchestration. Leveraged Databricks for extracting large datasets and integrated with AWS S3 for cloud storage. Utilized Apache Airflow to automate scheduling and execution, optimizing resource usage. Applied PySpark for processing and transforming large-scale datasets, enabling real-time and batch capabilities. Implemented a multi-stage transformation process in Snowflake to deliver actionable insights.

AWS GlueDatabricksApache AirflowSnowflakePySpark

Key outcomes:

Designed and implemented a robust end-to-end data pipeline for Google Analytics data.
Ensured data accuracy and reliability through end-to-end data quality checks.

Product 360 Data Pipeline and Monitoring System – Data Engineer

Overview: This project involved developing and deploying a data pipeline monitoring system for Product 360. Responsibilities: Developed and deployed the data pipeline monitoring system using Databricks for processing and Snowflake for warehousing. Ingested and processed large-scale datasets from Azure Data Lake Storage (ADLS) and Azure Blob Storage using Databricks. Automated the pipeline with Apache Airflow to ensure scalable and reliable data flow to Snowflake. Implemented a custom data observability system to track data health, detecting anomalies and changes in patterns.

DatabricksSnowflakeAzure Data Lake Storage (ADLS)Apache AirflowPySpark

Key outcomes:

Deployed a data pipeline and monitoring system for Product 360.
Implemented a custom data observability system with anomaly detection.

Customer Data Integration and Consolidation Platform – Data Engineer

Overview: This project focused on designing and implementing a customer data integration platform to consolidate data from multiple sources (CRM, e-commerce platforms) into a single source of truth. Responsibilities: Designed and implemented the platform to consolidate customer data from CRM and e-commerce platforms. Leveraged AWS Glue to automate the ETL process into AWS S3, ensuring data consistency. Employed Databricks and PySpark for large-scale processing and complex transformations.

AWS GlueAWS S3DatabricksSnowflakeApache AirflowPySpark

Key outcomes:

Designed and implemented a customer data integration platform for a single source of truth.
Automated ETL processes for customer data using AWS Glue into AWS S3.

Health Care Data Extraction and Transformation – Data Engineer

Overview: This project engineered a fully automated data extraction and transformation pipeline for health care systems. Responsibilities: Engineered a fully automated data pipeline leveraging AWS WorkMail, AWS S3, and AWS Lambda. Integrated AWS Lambda with AWS WorkMail to automate Excel file ingestion into an S3 bucket. Triggered Lambda functions for data cleansing and transformation.

AWS WorkMailAWS S3AWS LambdaPostgreSQL

Key outcomes:

Engineered a fully automated data extraction and transformation pipeline for healthcare systems.
Designed a fault-tolerant and scalable system capable of handling large data volumes.

Real Estate Price Prediction Model – Data Scientist / Machine Learning Engineer

This project built an advanced real estate price prediction model with high accuracy using machine learning algorithms.
It aimed to provide stakeholders with easily interpretable predicted real estate prices.

Built the prediction model using linear regression, SVM, and decision trees.
Conducted extensive data analysis and feature engineering using Pandas, Numpy, and Seaborn for preprocessing.
Utilized Sklearn and TensorFlow to train, test, and validate various ML models.
Visualized key data insights and model results using Matplotlib and Seaborn.
Developed an optimization pipeline for model selection and hyperparameter tuning, achieving 89% prediction accuracy.

PythonNumpyPandasMatplotlibSeabornSklearnTensorFlow

Key outcomes:

Achieved 89% prediction accuracy for the real estate price prediction model.
Developed an optimization pipeline for model selection and hyperparameter tuning.
Enabled stakeholders to easily interpret predicted prices through effective visualizations.

5+ years of industry experience

Logistics & Supply ChainReported in resume

Ready to work with Gaugran?

Onboard within 48 hours. No long hiring cycles, no recruiter middleman.

At a Glance

LocationIndia

Experience5+ years

Work moderemote

Direct hirePossible

Start within48 hours

From$2,012/ month

Single contract. Billed in USD.

Typically responds within 4 business hours.

5-day replacement guarantee

48-hour onboarding, single invoice

Direct chat — no recruiter middleman

Top Skills

Databricks

10/10

Python

10/10

PySpark

10/10

AWS

9/10

Azure

9/10

Seniority signals

Owns production deploysGreenfield architectSystem ownerRecognised OSS contributor

Vetted by Witarist

Technical skills assessed & verified

Background & identity checked

English communication verified

Ready to onboard in 48 hours

Not sure if this is the right fit?

Tell us your requirements and we'll match you with the best candidates.

Gaugran

Data Engineer