Shrey Soni  ·  Senior Snowflake Data Engineer  ·  8+ yrs

Senior
India8+ years experienceremote
Available within 48 hrs

About Shrey

Shrey Soni is a seasoned Data Engineer with over 8 years of experience in the field. He has a strong background in data engineering, marketing automation, and digital analytics tools, with a focus on Python and cloud technologies. His expertise includes orchestrating data workflows, automating ETL processes, and implementing data validation mechanisms. Shrey is passionate about leveraging data to drive strategic decision-making and operational excellence.

Core expertise

Python
language
9/10
AWS
cloud
8/10
Snowflake
database
8/10
SQ
SQL
language
8/10

Additional skills(21)

PythonPysparkHiveSnowflakeHadoopMongoDBApache SparkVS CodeNLPWeb Scraping

Why hire Shrey?

Production deploy authorityMentored juniorsRecognized contributor in open source

Engineered an advanced data storage solution leveraging Amazon S3, optimizing data accessibility.

Automated 15+ ETL processes, reducing data processing time by 45%.

Implemented robust automated data validation mechanisms, guaranteeing data accuracy.

Migrated from SQL Server to Databricks, reducing operation costs by 52.35%.

Designed a data pipeline to organize data from 100+ sources while ensuring 99.8% uptime.

Reduced processing time by 85% through automated bug triage for SQL data ingestion.

Achieved a 52.35% reduction in operation costs by migrating to Databricks.

Increased performance by 37% after remodeling and migrating ETL processes.

Project highlights(5)

NLP Sentiment AnalysisPersonal / Open-Source

NLP Sentiment Analysis — open-source web scraping + word cloud + count vectorizer. Live evidence: GitHub personal project.

PythonNLPWeb ScrapingCount VectorizerWord Cloud

Key outcomes:

  • Saved an average of 23 mins per product review session by automating sentiment scoring across web-scraped reviews.

NLP Sentiment AnalysisData Scientist

Overview: Developed an NLP model to analyze product reviews. Responsibilities: Utilized web scraping and word cloud techniques to determine sentiment, saving time in review processes.

PythonNLPWeb Scraping

Key outcomes:

  • Reduced data-processing time by 45% via 15+ ETL automations in Databricks.

  • Reduced operational cost by 52.35% + increased performance by 37% by migrating SQL Server to Databricks Delta tables.

  • Pipeline organising 100+ data sources with 99.8% uptime.

ATC Dataset AnalysisData Engineer

Overview: Conducted a comprehensive analysis of the ATC dataset using Big Data tools. Responsibilities: Created partitioning of desired columns in Hive, performed data transformation using PySpark, and stored data into Azure Data Lake for visualization in Power BI.

PysparkHDFSHiveAzure Data LakePower BIAzure Synapse

Key outcomes:

  • Reduced processing time by 85% via automated SQL-ingestion duplicate detection across merged Anaplan modules.

  • Improved forecasting precision through automated data-validation across the ETL lifecycle.

ATC Live Dataset — Big Data AnalyticsAssociate Data Engineer

  • Comprehensive Data Pipeline using Azure Data Factory + Synapse to study the ATC Live Dataset (aviation).
  • Worked across 13+ Big Data Modules including Hadoop core, Azure Cloud, PySpark, NoSQL and SQL databases.
  • Devised event-scheduled ADF pipelines + Azure Synapse SQL pools.
  • Performed fishbone-diagram root-cause analysis; brainstormed recommendations.
SnowflakedbtPythonSQLPySparkHiveHadoopMongoDBAzure Data FactoryAzure SynapseApache SparkData ModelingGCP BigQuery

Key outcomes:

  • Improved efficiency by 19% through fishbone RCA and increased throughput by 23% through stakeholder-aligned process recommendations.

ATC Dataset Analysis using BigData (Personal Project, 2021)Personal Project

Personal ATC dataset analysis with Hive partitioning + PySpark + Azure Data Lake.

PySparkHDFSHiveAzure Data LakePower BIAzure SynapseVS Code

Key outcomes:

  • Live Hive + PySpark pipeline feeding Synapse SQL Pools and Power BI dashboards for ATC dataset analytics.

8+ years of industry experience

EdTechReported in resume
Logistics & Supply ChainReported in resume
  • ATC Live Dataset — Big Data AnalyticsAssociate Data EngineerSnowflake · dbt · Python · SQL +9

Ready to work with Shrey?

Onboard within 48 hours. No long hiring cycles, no recruiter middleman.

At a Glance

LocationIndia
Experience8+ years
Work moderemote
Direct hirePossible
Start within48 hours
From$1,796/ month

Single contract. Billed in USD.

Typically responds within 4 business hours.

5-day replacement guarantee
48-hour onboarding, single invoice
Direct chat — no recruiter middleman

Top Skills

Python
9/10
AWS
8/10
Snowflake
8/10
SQL
8/10
Seniority signals
Owns production deploysGreenfield architectSystem ownerCode reviewerMentor / leads juniorsRecognised OSS contributor
VerifiedVetted by Witarist
Technical skills assessed & verified
Background & identity checked
English communication verified
Ready to onboard in 48 hours

Not sure if this is the right fit?

Tell us your requirements and we'll match you with the best candidates.

Shrey Soni

Data Engineer