Rajat is a seasoned Data Engineer with extensive experience in designing and implementing data pipelines using AWS, Databricks, and Snowflake. He excels in transforming raw data into actionable insights and has a proven track record of collaborating with stakeholders to meet business needs. His expertise in data validation and cleansing ensures high-quality data for analytics. Rajat is committed to best practices in data engineering and has a strong focus on maintainable architectures.
Designed and maintained scalable ETL/ELT pipelines on AWS and Databricks
Built PySpark and Databricks-based processing jobs for batch and streaming use cases
Collaborated with product and platform stakeholders to translate operational needs into technical workflows
Ensured secure data access and adherence to platform governance standards
Delivered analytics-ready datasets for downstream reporting
Processed 15M+ records across AWS, Databricks, and Snowflake
Implemented data validation and cleansing logic to ensure pipeline reliability
Designed optimized Snowflake schemas for analytics consumption
Overview: Cloud-scale orchestration platform enabling automated recovery workflows and real-time event processing using Kafka and AWS services. Responsibilities: Designed and maintained scalable event-driven pipelines using Kafka for real-time operational workflows. Built PySpark and Databricks-based processing jobs to support batch and streaming use cases. Ingested system events and operational data from APIs and cloud services. Implemented data validation and cleansing logic to ensure pipeline reliability.
Overview: Enterprise data automation platform delivering large-scale ETL/ELT pipelines and analytics-ready datasets in Snowflake. Responsibilities: Designed and developed scalable ETL/ELT pipelines processing 15M+ records across AWS, Databricks, and Snowflake. Built ingestion frameworks for APIs, files, and database sources. Implemented PySpark transformation layers on Databricks for batch processing and Kafka-based pipelines for real-time ingestion.
Overview: Healthcare analytics platform supporting batch and streaming pipelines for provider and patient data. Responsibilities: Built end-to-end ETL pipelines ingesting data from APIs and databases into Snowflake via Databricks. Implemented Kafka-based real-time ingestion alongside PySpark batch processing on Databricks.
Overview: Enterprise analytics solution transforming unstructured email content into structured reporting datasets. Responsibilities: Designed ETL pipelines to ingest email data and transform it using PySpark on Databricks. Built structured datasets in PostgreSQL and Snowflake for downstream reporting.
Overview: Global job market analytics platform delivering batch ingestion and reporting pipelines. Responsibilities: Developed ETL pipelines ingesting data from APIs and external files into Snowflake using Databricks. Built PySpark-based transformation and enrichment workflows on Databricks.
Rajat Dave
Data Engineer AWS