UMAIR NAWAZ

Cloud Data Engineer

Certified Data Engineer with hands-on experience in building and optimizing ETL pipelines, real-time data streaming, and cloud-native solutions. Proficient in AWS, Apache Spark, Kafka, Airflow, and Snowflake. Passionate about designing scalable data architectures that drive actionable insights and business impact.

Umair's professional profile picture

Experience

Data Engineer

Jun 2025 - Present

Curve Digital Solutions

Spearheaded the optimization of a large-scale ETL pipeline, reducing execution time by over 80% through SQL restructuring and in-memory data transformations with Python and Polars. Built real-time data streaming pipelines using Apache Kafka to support event-driven architecture and reduce latency. Leveraged Apache Spark for large-scale processing by tuning job configurations and applying custom partitioning strategies. Automated complex workflows using Apache Airflow, enhancing reliability and monitoring. Collaborated with other teams to productionize machine learning models, improving scalability and deployment speed.

Cloud Data Engineer

Jan 2024 - May 2025

Saylani Tech

Designed and implemented scalable ETL pipelines using Apache Airflow and Apache Spark to automate the ingestion and transformation of large datasets. Developed optimized data models in Snowflake, significantly improving query performance and enabling real-time business insights. Built cloud-native solutions using AWS services like S3, Lambda, and Kafka to ensure reliable and efficient data processing. Ensured data integrity and accuracy through validation pipelines and collaborated with cross-functional teams to deliver end-to-end data solutions for analytics and reporting.

Featured Projects

DataPulse: Real-Time Serverless Data Ingestion Pipeline

DataPulse: Real-Time Serverless Data Ingestion Pipeline

DataPulse automates real-time ingestion and processing of financial, crypto, and forex data using AWS Lambda, EventBridge, S3, SNS, and SQS, with optional transformations.

Details
Real-Time Stock Market Data Pipeline

Real-Time Stock Market Data Pipeline

Built a real-time data pipeline for stock market data. It integrates Apache Kafka for data streaming and multiple AWS services for data storage and querying.

Details
Real-Time Data Pipeline with SCD Implementation

Real-Time Data Pipeline with SCD Implementation

Developed an end-to-end data pipeline that generates synthetic data using Python, extracts and transfers it via NiFi to S3, and ingests it into Snowflake using Snowpipe, with SCD Type 1 & 2 implementations for effective data tracking and management.

Details
Real Estate Data Pipeline

Real Estate Data Pipeline

This project implements a scalable data pipeline to extract, transform, and load real estate data from Redfin into Snowflake using AWS services. The data is later visualized in Power BI to provide insights into real estate trends.

Details