Data Engineer
Jun 2025 - Present
Curve Digital Solutions
Spearheaded the optimization of a large-scale ETL pipeline, reducing execution time by over 80% through SQL restructuring and in-memory data transformations with Python and Polars. Built real-time data streaming pipelines using Apache Kafka to support event-driven architecture and reduce latency. Leveraged Apache Spark for large-scale processing by tuning job configurations and applying custom partitioning strategies. Automated complex workflows using Apache Airflow, enhancing reliability and monitoring. Collaborated with other teams to productionize machine learning models, improving scalability and deployment speed.




