- ๐จโ๐ป Iโm currently working as a Database Administrator, building strong foundations in data management and reliability
- ๐ฑ Transitioning into Data Engineering by designing endโtoโend batch and streaming pipelines
- ๐ ๏ธ Passionate about building scalable, reliable data pipelines that turn raw data into actionable insights
- ๐ฏ Open to collaborating on Data Engineering & Open Source projects
- ๐จโ๐ป Explore my work here: My Portfolio
- ๐ซ Reach me at [email protected]
- โก Fun fact: I debug pipelines the way I play games โ with persistence and strategy
-
๐๏ธ YouTube Data Engineering Pipeline (Batch Processing)
Endโtoโend batch ETL pipeline implementing the Medallion Architecture (Bronze โ Silver โ Gold).- Orchestrated with Apache Airflow (3.x)
- Transformations with Apache Spark
- Data lake layers on local filesystem (Bronze/Silver/Gold)
- Serving layer in Postgres (analyticsโready tables)
- Interactive Streamlit + Altair dashboard via SQLAlchemy
- Ingests raw YouTube trending data (CSV/JSON), cleans, enriches, and computes derived metrics for BI
-
๐ StockPulse (Streaming Pipeline)
Realโtime streaming pipeline simulating stock ticks and processing them endโtoโend.- Ingestion via Kafka producer publishing to
stock_tickstopic - Processing with Spark Structured Streaming (schema enforcement + derived metrics)
- Dual sinks: Postgres (serving layer) + Parquet (partitioned by index/date)
- Interactive Streamlit + Altair dashboard for realโtime visualization
- Fully orchestrated with Apache Airflow
- Ingestion via Kafka producer publishing to
Note: Top languages is only a metric of the languages my public code consists of and doesn't reflect experience or skill level.

