Data Engineer | Senior DBA
Transforming Complex Data into Actionable Insights
Hi, I’m Pravin Regismond. I work with data—processing it, storing it, and making it useful. I started my career in database administration and spent over two decades managing and optimizing systems. More recently, I’ve been focusing on data engineering, working with ETL pipelines, cloud platforms, and real-time data processing.
Feel free to check out my projects here or connect with me below.
Nov 2024
This project aimed to analyze car sales and dealer profits for SwiftAuto Traders by creating visualizations and presenting them as dashboards. The approach involved using Snowflake’s Snowsight to create and analyze business intelligence (BI) dashboards. Additionally, a Streamlit app was provided to produce the same visualizations using Streamlit-in-Snowflake (SiS).
Objectives:
Impact:
Skills: Python (Programming Language) · Data Engineering · Snowflake · Data Visualization · Security · Streamlit · Problem Solving · Data Warehousing · Business Intelligence
Sep 2024
This project aimed to predict the locations of the Freezing Point food truck by analyzing historical location data and creating a predictive model. The approach involved using Snowflake’s Snowpark ML and XGBoost to develop and evaluate the model. Additionally, the project included creating a complete end-to-end workflow for data processing and model training.
Objectives:
Impact:
Skills: Python (Programming Language) · Machine Learning · Snowflake · Data Engineering · Data Visualization · XGBoost · Problem Solving · Data Warehousing · Business Intelligence
Aug 2024
This project sought to build a logistic regression classifier using the PySpark Machine learning library (MLLIB) and Python to classify between diabetic and non-diabetic patients. My approach was to build a machine learning model to accurately predict whether the patient possesses diabetes or not.
Objectives:
Impact:
Skills: Python (Programming Language) · Data Engineering · Apache Spark ML · PySpark · Machine Learning · Data Science · Problem Solving · Apache Spark
Jun 2024
This project involved indentifying patterns in volume and location of waste collection across Brazil. My approach was to design a data warehouse and subsequent visual representation of the waste collected by truck type, city, station ID and month.
Objectives:
Impact:
Skills: Data Modeling · Data Engineering · Data Visualization · Problem Solving · PostgreSQL · Data Warehousing · IBM Cognos Analytics
Apr 2024
This project sought to improve traffic flow on national highways by analyzing road traffic data from various toll plazas. My approach was to consolidate the disparate data from different toll operators and IT systems into a single file and then create a data pipeline to continue collecting the streaming data into a database for future analysis. During the process, I encountered carriage return characters (^M) and provided two potential solutions.
Objectives:
Impact:
Skills: Extract, Transform, Load (ETL) · Python (Programming Language) · Apache Airflow · Data Engineering · Bash · MySQL · Problem Solving · Apache Kafka · Shell Script
Mar 2024
This project required the creation of a database wherein managers from London, Berlin and New Delhi could query the top 10 largest banks by market capitalization in their local currency. My approach was to compile the list of the top 10 largest banks ranked by market capitalization in billion USD and then transform and store it in USD, GBP, EUR and INR based on the provided exchange rate.
Objectives:
Impact:
Skills: Extract, Transform, Load (ETL) · Python (Programming Language) · Beautiful Soup · Data Engineering · Pandas · Web Scraping · Problem Solving · SQLite
Feb 2024
This project aimed to identify the optimal angle of attack and flow direction for airfoil noise reduction. My approach was to Extract, Transform, Load (ETL) and construct ML pipelines on data from a series of aerodynamic and acoustic tests of airfoil blade sections conducted in an anechoic wind tunnel.
Objectives:
Impact:
Skills: Extract, Transform, Load (ETL) · Python (Programming Language) · Data Engineering · Apache Spark ML · PySpark · Problem Solving · Apache Spark
Jan 2024
The project required the creation of a robust data pipeline capable of ingesting employee data in CSV format. For this I analyzed the data, implemented necessary transformations, and enabled the extraction of valuable insights from the processed data.
Objectives:
Impact:
Skills: Data Engineering · Problem Solving
Dec 2023
This project tasked me with providing analysts with usable data. My approach was to move data from external sources into various databases, transfer data between different types of databases, and execute basic queries across various databases.
Objectives:
Impact:
Skills: MongoDB · Data Engineering · IBM Cloudant · Problem Solving · Cassandra