Welcome to My Portfolio

Josh Capistrano

Passionate about building scalable data solutions and transforming complex business requirements into elegant technical implementations. Experienced in AWS, Python, Spark, and modern data engineering practices.

Explore My Work

About Me

💼

Professional Experience

6+ years of experience in data engineering and software development at leading companies like Bloomberg and Fannie Mae.

🚀

Technical Expertise

Specialized in AWS services, Python, Spark, and building scalable ETL pipelines that process massive datasets efficiently.

🎯

Problem Solver

Proven track record of translating complex business requirements into technical solutions that drive real value.

0+
Years Experience
0M+
Records Processed Daily
0+
Certifications
0%
Error Reduction Achieved

Technical Skills

Core Technologies

Python (Pandas, Boto3, PySpark) 95%
AWS (Lambda, S3, EMR, Redshift) 90%
Apache Spark 85%

Tools & Methodologies

SQL & Database Management 85%
Git & Version Control 90%
Agile Development 80%

Professional Experience

Click on any role to expand details
May 2023 - Present

Data Engineer

Bloomberg L.P., New York, NY
View details
Designed and implemented Spark-based ETL pipelines to efficiently deliver data to clients on Bloomberg Terminal, ensuring timely and accurate data delivery
Developed and maintained microservices similar to AWS Lambda integral to ETL pipelines, facilitating preprocessing tasks and enhancing data quality
Actively participated as a representative of the technology team in the department's knowledge council, contributing to the creation of comprehensive documentation on the tech stack used in projects
Collaborated closely with team representatives to understand and address business needs, ensuring that technical solutions aligned with client requirements and expectations
Played a key role in bridging the gap between technology and business by translating business requirements into technical specifications and delivering solutions that meet client needs effectively
Designed and implemented post-processing microservices tailored to align with specific business requirements, ensuring data quality and integrity before delivery to clients
Collaborated with stakeholders to understand business objectives and translated them into technical requirements for post-processing microservices, ensuring alignment with client needs
January 2020 - May 2023

Software Engineer

Fannie Mae, Reston, VA
View details
Designed and developed scalable AWS infrastructure to store and process large data sets using Pyspark, S3, EMR, and Redshift
Implemented ETL pipelines using Amazon Web Services and Python to extract data from various sources, perform data transformations, and load data into the data warehouse
Utilized AWS EC2 instances to create and manage EMR clusters for data ingestion, ensuring optimal performance and resource utilization
Implemented Spark jobs on AWS EMR to process large datasets, improving data ingestion efficiency and reducing processing time
Developed custom scripts to automate data ingestion, data quality checks, and data reconciliation processes
Utilized AWS Glue for scheduling and running ETL jobs, reducing manual intervention and increasing reliability
Collaborated with cross-functional teams to gather requirements, design solutions, and implement end-to-end data pipelines
Monitored performance and optimized AWS resources usage to reduce costs and increase efficiency
Automated deployment of Lambda functions using AWS CloudFormation
Designed, developed, and deployed custom software applications using languages such as Python, and Java

Featured Projects

Real-Time Data Pipeline

Built a scalable ETL pipeline processing 10M+ records daily using AWS EMR, Lambda, and Redshift with automated data quality checks.

Python PySpark AWS Redshift

Data Quality Framework

Developed a comprehensive data validation framework that reduced data errors by 85% using custom Python scripts and automated monitoring.

Python Pandas CloudWatch SNS

Microservices Architecture

Designed and implemented Lambda-based microservices for data preprocessing, improving processing speed by 60% and reducing costs by 40%.

AWS Lambda API Gateway S3 CloudFormation

Certifications

AWS Certified Solutions Architect

Amazon Web Services

2023

AWS Certified Developer

Amazon Web Services

2022

Apache Spark Certification

Databricks

2022

Python for Data Science

DataCamp

2021

Education

Rutgers University, School of Engineering

Bachelor of Science in Electrical and Computer Engineering

New Brunswick, NJ | December 2019

Concentration: Computer Engineering
Honors: Magna Cum Laude

Get In Touch

Let's Connect!

I'm always interested in hearing about new opportunities, collaborations, or just chatting about data engineering and technology.

Feel free to reach out through the form or connect with me on social media.

Resume Preview

Place your resume.pdf file in the project folder to preview it here.

Download Resume
🌙