About Me

Sebastian Patron

I'm a Data Engineer with experience building data pipelines that handle millions of daily events, designing data warehouses from scratch at startups, and creating AI agents to streamline workflows. I spend a lot of time thinking about the future of technology, and I occasionally share those ideas here.

Professional Experience

I've spent the last few years diving deep into the startup world. Most recently, I was the first Data Engineer at a telemedicine startup, where I architected and executed the migration from Fivetran to a federated query-based ETL solution using dbt, AWS RDS, and Redshift. This change resulted in a 90% cost reduction and saved $60,000 in the first year alone. I also optimized our dbt models through strategic refactoring and smart materializations, reducing build times by 75% and making our analytics much more responsive.

Part of my role involved leading the migration from Mode Analytics to Tableau, which significantly improved self-service analytics adoption across the company. I developed several AWS Glue jobs using Python and Spark to flatten JSON data and load it efficiently into Redshift and S3. One of the most rewarding aspects was mentoring junior data analysts in Redshift, dbt, and pipeline optimization techniques.

Before that, I was the second engineer on the data team at a digital therapeutics startup. I helped develop ETL pipelines, set up our Data Lake and Data Warehouse using Python, Snowflake, AWS S3 and Glue. A particularly interesting challenge was developing a pipeline to export Firebase data via GCP BigQuery and Cloud Storage into AWS. I also implemented CI/CD workflows using GitHub Actions and Terraform to streamline our data pipeline deployments.

I also got the chance to work at a large org at DriveTime, where I reduced SQL Server build time by 40% by refactoring stored procedures. I also led the migration of the Inventory team's data infrastructure from SQL Server to Snowflake, which included designing and implementing a reusable framework for converting stored procedures. The experience taught me how to develop high-volume pipelines with Snowflake and Kafka to process millions of rows daily.

I started my data engineering career at an e-commerce startup, where despite my intern title, I was their first technical hire. I created all their ETL processes from BigCommerce using Python, Flask, and GCP, and built their first data warehouse in PostgreSQL (which worked surprisingly well for their data volume, even though I've since learned columnar storage is better). I also developed a Ruby on Rails app that used headless BigCommerce, though it's been years since I've touched Rails.

Personal Projects

When I'm not wrangling data for work, I've been analyzing Pokémon Showdown data through my Pokemon-Showdown-Airflow project. This Apache Airflow pipeline ETLs gigabytes of Pokémon Showdown logs into structured datasets using Python, with downstream analysis in both Spark and DuckDB to extract insights from competitive gameplay data. I've also been working on a SP404 WAV converter tool, and I'm currently exploring turning it into an Electron app based on feedback from friends who've found it useful.

Tech Stack

My technical toolkit includes languages like Python, SQL, Scala, and TypeScript. For data engineering, I'm proficient with dbt, Airflow, and Spark. I've worked extensively with cloud platforms including AWS (Glue, Lambda, Redshift, S3, RDS) and GCP (Cloud Storage, BigQuery). My database experience spans Snowflake, Redshift, BigQuery, and PostgreSQL. For visualization, I'm comfortable with Tableau, and I've worked with various APIs and libraries like PySpark. My workflow also involves GitHub, CI/CD pipelines, Docker, and Terraform for infrastructure as code. Recently, I've been experimenting with the OpenAI and Perplexity APIs because, well, who isn't these days? I also like to dabble in iOS development with Swift and backend with TypeScript and Supabase.

Theme

This site is based on the Mediumish theme for Jekyll, but has been heavily customized.

Find me on GitHub

Check out my projects and contributions on GitHub.

GitHub Profile