Trevor Dalton

About Me

Professional Headshot

About Me

Senior Data Engineer at M Science, where my focus is now on agentic AI, building pipelines that automate large-scale data labeling and classification for analyst consumption. Passionate about LLMs and the engineering challenges of putting them into production.

When I'm not coding I'm lifting weights, playing pickleball, or reading. Currently on The Count of Monte Cristo by Alexandre Dumas. Always happy to talk books or anything else.


Experience

Senior Data Engineer
Period: January 2026 - Present
Data Engineer
Period: March 2022 - December 2025
  • Designed an agentic eReceipt tagging pipeline using LangGraph, BM25 retrieval, and OpenAI/Anthropic LLMs to automatically classify thousands of daily transactions against a video game product taxonomy
  • Fine-tuned multiple BERT-based classification models using PyTorch and Hugging Face to automate large-scale data labeling and improve data accuracy with F1-scores above 98%. Deployed models with MLflow, streamlining operations and cutting manual effort by 50%
  • Derived actionable insights from large-scale video game sales data using SQL and Python, uncovering user behavior trends to inform client product strategy and marketing decisions
  • Construct PySpark/SQL ETL pipelines for full raw-to-deliverable processing of large-scale data
  • Optimize AWS EC2 configurations to cut pipeline costs and runtime by as much as 60%
  • Developed DataOps-controlled pipeline orchestration systems, reducing runtimes by over 20%
python icon

Python

databricks icon

Databricks

apache spark icon

Apache Spark

pytorch icon

PyTorch

hugging face icon

Hugging Face

snowflake icon

Snowflake

aws icon

AWS

apache airflow icon

Apache Airflow

mysql icon

MySQL

git icon

Git

langchain icon

LangChain

anthropic icon

Anthropic

Data Engineer
Period: May 2021 - March 2022
Data Engineering Intern
Period: May 2020 - May 2021
  • Engineered a JavaScript multi-source ETL pipeline to connect, aggregate & analyze data
  • Designed frontend interface for a master data pipeline to derive insights from enterprise application data
  • Developed an Ontology-driven ETL Orchestration tool used by 50+ developers using React
  • Built scalable desktop applications using React, Node.js, TypeScript, and Electron
  • Developed machine learning and graph database models for supply chain and maintenance forecasting to mitigate risk and reduce lifetime costs of advanced weapons systems
  • Delivered valuable insights on large datasets using statistical analysis and MatPlotLib visualizations
python icon

Python

typescript icon

TypeScript

javascript icon

JavaScript

node.js icon

Node.js

react icon

React

neo4j icon

Neo4j

apollo icon

Apollo

graphql icon

GraphQL

mongodb icon

mongoDB

mysql icon

MySQL

aws icon

AWS

git icon

Git

Education

Master of Information and Data Science

The MIDS program at the Berkeley School of Information is recognized as one of the nation's top-tier data science program. The program's focus on collaborative problem-solving has taught me how to form effective teams from a diverse set of individuals.

Graduated In: 2024
  • Natural Language Processing
  • Time Series and Panel Data Analysis
  • Computer Vision
  • Experiments and Causal Inference
  • Machine Learning
  • Data Engineering
  • Statistics
  • Research Design and Analysis
Bachelor of Computer Science

Utah's preeminent research institution and where I first cut my teeth at Software Development. Despite the challenging curriculum I was able to thrive thanks to the help of my peers and professors who offered ample support.

Graduated In: 2021
  • Algorithms
  • Artificial Intelligence
  • Data Visualization
  • Operating Systems
  • Database Systems
  • Machine Learning
  • NLP
  • Information Systems
  • Computer Systems
Associate of Science

Taking full advantage of the Success Academy Program I engaged in concurrent enrollment classes which allowed me to graduate high school with enough credits to attain by associates degree from Utah Tech University at the age of 18. The supportive community and passionate professors ignited my love for computing that continues to today.

Graduated In: 2018
  • Data Structures
  • Algorithms

Projects

Datadrip
Datadrip: AI for Financial Analysts

Datadrip automates earnings analyses for publicly traded companies using state-of-the-art technology. It frees research analysts from manual work by extracting and summarizing critical data instantly. Retail investors gain rapid insights into revenue trends and presentation summaries in a concise format. For financial analysts, Datadrip converts presentations into Excel sheets for seamless modeling integration, enhancing decision-making and maximizing returns effortlessly.

During Datadrip's 16-week development, the Datadrip team leveraged cutting-edge models in computer vision, generative AI, and visual Q/A. Datadrip exemplifies the evolving landscape of AI-driven financial analysis tools, highlighting the potential for cutting-edge technologies to democratize access to critical financial information and elevate data-driven decision-making in the financial sector.

You can view the demo here!

Website
Bank Document Verifier
Bank Document Verifier

A collaborative workspace geared to enable financial institutions to automate their application processes. Created using Angular, TypeScript, and SCSS. Here, banks can automatically verify important documents such as W2s, Schedule C's, etc. and approve applicants. BDV allows for real-time communication to ensure loans are processed accurately and timely.

The Bank Document Verifier team was awarded 2nd place in the University of Utah's School of Computing's 2021 Spring capstone presentations

Demo
Aye-Aye
Aye-Aye: Semantic Lexicon Induction

A semi-supervised lexicon induction algorithm! Given just a few seed words this the Aye-Aye can learn any semantic category to a high degree of precision

Github Repo
Personal Website
Personal Website

Passion project of mine I have maintained since 2018 to keep with my web development roots and have an online portfolio for people to stay updated on my journey

Github Repo
Ascii Converted
ASCII Image Converter

An application that allows users to submit images and convert them to their ASCII equivalents. Multiprocessing is used to speed up conversion on larger images. The image above is an egg I tested this on.

Github Repo