Drew Harris ajharris

About

I build data science and machine learning systems that move cleanly from raw data to defensible insight, with an emphasis on well-motivated problems, reproducible pipelines, and model interpretability.

My work sits at the intersection of applied ML, scientific computing, and software engineering, often using public or operational data to prototype end-to-end analyses that could realistically run in production.

Current Focus & Active Projects

Distributed-ML
A distributed, dataset-agnostic CT preprocessing pipeline using Dask, designed for large clinical imaging datasets and downstream ML workflows.
publicdata_ca
A reusable data acquisition and normalization framework for Canadian public datasets (StatCan, CMHC, CIHI), supporting rapid ML case studies such as housing affordability indices and hospital utilization analysis.
Applied ML Case Studies
Short, tightly scoped projects demonstrating:
- Unsupervised learning (Isolation Forests, autoencoders)
- Feature engineering from messy public datasets
- Evaluation under limited or noisy ground truth
- Clear motivation and decision-oriented outputs
YesChef GPT
An AI-powered system that structures generative outputs into machine-readable components (ingredients, preparation steps, pickup notes), emphasizing controllability and downstream usability over novelty.

Background in Scientific Computing

C++ medical image registration using ITK (Insight Toolkit)
MATLAB pipelines using Marching Cubes for carotid artery tracing in CT angiography
Control systems for LED solar simulators supporting photovoltaic research
Formal training in medical physics, with strong grounding in measurement, uncertainty, and validation

Currently Exploring

Anomaly detection in healthcare operations
Early detection of unusual demand or utilization patterns using unsupervised and semi-supervised methods.
Public-sector ML pipelines
Designing reusable ingestion and feature pipelines that make public data viable for rapid experimentation.
Evaluation without labels
Practical techniques for validating unsupervised models when ground truth is incomplete or unavailable.
Bridging notebooks to systems
Turning exploratory analyses into maintainable, testable services without losing scientific intent.

Perspective

I approach data science as an engineering discipline:
start with a clear question, respect the data’s limitations, and build models that can be explained, tested, and trusted.

My goal is to work on problems where statistical thinking, ML techniques, and real-world constraints all matter — especially in healthcare, infrastructure, and public data contexts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drew Harris ajharris

Achievements

Achievements

Block or report ajharris

About

Current Focus & Active Projects

Background in Scientific Computing

Currently Exploring

Perspective

Pinned Loading

Uh oh!