π Pennsylvania | π² Statistician | π€ Research Software Engineering
I work at the intersection of statistical methodology and research software engineering. My focus is on making late-stage clinical trials efficient, reproducible, and ready for regulatory review.
- π¦ ggsci - Scientific color palettes for ggplot2
- π¨ py-ggsci - Scientific color palettes for plotnine
- π¦ pkglite - Compact R package representations for regulatory submissions
- πΏ py-pkglite - Pack and restore source packages as text files
- π rtflite - Lightweight RTF composer for Python
- π pkgdown.offline - Build pkgdown websites without an internet connection
- π revdeprun - Rust CLI for R package reverse dependency checks
- π¬ tinytopics - GPU-accelerated topic modeling via neural Poisson NMF
- π§ͺ tinyvdiff - Minimalist visual regression testing plugin for pytest
- πΈ pytest-r-snapshot - Snapshot testing against R reference outputs
- π§² msaenet - Multi-step adaptive elastic-net for sparse regressions
- π³ stackgbm - Model stacking for xgboost, lightgbm, and catboost
- π₯ oneclust - Maximum homogeneity clustering for univariate data
- 𧬠protr - Protein sequence feature extraction in R
- π³ liftr - Containerize R Markdown documents for reproducibility
- π vscode-textmate-rstheme - VS Code theme inspired by RStudio IDE default theme
- βοΈ vscode-markdown-stupefy - Convert smart punctuation to ASCII in VS Code
- π€ DM Mono Ligaturized - DM Mono font with Fira Code ligatures
- β‘ r-base-shortcuts - Lesser-known base R idioms for concise and fast code
- π awesome-shiny-extensions - Curated list of Shiny UI/server components
- Introducing pytest-r-snapshot: Verifying Python code against R outputs at scale
- Reverse dependency check speedrun: a data.table case study
- ggsci 4.0.0: 400+ new color palettes
- Introducing py-ggsci: ggsci color palettes for plotnine in Python
- tinytopics: GPU-accelerated topic modeling via constrained neural Poisson NMF
- Group sequential trials in industry: a 30-year perspective
- eCTD submission with analysis using R
- Training tissue-specific gene embeddings on GTEx data
- Statistics in Biopharmaceutical Research Best Paper Award 2025
- John M. Chambers Statistical Software Award, American Statistical Association
- Second place @ PrecisionFDA Brain Cancer Predictive Modeling and Biomarker Discovery Challenge
- First place @ PrecisionFDA BioCompute Object App-a-thon (Advanced Track)
- Published author and speaker (papers, books, talks)
- 2 million+ annual downloads across projects
- 3.5K+ GitHub stars across projects






