Shahid Ul Islam Khanz9664

Most teams stop at accuracy. I ask: why did the model decide that — and can you prove it?

I'm a Machine Learning Engineer specializing in Explainable AI (XAI) and clinical ML systems. My work sits at the boundary between statistical rigor and software engineering — building pipelines that are not only high-performing but accountable.

My research uncovered what I call the Explainability Paradox: visually convincing saliency maps that fail causal validity tests. That finding is now under peer review.

TrustLens — Open-Source ML Reliability Framework

Most evaluation stops at accuracy_score. TrustLens goes deeper.

A single analyze() call surfaces calibration drift, subgroup bias, failure patterns, and representation quality — the things that matter in production, but don't appear on leaderboards.

from trustlens import analyze

report = analyze(model, X_val, y_val, y_prob=proba)
# → Calibration · Bias · Failure Modes · Representation

Live on PyPI · Built with production CI/CD (multi-Python testing, Ruff, MyPy) · Active contributor community

→ Full writeup | PyPI package | Repository

Research

Paper Under Review

Quantitative Faithfulness Benchmarking of CNNs vs. Vision Transformers: Implications for Clinical Trustworthiness

I Trained 3 different Models (VGG16, ViT B/16 and Custom CNN) and ran GradCAM++ and EigenCam on a chest X-ray dataset and found something counterintuitive: visually plausible heatmaps lacked causal validity. A 6-dimensional benchmark along with Pixel Deletion (AOPC/AUC) showed that patch-based Transformer attention was causally faithful where CNNs weren't — despite CNNs looking more "correct" to the human eye. I call this the Explainability Paradox.

Metrics used: Sparsity · Entropy · Inter-Method Agreement · AOPC/AUC · Bonferroni-corrected non-parametric testing

→ Project writeup | Repository

Open-Source Contribution: Roboflow Supervision

Merged PR #2247 — Add OBB Support to ConfusionMatrix via MetricTarget

Implemented Oriented Bounding Box (OBB) support for ConfusionMatrix, enabling correct IoU computation for rotated detections through MetricTarget.ORIENTED_BOUNDING_BOXES. This aligned ConfusionMatrix with the library's existing metrics architecture and resolved a long-standing gap where OBB inputs were evaluated using axis-aligned IoU. The contribution included validation logic, regression tests, documentation updates, and full backward compatibility with existing workflows.

Mathematical Foundations of Machine Learning

I write long-form, derivation-first articles that explain the mathematics behind machine learning from first principles. Every article emphasizes intuition, rigorous derivations, and practical connections rather than treating algorithms as black boxes.

⭐ Featured Articles

Article	What you'll learn
Probability Theory	Random variables, probability distributions, expectation, Bayes' theorem, and the statistical foundations of machine learning.
Matrix Calculus	Gradients, Jacobians, Hessians, and the calculus required to understand optimization and neural networks.
Gradient Descent	Why optimization works, how gradients are derived, convergence analysis, learning rates, and practical training dynamics.
Convex Optimization	Convex sets, convex functions, duality, KKT conditions, and why convexity matters in machine learning.
Backpropagation	A complete derivation of backpropagation using the chain rule, computation graphs, and gradient flow through deep networks.
Transformers	Self-attention, positional encoding, multi-head attention, encoder-decoder architecture, and the mathematics powering modern LLMs.

Show All Article's →

Deployed Systems

System	Stack	Live	Highlight
CardioSense-AI	XGBoost · FastAPI · Docker · Optuna	🟢 Live	90.16% acc · 0.9524 AUC · "Least Effort Path" optimizer for patient intervention
Breast Cancer MLOps Suite	Random Forest · Z-Score Drift · Streamlit	🟢 Live	98.2% acc · Real-time out-of-distribution detection
Respiratory Disease Classifier	VGG16 · ViT-B/16 · GradCAM++ · LIME	Research	99% recall for COVID-19 · Explainability Paradox discovery
Apple Sales Intelligence	Scikit-Learn · SciPy SLSQP · Streamlit	🟢 Live	Constrained optimization for hardware-mix revenue maximization
Patient Safety Guardian	Gemini 2.5 Pro · Google ADK · Streamlit	🟢 Live	Kaggle Agents Intensive · Multi-agent clinical safety net · 100% critical interaction detection

Technical Stack

ML / DL          PyTorch · XGBoost · Scikit-Learn · VGG16 · ViT · Optuna
XAI              SHAP · LIME · GradCAM++ · EigenCAM · Pixel Deletion (AOPC/AUC)
MLOps            FastAPI · Docker · GitHub Actions CI/CD · Streamlit · REST APIs
Data Engineering Python · SQL · Pandas · NumPy · PCA · K-Means · Plotly
Drift Detection  Z-Score · Counterfactual Analysis · Synthetic Stress Testing

GitHub Activity

"In God we trust. All others must bring data." — W. Edwards Deming

If your model can't explain itself, it has no business making decisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shahid Ul Islam Khanz9664

Achievements