Skip to content
View alyayman2020's full-sized avatar
💪
Focusing & Hustling
💪
Focusing & Hustling
  • ITI
  • Cairo, Egypt

Block or report alyayman2020

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alyayman2020/README.md

Typing SVG

Data Scientist · AI Engineer

📍 Cairo, Egypt  |  🌐 Arabic / English / Deutsch


LinkedIn GitHub Kaggle Profile Views


🧠 About Me

I'm a data scientist with a background in pharmacy and data analytics — two disciplines that trained me to think in systems, prioritize precision, and work with high-stakes data.

I hold an Academic Diploma in Data Science from Cairo University (GPA 3.78 / A) and am currently enrolled in the ITI Data Science Track — a competitive 9-month scholarship program covering ML, Deep Learning, LLMs, Big Data, MLOps, and Agentic AI.

I've built end-to-end ML pipelines achieving 92.6% recall on clinical classification tasks, automated complex workflows with Python AI agents achieving 87% time reduction, and analyzed 10,000+ orders across 80+ countries to inform executive strategy.

My focus: ML · LLMs · Arabic NLP · Healthcare AI · Agentic AI — where the most impactful and underserved problems live.

💡 From prescriptions to predictions — I follow the data.


🛠️ Tech Stack

🔤 Languages

🤖 Machine Learning & Deep Learning

Classification · Regression · Clustering · Time Series (GARCH) · NLP · Computer Vision · Ensemble Methods · SHAP Explainability

🧠 LLMs & Generative AI

Transformers · RAG Systems · AI Agents · Human-in-the-Loop · Generative AI · Prompt Engineering

📦 Data Engineering & Databases

Data Warehousing · Star Schema · OLAP · ETL Pipelines · ER Modeling

☁️ Cloud, MLOps & DevOps

MLOps Fundamentals · Model Deployment · REST API Serving · Version Control · CI/CD Basics · Agile/Scrum

📊 BI & Visualization


📊 Key Achievements

Metric Context
🎯 92.6% Recall Clinical ML pipeline — 75% reduction in missed diabetes diagnoses
87% Time Reduction Python AI agent automating a 6-hour manual workflow → 30 min
📦 10,000+ Orders Analyzed BI analysis spanning 4 years and 80+ countries
🌍 8 Applied DS Projects WorldQuant University Lab — real estate, NLP, time series, A/B testing
🏛️ ~10% Acceptance Rate ITI Data Science Track — competitive government scholarship
🧠 GPA 3.78 / A Academic Diploma in Data Science — Cairo University

🔬 Areas of Interest

Domain Focus
🤖 Machine Learning End-to-end pipelines · Classification & Regression · Ensemble Methods · Model Explainability (SHAP)
🧠 LLMs & Agentic AI LLM engineering · AI Agents · RAG systems · Agentic workflows · Prompt engineering
🌍 Arabic NLP Arabic language processing · Text classification · Multilingual models · Arabic content AI
🏥 Healthcare AI Clinical ML · Disease classification · Pharmacogenomics · Healthcare informatics

🚀 Featured Projects

🛒 E-Commerce DWH & Recommendation Engine

Star Schema data warehouse with 15,000 fact rows, 21 advanced window-function SQL queries, 7 KPI dashboards, and a hybrid SQL + ML product recommendation system (Apriori + cosine similarity, 400 recommendations across 100 products).

🗄️ 1 Fact + 6 Dims  ·  📊 21 SQL Queries  ·  🤖 55% SQL + 45% ML Fusion

🩺 Diabetes Mellitus ML Classification

Cairo University capstone project. Comparative study of 5 ML algorithms on clinical data using a CRISP-DM pipeline with KNN imputation, F2-optimized threshold tuning, and SHAP explainability. Achieved 92.6% recall — 75% fewer missed diagnoses.

🎯 92.6% Recall  ·  📉 75% fewer missed diagnoses  ·  🏅 Grade A

🏭 Supplier Quality Analysis

End-to-end supply chain quality audit across suppliers, products, and manufacturing plants. Identified high-defect categories (Mechanical & Packaging), flagged the Detroit plant as highest-risk, and produced an interactive Tableau dashboard for executive-level exploration.

🔍 EDA + Statistical Analysis  ·  📊 Interactive Tableau Dashboard  ·  📄 Executive Report

🗄️ Bash DBMS

A fully functional, file-based Database Management System built entirely in Bash with a Zenity GUI. Supports full CRUD operations, 7 SELECT modes, primary key enforcement, datatype validation, and a clean 4-module architecture.

🛠️ 4-module architecture  ·  🔍 7 SELECT modes  ·  ✅ Full PK validation


🎓 Education

Institution Credential
🏛️ ITI — Information Technology Institute Data Science Track · 9-Month Scholarship · ~10% Acceptance Rate · 2025–2026
🎓 Cairo University — FGSSR Academic Diploma in Data Science · GPA 3.78 · Grade A
💊 Cairo University — Faculty of Pharmacy Bachelor of Pharmacy · Graduation Research: Pharmacogenomics × Deep Learning

🏅 Certifications

🌍 Applied Data Science Lab WorldQuant University
📊 Data Analyst Specialist Digital Egypt Pioneers Initiative (DEPI)
🤖 Generative AI with AWS Udacity
✍️ 1 Million Prompters Dubai Future Foundation
🏢 McKinsey Forward Program McKinsey & Company
🧪 Clinical Research Principles National Institutes of Health (NIH)
🌱 ALX Data Science Program ALX Africa · NLP · Regression · Unsupervised Learning

🚀 Open to Data Science & ML Engineering roles — Egypt · MENA · Remote

LinkedIn Kaggle

Popular repositories Loading

  1. mini-GPT mini-GPT Public

    Python 1

  2. supplier-quality-analysis-report supplier-quality-analysis-report Public

    Supplier Quality Analysis Project: Evaluates supplier and plant performance to optimize supply chain efficiency. Includes data cleaning, EDA, and a Tableau dashboard. Identifies trends, top supplie…

    Jupyter Notebook

  3. ecommerce-dwh-analytics-recommendation-engine ecommerce-dwh-analytics-recommendation-engine Public

    From Raw Transactions to Smart Recommendations: An E-Commerce Analytical Data Warehouse

    Jupyter Notebook 1

  4. alyayman2020 alyayman2020 Public

  5. diabetes-mellitus-ml-classification diabetes-mellitus-ml-classification Public

    Comparative ML study on diabetes prediction — 92.6% Recall, 75% reduction in missed diagnoses | CRISP-DM · KNN Imputation · SHAP · Threshold Optimization | Cairo University Capstone (Grade A)

    Jupyter Notebook

  6. bash-dbms bash-dbms Public

    File-based DBMS built in Bash — GUI via Zenity | Create/Drop databases & tables, full CRUD, 7 SELECT modes, PK enforcement | ITI Data Science Track

    Shell