Skip to content
View Hayden727's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Hayden727

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Hayden727/README.md

Hi, I'm Chenchen Hong πŸ‘‹

AI Infrastructure Β· MLSys Β· Compilers

I build and optimize the systems that make large models fast β€” from compiler-level kernel work up to distributed inference serving.


πŸš€ What I work on

  • Multimodal & LLM Inference Infrastructure (main focus) β€” performance engineering for multimodal serving on SGLang-omni, alongside LLM serving stacks (SGLang, vLLM): model integration, scheduling, memory efficiency, and throughput/latency optimization.
  • RL Infrastructure β€” systems and tooling for reinforcement learning workloads: training/inference orchestration, rollout, and scaling.
  • Kernel Compiler Optimization β€” compiler-driven kernel optimization for ML workloads: codegen, graph-level transformations, and automatic kernel generation/tuning (Triton, CUDA) on NVIDIA Hopper (H100) and Blackwell (B200).

πŸ› οΈ Tech & Tools

Python C++ CUDA Triton PyTorch

πŸ“Š GitHub Stats

GitHub stats Top languages

GitHub streak

πŸ“« Reach me


Weekly Issue Arena

Pinned Loading

  1. sgl-project/sglang-omni sgl-project/sglang-omni Public

    SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models

    Python 567 233

  2. sgl-project/sglang sgl-project/sglang Public

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python 29.9k 6.9k

  3. sgl-project/SpecForge sgl-project/SpecForge Public

    Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

    Python 954 277

  4. NVIDIA-NeMo/Automodel NVIDIA-NeMo/Automodel Public

    πŸš€ Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

    Python 657 195

  5. ctorch ctorch Public

    C++ 1 1

  6. Hayden727.github.io Hayden727.github.io Public

    CSS