Skip to content
View ash80's full-sized avatar
  • Mistral AI
  • London, United Kingdom

Block or report ash80

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. RLHF_in_notebooks RLHF_in_notebooks Public

    RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks

    Jupyter Notebook 249 31

  2. diffusion-gpt diffusion-gpt Public

    From baby GPT to diffusion GPT: An annotated implementation of a character-level discrete diffusion model (adapted from Karpathy’s baby GPT).

    Jupyter Notebook 259 22

  3. backtracking_gpt backtracking_gpt Public

    A GPT agent with a Text Interface tool

    Python 15 1