Skip to content
View MissPenguin's full-sized avatar

Block or report MissPenguin

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,947 139 Updated Jan 24, 2026

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Python 370 30 Updated Sep 25, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,390 4,775 Updated Jun 2, 2025

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…

Python 1,473 144 Updated Mar 7, 2025

Ongoing research training transformer models at scale

Python 15,088 3,550 Updated Feb 1, 2026

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,417 105 Updated Mar 3, 2024

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,979 2,216 Updated Jul 29, 2024

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 30,265 4,017 Updated Jul 17, 2024

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

Python 9,513 701 Updated Jan 24, 2026

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…

Jupyter Notebook 2,800 253 Updated Dec 12, 2023

深度学习经典、新论文逐段精读

32,505 2,774 Updated Mar 22, 2025

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 19,236 1,951 Updated Nov 19, 2025

Awesome-LLM: a curated list of Large Language Model

26,151 2,282 Updated Jul 31, 2025

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

22,178 2,103 Updated May 19, 2025

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

Jupyter Notebook 23,159 2,817 Updated Jun 12, 2025

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python 5,514 512 Updated Dec 14, 2025

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 70,057 8,406 Updated Jan 25, 2026

Inference code for Llama models

Python 59,098 9,824 Updated Jan 26, 2025

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 41,219 5,210 Updated Jun 27, 2024

AI PDF chatbot agent built with LangChain & LangGraph

TypeScript 16,337 3,233 Updated Feb 20, 2025

总结Prompt&LLM论文,开源数据&模型,AIGC应用

3,343 321 Updated Jan 19, 2026

[Open-Source Project] Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text…

Python 580 41 Updated Jan 30, 2024

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Python 502 51 Updated Dec 2, 2022

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python 3,637 690 Updated Jan 31, 2026

PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.

Python 12,950 2,168 Updated Feb 1, 2026

Open source Python library for converting PDF to DOCX.

Python 3,287 471 Updated May 28, 2025

OCR pre-processing Toolbox

C++ 18 3 Updated Nov 29, 2022

A simple OCR preprocessing tool using Python with a GUI.

Python 33 5 Updated Dec 21, 2022

PaddleOCR AutoHotkey Version. PaddleOCR AHK 版。

AutoHotkey 161 21 Updated Sep 9, 2025
Next