buddhist-uni · Lsh0x · Mar 12, 2026 · Mar 12, 2026
diff --git a/.mcp.json b/.mcp.json
@@ -0,0 +1,11 @@
+{
+  "mcpServers": {
+    "buddhist-uni": {
+      "command": "/Users/lsh/projects/divers/buddhist-uni.github.io/search/.venv/bin/python",
+      "args": ["-m", "search.server.mcp_server"],
+      "env": {
+        "PYTHONPATH": "/Users/lsh/projects/divers/buddhist-uni.github.io"
+      }
+    }
+  }
+}
diff --git a/search/README.md b/search/README.md
@@ -0,0 +1,165 @@
+# Buddhist University — Moteur de recherche vectoriel + MCP Server
+
+Recherche sémantique dans **4 494 ressources bouddhistes** (textes canoniques, articles académiques, AV, cours) via une base vectorielle [Qdrant](https://qdrant.tech/) et le modèle d'embeddings `all-MiniLM-L6-v2`.
+
+Expose les résultats via une **API REST FastAPI** et un **serveur MCP** branché directement sur Claude.
+
+---
+
+## Architecture
+
+```
+_content/         ← 4494 fichiers markdown (source de vérité Jekyll)
+search/
+├── ingestion/    ← Pipeline: extraction → embeddings → Qdrant
+│   ├── extract.py        réutilise website.py + frontmatter
+│   ├── embedder.py       sentence-transformers/all-MiniLM-L6-v2
+│   └── ingest.py         pipeline principal (batch 100, ~25s)
+├── api/          ← FastAPI REST (port 8001)
+│   ├── main.py           app + CORS + routes
+│   ├── search.py         GET /search, GET /reading-path
+│   ├── courses.py        GET /courses, GET /courses/{id}, GET /teachers/{slug}
+│   └── models.py         Pydantic models
+├── server/       ← MCP Server (5 tools pour Claude)
+│   ├── mcp_server.py     FastMCP + déclaration des tools
+│   └── tools.py          fonctions Qdrant partagées API + MCP
+├── tests/        ← 56 tests (unit + integration)
+│   ├── test_search.py
+│   ├── test_api.py
+│   └── pytest.ini
+├── docker-compose.yml    Qdrant
+├── requirements.txt
+└── README.md
+.mcp.json         ← Config MCP pour Claude Code (racine du projet)
+```
+
+---
+
+## Setup (première fois)
+
+### 1. Créer l'environnement Python
+
+```bash
+cd buddhist-uni.github.io
+uv venv search/.venv --python 3.12
+uv pip install --python search/.venv/bin/python -r search/requirements.txt
+```
+
+### 2. Démarrer Qdrant
+
+```bash
+cd search && docker-compose up -d
+# Vérifier : curl localhost:6333/healthz
+# Dashboard : http://localhost:6333/dashboard
+```
+
+### 3. Indexer les 4494 documents (~25 secondes)
+
+```bash
+PYTHONPATH=$(pwd) search/.venv/bin/python -m search.ingestion.ingest
+```
+
+Options :
+```bash
+# Test sur 100 fichiers
+--limit 100
+
+# Réindexer entièrement
+--recreate
+
+# Requête de test après ingestion
+--test-query "impermanence nibbana"
+```
+
+---
+
+## Utilisation
+
+### API REST (FastAPI)
+
+```bash
+PYTHONPATH=$(pwd) search/.venv/bin/uvicorn search.api.main:app --port 8001 --reload
+```
+
+**Docs interactives** → http://localhost:8001/docs
+
+| Endpoint | Exemple |
+|---|---|
+| `GET /search` | `/search?q=meditation+breath&limit=8` |
+| `GET /search` (filtres) | `/search?q=nibbana&category=canon&min_stars=4` |
+| `GET /search` (tags) | `/search?q=compassion&tags=metta&tags=karuna` |
+| `GET /reading-path` | `/reading-path?topic=anatta&level=beginner` |
+| `GET /courses` | `/courses` |
+| `GET /courses/{id}` | `/courses/mn` · `/courses/pali-primer` |
+| `GET /teachers/{slug}` | `/teachers/bodhi` · `/teachers/thanissaro` |
+| `GET /health` | État de Qdrant |
+
+**Paramètres `/search`** :
+- `q` — requête en langage naturel (obligatoire)
+- `category` — `articles` `canon` `av` `booklets` `essays` `monographs` `papers` `excerpts` `reference`
+- `tags` — tags multiples (ex: `&tags=metta&tags=meditation`)
+- `course` — slug de cours (ex: `mn`, `abhidhamma`)
+- `min_stars` — qualité minimale 1–5
+- `limit` — 1–20 (défaut 8)
+
+**Paramètres `/reading-path`** :
+- `topic` — sujet à explorer
+- `level` — `beginner` · `intermediate` · `advanced`
+- `limit` — 1–20 (défaut 10)
+
+### MCP Server (Claude Code)
+
+Le fichier `.mcp.json` est déjà configuré à la racine du projet.
+**Ouvre un nouveau Claude Code dans ce dossier** — les 5 tools sont disponibles automatiquement.
+
+| Tool MCP | Description |
+|---|---|
+| `search_dharma(query, tags?, category?, limit?)` | Recherche sémantique |
+| `get_course(course_id)` | Curriculum complet d'un cours |
+| `list_courses()` | Liste des 16 cours structurés |
+| `find_by_teacher(teacher_slug)` | Ressources par enseignant |
+| `get_reading_path(topic, level, limit)` | Parcours de lecture guidé |
+
+**Cours disponibles** : `an` `buddha` `buddhism` `chinese-primer` `ebts` `ethics` `form` `function` `imagery` `mn` `nibbana` `nibbana-mind-stilled` `pali-new-course` `pali-primer` `philosophy` `tranquility-and-insight`
+
+**Enseignants** (exemples) : `bodhi` `thanissaro` `ajahn-chah` `ajahn-brahm` `analayo` `sujato` `nanavira`
+
+---
+
+## Tests
+
+```bash
+# Tests unitaires (pas besoin de Qdrant)
+PYTHONPATH=$(pwd) search/.venv/bin/pytest search/tests/ -m "not integration" -v
+
+# Tests complets (Qdrant requis)
+PYTHONPATH=$(pwd) search/.venv/bin/pytest search/tests/ -v
+
+# Résultat attendu : 56 passed
+```
+
+---
+
+## Réindexation
+
+Si tu ajoutes du nouveau contenu dans `_content/` :
+
+```bash
+# Réindexation incrémentale (upsert, safe)
+PYTHONPATH=$(pwd) search/.venv/bin/python -m search.ingestion.ingest
+
+# Réindexation complète (repart de zéro)
+PYTHONPATH=$(pwd) search/.venv/bin/python -m search.ingestion.ingest --recreate
+```
+
+---
+
+## Stack technique
+
+| Composant | Technologie | Raison |
+|---|---|---|
+| Vector DB | **Qdrant** (Docker) | Filtres metadata natifs, HNSW, open-source |
+| Embeddings | **all-MiniLM-L6-v2** | 384 dims, local, rapide (~500 docs/s), gratuit |
+| API | **FastAPI** + uvicorn | Async, autodoc OpenAPI, Pydantic v2 |
+| MCP | **FastMCP** (Anthropic SDK) | Stdio transport, compatible Claude Code |
+| Extraction | **python-frontmatter** | Réutilise le pipeline Jekyll existant |
diff --git a/search/__init__.py b/search/__init__.py
diff --git a/search/api/__init__.py b/search/api/__init__.py
diff --git a/search/api/courses.py b/search/api/courses.py
@@ -0,0 +1,39 @@
+"""Courses endpoints."""
+
+from fastapi import APIRouter, HTTPException
+from search.api.models import CourseDetail, CourseSummary, SearchResult
+from search.server.tools import get_course as _get_course, list_courses as _list_courses, find_by_teacher as _find_by_teacher
+
+router = APIRouter()
+
+
+@router.get("/courses", response_model=list[CourseSummary], summary="Liste des cours")
+def list_courses():
+    """Retourne les 16+ cours structurés disponibles."""
+    return [CourseSummary(**c) for c in _list_courses()]
+
+
+@router.get("/courses/{course_id}", response_model=CourseDetail, summary="Détail d'un cours")
+def get_course(course_id: str):
+    """
+    Retourne le curriculum complet d'un cours avec toutes ses ressources.
+
+    Exemples : `mn`, `dn`, `sn`, `an`, `abhidhamma`, `meditation`, `pali-primer`, `bn4`
+    """
+    result = _get_course(course_id)
+    if result is None:
+        raise HTTPException(status_code=404, detail=f"Cours '{course_id}' introuvable")
+    return CourseDetail(**result)
+
+
+@router.get("/teachers/{teacher_slug}", response_model=list[SearchResult], summary="Contenu par enseignant")
+def get_teacher(teacher_slug: str, limit: int = 20):
+    """
+    Retourne tout le contenu d'un enseignant/auteur.
+
+    Exemples : `bodhi`, `thanissaro`, `ajahn-chah`, `ajahn-brahm`, `analayo`, `sujato`
+    """
+    raw = _find_by_teacher(teacher_slug, limit=limit)
+    if not raw:
+        raise HTTPException(status_code=404, detail=f"Enseignant '{teacher_slug}' introuvable ou sans contenu indexé")
+    return [SearchResult(score=1.0, url=r.get("url", ""), **{k: v for k, v in r.items() if k != "url"}) for r in raw]
diff --git a/search/api/main.py b/search/api/main.py
@@ -0,0 +1,65 @@
+"""
+Buddhist University Search API — FastAPI app.
+
+Usage:
+    PYTHONPATH=/path/to/buddhist-uni.github.io \\
+        uvicorn search.api.main:app --port 8001 --reload
+
+Docs:
+    http://localhost:8001/docs
+"""
+
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+
+from search.api.search import router as search_router
+from search.api.courses import router as courses_router
+
+app = FastAPI(
+    title="Buddhist University Search API",
+    description=(
+        "Recherche sémantique dans 4494+ ressources bouddhistes — "
+        "textes canoniques, articles académiques, AV, cours structurés."
+    ),
+    version="1.0.0",
+    docs_url="/docs",
+    redoc_url="/redoc",
+)
+
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_methods=["GET"],
+    allow_headers=["*"],
+)
+
+app.include_router(search_router, tags=["Search"])
+app.include_router(courses_router, tags=["Courses"])
+
+
+@app.get("/", tags=["Health"])
+def root():
+    return {
+        "name": "Buddhist University Search API",
+        "version": "1.0.0",
+        "docs": "/docs",
+        "endpoints": {
+            "search": "GET /search?q=...&tags=...&category=...&limit=8",
+            "reading_path": "GET /reading-path?topic=...&level=beginner",
+            "courses": "GET /courses",
+            "course_detail": "GET /courses/{id}",
+            "teacher": "GET /teachers/{slug}",
+        },
+    }
+
+
+@app.get("/health", tags=["Health"])
+def health():
+    from search.ingestion.qdrant_setup import get_client, COLLECTION_NAME
+    client = get_client()
+    info = client.get_collection(COLLECTION_NAME)
+    return {
+        "status": "ok",
+        "indexed_documents": info.points_count,
+        "collection": COLLECTION_NAME,
+    }
diff --git a/search/api/models.py b/search/api/models.py
@@ -0,0 +1,69 @@
+"""Pydantic models for the Buddhist University Search API."""
+
+from pydantic import BaseModel, Field
+
+
+class SearchResult(BaseModel):
+    score: float
+    title: str
+    category: str
+    tags: list[str] = []
+    authors: list[str] = []
+    course: str | None = None
+    year: int | None = None
+    stars: int | None = None
+    url: str
+    external_url: str | None = None
+    minutes: int | None = None
+    pages: str | None = None
+
+
+class SearchResponse(BaseModel):
+    query: str
+    total: int
+    results: list[SearchResult]
+
+
+class CourseItem(BaseModel):
+    title: str
+    category: str
+    tags: list[str] = []
+    authors: list[str] = []
+    year: int | None = None
+    stars: int | None = None
+    url: str
+    minutes: int | None = None
+    pages: str | None = None
+
+
+class CourseDetail(BaseModel):
+    id: str
+    title: str
+    subtitle: str = ""
+    description: str = ""
+    icon: str = ""
+    next_courses: list[str] = []
+    content_count: int
+    content: list[CourseItem]
+
+
+class CourseSummary(BaseModel):
+    id: str
+    title: str
+    subtitle: str = ""
+    icon: str = ""
+    next_courses: list[str] = []
+
+
+class ReadingPathItem(BaseModel):
+    path_order: int
+    level: str
+    score: float
+    title: str
+    category: str
+    tags: list[str] = []
+    authors: list[str] = []
+    stars: int | None = None
+    url: str
+    minutes: int | None = None
+    pages: str | None = None