An open-source Nepal school textbook catalog and API.
YoBook API collects public educational-book metadata from official and public sources, keeps CEHRD as the primary source, generates real cover images from PDF first pages, and serves everything through a simple Flask API and browser UI.
This project is free to use, copy, modify, distribute, and build on.
Please give credit when you use it by preserving the LICENSE and NOTICE files, and by mentioning:
Powered by YoBook API
The project code is released under the MIT License. Source textbook PDFs, book covers generated from those PDFs, trademarks, and third-party metadata remain owned by their original publishers and providers.
CEHRD Learning Portal is the primary source because it currently gives the cleanest official structure:
Other sources are still useful as secondary enrichment.
| Source | Role | What It Provides |
|---|---|---|
| CEHRD Learning Portal | Primary | Official grade/subject textbook PDFs from learning.cehrd.gov.np |
| CDC Nepal | Secondary | Official CDC publication links and curated textbook records |
| E-Pustakalaya | Secondary | Public digital-library records for Nepal education |
| Internet Archive | Supplementary | Digitized Nepal-related books and documents |
| Open Library | Supplementary | Additional public catalog metadata |
/covers/<file>/docspip install -r requirements.txt
python api.py
Open:
http://127.0.0.1:5000/
Scrape the primary CEHRD source:
python scraper.py --source cehrd
Scrape one grade:
python scraper.py --source cehrd --grade 5
Scrape everything:
python scraper.py
Generate real covers from PDF first pages:
python generate_pdf_covers.py --source cehrd-learning
The generated covers are saved in:
data/covers/
GET /api/books
Useful filters:
| Query | Example |
|---|---|
source |
/api/books?source=cehrd-learning |
grade |
/api/books?grade=10 |
subject |
/api/books?subject=Science |
q |
/api/books?q=mathematics |
limit |
/api/books?limit=20 |
page |
/api/books?page=2 |
GET /api/books/<id>
GET /api/sources
GET /api/stats
GET /docs
{
"id": "cehrd-learning-g1-mathematics-40",
"title": "Mathematics - Grade 1",
"author": "Centre for Education and Human Resource Development",
"grade": 1,
"subject": "Mathematics",
"language": "en",
"country": "np",
"curriculum": "CDC Nepal",
"source": "cehrd-learning",
"sourceUrl": "https://learning.cehrd.gov.np/mod/resource/view.php?id=40",
"readUrl": "https://learning.cehrd.gov.np/mod/resource/view.php?id=40",
"pdfUrl": "https://learning.cehrd.gov.np/pluginfile.php/...",
"coverUrl": "/covers/cehrd-learning-g1-mathematics-40.jpg",
"category": "Textbook",
"keywords": ["CEHRD", "CDC", "textbook", "Nepal", "class 1", "Mathematics"]
}
book-api/
api.py Flask API and UI server
scraper.py Source scrapers
generate_pdf_covers.py Generates covers from PDF first pages
requirements.txt Python dependencies
Procfile Production start command
openapi.json API schema
templates/
index.html Browser UI
data/
all_books.json Merged catalog, CEHRD first
cehrd_learning.json Primary CEHRD data
cdc_nepal.json CDC data
pustakalaya.json E-Pustakalaya data
archive_org.json Internet Archive data
open_library.json Open Library data
covers/ Generated local book covers
Recommended start command:
gunicorn api:app
Good hosting options:
For a simple public deployment, commit the JSON data and generated covers so the app works immediately after deploy.
If you use this project in an app, website, API, dataset, research project, or redistributed package, please include visible or documented credit:
Powered by YoBook API
Also keep the original LICENSE and NOTICE files with the code or distribution.
YoBook API does not claim ownership of CEHRD, CDC, E-Pustakalaya, Internet Archive, Open Library, or other third-party source content.
The scraper and API code, catalog structure, normalization logic, and documentation are open source. Textbook PDFs and generated PDF-cover images may be subject to the original publishers’ terms.
Contributions are welcome. Good first improvements include:
Please read CONTRIBUTING.md before opening a pull request.