Building Lumeon: an AI web-app for textbooks
In this blog, we discuss how to create a full stack AI web-app, with a 5 step process playbook.
What is Lumeon?
Lumeon is a web-app, that helps you stay in the flow while reading textbooks, providing useful AI shortcuts to interact with the content. It is an exploration of embedded AI features on documents. The core of Lumeon is about:
- Parsing and chunking long-form technical books correctly
- Creating AI-generated content from the parsed data
- Interacting (using AI) with the the source content and generated content
Tech Outline
We implement the project as a monorepo that has three components:
- Frontend
- Backend
- Serving
- Infra (like DB, Workers)
- Core logic
- AI logic
- Business logic
TODO: where to talk about Auth and Stripe
Process
We complete the project with the following 5 phases:
1. Research + Spec Phase
First, lets list down any inspiration, features, what we want to build. We came up with:
Features:
- User management / subscription / login / auth
- Upload pdfs
- pdf / media viewer
- generate quizzes, lessons, flashcards, ppts, etc using AI
- quiz report analysis
- chat with pdf highlights (CMD-L)
- semantic search over document chunks (CMD-K)
2. Prototype Phase
In this phase, we focus on the initial look and feel of the product, the features we want to come alive. This is a haphazard mix of figma and react app building that leads us to a demo version that is end-to-end mocked out. Here is the demo version: https://demo.lumeon.app
After this phase, we have a Lumeon folder from which we can start the frontend server (ensure .env.local secrets are set!)
bun install bun run dev
3. Backend Phase
This is the backbone of the app. We use Python / FastAPI as the backend because of easy inter-op with Python AI SDKs. Here we focus on
- Data Model
- API Routes
- Background Jobs
- Supabase DB for Persistence
After this phase, we have the backend folder setup which we can serve with (ensure .env secrets are set!):
uv sync --dev uv run dev.py
a. Data Model
We first define the types and schemas we will be working with. Defining strong types early on can debug, and make code robust to iteration. Some of these BaseModel types will be used to generate Structured Outputs from LLMs, which is the core utility we leverage for AI generated content. Some of them are for the DB Schema. Here are some data models in pydantic:
# type for AI generation class LessonContentAI(BaseModel): title: str = Field( ..., description="The title of the lesson. Use the given title of the section as the title of the lesson if it is already provided, otherwise generate a title for the lesson.", ) description: str = Field( ..., description="One or two line description of the content. Get the most information across.", ) content: str = Field( ..., description="A concise summary of the content. Be to the point and concise.", ) # type for database class Lesson(BaseModel): id: str space_id: str content: LessonContentAI order_number: int flashcards: list[Flashcard] | None = None ppt: dict[str, Any] | None = None ppt_path: str | None = None created_at: datetime updated_at: datetime
b. API Routes
We use FastAPI routers for the serving logic. FastAPI makes middleware and auth a breeze. Overall, we have the following OpenAPI spec:
c. Background Jobs
Background jobs are an essential component for long running async tasks. We use background jobs parsing long form documents, for creating the AI generated content. This involves many (>100) AI calls that need to be resolved. We can poll the status of a background job and inform the user at every step.
d. Supabase DB + Auth
We can use supabase for all the 3 kinds of storage:
- Relational SQL DB
- Blob Storage
- Vector Store
We use a blob storage for file uploads, and vector store for embeddings over which we can perform RAG queries and power the CMD-K feature.
First setup a blank supabase project, then from command line run the migrations (sql scripts) to set it all up.
npx supabase init npx supabase link npx supabase db reset npx supabase db push
4. Core Logic
The core logic for Lumeon is about Book Parsing and Structured AI generation. We handle them both with our open source library lumos
a. Book Parsing Logic (with lumos.book)
Here is the flow for how we pre-process a book:
The core logic for the app involves parsing the structure of a book - starting from its table of contents.
Book
- extract TOC
- sanitize the TOC to focus on the main content
- find chapter and section boundries (in the page)
- partition the book into legible sections
- chunking logic for clean references
4b. AI Structured Outputs (with lumos)
We have 6 AI flows in lumeon app, here we talk about how to implement the Instant-Quiz AI feature.
The usual way to generate AI content (text) is via the ChatMessages API from major AI providers. We can improve the reliability of the generation using Structured Outputs - a way to generate well defined structs. See more: openai-link
We wrote a small wrapper on litellm for Structured Outputs called lumos. Here is how one can generate structs:
from lumos import lumos from pydantic import BaseModel class Quiz(BaseModel): steps: list[str] final_answer: str lumos.call_ai( messages=[ {"role": "system", "content": "You are a mathematician."}, {"role": "user", "content": "What is 100 * 100?"}, ], response_format=Response, model="gpt-4o-mini", ) # Response(steps=['Multiply 100 by 100.', '100 * 100 = 10000.'], final_answer='10000')
5. Deployment
We have three deployments to handle:
- NextJS frontend on Vercel
- FastAPI serving backend on Render
- Docker containerized book-parsing backend on Render