Back to blog

Building Lumeon: an AI web-app for textbooks

983 words
A
Written by Arun Patro

In this blog, we discuss how to create a full stack AI web-app, with a 5 step process playbook.

What is Lumeon?

Lumeon is a web-app, that helps you stay in the flow while reading textbooks, providing useful AI shortcuts to interact with the content. It is an exploration of embedded AI features on documents. The core of Lumeon is about:

  1. Parsing and chunking long-form technical books correctly
  2. Creating AI-generated content from the parsed data
  3. Interacting (using AI) with the the source content and generated content

Tech Outline

We implement the project as a monorepo that has three components:

  1. Frontend
  2. Backend
    • Serving
    • Infra (like DB, Workers)
  3. Core logic
    • AI logic
    • Business logic

TODO: where to talk about Auth and Stripe

Process

We complete the project with the following 5 phases:

1. Research + Spec Phase

First, lets list down any inspiration, features, what we want to build. We came up with:

Features:

  • User management / subscription / login / auth
  • Upload pdfs
  • pdf / media viewer
  • generate quizzes, lessons, flashcards, ppts, etc using AI
  • quiz report analysis
  • chat with pdf highlights (CMD-L)
  • semantic search over document chunks (CMD-K)

2. Prototype Phase

In this phase, we focus on the initial look and feel of the product, the features we want to come alive. This is a haphazard mix of figma and react app building that leads us to a demo version that is end-to-end mocked out. Here is the demo version: https://demo.lumeon.app

After this phase, we have a Lumeon folder from which we can start the frontend server (ensure .env.local secrets are set!)

bun install
bun run dev

3. Backend Phase

This is the backbone of the app. We use Python / FastAPI as the backend because of easy inter-op with Python AI SDKs. Here we focus on

  1. Data Model
  2. API Routes
  3. Background Jobs
  4. Supabase DB for Persistence

After this phase, we have the backend folder setup which we can serve with (ensure .env secrets are set!):

uv sync --dev
uv run dev.py

a. Data Model

We first define the types and schemas we will be working with. Defining strong types early on can debug, and make code robust to iteration. Some of these BaseModel types will be used to generate Structured Outputs from LLMs, which is the core utility we leverage for AI generated content. Some of them are for the DB Schema. Here are some data models in pydantic:

# type for AI generation
class LessonContentAI(BaseModel):
    title: str = Field(
        ...,
        description="The title of the lesson. Use the given title of the section as the title of the lesson if it is already provided, otherwise generate a title for the lesson.",
    )
    description: str = Field(
        ...,
        description="One or two line description of the content. Get the most information across.",
    )
    content: str = Field(
        ...,
        description="A concise summary of the content. Be to the point and concise.",
    )

# type for database
class Lesson(BaseModel):
    id: str
    space_id: str
    content: LessonContentAI
    order_number: int
    flashcards: list[Flashcard] | None = None
    ppt: dict[str, Any] | None = None
    ppt_path: str | None = None
    created_at: datetime
    updated_at: datetime

b. API Routes

We use FastAPI routers for the serving logic. FastAPI makes middleware and auth a breeze. Overall, we have the following OpenAPI spec:

c. Background Jobs

Background jobs are an essential component for long running async tasks. We use background jobs parsing long form documents, for creating the AI generated content. This involves many (>100) AI calls that need to be resolved. We can poll the status of a background job and inform the user at every step.

d. Supabase DB + Auth

We can use supabase for all the 3 kinds of storage:

  1. Relational SQL DB
  2. Blob Storage
  3. Vector Store

We use a blob storage for file uploads, and vector store for embeddings over which we can perform RAG queries and power the CMD-K feature.

First setup a blank supabase project, then from command line run the migrations (sql scripts) to set it all up.

npx supabase init
npx supabase link
npx supabase db reset
npx supabase db push

4. Core Logic

The core logic for Lumeon is about Book Parsing and Structured AI generation. We handle them both with our open source library lumos

a. Book Parsing Logic (with lumos.book)

Here is the flow for how we pre-process a book:

The core logic for the app involves parsing the structure of a book - starting from its table of contents.

Book

  • extract TOC
  • sanitize the TOC to focus on the main content
  • find chapter and section boundries (in the page)
  • partition the book into legible sections
  • chunking logic for clean references

4b. AI Structured Outputs (with lumos)

We have 6 AI flows in lumeon app, here we talk about how to implement the Instant-Quiz AI feature.

The usual way to generate AI content (text) is via the ChatMessages API from major AI providers. We can improve the reliability of the generation using Structured Outputs - a way to generate well defined structs. See more: openai-link

We wrote a small wrapper on litellm for Structured Outputs called lumos. Here is how one can generate structs:

from lumos import lumos
from pydantic import BaseModel


class Quiz(BaseModel):
    steps: list[str]
    final_answer: str


lumos.call_ai(
    messages=[
        {"role": "system", "content": "You are a mathematician."},
        {"role": "user", "content": "What is 100 * 100?"},
    ],
    response_format=Response,
    model="gpt-4o-mini",
)
# Response(steps=['Multiply 100 by 100.', '100 * 100 = 10000.'], final_answer='10000')

5. Deployment

We have three deployments to handle:

  1. NextJS frontend on Vercel
  2. FastAPI serving backend on Render
  3. Docker containerized book-parsing backend on Render