Learning Pipeline

Passive consumption → Searchable knowledge

32K videos tracked. 15K transcribed. 4,142 hours watched.

Watch and forget → transcribe, embed, retrieve.
"My best teachers have often been YouTube tutorials, reviews and demos. I found myself spending more than five hours a day watching videos at 2-3x speed."

External memory for minds that can't hold it all.

The Real Numbers

31,832

Videos tracked

15,456

Transcribed

6,152

Channels

4,142

Hours watched

The Problem

The Solution

Local-first learning infrastructure. Capture → Transcribe → Structure → Store → Retrieve. Zero API costs. Searchable knowledge base from video consumption.

CAPTUREDownload video/audio from any sourceyt-dlp, browser extensions
TRANSCRIBEConvert audio to searchable textWhisper, Parakeet (local ML)
STRUCTUREExtract topics, timestamps, key pointsLLM processing
STORESearchable database with embeddingsSupabase, vector DB
RETRIEVEQuery across all consumed contentSemantic search

Research Questions

Preliminary Data

Scale Achieved

31,832 videos tracked. 15,456 transcribed. 6,152 channels. 1,407 rewatched. Local ML transcription via Whisper/Parakeet—zero API costs.

Time Savings

Estimated 4+ hours/day saved by searching transcripts instead of re-watching. "What did that tutorial say about X?" answered in seconds instead of scrubbing.

Consumption Patterns

Primary content: tutorials, tech reviews, lectures. Average watch speed: 2-3x. Peak consumption: late evening.

Pipeline Architecture

YouTube/Podcast → yt-dlp → Audio file
                              ↓
                    Whisper/Parakeet (local)
                              ↓
                    Transcript + timestamps
                              ↓
                    LLM extraction (topics, summary)
                              ↓
                    Supabase + embeddings
                              ↓
                    Semantic search interface

Roadmap

Documentation

Contribute

Share your own learning infrastructure, consumption data, or retention studies.

Open an issue →

Built with yt-dlp, Whisper, and local ML.