Learning Pipeline
Passive consumption → Searchable knowledge
32K videos tracked. 15K transcribed. 4,142 hours watched.
Watch and forget → transcribe, embed, retrieve.
"My best teachers have often been YouTube tutorials, reviews and demos. I found myself spending more than five hours a day watching videos at 2-3x speed."
External memory for minds that can't hold it all.
The Real Numbers
31,832
Videos tracked
15,456
Transcribed
6,152
Channels
4,142
Hours watched
The Problem
- Video is inefficient. 1 hour video = 10 minutes reading. But videos have content text doesn't.
- Consumed ≠ retained. Watching at 2x speed helps throughput but not recall.
- No cross-reference. What did that tutorial say about X? Lost in watch history.
- API costs. Cloud transcription is expensive at scale (10K+ videos).
The Solution
Local-first learning infrastructure. Capture → Transcribe → Structure → Store → Retrieve. Zero API costs. Searchable knowledge base from video consumption.
Research Questions
- Retention: Does searchable transcription improve recall vs. passive watching?
- Speed vs. depth: What's the optimal consumption speed for different content types?
- Active retrieval: How often do people actually search their knowledge base?
- Compression: Can AI summarization replace full consumption for some content?
Preliminary Data
31,832 videos tracked. 15,456 transcribed. 6,152 channels. 1,407 rewatched. Local ML transcription via Whisper/Parakeet—zero API costs.
Estimated 4+ hours/day saved by searching transcripts instead of re-watching. "What did that tutorial say about X?" answered in seconds instead of scrubbing.
Primary content: tutorials, tech reviews, lectures. Average watch speed: 2-3x. Peak consumption: late evening.
Pipeline Architecture
YouTube/Podcast → yt-dlp → Audio file
↓
Whisper/Parakeet (local)
↓
Transcript + timestamps
↓
LLM extraction (topics, summary)
↓
Supabase + embeddings
↓
Semantic search interfaceRoadmap
- Build transcription pipeline
- Process 10K+ videos
- Semantic search interface
- Retention study (before/after)
- Speed optimization research
- Open source pipeline
Documentation
Contribute
Share your own learning infrastructure, consumption data, or retention studies.
Open an issue →