YouTube Transcription & Search API

A Cloudflare Workers application that transcribes YouTube videos, stores transcriptions, and provides semantic search capabilities using AI embeddings.

Features

YouTube video transcription using Groq's Whisper model
Vector embeddings generation using Cloudflare AI
Semantic search across video transcriptions
Caching with Cloudflare KV
Workflow-based processing architecture

Prerequisites

Node.js (Latest LTS version recommended)
Cloudflare Workers account
Required API keys:
- Groq API key

Installation

Clone the repository
Install dependencies:

bun install

Configuration

Create a wrangler.toml file based on the provided template
Set up the required secrets using Wrangler:

wrangler secret put GROQ_API_KEY

Development

Run the development server:

bun run dev

Deployment

Deploy to Cloudflare Workers:

bun run deploy

API Endpoints

Process Video

POST api/process-video/:id
- Starts transcription process for a YouTube video
- Returns cached result if available

Check Status

GET api/status/:instanceId
- Check the status of a running transcription workflow

Search

GET api/search?q=query
- Perform semantic search across transcribed videos

Architecture

The application uses several Cloudflare services:

Workers: Main application runtime
KV: Caching transcriptions
Vectorize: Vector database for embeddings
AI: Embedding generation
Workflows: Orchestrating the transcription process

Technical Details

Built with Hono.js framework
TypeScript for type safety
Uses Cloudflare's AI models for embeddings
Implements workflow-based processing for long-running tasks
Includes automatic retries and error handling
Implements caching strategies for performance

Error Handling

The application includes comprehensive error handling:

Workflow retries for transient failures
Graceful degradation
Detailed error logging
Client-friendly error responses

Limitations

Maximum video length may be limited by processing time constraints
API rate limits apply based on Cloudflare Workers limits
Cached transcriptions expire after 30 days

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
src		src
.gitignore		.gitignore
README.md		README.md
bun.lockb		bun.lockb
package.json		package.json
tsconfig.json		tsconfig.json
wrangler.toml		wrangler.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Transcription & Search API

Features

Prerequisites

Installation

Configuration

Development

Deployment

API Endpoints

Process Video

Check Status

Search

Architecture

Technical Details

Error Handling

Limitations

Contributing

License

About

Releases

Packages

Languages

TheRohit/cf-hono-app

Folders and files

Latest commit

History

Repository files navigation

YouTube Transcription & Search API

Features

Prerequisites

Installation

Configuration

Development

Deployment

API Endpoints

Process Video

Check Status

Search

Architecture

Technical Details

Error Handling

Limitations

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages