Pycasts is a reference API for producing long form spoken audio and scripts for podcasts, powered by Nitric, Suno Bark and Llama 3.2.
Here's a sample of what can be produced with this project:
power-rangers-small.webm
If you'd like a step-through guide on producing this API see here
First off install project dependencies:
uv sync --all-extras
This project uses also uses smaller LLMs for producing podcast scripts, which is baked directly into the resulting container when we deploy, via the models
directory of the project.
This can be populated by downloading an appropriately sized and quantized Llama 3.2 model. As an example:
mkdir models
curl -L https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_L.gguf -o models/Llama-3.2-3B-Instruct-Q4_K_L.gguf
The project can be then run locally by installing the Nitric CLI. And running
nitric start
This will start the project on your local machine, and you will be able to use the provided local nitric dashboard, or your HTTP client of choice to interact with the API.
Audio models are downloaded ahead of time using the API to be stored locally in a bucket, this can be done by hitting the /download-model
endpoint:
curl -X POST http://localhost:4001/download-model
Assuming your API is hosted on 4001 (check your CLI output for
nitric start
).
Once the model has been fetched you're good to run the start generating podcasts.
For example
curl -X POST http://localhost:4001/podcast/peanut \
-H "Content-Type: text/plain" \
-d "A podcast about the history of the peanut"
Would produce a short podcast style script and audio on the history of the peanut 🥜.
Watch the logs in your CLI to see progress for now. Your output audio will be available in the clips
bucket once everything has generated. See the local nitric dashboard storage, to download your finished podcast.
A reference for deploying to AWS is provided along with the project under nitric.aws.yaml
.
For info on pre-requisites and setup see the Nitric AWS provider docs.
You can also consult this guide for instructions specific to this project.
Nitric supports GCP for this app, and may be included as a reference for this project in future as well.
- Remove API endpoints and make podcasts on a schedule
- Use different voice models
- Add support for multiple speakers