Skip to content

Commit

Permalink
Add api examples
Browse files Browse the repository at this point in the history
  • Loading branch information
daavoo committed Dec 12, 2024
1 parent ebb3d11 commit 74bcd43
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 0 deletions.
37 changes: 37 additions & 0 deletions docs/step-by-step-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,33 @@ Cleaner input data ensures that the model works with reliable and consistent inf

- Ensures the document is clean and ready for the next step.

### 🔍 **API Example**

```py
from document_to_podcast.preprocessing import DATA_CLEANERS, DATA_LOADERS

input_file = "example_data/introducing-mozilla-ai-investing-in-trustworthy-ai.html"
data_loader = DATA_LOADERS[".html"]
data_cleaner = DATA_CLEANERS[".html"]

raw_data = data_loader(input_file)
print(raw_data[:200])
"""
<!doctype html>
<html class="no-js" lang="en-US">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="profile" href="https://gmpg.org/x
"""
clean_data = data_cleaner(raw_data)
print(clean_data[:200])
"""
Skip to content Mozilla Internet Culture Deep Dives Mozilla Explains Interviews Videos Privacy Security Products Firefox Pocket Mozilla VPN Mozilla News Internet Policy Leadership Mitchell Baker, CEO
"""
```

## **Step 2: Podcast Script Generation**

In this step, the pre-processed text is transformed into a conversational podcast transcript. Using a Language Model, the system generates a dialogue that’s both informative and engaging.
Expand All @@ -73,6 +100,16 @@ In this step, the pre-processed text is transformed into a conversational podcas
- Supports both single-pass outputs (`text_to_text`) and real-time streamed responses (`text_to_text_stream`), offering flexibility for different use cases.


### 🔍 **API Example**

```py
from document_to_podcast.inference.model_loaders import load_llama_cpp_model
from document_to_podcast.inference.text_to_text import text_to_text

...
```


## **Step 3: Audio Podcast Generation**

In this final step, the generated podcast transcript is brought to life as an audio file. Using a Text-to-Speech (TTS) model, each speaker in the script is assigned a unique voice, creating an engaging and professional-sounding podcast.
Expand Down
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ theme:
name: Switch to light mode
extra_css:
- assets/custom.css
features:
- content.code.copy

markdown_extensions:
- pymdownx.highlight:
Expand Down

0 comments on commit 74bcd43

Please sign in to comment.