Add api examples

mozilla-ai · Dec 12, 2024 · 74bcd43 · 74bcd43
1 parent ebb3d11
commit 74bcd43
Show file tree

Hide file tree

Showing 2 changed files with 39 additions and 0 deletions.
diff --git a/docs/step-by-step-guide.md b/docs/step-by-step-guide.md
@@ -48,6 +48,33 @@ Cleaner input data ensures that the model works with reliable and consistent inf
 
    - Ensures the document is clean and ready for the next step.
 
+### 🔍 **API Example**
+
+```py
+from document_to_podcast.preprocessing import DATA_CLEANERS, DATA_LOADERS
+
+input_file = "example_data/introducing-mozilla-ai-investing-in-trustworthy-ai.html"
+data_loader = DATA_LOADERS[".html"]
+data_cleaner = DATA_CLEANERS[".html"]
+
+raw_data = data_loader(input_file)
+print(raw_data[:200])
+"""
+<!doctype html>
+<html class="no-js" lang="en-US">
+
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+  <link rel="profile" href="https://gmpg.org/x
+"""
+clean_data = data_cleaner(raw_data)
+print(clean_data[:200])
+"""
+Skip to content Mozilla Internet Culture Deep Dives Mozilla Explains Interviews Videos Privacy Security Products Firefox Pocket Mozilla VPN Mozilla News Internet Policy Leadership Mitchell Baker, CEO
+"""
+```
+
 ## **Step 2: Podcast Script Generation**
 
 In this step, the pre-processed text is transformed into a conversational podcast transcript. Using a Language Model, the system generates a dialogue that’s both informative and engaging.
@@ -73,6 +100,16 @@ In this step, the pre-processed text is transformed into a conversational podcas
    - Supports both single-pass outputs (`text_to_text`) and real-time streamed responses (`text_to_text_stream`), offering flexibility for different use cases.
 
 
+### 🔍 **API Example**
+
+```py
+from document_to_podcast.inference.model_loaders import load_llama_cpp_model
+from document_to_podcast.inference.text_to_text import text_to_text
+
+...
+```
+
+
 ## **Step 3: Audio Podcast Generation**
 
 In this final step, the generated podcast transcript is brought to life as an audio file. Using a Text-to-Speech (TTS) model, each speaker in the script is assigned a unique voice, creating an engaging and professional-sounding podcast.

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -27,6 +27,8 @@ theme:
         name: Switch to light mode
   extra_css:
     - assets/custom.css
+  features:
+    - content.code.copy
 
 markdown_extensions:
   - pymdownx.highlight: