This project is a Python script that processes Markdown files, automatically replacing plain URLs with titled links. It fetches the content of each URL, generates a descriptive title using a Language Model (LLM), and updates the Markdown file with the new, more informative links.
- Asynchronous processing of URLs for improved performance
- Supports multiple backend LLM providers
- Uses Jina AI's content extraction service for reliable web scraping
- Implements retry logic and rate limiting to handle network issues
- Python 3.7+
- Required Python packages (see
requirements.txt
)
Edit the config.yaml
file to set your preferred LLM backend provider and any necessary API keys.
-
Configure your preferred LLM backend and API key in the
config.yaml
file. -
Run the script with the input Markdown file as an argument:
python3 run.py input_file.md
The processed file will be saved as output.md
in the same directory.
Just try it on this README.md, say running
python3 run.py README.md
See what happens.to following links
- https://www.markdownguide.org/
- https://docs.python.org/3/library/asyncio.html
- https://en.wikipedia.org/wiki/Language_model
After running run.py
on this README, these links would be transformed into more informative, titled links. The output might look like this:
- Markdown Guide - Basic Syntax, Extended Syntax, Cheat Sheet
- asyncio — Asynchronous I/O — Python 3.11.5 documentation
- Language model - Wikipedia
- OpenAI (GPT3/4)
- Ollama
- DeepSeek
- Qwen
- ERNIE
- GLM
- Spark
- HunYuan
Contributions are welcome! Please feel free to submit a Pull Request.