Scalable PDF document processing with DataChain and Unstructured.io
Datasets + LLMs + Pydantic = DataChain ...now with @huggingface !💛
DataChain by @DVCorg just added @huggingface support ! Create, Load, Transform HF Datasets with LLMs easily.
- Pydantic for dataset schema
- Use your own or public HF Datasets
- Run your own or public HF Models