Visual Document NER and New Healthcare Models in NLU 5.3.1 !
We are excited to announce NLU 5.3.1 has been released! It comes with Visual Document NER, enabling you to extract entities from image files like JPGs.
Additionally 5 Healthcare Pipelines have been added for domains like Therapeutic Chemicals, HPO Resolvers, Voice of Patient, Oncology and Generic Clinical .
Additionally TextMatcherInternal based pipelines are now supported
Visual NER
- Tutorial Notebook
- Medium: Named Entity Recognition in Documents with Transformer Models using Visual-NLP: Part 1
- Medium: One-Liner Magic with Spark NLP: Deep Learning for NER in Documents — Part 2
VisualDocumentNER is a transformer-based model designed for Named Entity Recognition (NER) in documents. It serves as the primary interface for tasks such as detecting keys and values in datasets like FUNSD, representing the structure of a form. These keys and values are typically interconnected using a FormRelationExtractor model.
However, some VisualDocumentNER models are trained with a different approach, considering entities in isolation. These entities could be names, places, or medications, and the goal is not to connect these entities to others, but to utilize them individually.
Powered by Spark OCR's VisualDocumentNER
New Healthcare Models
NLU ref | Model |
---|---|
en.resolve.atc_pipeline | atc_resolver_pipeline |
en.map_entity.hpo_resolver_pipe | hpo_resolver_pipeline |
en.explain_doc.pipeline_vop | explain_clinical_doc_vop |
en.explain_doc.clinical_generic.pipeline | explain_clinical_doc_generic |
en.explain_doc.clinical_oncology.pipeline | explain_clinical_doc_oncology |
New Medium Articles
Tutotirals on how to leverage Visual NLPs table extraction and Visual NER in 1 line and with custom pipelines:
- Deep Learning based Table Extraction using Visual NLP: Part 1
- One-Liner Magic with Spark NLP: Deep Learning for Table Extraction — Part 2
- Named Entity Recognition in Documents with Transformer Models using Visual-NLP: Part 1
- One-Liner Magic with Spark NLP: Deep Learning for NER in Documents — Part 2
📖Additional NLU resources
- 140+ NLU Tutorials
- Streamlit visualizations docs
- The complete list of all 20000+ models & pipelines in 300+ languages is available on Models Hub
- Spark NLP publications
- NLU documentation
- Discussions Engage with other community members, share ideas, and show off how you use Spark NLP and NLU!
Installation
#PyPI
pip install nlu pyspark