(work in progress)
This paper, authored by Jorge Martinez-Gil, delves into methods for evaluating the similarity between open data catalogs (ODCs). It underscores the increasing relevance of open data initiatives and the necessity for effective catalog similarity metrics to enhance data management and accessibility.
- Introduction
- Explores the role of ODCs in data transparency and innovation.
- Introduces the concept of catalog similarity.
- Related Works
- Reviews literature on ODC similarity and standard data catalog vocabularies.
- Similarity Methods for Open Data Catalogs
- 3.1: Repositories of Triples
- 3.2: Repositories of Tokens
- 3.3: Character Sequences
- Conclusion
- Summarizes findings and suggests future research directions.
- Highlights various strategies for ODC similarity measurement, including traditional and advanced semantic-based approaches.
- Stresses the importance of selecting an appropriate method based on catalog characteristics and objectives.
- Discusses future research possibilities in dynamic and real-time similarity assessment.
The paper emphasizes the diversity of methodologies in ODC similarity measurement and the need for customized approaches depending on specific catalog requirements.
For full details, refer to the complete paper.
If you use this work, please cite:
@article{martinez2023overview,
title={An Overview of Approaches to Quantify Open Data Catalog Similarity},
author={Martinez-Gil, Jorge},
year={2023}
}
The material is provided under the MIT License.