Small datasets & files in many formats (including genomic file samples), used for testing cloud buckets, SQL, NoSQL, Spark or Machine Learning Services
- GCP Public Datasets - https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset
- GCP Dataset Search - https://datasetsearch.research.google.com/
- AWS Public Datasets - https://aws.amazon.com/blogs/aws/tag/public-data-sets/
- AWS Registry of Open Data - https://registry.opendata.aws/
- Azure Open Datasets Catalog - https://azure.microsoft.com/en-us/services/open-datasets/catalog/
- Microsoft Research Open Data - https://msropendata.com/
- DataHub Dataset Collections - https://datahub.io/collections