- Apache Lucene
- Lucene++
- Apache Solr
- Open Semantic Search
- Subprojects
- Solr PHP UI
- Subprojects
- Elasticsearch
- Other Projects
- dejavu
- Fess
- Searchkit
- Other Projects
- OpenSearch
- Other Projects
- Gigablast
- YaCy
- Articles
- Vald
- Weaviate
- MWMBL
- Alexandria
- Wiby
- OpenSearchServer
- Metasearch
- MetaGer
- Not Web Scale
- meilisearch
- Typesense
- Smaller Engines
- Sonic
- ZincSearch
- https://lucene.apache.org/
- The open source Java library that powers Apache Solr and Elasticsearch, among many other search projects.
- https://github.com/luceneplusplus/LucenePlusPlus
- An open source C++ port of Lucene.
- https://solr.apache.org/
- See also dedicated pages on Solr
- https://opensemanticsearch.org/
- Under the hood one is running Apache Solr, but there are some significant changes that make listing Open Semantic Search separately worthwile.1
- Solr PHP UI - Stars: 20 - Updated: 12/2021 - Checked: 2/2024
- A frontend for Open Semantic Search.
- GitHub Repo
- Solr Ontology Tagger - Stars: 39 - Updated: 1/2022 - Checked: 5/2023
- Solr Synonames - Stars: 5 - Updated: 10/2020 - Checked: 5/2023
- https://elastic.co/
- See also the dedicated pages on Elasticsearch.
- dejavu - Open source, JS web-based UI for Elasticsearch and OpenSearch.
- Fess - Open source, enterprise search server with web crawler and GUI. Written in Java.
- Searchkit - Updated: 3/2023 - Checked: 3/2023 - Stars: 4.6k - Open source library for building search UI's with JS, React, Vue, Angular, etc. Written in TypeScript primarily.
- https://opensearch.org/
- An open source fork of Elasticsearch started by Amazon.2
- See also the dedicated pages on OpenSearch
- Please see Other Projects under Elasticsearch. Only projects that are for OpenSearch exclusively will be listed here.
- https://gigablast.com/
- GitHub Repo
- Founded in 2000 by Matt Wells as a closed source search engine it was later open sourceed. It is written in C++, is distributed, and includes both the engine and a crawler.
- Please see the dedicated page on YaCy.
- https://vald.vdaas.org/
- GitHub Repo
- An open source, distributed vector search engine built using Go, utilized by Yahoo Japan.
- https://weaviate.io/
- GitHub Repo
- Open source vector search engine written in Go.
- Semantic Search through Wikipedia with Weaviate
- https://mwmbl.org/
- GitHub Repo
- Open source, non-profit search engine written in Python.3
- https://www.alexandria.org/
- GitHub Repo
- Open source search engine that uses CommonCrawl and is written in C++.
- https://wiby.me/
- GitHub Repo
- Installation and Setup Instructions
- Open source search engine written in PHP, C, and Go.
- https://www.opensearchserver.com/
- GitHub Repo
- Open source search engine written in Java, includes bundled crawler.
- Note: No updates since 8/2021 as of 3/2023.
- https://metager.org/
- Git Repo
- Open source metasearch engine run by a nonprofit.
- https://www.meilisearch.com/
- GitHub Repo
- An open source search engine written in Rust.
- https://typesense.org/
- GitHub Repo
- An open source Algolia alternative written in C/C++.4
- Sonic - Updated: 1/2023 - Checked: 3/2023 - Stars: 18k - A lightweight, speedy search backend written in Rust.
- ZincSearch - Updated: 3/2023 - Checked: 3/2023 - Stars: 14.7k - Lightweight alternative to Elasticsearch, written in Go. Includes a web UI.
Footnotes
-
It isn't meant for web search particularly but it offers a number of features which could be useful in a search engine - e.g. exploratory search as well as collaborative annotation and tagging. ↩
-
The fork was started following controversial licensing changes by Elasticsearch. For more on the history of this controversy see Graham Gillen's Elasticsearch vs OpenSearch series. For a brief evaluation of OpenSearch's progress see Matt Asay's One year of OpenSearch: Grading AWS’ open source effort. ↩
-
The project has some similarities with what I'm looking to do with Phoebe. It is open source, a non-profit, and the code is written in Python. ↩
-
Some interesting functionality includes tunable ranking, sorting, faceting & filtering, grouping & distinct, federated search, and curation. It doesn't appear to be in web scale usage but they've expressed interest in benchmarking larger datasets so I submmited an issue requesting CommonCrawl be benchmarked. ↩