-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ordering of search results is affected by Max Results #89
Comments
I would be willing to work on this. |
How would you do it? Basically, you the search engine shows the best result on top. If you search over a larger corpus, you can find better matches, which displaces other results and changes ordering. |
One possibility would be to present the results in the order they occur within the posting lists. Items in posting lists could have an additional weight field corresponding to their estimated general relevance. The posting lists could be sorted according to the weights, and re-sorted as necessary. That would degrade the ordering though. The question needs some more thought. |
@ijt - If you mean "index shard" when you say "posting list", this is exactly how it works already. Within a shard, files are ordered by importance (important files first), so eg. all things equal you get matches from non-test files before test files. Then the shards themselves are ordered by "quality" score, which is mainly powered from the github star-count. So matches in github.com/google/guava get prefernce over matches in android.googlesource.com/platform/external/guava, even though the content is the same. The problem is that matches have quality. If you are looking for "idiot", then the word "idiot" in an unimportant shard is a better match than the identifier "bidiOther" in an important shard. If you increase the result count to include the unimportant shard, inevitably, this will upset the ordering. One way out of this is to have a cheaper way to find quality matches. For example, currenltly we have |
Increasing the max results can affect the ordering of the search results. Here is an example.
Having stable ordering of search results would be a useful property and less surprising for users.
The text was updated successfully, but these errors were encountered: