Solr/Lucene Test Queries for Koha

From Koha Wiki
Jump to navigation Jump to search

Add Solr/Lucene test queries for particular expected record retrieval functionality.

Test queries have been organised for ensuring that we do not omit important types of queries. Please also help improve the organisation.

Some good test queries are needed to test each particular query feature in isolation. Real queries combine multiple query features into one query. Your test queries are welcome to combine query features. We need both query feature isolation test queries and query feature combination test queries.

We want your test queries. If you have trouble classifying your test query and have no time to improve the organisation, then please take a guess about where to add your query or add it to the unclassified queries table.

See Solr/Lucene Documentation for documentation of Solr/Lucene query syntax.

Character Set Encoding in Queries

Topic Query Results Development Status Notes
Encoding 体验汉语100句留学类 Tiyan hanyu100ju liuxuelei developed [Is the "100" correct or has the character set encoding been corrupted for the query?]

Punctuation in Queries

Topic Query Results Development Status Notes
Elisions d'alain;l'amour "Amours sauvages" "Trop de soleil tue l'amour" developed

Record Schema Queries

Topic Query Results Development Status Notes

Connectors Queries

Boolean Connectors Queries

Examples include AND, OR, and NOT queries.

Topic Query Results Development Status Notes
AND implicit connector without index specified maison prairie Petite maison dans la prairie
NOT operator search with different indexes specified. str_publisher:Ed* NOT str_author:Atkinson* "published by "Ed*" with an author different from "Atkinson*" should work

Proximity Connectors Queries

Topic Query Results Development Status Notes
Proximity Search "nuit jour"~5" "Mes nuits sont plus belles que vos jours"; "La Nuit, le jour et toutes les autres nuits" developed

Term Set Grouping (Nested) Queries

Grouping of terms as with parentheses, such as ((("A B") OR ("A C")) AND (D NEAR E)) NOT (X OR Y).

Topic Query Results Development Status Notes

Term Format (Structure) Queries

Word, word list, phrase, number, string, and date queries are some examples of term format (structure) queries.

Topic Query Results Development Status Notes
Word list search without index specified maison prairie Petite maison dans la prairie
Phrase Simple Search with index specified str_publisher:"Ed. France-Empire" "Le crépuscule des maudits" developed
Expression without index specified petite prairie' NOT petite maison dans la prairie [Is this a phrase query?]
Date Search with index specified srt_date_acqdate:"2010-08-31T00:00:00Z" with date in iso format developed

Relation Queries

Numeric and String Comparisons Queries

Examples include less than, greater than, equality, exact equality, and range within.

Topic Query Results Development Status Notes
Less than or equal to date specified by range beginning masking. srt_date_pubdate:[* TO 2000-01-01T00:00:00Z] published before 2000 [If pubdate contains dates derived from years, then this query would return results from before 2001 expressed as 2000 and before.]
Greater than or equal to date specified by range end masking. srt_date_pubdate:[2000-01-01T00:00:00Z TO *] published after 2000 [This query should return results published after 1999 expressed as from 2000 and later.]
Greater than or equal to Callnumbers with masking and range end specified by masking srt_str_callnumber:[200* TO *] callnumber over 200 [This query should return results for call numbers over 199.999 expressed as 200 and greater.]
Range within Date Search explicitly. srt_date_pubdate:[1995-01-01T00:00:00Z TO 2000-01-01T00:00:00Z] with date in range developed
Range within dates explicitly. srt_date_pubdate:[2000-01-01T00:00:00Z TO 2004-01-01T00:00:00Z] published between 2000 and 2004 [This query should return published within the range from 2000 to 2003 unless only year is significant. The terminating date and time of the query should be 2004-12-31T23:59:59Z to ensure complete coverage of 2004 even when more than year has been specified for some pubdate dates.]
Range within dates specified by masking. srt_date_pubdate:1999* "with date begins with "1999" developed [Simple means of specifying year. We should determine the whether explicit range or masking is more efficient.]
Range within Callnumbers with masking srt_str_callnumber:[3?B* TO 99*] callnumber from 1 to 99 (should be numerically sorted)

Case Sensitivity Selection Queries

Queries which select whether to ignore or respect case sensitivity.

Topic Query Results Development Status Notes

Character Accent Sensitivity Selection Queries

Queries which select whether to ignore or respect character accent sensitivity.

Topic Query Results Development Status Notes

Stemming Queries

Queries which match word morphology variants.

Topic Query Results Development Status Notes
Stemming Search juge "Le jugement du soir" developed

Phonetic Similarity Queries

Topic Query Results Development Status Notes
Metaphone Search amr "Amours sauvages" "Trop de soleil tue l'amour" should work Phonetic search gives me results I don't understand for the moment so it is disabled

Partial Match Queries

Some part of the query term may overlap and extend beyond or be included within the indexed term. Examples: Query term 'autobus' extends beyond indexed term 'bus'. Query term 'end' is contained within indexed term 'endgame'.

Topic Query Results Development Status Notes

Fuzzy Queries

Topic Query Results Development Status Notes
Fuzzy search mautid~ "Le crépuscule des maudits" developed
Fuzzy search maud~ "Le crépuscule des maudits" developed
Fuzzy search alain~ "Les proverbez d'Alain by Thomas Maillet (?)"; "Les mystères de la procession de Lille by édition critique par Alan E. Knight" developed

Spelling Suggestion Queries

Topic Query Results Development Status Notes
Spell Check unsupported yet

Synonyms Queries

Topic Query Results Development Status Notes
Synonym search damnés "Le crépuscule des maudits" developed "depends on how is configured "synonyms file"

Relevancy and Result Set Sort Order Queries

Topic Query Results Development Status Notes
Callnumber ranges srt_str_callnumber:[3?B* TO 99*] callnumber from 1 to 99 (should be numerically sorted)
Callnumber ranges srt_str_callnumber:[200* TO *] callnumber over 200
ordered by integer "can you precise please? what index? [Maybe this row was meant to refer to the row above, order by callnumber numerically. However, numerical sorting on callnumber needs a normalised callnumber for machine sorting different from the form meant for humans to read which even if it is a DDC or UDC number may have a Cutter code and copy number appended and collection code prefixed.]"
Boosting Keywords livre grand^2 "grand" would have a better weight in the query than "livre" should work

Locale Specific Queries

Specify sorting of the result set according to a specific language locale allowing overriding defaults.

Topic Query Results Development Status Notes

Always Matches Queries

Specify sorting of the result set according to a specific language locale allowing overriding defaults.

Topic Query Results Development Status Notes
Simple Search *:* all biblios developed

Index Selection Queries

Topic Query Results Development Status Notes
Serial srt_str_ccode:"Revue" Any serial
Serial srt_str_ccode:"Revue" txt-title-cover:science any serial with title containing science
Serial "srt_str_ccode:"Revue" txt-title-cover:science str_homebranch:A" any serial with title containing science from library A


Position Queries

A Specifies matching the query term from a particular character position relative to the specified beginning or end of the indexed specified field or field part. Queries requiring matches to start from the beginning of a field or field part are the most common.

Topic Query Results Development Status Notes

Truncation (Masking) Queries

Topic Query Results Development Status Notesl
Masking all indexes and query terms Simple Search *:* all biblios developed
Simple Search maudit "Le crépuscule des maudits" developed [Is this result from implicit default right truncation or stemming?]
Single character masking Simple Search str_publisher:Ed.?France-Emp?re "Le crépuscule des maudits" developed
Right truncation Simple Search str_publisher:Ed.* "Le crépuscule des maudits" developed

Regular Expression Queries

Topic Query Results Development Status Notes

Field or Part Completeness Queries

Complete match between the query term and all the contents of the indexed field or other part of the indexed record.

Topic Query Results Development Status Notes

Unclassified Queries

Topic Query Results Development Status Notes
Search by similarity unsupported yet