From Koha Wiki
We also have a page for technical detail to help you start working on it.
Essentially, the goal of the short term is to allow zebra to be turned off and have things still work.
- Replace OPAC biblio search with ES [done]
- Replace OPAC authority search with ES [done]
- Replace staff client biblio search with ES [done]
- Replace staff authority search with ES [done]
- Improve general integration (packages, realtime indexing, etc.)
- Ensure that zebra still works properly [in progress]
- Add a UI that makes it possible to edit the mappings in the staff client [in progress]
- Add a browse interface [done]
- Ingest data from other sources into ES so Koha can search it natively
- Any other neat ideas that are possible by having a flexible search engine
- Make code release-ready
Basic 'Get Started' Steps
If you are familiar with kohadevbox, you just need to run
$ KOHA_ELASTICSEARCH=1 vagrant up
Manual install (master)
Have a Debian Jessie server.
Use the Koha community unstable repository, the Koha nightly unstable repository and jessie-backports (for openjdk-8) Add to /etc/apt/source.list
# jessie-backports deb http://ftp.debian.org/debian/ jessie-backports main # koha unstable deb http://debian.koha-community.org/koha unstable main # koha nightly unstable deb https://apt.abunchofthings.net/koha-nightly unstable main
You will need apt via https
$ sudo apt install apt-transport-https $ sudo apt update
$ sudo apt install -t jessie-backports openjdk-8-jre-headless
Install Elasticsearch 5.x. Short story:
$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - $ echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
Install the analysis-icu plugin
$ sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu
Install the koha-elasticsearch metapackage
$ sudo apt-get install koha-elasticsearch
If you're getting the following error during install
Couldn't write '262144' to 'vm/max_map_count', ignoring: Permission denied
$ export ES_SKIP_SET_KERNEL_PARAMETERS=true $ apt-get install elasticsearch
Add a configuration like this to your koha-conf.xml:
<elasticsearch> <server>localhost:9200</server> <!-- may be repeated to include all servers on your cluster --> <index_name>koha_robin</index_name> <!-- should be unique amongst all the indices on your cluster. _biblios and _authorities will be appended. --> </elasticsearch>
Note: you can change localhost for the hostname/IP of an external Elasticsearch server you might have running somewhere.
Advanced Connection Settings
N.B. The following configuration options are only available from Koha version 19.05.
You may need to add other options e.g. if your Elasticsearch server is on a different network. Here are the available configuration keys and their default values:
client: 5_0::Direct cxn_pool: Sniff request_timeout: 60
Example configuration for connecting to Elasticsearch 6.x server on another network:
<elasticsearch> <server>some.other.network:9200</server> <!-- may be repeated to include all servers on your cluster --> <index_name>koha_robin</index_name> <!-- should be unique amongst all the indices on your cluster. _biblios and _authorities will be appended. --> <client>6_0::Direct</client> <!-- Client version to use --> <cxn_pool>Static</cxn_pool> <!-- Use Static connection pool (see https://metacpan.org/pod/Search::Elasticsearch#cxn_pool for more information) --> </elasticsearch>
Advanced Index Settings
By default, when you reset index mappings in Koha's administration or drop and recreate an index, the settings are read from the following files:
If you'd like to use customized versions of these files, you can override any or all of the defaults by using the respective settings in koha-conf.xml (note that they reside outside the elasticsearch element):
<elasticsearch> <server>localhost:9200</server> <index_name>koha_robin</index_name> </elasticsearch> <elasticsearch_index_config>/etc/koha/searchengine/elasticsearch/index_config.yaml</elasticsearch_index_config> <elasticsearch_field_config>/etc/koha/searchengine/elasticsearch/field_config.yaml</elasticsearch_field_config> <elasticsearch_index_mappings>/etc/koha/searchengine/elasticsearch/mappings.yaml</elasticsearch_index_mappings>
If Elasticsearch is updated to a new version, a corresponding analysis-icu plugin version will also need to be installed. Otherwise Elasticsearch may fail to start and an error such as the following will be logged in /var/log/elasticsearch/elasticsearch.log:
java.lang.IllegalArgumentException: plugin [analysis-icu] is incompatible with version [5.6.13]; was designed for version [5.6.12]
A new version of the plugin can be installed as follows:
$ sudo /usr/share/elasticsearch/bin/elasticsearch-plugin remove analysis-icu; sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu
Set the 'SearchEngine' system preference to 'Elasticsearch'
sudo koha-shell kohadev cd kohaclone misc/search_tools/rebuild_elastic_search.pl -v -d
on a manual setup just run
misc/search_tools/rebuild_elastic_search.pl -v -d
You have to start Elasticsearch before you can do the indexing.
(*You may get an error "Can't locate Catmandu/Importer/MARC.pm"; make sure to use the package libcatmandu-marc-perl and DO NOT CPAN this module - it doesn't work)
Do a search in the OPAC and/or staff client, notice that it all works beautifully.
Search engine configuration
N.B. Many of the configuration options are only available from Koha version 18.11.
There is a default set of fields configured for the search indexes (see admin/searchengine/elasticsearch/mappings.yaml), but you can also use the "Search engine configuration" page available from the Koha Administration page to modify the mappings, field weights for relevance ranking etc. There's also elasticsearch_index_mappings setting in koha-conf.xml that can be used to set a customized mappings.yaml as the defaults file used if the mappings are reset.
You can always reset the mappings to Koha's defaults by using the "Reset Mappings" button. Note that any customizations will be lost forever if the mappings are reset.
The Search fields tab allows you to view all the available search fields, their types and weights. Note that changing weights will take effect immediately after the changes have been saved.
MARC to Elasticseach mappings
The Bibliographic records and Authorities tabs contain the mappings from MARC fields to search fields. Any changes here will only be reflected on existing records in the search index after rebuild_elastic_search.pl (see above) has been run.
The mapping column has the following syntax:
Description: Takes positions 35-37 from field 008
For data fields there are multiple possibilities depending on what is needed.
Description: Takes subfiels a and b from field 245 and adds them to separate search fields. This will treat the subfields are completely separate values.
Description: Takes subfiels a and b from field 245 and adds them to a single search field separated by a space. This will treat the subfields as belonging together so that e.g. phrase and proximity searches work properly and only a single value is displayed in facets. Note that a simple search still works with any of the words in the phrase, so there's normally no need to index related subfields separately.
Description: Combination of both above.
Note that all data field mappings will automatically handle alternate script fields (880) for MARC 21 records.
Advanced Field Configuration
There are two files that control the behavior of the search index: admin/searchengine/elasticsearch/index_config.yaml and admin/searchengine/elasticsearch/field_config.yaml. If necessary, these files can be copied e.g. to the etc directory and elasticsearch_index_config and elasticsearch_field_config in koha-conf.xml set to point to them.
index_config.yaml contains the configuration for the analysis chain used to process any search terms for indexing or searching. You may need to e.g. modify the settings of the ICU folding filter to preserve the correct sort order. See ICU Folding Filter documentation for an example of how to configure the index for Swedish/Finnish folding. See Analysis documentation for more information on how the analysis works in Elasticsearch.
field_config.yaml contains the field configuration used to set up the Elasticsearch index. There's normally no need to modify this file.
For any changes to these files to take effect, rebuild_elastic_search.pl will need to be run with the -d parameter that forces the index to be recreated.