Elasticsearch

From Koha Wiki

Jump to: navigation, search
Home > Documentation > Search Engine > Elasticsearch

Contents

Elasticsearch Support

This is a work in progress. Details are on Bug 12478. There is also an RFC-style document for high-level descriptions. A kanban-ish TODO list also exists.

We also have a page for technical detail to help you start working on it.

Goals/Status

Essentially, the goal of the short term is to allow zebra to be turned off and have things still work.

Functional Goals

Short term:

  • Replace OPAC biblio search with ES [done]
  • Replace OPAC authority search with ES [done]
  • Replace staff client biblio search with ES [done]
  • Replace staff authority search with ES [done]
  • Improve general integration (packages, realtime indexing, etc.)
  • Ensure that zebra still works properly [in progress]

Medium term:

  • Add a UI that makes it possible to edit the mappings in the staff client [in progress]
  • Add a browse interface [done]

Long term:

  • Ingest data from other sources into ES so Koha can search it natively
  • Any other neat ideas that are possible by having a flexible search engine

Non-functional Goals

Medium term:

  • Make code release-ready

Basic 'Get Started' Steps

Install elasticsearch

kohadevbox

If you are familiar with kohadevbox, you just need to run

 $ KOHA_ELASTICSEARCH=1 vagrant up

Manual install (master)

Have a Debian Jessie server.

Use the Koha community unstable repository, the Koha nightly unstable repository and jessie-backports (for openjdk-8) Add to /etc/apt/source.list

# jessie-backports
deb http://ftp.debian.org/debian/ jessie-backports main
# koha unstable
deb http://debian.koha-community.org/koha unstable main
# koha nightly unstable
deb https://apt.abunchofthings.net/koha-nightly unstable main


You will need apt via https

$ sudo apt install apt-transport-https
$ sudo apt update


Install openjdk-8-jre-headless

$ sudo apt install -t jessie-backports openjdk-8-jre-headless


Install Elasticsearch 5.x. Short story:

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list

Long story: https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html


Install the analysis-icu plugin

$ sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu


Install the koha-elasticsearch metapackage

$ sudo apt-get install koha-elasticsearch

If you're getting the following error during install

 Couldn't write '262144' to 'vm/max_map_count', ignoring: Permission denied

Try

 $ export ES_SKIP_SET_KERNEL_PARAMETERS=true
 $ apt-get install elasticsearch

Add a configuration like this to your koha-conf.xml:

 <elasticsearch>
     <server>localhost:9200</server>      <!-- may be repeated to include all servers on your cluster -->
     <index_name>koha_robin</index_name>  <!-- should be unique amongst all the indices on your cluster. _biblios and _authorities will be appended. -->
 </elasticsearch>

Note: you can change localhost for the hostname/IP of an external Elasticsearch server you might have running somewhere.

Configuration

Load installer/data/mysql/elasticsearch_mapping.sql into your database

Set the 'SearchEngine' system preference to 'Elasticsearch'

Cross fingers

Trigger indexing:

 sudo koha-shell kohadev
 cd kohaclone
 misc/search_tools/rebuild_elastic_search.pl -v -d

on a manual setup just run

 misc/search_tools/rebuild_elastic_search.pl -v -d

You might have to start Elasticsearch before you can do the indexing.

(*You may get an error "Can't locate Catmandu/Importer/MARC.pm"; make sure to use the package libcatmandu-marc-perl and DO NOT CPAN this module - it doesn't work)

Do a search in the OPAC and/or staff client, notice that it all works beautifully.

Personal tools