ICU do not undiacritic

From Koha Wiki
(Redirected from ICU do not undiactric)
Jump to navigation Jump to search

This page explains how to configure Zebra search engine, with ICU option, to not undiacritic some characters.

How to

Goto Zebra configuration directory : etc/zebradb.

Edit etc/words-icu.xml (and etc/phrases-icu.xml if exists).

This line defines the action of separating the diacritic and the letter (for example "ê" => "^e") :

<transform rule="NFD"/>

You can configure some characters not being undiacritic. For example "å" :

<transform rule="[^å] NFD"/>

Reindex full.

Then searching with a will not match strings with å and searching with å will not match strings with a.


See also Correcting_Search_of_Polish_records and ICU_chains_configuration