ICU do not undiacritic

From Koha Wiki

(Redirected from ICU do not undiactric)
Jump to: navigation, search
Home > Documentation
Home > Documentation > Installation
Home > Documentation > Search Engine > Zebra
Koha > Technical > Administration

This page explains how to configure Zebra search engine, with ICU option, to not undiacritic some characters.

How to

Goto Zebra configuration directory : etc/zebradb.

Edit etc/words-icu.xml (and etc/phrases-icu.xml if exists).

This line defines the action of separating the diacritic and the letter (for example "ê" => "^e") :

<transform rule="NFD"/>

You can configure some characters not being undiacritic. For example "å" :

<transform rule="[^å] NFD"/>

Reindex full.

Then searching with a will not match strings with å and searching with å will not match strings with a.

See also Correcting_Search_of_Polish_records and ICU_chains_configuration

Personal tools