ICU do not undiacritic
Jump to navigation
Jump to search
This page explains how to configure Zebra search engine, with ICU option, to not undiacritic some characters.
How to
Goto Zebra configuration directory : etc/zebradb.
Edit etc/words-icu.xml (and etc/phrases-icu.xml if exists).
This line defines the action of separating the diacritic and the letter (for example "ê" => "^e") :
<transform rule="NFD"/>
You can configure some characters not being undiacritic. For example "å" :
<transform rule="[^å] NFD"/>
Reindex full.
Then searching with a will not match strings with å and searching with å will not match strings with a.
See also Correcting_Search_of_Polish_records and ICU_chains_configuration