Correcting Search of Arabic records

From Koha Wiki
Jump to navigation Jump to search

This page is based on this article article and [this conversation] ( to get the attached file) :

First, Install yaz-icu package:

sudo apt-get install yaz-icu

Go to more> Administration > Global system preferences > Searching.

Change UseICU to use (if the syspref exists).

Change QueryFuzzy to don't try.

Change QueryStemming to don't try.

goto /etc/koha/zebradb/etc/

Check and correct the first lines of default.idx wth this lines :

 # Traditional word index
 # Used if completenss is 'incomplete field' (@attr 6=1) and
 # structure is word/phrase/word-list/free-form-text/document-text
 index w
 completeness 0
 position 1
 alwaysmatches 1
 firstinfield 1
 icuchain words-icu.xml
 
 
 # Phrase index
 # Used if completeness is 'complete {sub}field' (@attr 6=2, @attr 6=1)
 # and structure is word/phrase/word-list/free-form-text/document-text
 index p
 completeness 1
 firstinfield 1
 icuchain phrases-icu.xml 

In the same folder modify or create words-icu.xml with:

 <icu_chain locale="ar">
   <transliterate rule="\'>\ "/>
   <transliterate rule="[:Number:] { '-' >  "/>
   <transform rule="[:Control:] Any-Remove"/>
   <tokenize rule="l"/>
   <transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/>
   <transform rule="NFD"/>
   <transform rule="[:Nonspacing Mark:] Remove"/>
   <transform rule="NFC"/>
   <transliterate rule="{ الا > ا "/>
   <transliterate rule="{ الأ > أ "/>
   <transliterate rule="{ الإ > إ "/>
   <transliterate rule="{ الآ > آ "/>
   <transliterate rule="{ الب > ب "/>
   <transliterate rule="{ الت > ت "/>
   <transliterate rule="{ الث > ث "/>
   <transliterate rule="{ الج > ج "/>
   <transliterate rule="{ الح > ح "/>
   <transliterate rule="{ الخ > خ "/>
   <transliterate rule="{ الد > د "/>
   <transliterate rule="{ الذ > ذ "/>
   <transliterate rule="{ الر > ر "/>
   <transliterate rule="{ الز > ز "/>
   <transliterate rule="{ الس > س "/>
   <transliterate rule="{ الش > ش "/>
   <transliterate rule="{ الص > ص "/>
   <transliterate rule="{ الض > ض "/>
   <transliterate rule="{ الط > ط "/>
   <transliterate rule="{ الظ > ظ "/>
   <transliterate rule="{ الع > ع "/>
   <transliterate rule="{ الغ > غ "/>
   <transliterate rule="{ الف > ف "/>
   <transliterate rule="{ الق > ق "/>
   <transliterate rule="{ الك > ك "/>
   <transliterate rule="{ الل > ل "/>
   <transliterate rule="{ الم > م "/>
   <transliterate rule="{ الن > ن "/>
   <transliterate rule="{ اله > ه "/>
   <transliterate rule="{ الو > و "/>
   <transliterate rule="{ الي > ي "/>
   <display/>
   <casemap rule="l"/>
 </icu_chain>

(note that searching for other locales will still work even if you have your locale set to ar)

If you are using packages, run:

 sudo koha-restart-zebra {yourinstance}
 sudo koha-rebuild-zebra -f {yourinstance}

(replacing {yourinstance} with the name of your instance)

If you are using a tarball installation or git installation (in which case you will need to change all the paths) execute now these two lines:

 /etc/init.d/koha-zebra-daemon restart
 /usr/share/koha/bin/migration_tools/rebuild_zebra.pl -b -r -v -w 

you may have to execute something this first:

 sudo bash
 export PERL5LIB=/usr/share/koha/lib
 export KOHA_CONF=/etc/koha/sites/iesh/koha-conf.xml

Congratulation, now you have Fully working arabic search engine ;)