C & P Authority Control Improvements RFC

From Koha Wiki
Jump to navigation Jump to search

C & P AUTHORITY CONTROL IMPROVEMENTS RFC

Executive summary

C & P Bibliography Services has been retained to substantially overhaul the Koha authority system. In particular, the project covers the following components:

  1. additional authority functionality will be exposed in the OPAC
  2. authority import and export will be added to the staff client
  3. additional authority-based search options (such as "exploded" [broader, narrower, and/or related term] searching and "Did you mean?" suggestions based on authorities) will be implemented
  4. authority overlay and deduplication will be implemented
  5. a heading flipper script will be created

Description

The development is divided into 14 parts:

  1. A dropdown box will be added to the authority browser in the OPAC allowing the user to choose which index to search in authorities: the keyword index, the match heading index (i.e. all heading fields), the preferred heading index, or the main entry only index. (bug 8206)
  2. Because importing large authority files can result in unused authority records overwhelming used authority records in OPAC authority browse results, some libraries may want to disable the display of authorities which are not used in the bibliographic database. In order to make this possible (while still retaining the existing behavior of displaying all records in the authority file as a default), a system preference OPACShowUnusedAuthorities will be created, and a "Show all results" link added to search results when it is enabled. (bug 8205)
  3. Because see also references in authority records are intended to refer to other authority records, they really ought to be turned into hyperlinks in the various authority displays. They will be made into hyperlinks (using the authid in $9 if available, or a search string otherwise) in both the OPAC and the staff client. (bug 3462)
  4. At present, the only authority record view available in the OPAC is an "expanded MARC" view which is of no use to patrons, and limited use to librarians. We propose adding a user-friendly authority details view similar in design to the bib details view. (bug 8204)
  5. Although the batch import functionality was originally designed with importing both bibliographic and authority records in mind, authority import was never actually implemented. This will be done quite simply, by adding an option to choose which type of record to import in the Stage MARC records tool. (bug 2060)
  6. Match points will also be extended for use with authority records, again completing a feature for which stubs already existed. (bug 7475)
  7. There is no way to export authority records other than via a direct SQL query, which makes it difficult for hosted libraries to get their data out of Koha without the help of their support vendors. We propose to change the "Export bibliographic/holdings" tool to the "Export data" tool, and add a tab to enable librarians with the proper permissions for the export tool to export their authority records directly from the staff client. (bug 8202)
  8. Some librarians have requested the ability to save individual authority records, so a "Save authority" button will be added next to the "Edit" button when viewing authority records. (bug 8203)
  9. One of the great promises of DOM indexing in Zebra (and the adoption of solr as an indexing technology) is the ability to incorporate alternate forms of headings into bib records when exporting bibs for indexing. We will add the plumbing for arbitrary record filters, and implement a filter that checks each heading in bib records for alternate forms, and includes them prior to exporting the record for indexing. (bug 7417)
  10. Instead of using only textual strings for see also links, the ability to use $9 in see also fields in authority records (5xx in MARC21, NORMARC, and UNIMARC) will be added, to simplify heading disambiguation. This will enable us to create a value builder plugin using the existing thesaurus functionality that will automatically populate the see also heading control fields with metadata about the link (broader term, narrower term, etc.) (bug 8207)
  11. Right now when cataloging, catalogers have only two options for adding a new authority: turn on AutoGenerateAuthorities, and later edit the authority record to justify its creation, or save the bib record they are working on, create a new authority record in the authority module, wait for the new record to be indexed, and open the bib record back up to use the authority finder plugin. We intend to add a "Create authority" button to the authority finder plugin which will allow the cataloger to create an authority in a new window and (similar to fast add in circ) automatically fill out the selected heading field in the bib record. (bug 8208)
  12. At times, people may want to search for terms that are related to the one they search for. For example, someone interested in books about Feet may also be interested in all books about specific parts of feet, including books about Toes, Heels, etc. We will add three new special pseudo-CCL search prefixes which will first search the authority file for the specified heading, then search not just for that term but also for broader terms (pseudo-CCL prefix su-br:), narrower terms (pseudo-CCL prefix su-na:), or all related terms (pseudo-CCL prefix su-rl:). (bug 8211)
  13. Because patrons may not always realize that they want to search for a related term instead of the term they entered, we will also add a "Did you mean?" feature to the OPAC, which suggests terms related to authorities that match the user's search. (bug 8209)
  14. Although not all authority records contain useful information, many do, and depending on the type of collection, access to that information can be critical to both librarians and patrons. In order to simplify looking up authorities referred to in particular records, a link from subject headings to their related authorities will be added to the OPAC. A patch which adds this to the "normal" mode bib details display has already Passed QA (bug 5888), so unless an unsolvable problem is found with that patch, implementation will focus on the XSLT view. (bug 8210)

Technical addendum

In addition to the features described above, one of our goals with this development is to move forward the migration from the C4 to Koha namespace and pave the way for incorporation of the solr search code developed by BibLibre into Koha 3.10, or, at the latest, 3.12. To that end, we also propose to do the following:

  1. Building on the work in bug 7430, continue the move of Zebra search implementation code to Koha::Search::Engine::Zebra (or wherever the solr taskforce has identified is the best place for that functionality) from its places in C4::Search, C4::AuthoritiesMarc, C4::Biblio, and elsewhere in C4. In particular, the following functionality will be moved:
  • C4::AuthoritiesMarc::SearchAuthorities
  • C4::Search::getRecords
  • C4::SimpleSearch
  • C4::Heading and C4::Linker will be moved to the Koha:: namespace
  • The beginning of an object-oriented Koha::Record::Authority class will be written that exposes authority functionality in an object-oriented fashion will be developed, and unmigrated code in C4::AuthoritiesMarc adjusted to use that object where possible (note: see below for my proposal on the layout of the Koha:: namespace)

Koha:: namespace

One issue that has been under discussion in the Koha community has been the layout and organization of the Koha:: namespace. I propose the following partial schema:

+ Koha::Cache - object-oriented class for caching data in Koha
|
+--- Koha::Cache::Fastmmap - mmap driver class for Koha caching
|    (implemented in bug 8092)
|
+--- Koha::Cache::Memcached - memcached driver class for Koha caching
|    (implemented by bug 7248)
|
+--- Koha::Cache::Memory - in-process driver class for Koha caching
|    (implemented in bug 8092)
|
+--- Koha::Cache::Memory - in-process driver class for Koha caching
     (implemented in bug 8092)

+ Koha::Calendar - object-oriented class for handling branch opening
  calendars

& Koha::Context - exporter class for basic configuration information
  (note: a better option might be Koha::Koha as an object-oriented
  alternative to Koha::Context; C & P will implement the start of
  the latter, if that is the consensus of the community)

& Koha::DateUtils - exporter shim class to ease migration to DateTime
  from date-only strings

# Koha::Filter - virtual parent class for all record processor filters
|
+--* Koha::Filter::[metadata schema]::* -
   | object-oriented classes extending Koha::RecordProcessor::Base
   | which implement particular record processing functionalities
   \
    +--- Koha::Filter::MARC::EmbedItems - filter for embedding items in
    | bib records for the indexing process
    |
    +--- Koha::Filter::MARC::EmbedSeeFromHeadings - filter for embedding
         see from headings in bib records for the indexing process

+ Koha::Heading - object-oriented class representing
| authority-controlled headings
|
+--- Koha::Heading::MARC21 - object-oriented MARC21 heading handler for
|    Koha::Heading
|
+--- Koha::Heading::UNIMARC - object-oriented UNIMARC heading handler
     for Koha::Heading

+ Koha::Indexer::Utils - indexer utility functions (implemented by
  bug 7818)

# Koha::Linker - virtual parent class for heading-authority linker
| modules
|
+--* Koha::Linker::* - object-oriented linker modules for linking
   | headings to authorities
   \
    +--- Koha::Linker::Default - default linker module
    |
    +--- Koha::Linker::FirstMatch - linker module that always selects
    |    the first match it finds
    |
    +--- Koha::Linker::LastMatch - linker module that always selects the
         last match it finds

# Koha::Record - virtual parent class for all types of records handled
| by Koha
|
+--- Koha::Record::Biblio - object-oriented class representing
|    bibliographic records
|
+--+ Koha::Record::Authority - object-oriented class representing
|  | authority records
|  |
|  &--- Koha::Record::Authority::Handler::MARC21 - exporter class
|  |    with MARC21-specific routines for Koha::Record::Authority
|  |    (should be used only by Koha::Record::Authority)
|  |
|  &--- Koha::Record::Authority::Handler::UNIMARC - exporter class
|       with UNIMARC-specific routines for Koha::Record::Authority
|       (should be used only by Koha::Record::Authority)
|
+--* Koha::Record::Holdings - hypothetical object-oriented class
|    representing holdings records
|
+--- Koha::Record::Item - object-oriented class representing item
     records

+ Koha::RecordProcessor - object-oriented class for processing records
  using various filters
 
& Koha::Search - exporter class offering access to searching
| functionality via Koha::Search::Engine
|
+--+ Koha::Search::Engine - object-oriented class which dispatches
|  | calls into the specific search engine module
|  |
|  +--- Koha::Search::Engine::Solr - class which interfaces with solr
|  |    for searching
|  |
|  +--- Koha::Search::Engine::Zebra - class which interfaces with Zebra
|       for searching
|
+--# Koha::Search::Plugin - virtual parent class for search plugins
|  |
|  +--- Koha::Search::Plugin::* - plugins implementing particular search
|       functionality
|
+--+ Koha::Search::Query - object-oriented class which generates queries
   |
   +--- Koha::Search::Query::Solr - class to generate Solr queries
   |
   +--- Koha::Search::Query::Zebra - class to generate Zebra queries

+ Koha::Template::Plugin::* - plugins for Template Toolkit

& Koha::Utils - exporter class for utility functions required by many
| parts of Koha (no dependencies other than Koha::Context)
|
&--- Koha::Utils::Authorities - exporter class for utility functions
|    related to authorities that do not act on an individual authority
|
&--- Koha::Utils::Biblios - exporter class for utility functions
     related to bibliographic records that do not act on an individual
     biblio

Legend:
+       indicates an object-oriented class (or classes, in some cases)
+---    indicates an object-oriented class that is further down in the
        hierarchy (may or may not be a subclass)
+--+    indicates an object-oriented subclass that is extended
#       indicates a virtual object-oriented parent class
+--#    indicates a virtual object-oriented parent class which is
        further down in the hierarchy
&       indicates an exporter class
&---    indicates an exporter subclass
*       indicates a hypothetical or example class
\       indicates that the class(es) below this symbol provide (a)
        specific example(s) of the hypothetical or example class above
        the symbol

NOTE: I do not propose to implement all of this as part of the authority
project. C & P's authority development will create the following classes
in the Koha:: namespace:
+ Koha::Filter
+ Koha::Filter::MARC::EmbedSeeFromHeadings
+ Koha::Heading::*
+ Koha::Record::Authority::*
+ Koha::RecordProcessor
+ Koha::Utils::Authorities (possibly)