Explaining Plugins

Explaining Plugins

From GreenstoneWiki

An outline of program flow when using import.pl for developers writing their own plugins:

import.pl calls the methods begin, read then end.

This starts at the import directory.

RecPlug handles directories, and will look through a directory to see what files are there. These files get passed to the plugin pipeline, first using metadata_read, then read.

  • The metadata_read method only gets called from RecPlug. (and MetadataCSVPlug)

All plugins inherit from BasPlug.

  • BasPlug inplements the metadata_read and read methods.
  • BasPlug read calls the process method.

Most plugins call the BasPLug read method, then do the format specific stuff using their own process method.

  • Some plugins override read.

Plugins can implement either read or process (or both). (Note to self - give examples)

Types of Plugin

There are two types, metadata and document plugins.

  • no distinction currently made between them by the system
  • All plugins process files, some
    • group several files into one document,
    • some split one file into several documents, and
    • some have a one to one mapping.
  • Plugins that split files up generally inherit from splitplug.
  • Plugins that process XML files generally inherit from XMLPlug. (They don't need to, it just avoids rewriting all the same code.)

Operational summaries of example plugins

for basic information about plugins see Plugins Another good section is http://greenstone.sourceforge.net/wiki/index.php/More_about_plugins#How_do_I_get_my_XML_files_into_Greenstone.3F


  • takes comma separated (.csv) files, extracts metadata (using the metadata_read method)
  • assigns metadata to the documents which are then processed by their normal plugin.
  • The first line is a list of metadata names
  • subsequent lines, one per record, contain the values.
  • Requires a filename field which contains the file name of the document to which the record metadata will be assigned.


  • uses the imagemagic utilities to
    • create derivatives(thumbnail images) and,
    • extract image metadata (width height format)
  • ImagePlug can easily be extended to extract more extensive image metadata if required


  • takes Refer format Bibliographies reads them in (using the process method)
    • assigns metadata and text with the add_utf8_metadata or add_metadata methods
    • assigns text with the add_text

NOTE on methods- order called

  • metadata_read: first to be called - usually by RecPlug - but also by MetadataCSVPlug
    • in RecPlug greenstone metadata.xml files are read by the metadata_read method
    • in MetadataCSVPlug a .csv text file with the first line containing field names is read by metadata_read
  • read: called after metadata read
  • process: called last?

Adding metadata

  • add_utf8_metadata adds metadata that is already in utf8
  • add_metadata converts to utf8 before adding metadata that is not already in utf8

- Thanks to Katherine Don for this text Which I have only edited slightly.