Table of Contents NAME mgintro++ - extended introduction to the MG system DESCRIPTION ...
Table of Contents

NAME

mgintro++ - extended introduction to the MG system

DESCRIPTION

This manual assumes the reader has already read mgintro(1).

Creating Different Databases
If a user wants to build databases other than for some predefined ones, such as "alice", "davinci", "mailfiles", "allfiles", then the user has a couple of choices. Ultimately (s)he must produce a text file with control-Bs terminating the documents. To do this one can produce one or more such files, or write a "get" command (typically in the form of a script or c program).

Using Input Files for mgbuild
If you don't want to write a "get" script and just want to use one or more text files as input, then you must first generate the file with control-Bs. For a simple example, you could take any text file(s) such as "test1.txt" and "test2.txt", and use vi(1) to insert control-Bs by typing "control-V b". Next you should create a file with "set" statements in the following form:

set pipe = 0 # do not use pipe - use file instead set input_files = `test1.txt test2.txt'

Let's call this file, "build_options". Now issue the command:

mgbuild -s build_options test

This should build a database called "test" in the $MGDATA directory, based on the source data of "test1.txt" and "test2.txt". The build_options file is simply sourced by mgbuild(1) after it has set up its variables. Therefore, any settings one makes in the build_options file will override the standard settings. See mgbuild(1) for more information.

Writing A Get Program
Instead of using files as input, it is often more convenient to write a "get" program. This program is called by mgbuild(1) to get the text data with control-Bs as document terminators. It should take three options: (i) -init; (ii) -text; (iii) -cleanup.
Get will be called with "init" first and with "cleanup" at the end. It will call get with "text" when it wants the text and it should write the text to stdout. See mg_get(1) for an example.

Regular Builds

The MG system provides a static database; there are no update commands. So if one wants to keep one's database reasonably up-to-date then one can have this done automatically on a regular basis by cron(1). A crontab file can be created using: crontab -e A crontab file contains lines of the form:

minute hour day-of-month month day-of-week shell-command.

See crontab(1) for more information.
An example crontab entry is:

15 02 * * * mgbuild allfiles >$MGDATA/allfiles/allfiles.log 2>&1

This will build up the mg database for "allfiles", your mail in the folders, every morning at 2:15am.

Command Structure
There are 22 commands that make up the mg system. However, a user may only need to be aware of a few: mgbuild(1), mgquery(1), and perhaps mg_get(1). Many of the commands are called by mgbuild(1). The commands can be broken up into a hierarchy.

-------------------------------------
MG--+--image compression | | +--mgbilevel | | +--mgfelics | | +--mgtic | | +--mgticbuild | | +--mgticdump | | +--mgticprune | | +--mgticstat +--text +--compression | | +--mg_passes -T1 | | +--mg_passes -T2 | | +--mg_compression_dict | | +--mg_fast_comp_dict +--indexing | | +--mg_passes -N1 | | +--mg_passes -N2 | | +--mg_perf_hash_build | | +--mg_invf_dict | | +--mg_invf_rebuild +--weights | | +--mg_weights_build +--query | | +--mgquery +--tools +--mg_invf_dump +--mg_text_estimate +--mgdictlist +--mgstat -------------------------------------

mgbuild(1)
calls the following commands:
mg_passes(1),mg_compression_dict(1)
mg_perf_hash_build(1),mg_invf_dict(1),mg_invf_rebuild(1) mg_weights_build(1)

SEE ALSO

mgintro(1), mgbuild(1), mg_get(1)
"Guide To The MG System", in Appendix A of the book:

Ian H. Witten, Alistair Moffat, and Timothy C. Bell Managing Gigabytes: Compressing and Indexing Documents and Images Van Nostrand Reinhold
1994
xiv + 429 pages
US$54.95
ISBN 0-442-01863-0
Library of Congress catalog number TA1637 .W58 1994.


Table of Contents