Table of Contents NAME mgbuild - build an mg system database SYNOPSIS mgbuild
Table of Contents

NAME

mgbuild - build an mg system database

SYNOPSIS

mgbuild [ -c ] [ -g get ] [ -s source ] collection-name

DESCRIPTION

mgbuild is a csh script that executes all the appropriate programs in the correct order to completely build an mg(1) system database ready for queries to be made by mgquery. This program makes use of the mg_get(1) script to obtain the text of the collection.

OPTIONS

Options can occur in any order, but the collection name must be last.

-c
This specifies whether the get program is "complex". If a get program is "complex", then it requires initialisation and cleanup with the -i and -c options.

-g get
This specifies the program to use for getting the source text for the build. If no -g option is given, the default program mg_get(1) is used.

-s source
The mgbuild program consists of two parts. The first part initializes variables to default values. The second part uses these variables to control how the mg(1) database is built. This option specifies a program to execute between the first and second parts. The details of what the variables are, and how they may be changed, are in comments in the mgbuild program.

collection-name This is the collection name, as required by the mg_get(1) program. It serves both as a case statement selector, and as the name of a subdirectory that holds the indexing files.

ENVIRONMENT

MGDATA If this environment variable exists, then its value is used as the default directory where the mg(1) collection files are. If this variable does not exist, then the directory "." is used by default. The command line option -d directory overrides the directory in MGDATA.

FILES

*.invf
Inverted file.

*.invf.chunk
Inverted file chunk descriptor file. When the inverted file is created, it is written in chunks that use no more than a set amount of memory. This file describes those chunks.

*.invf.chunk.trans Word-occurrence-order to lexical-order translation file. The *.invf.chunk file is written in word-occurrence order but is required by -N2 to be in lexical order.

*.invf.dict
Compressed stemmed dictionary.

*.invf.dict.blocked The `on-disk' stemmed dictionary.

*.invf.dict.hash
Data for an order-preserving perfect hash function.

*.invf.idx
The index into the inverted file.

*.weight
The exact weights file.

*.text
Compressed documents.

*.text.stats
Text statistics.

*.text.dict
Compressed compression dictionary.

*.text.idx
Index into the compressed documents.

*.text.idx.wgt
Interleaved index into the compressed documents and document weights.

*.weight.approx
Approximate document weights.

SEE ALSO

mg(1), mg_compression_dict(1), mg_fast_comp_dict(1), mg_get(1), mg_invf_dict(1), mg_invf_dump(1), mg_invf_rebuild(1), mg_passes(1), mg_perf_hash_build(1), mg_text_estimate(1), mg_weights_build(1), mgbilevel(1), mgdictlist(1), mgfelics(1), mgquery(1), mgstat(1), mgtic(1), mgticbuild(1), mgticdump(1), mgticprune(1), mgticstat(1).


Table of Contents