A compiler for formal metadata
This program is a compiler to parse formal metadata, checking the
syntax against the FGDC Content Standard for Digital Geospatial Metadata and
generating output suitable for viewing with a web browser or text
editor. It runs on UNIX systems and on PC's running Windows 95, 98,
or NT. MP generates a textual report indicating errors in the metadata,
primarily in the structure but also in the values of some of the scalar
elements (i.e. those whose values are restricted by the standard).
The compiler, its source code, executables for UNIX (DG/UX, HP/UX, IRIX,
and Linux) and Microsoft Windows 95, 98, and NT, and its
own formal metadata are available through
<http://geology.usgs.gov/tools/metadata/>
Usage
Basic usage is
mp [options] input_file
where input_file is the name of a text file containing metadata
encoded as described in the encoding format
document or in SGML conforming to a specific
Document Type Definition (DTD). These command-line options are available:
-c cfile obtains configuration information from cfile
-e efile directs syntax errors to efile
-t tfile creates text output in tfile
-h hfile creates html output in hfile
-f ffile creates FAQ-style html output in ffile
-s sfile creates sgml output in sfile
-d dfile creates DIF output in dfile
Syntax error messages indicate the nature of discrepancies between
the input file and the standard, and the line numbers of the relevant
elements in the input file. If -e efile is not specified,
syntax errors are written to stderr, which is usually the console
(for MS-DOS) or the terminal from which the compiler is launched.
Input
Since the FGDC Content Standard for Digital Geospatial Metadata, as the
name implies, specifies only the contents of metadata files and not
their encoding, it was necessary to devise a
specification for metadata encoding in order to develop and use this
compiler. The encoding format is purely textual and the fidelity of the
compiler to this format is fanatical.
Note: mp does not read word-processor documents, it only reads ASCII text, SGML, and XML!
Output
- Text output, if requested, follows the
encoding format. This provides a check of the compiler; any such
program should be able to reproduce its input without significant loss of
information.
- HTML output, if requested, uses descriptive lists to arrange the elements
hierarchically. The HEAD element of the metadata contains
META elements corresponding to the Dublin
Core.
- FAQ-style HTML output, if requested, uses the general arrangement
of information found in Metadata in Plain Language
and re-expresses the metadata in a manner that is easier to read. This
format is not parseable in subsequent software processing. To see how
mp writes standard metadata elements in this output format, consult
Dublin core elements are added to the HEAD
element as META tags.
NOTE: mp now provides in its HTML output a link to each of
the other output formats that you requested when running mp. These
links are relative to the current directory by default, and will work
correctly when someone retrieves a metadata record directly through
a web server. However, HTML metadata records retrieved through the
Clearinghouse gateway interface come tagged with the URL of the gateway,
consequently these links will not work by default with HTML records
found through the gateway interface. To make these links work without
regard to the retrieval method, place a BASE tag into the
HEAD element of the output HTML code. As you might guess,
mp can do this for you, but it needs to know the URL where your
metadata will be available as web pages. It gets this information
from a config file entry as follows:
output:
html:
base: URL
So if your web site has a URL like
http://www.our-data.org/metadata/
that will contain your metadata records, put this into your config file:
output:
html:
base: http://www.our-data.org/metadata/
Obviously you have to use the -c config_file command line option
for mp, substituting for config_file the name of the actual config
file you'll be using.
- SGML output uses the eight-character tags proposed by
the FGDC Clearinghouse Working Group. The SGML output is designed to work
with a Document Type Definition (DTD) that I have
developed and tested.
- XML output uses the eight-character tags given in the 1998
version of the CSDGM. The XML output is designed to work with a
Document Type Definition (DTD) that
I have developed and tested.
- Directory
Interchange Format (DIF) output will require editing to fix
inconsistencies between the DIF and FGDC metadata standards, and to add
information required by DIF that is not clearly identified in the FGDC
scheme, such as Entry_ID.
Acknowledgements
Questions, comments, and suggestions are welcome. A number of people have
assisted me with performance and portability issues, including
Chuck Denham (USGS),
Mark Graves (US Army),
Sol Katz (BLM),
Tom McCulloch (USGS),
Eric Miller (OCLC),
Doug Nebert (USGS),
Tom Northcutt (NASA),
Barbara Poore (USGS),
Chuck Stein (Mirror Imaging), and
Susan Stitt (NBS).
Technical contact:
Peter N. Schweitzer
Mail Stop 918, National Center
U.S. Geological Survey
Reston, VA 20192
Tel: (703) 648-6533
FAX: (703) 648-6560
Email: pschweitzer@usgs.gov
This file is <http://geology.usgs.gov/tools/metadata/tools/doc/mp.html>
Last updated 29-Apr-1999