GeMS — Content

Overview

Geologic maps are rich in semantic content, fundamental aspects of which must be preserved in a database for it to be useful. Such fundamental content appears in magenta highlight bold type in the outline below. Items in black bold type are secondary or supporting content. Digital forms for both primary and secondary content are further discussed below where the GeMS schema is described in detail (clicking on the content name will take you to the digital form description). Items in black italic bold type are secondary or supporting content for which a digital form is not specified herein.

Map Graphic—Consists of some or all of the following:

Base map—Graphical elements that provide spatial and cultural context for map data at a certain scale (for example, shaded relief, topographic and (or) bathymetric contours, drainage features, township and range grid, transportation network, cultural features, place names); although not required, inclusion of a base map is strongly recommended.
Map-Unit Polygons—Polygons that cover the map area with no gaps or overlaps; may include areas of open water, permanent snowfields and glaciers, and unmapped areas.
Contacts and Faults—Lines that, with a few exceptions, bound and separate map-unit polygons.
Other elements that are present as needed to record significant map content:

Orientation Data—Point features that record measurements of the orientation of rock fabrics (for example, bedding, foliation, paleocurrents, magnetic anisotropy).
Overlay Polygons—Areal features that overlie, underlie, are within, or cut across mapunit polygons (for example, shear zones, alteration zones, areas of artificial fill, surface projections of mined-out areas); need not conform to topological rules that constrain map-unit polygons; and are commonly represented by patterned overlays whose boundaries are not drawn.
Map-Unit Lines and Points—Line or point features that represent map units that are too small to show at scale of map graphic.
Miscellaneous Lines—Line features not categorized above (for example, dikes and sills, marker beds, traces of fold hinges, facies boundaries, structure contours, isograds, crosssection lines).
Miscellaneous Point Data—Point features that record, for example, sample localities for geochronologic and chemical analyses, fossil localities, prospect locations, and displacement (fault-slip) measurements.

Cross Sections, as appropriate—Each cross section has elements that are analogous to those of the map (listed above), except that the base map is replaced by a topographic profile.
Correlation of Map Units (CMU)—Diagram that depicts the ages and relations of (and among) map units; may include headings and grouping brackets; sometimes shows nonconformities or additional data such as radiometric ages; is not intended to be a true stratigraphic section or column; usually does not include areas of open water, permanent snowfields and glaciers, or unmapped areas.
Symbolization for all of the above, including the following:

Fills (Colors and Optional Patterns) for map-unit polygons.
Fills (Patterns) for overlay polygons.
Line Symbols and (or) Point Symbols for map-unit areas too small or narrow to show as polygons at map scale.
Text labels (or, in some cases, labels and leaders) for some (but not necessarily all) polygons.
Line Symbols Having Various Colors, Weights, and Patterns (Dashes, Dots)— Some line symbols contain repeated ornaments (for example, thrust teeth, queries), and some have single (nonrepeating) ornaments.
Point Symbols—Note that not all points in the database need be symbolized.
Font Files—If needed for point symbols or line-symbol ornaments.
Text labels (or, in some cases, labels and leaders) for some lines and groups of lines and for some points (for example, fault names, well identifiers).

Description of Map Units (DMU)—Contains labels, names, ages, and descriptions for each map unit shown on the map or in a cross section:

A DMU contained on a map sheet will show, in addition to the map-unit label, the area-fill symbol (color and sometimes a pattern) used to depict the unit on the map
In some map reports, the DMU is not printed on the map sheet but, instead, is contained in an accompanying pamphlet; in these DMUs, typically only the map-unit label is shown, without color or pattern.
DMUs can be strongly hierarchical, using headings (for example, “Unconsolidated Deposits”) and paragraph styles (font size and style, indentation) to denote each unit’s position in the hierarchy
DMUs may include bracketed headnotes (for example, “[Modified from Booth and others (2009)]”).
DMUs sometimes contain units that are not shown on the map or cross section (for example, a formation that is mapped entirely as its constituent subunits); such units will lack a map-unit label and an area-fill color or pattern in the DMU (because they are not depicted on the map), but, in order to fully communicate stratigraphic relations, their names and ages are listed in the DMU and their positions in the hierarchy are indicated.
DMUs typically do not describe areas of open water, permanent snowfields and glaciers, unmapped areas, and certain geologic overlays.

List of Map Units (LMU), if applicable—A distillation of the DMU, placed on the map sheet as a key to help identify map units when the DMU is contained in an accompanying pamphlet; an LMU contains the same map-unit labels, names, and ages as does the DMU but not the full map-unit descriptions; an LMU also uses the same hierarchical elements (headings, type styles, indentations, and so on) as the DMU.
Explanation of Map Symbols—Contains descriptions of line symbols, point symbols, and overlay patterns but (typically) not symbols for base-map features; map-symbol explanations can be hierarchical.
Miscellaneous Map-Collar Information—Consists of information such as report title, author(s), date of publication, publisher, series name and series number, map scale, various credit notes (for example, “mapped by”, “edited by”, “GIS database by”, “cartographic production by”), and base credit note (for example, specification of spatial reference framework or projection, source[s] and scale[s] of base map); content may vary from agency to agency, from one mapper to another, and from one map to another.
Discussion.
References Cited.
Figures(s), if applicable.
Tables(s), if applicable.
Additional map(s), if applicable—For example, sources of map data, distribution of facies in the Cambrian.

It is worth noting that, in general, populating the database by assigning values to feature attribute fields is the responsibility of the database creator. In some cases, this may require manual entry of individual attribute values for each feature. However, in many cases, such assignments can be made in bulk or can be scripted, as noted below (see, for example, the scripts GeMS_SetSymbols_Arc10.py and GeMS_SetPlotAtScales_Arc10.py; both of these, or their replacements, are available at https://github.com/usgs/GeMS_Tools and at https://github.com/usgs/gems-tools-pro).

Extensions to Traditional Geologic Map Content

GeMS includes several extensions to traditional geologic map content. The following three are required: (1) a glossary of terms, (2) a simple classification of map-unit materials, and (3) certain feature-level metadata. Optional extensions may include supplemental standardized lithologic descriptions of map units and a simple table of miscellaneous map information.

(1) Glossary of Terms

Many published digital geologic map databases, and many paper geologic maps as well, provide few, if any, definitions for the technical terms used to name and describe map features. Some producers of geologic map databases have remedied this by providing formal Federal Geographic Data Committee (FGDC) metadata, which can contain detailed entity and attribute descriptions that encapsulate definitions (and definition sources) for these terms. Unfortunately, such metadata can be difficult to access and nearly impossible to relate automatically to the relevant features in a database.

GeMS implements a Glossary table that, for certain database fields, lists the terms that populate these fields, along with their definitions and the sources for these definitions. Terminology used in these database fields must be defined in this Glossary table. Contents of the Glossary table may be accessed by means of a database join (or relate) that is based on the term itself. Formal metadata for a feature class or table can then reference the Glossary table for definitions and definition sources; therefore, relisting these definitions within detailed entity and attribute metadata is not necessary. Terms used only in the Description field of a DescriptionOfMapUnits table need not be defined.

If populating the Glossary table seems excessively laborious, consider that, once terms are defined in a Glossary table, they are readily available for display within the map, either on screen or in print. Furthermore, they can be easily searched for and extracted for use in other publications. In most cases, definitions in the Glossary table can be copied directly (or paraphrased) from standard reference sources (for example, American Geosciences Institute’s [AGI’s] “Glossary of Geology (5th ed., revised)”; Neuendorf and others, 2011), with appropriate attribution, or from preexisting GeMS Glossary tables, with minor adjustments or amendments. Although building Glossary tables for the first few maps produced by a workgroup will be a significant effort, subsequent Glossary tables ought to become much easier to develop as content from previous Glossary tables is reused.

(2) Classification of Geologic Materials

A DMU conveys essential information about each map unit and is a cornerstone of the GeMS design. However, descriptions in a DMU can vary greatly in their content and format, and they commonly use specialized terminology that may be unfamiliar to the nongeologist. In addition, terminology may be used inconsistently from map to map (usually for valid reasons). Over time, many attempts have been made to organize and standardize descriptions of geologic materials, with the goals of improving our abilities to make regional compilations and to more effectively convey geologic information to the public. Of necessity, such efforts are compromises that only partly describe the nearly infinite variety of map-unit ages, compositions, textures, modes of genesis, and appearances.

In 2004, the North American Geologic Map Data Model Steering Committee (see https://ngmdb.usgs.gov/www-nadm/)— sponsored by the USGS, the Association of American State Geologists, and the Geological Survey of Canada—defined a general, conceptual data model for geologic maps, as well as a science language for describing various characteristics of earth materials. Their summary report on science language (North American Geologic Map Data Model Steering Committee Science Language Technical Team, 2004) presented classifications that have since been evaluated and adapted for many purposes; for example, the IUGS-CGI⁸ Geoscience Concept Definitions Working Group incorporated that work into a limited set of lithology categories (“SimpleLithology”) for use in GeoSciML⁹ interchange documents (see http://resource.geosciml.org/def/voc/).

A similar list of terms (“StandardLithology”) accompanied the initial release of NCGMP09 (version 1.0, October 2009). As with SimpleLithology, the StandardLithology list of terms was designed to be used with companion lists of proportion terms or values to encode the relative amounts of potentially numerous lithologies that could be found in each map unit. This approach encouraged multiple lithology entries for a map unit, thereby allowing detailed descriptions of map units. However, the level of effort required to populate StandardLithology was judged by many reviewers of version 1.0 to be too high. Therefore, this approach was abandoned for version 1.1 of NCGMP09 (and for GeMS).

Nevertheless, we remain convinced that standardized terminologies are beneficial, largely because of their potential to do the following:

Allow more uniform portrayal of rock and sediment types across multiple maps.
Facilitate queries for the presence of a particular rock type. For example, by using a hierarchical classification, both the queried rock type and its related rock types can be found. For example, if “lava flows” is queried, then both “felsic lava flows” and “igneous rock” may also be returned.

Bearing in mind the importance of providing the public a simple, systematic view of the Nation’s geology, NCGMP09 v.1.1 included, and GeMS retains, a simplified classification of earth materials that is based on general lithologic and genetic character. This classification (called GeneralLithology in NCGMP09 but renamed GeoMaterial in GeMS) applies a single term to each map unit, providing information that a nonexpert could quickly use to identify map units that contain similar materials. Although GeoMaterial is a required field in GeMS, its content is not intended to be a substitute for the more detailed and precise lithologic terminology that would be included in a DMU or in a more detailed and specialized controlled-term list; rather, its purpose is as stated briefly above (and also as discussed in detail in Soller, 2009).

The GeoMaterial classification, which has been developed empirically, is based on commonly occurring geologic materials. It serves to organize the many and varied geologic material terms found in the source maps, which represent a wide range of geologic conditions across the Nation, in order to provide a consistent means of displaying map-unit materials. It was initially developed for the NGMDB Data Portal, a prototype site (ca. 2008) intended to raise discussion with NGMDB partners in the state geological surveys regarding how to provide the public with an integrated view of regional-scale geologic maps, with links to the source-map information. Documentation of the original term list, including the rationale, was provided in Soller (2009). The GeneralLithology classification in version 1.1 of NCGMP09 was slightly modified from the original, and it has been further modified (slightly) on the basis of six years of evaluation and test implementation by the state geological surveys and USGS. For inclusion in this version (version 2, GeMS), it has been renamed GeoMaterial. Note that, given the lengthy period of evaluation, as well as the inherent challenges in modifying a controlled-term list (for example, updating previously published databases to a revised classification), we do not anticipate modifications to this list in the near future.

For some purposes, a single, standard earth-materials classification may not adequately address the geology of a given region in sufficient detail. Therefore, scientists may wish to attach a terminology to their research databases that is more detailed and structured than what is stipulated in GeMS. A more structured controlled-term list, for example, might be desired in order to query a database for minor lithologies within a map unit that are not adequately indicated, either by the map-unit name, its GeoMaterial term, or its Description in the DescriptionOfMapUnits table. In such cases, evaluation of the salary and programming costs versus the research and societal benefits of including supplemental data tables and vocabularies may motivate a mapping project to extend the GeMS schema; if so, we advocate using either an optional table of the geologist’s own design or the optional StandardLithology table (described in appendix 2).

Appendix 1 - Terms and definitions for the GeoMaterial and GeoMaterialConfidence fields.
GeoMaterials — terms and definitions from Appendix 1 of the GeMS documentation, in spreadsheet form.
GeoMaterialDict - A nonspatial table that provides definitions and a hierarchy for GeoMaterial names prescribed by the GeMS database schema.
GeoMaterial and GeoMaterialConfidence fields in a DMU.

(3) Feature-Level Metadata

All features in a geologic map database should be accompanied by an explicit record of the data source. Many features should also be accompanied by explicit statements of scientific confidence—for example, how confident is the author that a feature exists? Or that it is correctly identified? How confidently are feature attributes known? We recognize that these are challenging questions to which the field geologist may not be comfortable providing an answer, except in the most general sense. But we also recognize that geologic information commonly is used in a GIS in conjunction with other types of information (for example, cadastral surveys, road networks, pipelines) and that terms such as “accurately located” have a markedly different meaning for a pipeline or property line than for a geologic contact. Thus, in order to provide a general indication of the scientific (existence and identity) confidence and locational accuracy of geologic-map features, GeMS implements per-feature descriptions of scientific confidence and locational accuracy. For more discussion on this topic, please see Section 4 in the introductory text of the “FGDC Digital Cartographic Standard for Geologic Map Symbolization” (Federal Geographic Data Committee [FGDC], 2006, available online at https://ngmdb.usgs.gov/fgdc_gds/geolsymstd.php; see also U.S. Geological Survey, 2006). The following sections discuss the feature-level metadata required in GeMS.

Data Source (Provenance)

Typically, a single map database will have very few datasource records because many features will have identical sources. For a database composed entirely of new mapping, a single data source may be cited (“this report”). Some data elements will have compound sources: for example, geochemical analysis of a rock sample will typically have one source for the map location and stratigraphic provenance of the sample (that is, the field geologist’s reported data) and another source for the chemical analysis (that is, the geochemist’s reported data). In such cases, having multiple source fields in the data table would be appropriate (for example, LocationSourceID and AnalysisSourceID).

Locational Confidence (Spatial Accuracy)

Reported locations of geologic features and observation points commonly are uncertain for a number of reasons. For example, (1) there may be error in locating observation points because of global positioning system (GPS) errors or an imprecise base map, (2) positions of subtle features may be poorly known relative to well-located observation points, or (3) the locations of features may be known only by inference from locations of other features or observations. This uncertainty could be expressed as uncertainty in absolute location (geodetic accuracy); however, because uncertainty in absolute location often is unknown (especially with legacy data), because most users locate geologic features in relation to a base map, and because most spatial analyses of geologic map data are in relation to the base map or to other data in the same database, we’ve chosen to focus on uncertainty of location relative to the positions of other features in the database (for example, “How well located is this contact with respect to surrounding lithologic and strike-and-dip observations?”). With a sufficiently large database, this is equivalent to uncertainty in location relative to the base map.

We define locational confidence (contained in the database field LocationConfidenceMeters) as the combination of the positioning error of a known point relative to the base map (“How precisely do I know where I am?”) and the uncertainty in location of a geologic feature relative to that known point (“How precisely, relative to where I am, can I place this contact?”). For a well-exposed, sharp contact, the second factor would be zero, and the locational confidence becomes equivalent to the positioning error.

This usage differs from that advocated by Section 4.2 in the introductory text of the FGDC cartographic standard (FGDC, 2006), which suggested that spatial accuracy be expressed using the following three attributes: (1) locatability (values of “observable”, “inferred”, or “concealed”); (2) zone of confidence (value of a distance [for example, a value equivalent to one-twenty-fifth of an inch at map scale], which may or may not be the same for all parts of a map); and (3) positioning (values of “within zone of confidence” or “may not be within zone of confidence”). We have departed from the recommendation in the FGDC cartographic standard in order to create databases that are simpler to understand and are less dependent upon visualization scale. In addition, we believe our approach is more informative because the FGDC cartographic standard does not include guidance for quantitatively recording how precisely a feature is located if it is not positioned within the zone of confidence.

For point features, LocationConfidenceMeters should be reported as the estimated radius (in meters) of the circle of uncertainty around the point location. For line features, it should be the half-width of the zone within which a line is asserted to be located. Values of LocationConfidenceMeters are recorded as floating-point numbers because they are real, measurable quantities, not because they are precisely known. Table 1 shows an example picklist of values for the LocationConfidenceMeters field that may provide insight into how this field may be populated. Note that the values in this picklist (table 1) are merely suggestions; use of other values for LocationConfidenceMeters certainly is acceptable. In situations where locational confidence changes along the length of a line, it may be best to split the line and assign different values of LocationConfidenceMeters to the different line segments.

Table 1. Example picklist of values for the LocationConfidenceMeters field.
[Abbreviations: DEMs, digital elevation models; GPS, global positioning system; m, meter(s); NAIP, National Agriculture Imagery Program]

Example value (m)	Comments
5	Appropriate for well-defined features located in the field by clear-sky GPS, by inspection of high-resolution topography (for example, 1- or 2-m-resolution lidar DEMs), or by inspection of large-scale, well-rectified digital orthophotographs (for example, NAIP images)
25	Reasonable for locations established by inspection of 1:24,000-scale maps, or for “accurately located” features digitized from 1:24,000-scale paper source maps
50	May be appropriate for some “approximately located” lines on 1:24,000-scale maps; other “approximately located” lines on the same map may have values of 100 m or more
100	Appropriate for “accurately located” features digitized from 1:100,000-scale paper source maps
250	Appropriate for “accurately located” features digitized from 1:250,000-scale paper source maps, or when a geologist, working at 1:24,000 scale, says, “My confidence in locating this feature is exceptionally low”

Values of LocationConfidenceMeters can be visualized with semitransparent, proportional-width symbols (similar to buffers in GIS jargon) in which the half-widths of the semitransparent symbols are equal to the LocationConfidenceMeters values assigned to the features. Such visualizations are powerful tools for evaluating the appropriateness of LocationConfidenceMeters values as a map is being prepared.

When new geologic map databases are created, we expect that geologists will assign values of LocationConfidenceMeters to individual features or groups of features on the basis of their knowledge of mapping procedures, field conditions, and the nature of specific features. Even with, for example, a factor-of-two uncertainty, such values assigned by the original mapper or author are preferable to a null value. When transcribing legacy maps, an experienced field geologist will commonly be able to estimate useful values for LocationConfidenceMeters. In rare cases, such estimation may not be practical, and, in such cases, traditional qualitative descriptors of line accuracy such as “contact, approximate” may be placed in the Type field; the meanings of such qualitative descriptors must be defined in the Glossary table. A null value (for example, value = −9; see discussion below) may then be assigned to LocationConfidenceMeters; note that this a numeric value, not a text string.

For certain types of lines (for example, most map boundaries), positions are calculated or assigned, not observed. For these lines there is no positional uncertainty, and LocationConfidenceMeters should be assigned a value of 0.0.

Scientific Confidence, Identity Confidence, and Existence Confidence

According to the FGDC cartographic standard (FGDC, 2006), scientific confidence may have either a single dimension or multiple dimensions. For a map-unit area, scientific confidence will have one dimension (that is, confidence that the map unit is correctly identified). In the case of faults, contacts, and other feature traces, the situation is more complex. For example, uncertainty may arise as to whether a boundary between two units is a contact or fault, or what kind of fault is mapped; in both cases, this uncertainty would be specified by an identity confidence value. In some cases, however, one may suspect (but not be certain) that a fault is present. Similarly, one may map features such as fold-hinge surface traces, dikes, and marker beds where their existence is suspected but not certain. These uncertainties would be specified by an existence confidence value such as “questionable”. Note that contacts rarely are mapped where their existence is uncertain; if different map units are identified, a boundary of some sort must exist between them, and so the identity of that boundary may be questionable, but not its existence.

GeMS includes ExistenceConfidence and IdentityConfidence fields for line feature classes and an IdentityConfidence field for polygon and point observation features. We discussed at length whether to combine these confidence concepts into a single ScientificConfidence field in the database, perhaps with four or six values that would allow for various combinations of existence and identity confidence, but we decided that it makes more sense to separate them, as is specified in the FGDC cartographic standard. In many situations, default values for an entire map area would be appropriate, as is noted elsewhere in this report; in other situations, perhaps tools to efficiently assign varying confidence values can be developed by the GeMS user community. We expect that symbolization will, in many cases, be assigned on the basis of the appropriate confidence terms and feature type.

For most databases, all values of ExistenceConfidence and IdentityConfidence likely will be either “certain” or “questionable” (see table 2), although GeMS allows values and definitions other than these. Values of ExistenceConfidence and IdentityConfidence must be defined in the Glossary table. For some digital transcriptions of legacy or paper geologic maps, it may not be possible for the transcriber to assign values of ExistenceConfidence or IdentityConfidence; in these cases, the value of “unspecified” should be used. Note, however, that if a reviewer during the review process encounters a value of “unspecified” in a database of new mapping, its use should be questioned.

Table 2. Example picklist of values for the IdentityConfidence field.
[Definitions modified from FGDC cartographic standard (Federal Geographic Data Committee, 2006), p. 16–17, A-iii]

Value	Definition
certain	Identity of a feature can be determined using relevant observations and scientific judgment; therefore, one can be reasonably confident in the credibility of this interpretation
questionable	Identity of a feature cannot be determined using relevant observations and scientific judgment; therefore, one cannot be reasonably confident in the credibility of this interpretation. For example, IdentityConfidence = “questionable” would be appropriate when a geologist reasons, “I can see some kind of planar feature that separates map units in this outcrop, but I cannot be certain if it is a contact or a fault"

Value

Definition

certain

Identity of a feature can be determined using relevant observations and scientific judgment; therefore, one can be reasonably confident in the credibility of this interpretation

questionable

Identity of a feature cannot be determined using relevant observations and scientific judgment; therefore, one cannot be reasonably confident in the credibility of this interpretation. For example, IdentityConfidence = “questionable” would be appropriate when a geologist reasons, “I can see some kind of planar feature that separates map units in this outcrop, but I cannot be certain if it is a contact or a fault"

Orientation Confidence

For orientation measurements (bedding, foliation, lineation, joints, etc.), it is useful to describe how accurately the orientation has been measured. For linear features (for example, fold axes, lineations), this error is usefully expressed as the radius of the error circle, similar to the alpha95 value often reported for paleomagnetic directions. For planar features (for example, bedding, foliation), the error is that of the pole to the plane. The OrientationPoints feature class includes an OrientationConfidenceDegrees field to record this uncertainty. Values of OrientationConfidenceDegrees are recorded as floating-point numbers because they are real, measurable quantities, not because they are precisely known.

Working with Multiple Feature Attributes

Some users of this schema will have had experience using databases in which features have a single attribute field (for example, LTYPE) and a single attribute (for example, “contact, inferred, queried”) that both defines the feature type and describes its locational and (or) scientific confidence. For these users, the use of multiple feature attributes (for example, Type and LocationConfidenceMeters for line features) prescribed herein for feature-level metadata may appear to require a significant increase in the amount of work needed to create a database. Although this document is not a vehicle for workflow suggestions, we note that simple modifications of existing workflows can greatly ease the workload of assigning multiple attributes to features. For example, features could be digitized and attributed with a single interim attribute value that could later be used to drive a script that assigns multiple attributes (for example, GeMS_AttributeByKeyValues_Arc10.py or its replacement, available at https://github.com/usgs/GeMS_Tools and at https://github.com/usgs/gems-tools-pro). In addition, feature templates in ArcGIS could be used to create features that have clusters of common attribute values. A workflow that tracks the genesis (that is, the initial creation) of features, perhaps by means of the DataSource attribute, may be useful for the bulk assignment of confidence attributes.

In some cases, default data source, scientific confidence, and locational accuracy values for an entire map area are appropriate. Although default values may seem meaningless, changes in default values from map to adjacent maps, as well as between geologic and other GIS layers (for example, pipeline routes), are likely to be informative to map users. As software tools evolve, we anticipate changing workflows that produce more detailed metadata.

Naming Database Elements

Standardized names for database feature datasets, feature classes, nonspatial tables, and fields that clearly convey their meaning are critical to a functional database design. Field names in GeMS have been chosen according to the following rules:

Field names convey content to the geoscientist, to the GIS practitioner, and to the public.
Long field names are acceptable and informative.
Field names are easy to code and to calculate.
Field names reflect data type.
Field names use uniform concatenation protocol (in this case, PascalCase; that is, the first letter of each word is upper case).
Field names do not exploit case sensitivity (note that case needs to be conserved because some languages and operating systems distinguish between “ThisName” and “thisName”).
Field names do not contain spaces or special characters.
Names that contain “_ID” (for example, TableName_ID, FeatureClassName_ID) are reserved for primary keys,¹⁰ which are maintained by the database creator (not the GIS software) and are used mostly to relate attributes stored in nonspatial tables to spatial features, as well as (optionally) to relate spatial features to additional, feature-specific attributes stored in other tables
Names that contain “SourceID” (for example, DataSourceID, LocationSourceID) are reserved for foreign keys¹¹ to the DataSources table.

Names for feature datasets, feature classes, and nonspatial tables follow similar rules. Note that we have chosen to not encode the publication identity (that is, the map name and series number) in such names. Although doing so would simplify the joint display of multiple publications in an ArcMap project (because each layer name would automatically include the publication identifier for the feature class), our choice to use the same names in each delivery database keeps the naming scheme simple and facilitates the coding and sharing of tools to manipulate databases. We note that layer names in an ArcMap project can be easily changed to reflect the source database if desired.

Transparent Identifiers

In the database, the identifiers for map units, line types, and point feature types should be transparent; in other words, they should all have obvious, easily understood meanings. In the MapUnitPolys feature class, the map-unit identifier (MapUnit) is a key to the DescriptionOfMapUnits table from various other tables, and it should correspond to, but may not be identical to, the label for that unit on the map (for example, Qal; see discussion of Labels below). However, some DMUs contain map units that are not symbolized either on the map, in the CMU, or in a cross section (for example, a formation that is mapped entirely as its subunits); these units will have null map-unit identifiers in the DescriptionOfMapUnits table, as will DMU headings and headnotes.

The type identifiers for lines and points are references to terms in the Glossary table, and, for this and other reasons, we recommend that these simply be the geologic term for the type of line or point feature represented. This is in contrast to a common practice that dictates that identifiers used as foreign keys in a database consist of numbers or text strings having no inherent or obvious meaning to users; these commonly are referred to as opaque identifiers. Although opaque identifiers may be more robust as foreign keys in a database, we assert that, to facilitate comprehension and use of information from a database, the use of transparent (that is, easily understood) identifiers is preferable. Note, however, that GeMS does not prohibit the use of opaque identifiers, particularly for primary key values (for example, TableName_ID).

File Formats

In principle, we encourage the use of open file formats because (1) open formats facilitate the writing and redistribution of third-party code; (2) open formats reduce the risk of losing data when the format becomes obsolete and unreadable (when open formats are superseded, documentation for them is likely to remain available); and (3) open formats are more likely to change in a measured fashion than are proprietary formats. Many in the geologic mapping community are still coping with the costs of the relatively rapid transitions from Arc/Info coverages to shapefiles to ArcGIS personal, and now file, geodatabases. However, our desire to endorse open file formats is overshadowed by our need to prescribe a database file format that preserves topology, allows long attribute names, and works well within ArcGIS; thus, we specify that spatial data be released in Esri’s file geodatabase (.gdb) format. To make geologic map data more widely available, we also specify that data be released in shapefile format. We look forward to wider implementation and use of text-based, application-independent delivery formats such as GeoSciML. However, a primary motivation for specifying a standard schema (GeMS) is fully satisfied through a proprietary format; more specifically, organizational efficiency is gained by managing a large number of GIS files in the same format. When that format is anticipated to become obsolete, bulk migration to a new format can be efficiently achieved.

Text not contained in the database should be stored as plain text (.txt), Web Markup Language (.htm/.html), Open Document Format (.odt, ISO/IEC 26300:2006 or its successor), .docx, or publication-formatted Portable Document Format (.pdf) files, and these files should be managed with the GIS data. Tables may be stored in a wide variety of text formats (.csv, .dat, .txt, dbf, .ods, or .xls) or as XML (.xml) files, which most modern database software can import. Note that the venerable dBASE III (.dbf) format, integral to Esri shapefiles, has been abandoned by most other software developers and so is unchanging and, thus, is a reliable choice; apparently, no published standard for the .dbf format exists, but documentation of the format is readily available. For raster images, the patent on the lossless LZW compression format (commonly used for .tif or .gif files) has expired, and patents that may have restricted use of the lossy JPEG compression format (resulting in .jpg images) have been found to be invalid; thus, the choice between .png (another lossless compression format), .tif, .jpg, and .gif files for raster images should depend on technical considerations. Vector, or mixed vector-raster, images can be stored as .pdf or .svg files.

Required, As-Needed, and Optional Elements of a Digital Geologic Map Publication

For a digital geologic map publication named “mapXYZ”, the publication package typically includes the elements listed in table 3. The digital data package is essential and shall include the files identified in tables 4 and 5. As-needed elements are mandatory only if they are part of the content of the map report; for example, if a figure 1 is on the map sheet or in the pamphlet, then a figure1.png file (or its equivalent) must be present in the digital data package. Optional elements may or may not be present, at the discretion of the author or publisher. These elements are further discussed in the sections that follow (and in appendix 2).

Table 3. Elements of a digital geologic map publication named “mapXYZ”.
[Abbreviation: FGDC, Federal Geographic Data Committee]

Filename	Comments
Required elements
mapXYZ.pdf	Map graphic (high resolution; publication quality)
mapXYZ-browse.png (or .jpg, .tif)	Browse graphic or thumbnail of map (should be a small file)
mapXYZ-metadata.xml	FGDC-compliant metadata for the overall map publication; additional inclusion of metadata in a more readable form (for example, .txt, .html) as a supplementary file is recommended
mapXYZ-gdb.zip	Zip archive containing the map database and other elements of the digital data package (see table 4 below for contents)
mapXYZ-open.zip	Open shapefile version of database (see “Shapefile Version of the Database” section below for contents)
As-needed elements
mapXYZ-pamphlet.pdf	Map pamphlet (fully formatted; publication quality)

Table 4. Required, as-needed, and optional contents of the zip archive (that is, the “mapXYZ-gdb.zip” file) that contains the database and other elements of the digital data package for a geologic map publication named “mapXYZ”.
[Abbreviation: FGDC, Federal Geographic Data Committee]

Filename	Comments
Required contents
mapXYZ.gdb (folder)	Folder that constitutes the map database
mapXYZ.mxd	ArcMap document stored using relative pathnames and including relevant macros
mapXYZ-metadata.xml	FGDC-compliant metadata for the overall map publication (copy of file referenced in table 3 above)
resources (folder)	Folder of digital resources that accompanies map database in a digital geologic map publication (see table 5 below for contents)
As-needed contents
mapXYZ-pamphlet.pdf	Map pamphlet (copy of file referenced in table 3 above)
mapXYZ-base.gdb (folder)	Folder with base-map data (required if not published elsewhere)
Optional contents
mapXYZ.pmf	ArcReader document

Table 5. Required, as-needed, and optional contents of the resources folder that accompanies the database in a digital geologic map publication named “mapXYZ”.
[Abbreviation: FGDC, Federal Geographic Data Committee]

Filename	Comments
Required contents
mapXYZ.style	ArcGIS .style file that contains the area, line, and point symbols used to symbolize the map. Must include all symbols specified in database. It is recommended that the .style contain a subset of the symbols in the FGDC cartographic standard; please see Resources on the GeMS website (https://ngmdb.usgs.gov/Info/standards/GeMS/) for a suggested master .style file. Note that a .style file is not necessary if Esri cartographic representations are encoded in the database itself
fonts (.ttf or .otf files)	All nonstandard font files that are used for special characters or are referenced by the line or point symbols in the .style file
As-needed contents
CMU (as .pdf or .png file)	Graphic representation of the Correlation of Map Units (CMU) diagram. Needed only if (1) CMU is present in report and (2) CMU is not encoded within the map database
figures (.png, .pdf, .jpeg, .gif, or .tif files)	Must be numbered as in report; may be included as individual files or gathered into a folder
tables (.dbf, .ods, .xls, or other file formats)	Must be numbered as in report; may be included as individual tables or gathered into a folder
Optional contents
DMU (as .pdf or .docx file)	Additional document for Description of Map Units (DMU) (fully formatted, including headings)

Citation

U.S. Geological Survey National Cooperative Geologic Mapping Program, 2020, GeMS (Geologic Map Schema)—A standard format for the digital publication of geologic maps: U.S. Geological Survey Techniques and Methods, book 11, chap. B10, 74 p., https://doi.org/10.3133/tm11B10.