Oak Tree in the College Valley, Northumberland National Park

Metadata Quick Guide

This document describes metadata, the governmental standards which govern it, and its application to web pages. It is an introduction and overall guide only, and reference should be made to the published standards for authoritative information.

Definition

  • metadata is data used to describe the nature and content of resources, in order to allow third-party systems to search those resources
  • metadata can only work if disparate systems use the same metadata standards
  • for UK statutory bodies, the standard is the e-Government Metadata Standard (eGMS), part of the e-Government Interoperability Framework (eGIF), adoption of which, and compliance with which, is mandatory

Metadata is extra information which one adds to a resource in order to describe it more efficiently. A parallel example is a library catalogue entry for a book: the book is the resource, and the catalogue entry contains extra information about the book, including author, title, publication date, Dewey Decimal classification, and so on. The library catalogue then allows someone to search for a given book without having to read every book in the library. This is the role of metadata. It allows automated searches to find relevant resources without having to trawl entire contents of documents.

Description

  • eGMS contains many elements which refer to specific information about resources (eg TITLE, AUTHOR, DATE, CONTRIBUTOR)
  • some elements contain refinements which describe different aspects of each element (eg DATE.ISSUED, DATE.MODIFIED)
  • we give these elements values to describe the resource (eg AUTHOR = Tom Chadwin)
  • some elements require values in specific formats (eg DATEs must be formatted YYYY-MM-DD)
  • some elements must contain values picked from a controlled vocabulary (eg AUDIENCE)
  • these formats and vocabularies are called encoding schemes

The eGMS elements and refinements correspond to the fields of information in the library catalogue, such as title, author, and publication date. The values given to these elements and refinements must comply with encoding schemes so that automated searches always know, for example, what format a date is in. A well known example of a controlled vocabulary is the LGCL, whereby resources are categorized, so that an automated search knows the subject matter of that resource.

Application

  • to apply metadata to resources, we need to specify all elements which will provide useful information
  • for each of these elements, we must specify the element name and its value, and often the encoding scheme
  • some elements are mandatory (CREATOR, DATE, SUBJECT.CATEGORY, TITLE)
  • some elements are mandatory if applicable to the resource in question (ACCESSIBILITY, IDENTIFIER, PUBLISHER)
  • some elements are recommended (COVERAGE, LANGUAGE)

In order to comply with the eGMS, and hence with the eGIF, elements classified as mandatory must be included in every resource, together with elements classified as mandatory if applicable. However, in order to exploit the full potential of the eGMS, as many elements should be applied as possible, in order to provide as much information about the resource as you can.

For example, if some pages on your site are explicitly targeted at children, applying the element AUDIENCE and giving it the value "Children" will ensure that search engines can find those pages if someone searches for material suitable for children.

Web pages

  • while the eGMS applies to all information resources, the medium which can immediately benefit the most is the web
  • if web pages have eGMS metadata applied to them, governmental search engines will be able to classify those pages accurately
  • the nationalparks.gov.uk site already uses the eGMS in order to classify pages from our individual National Park web sites

Any Content Management System which claims eGIF compliance must provide the facility for adding eGMS elements to pages. It should also not allow pages to be published until all mandatory elements have been supplied, and it should enforce specified encoding schemes such as the LGCL (or its successor).

(X)HTML

  • in (X)HTML, metadata is written in the code of the resource itself
  • it is placed within the <head> section of the page
  • it is added using the existing (X)HTML entity <meta>

If you have non-CMS pages, it is a simple job to edit the code of the web page to add eGMS elements. Similarly, if your CMS is not eGMS-compliant, you can edit your page templates in order to include at least the mandatory elements.

Examples

  • <meta name="DC.title" content="Otterburn Range firing times">
  • <meta name="DC.date.valid" scheme="W3CDTF" content="2003-04-30/2003-06-01">
  • <meta name="eGMS.subject.category" scheme="LGCL" content="Explosives and fireworks">

References

© Northumberland National Park Authority, Eastburn, South Park, Hexham, Northumberland, NE46 1BS, United Kingdom
Tel: +44 (0)1434 605555 Fax: +44 (0)1434 611675 Email: enquiries@nnpa.org.uk