Upto: Table of Contents of full book

Dublin Core

Dublin Core was the first attempt to add metadata to the Web. It has been mandated by governments, and some HTML editors will add it. It has evolved from its first version.

Resources

Dublin Core

The Dublin Core was designed at a workshop in Dublin, Ohio in 1995. It consists of fifteen elements

  1. Title
  2. Creator
  3. Subject
  4. Description
  5. Publisher
  6. Contributor
  7. Date
  8. Type
  9. Format
  10. Identifier
  11. Source
  12. Language
  13. Relation
  14. Coverage
  15. Rights

Including Dublin Core in HTML

Dublin Core elements can be included in HTML as part of the HEAD section, using the META tag. These tags have two attributes: name and content, where the name is the name of the metadata element, and content is its value.

This is standard HTML, and anything can be included as metadata. So there needs to be some way of distinguishing Dublin Core metadata from any other metadata. This is done by prefixing each element by "DC." as in

<meta name="DC.Title" content="www.gov.au" />
      

Viewing Dublin Core metadata

When you use a browser to look at a Web site, you don't see whether or not it has any metadata. It is simply not shown. To view it in an ordinary browser, you have to view the page's HTML source. This can be done in e.g. Firefox by pressing the Control and simultaneously the letter 'u'. For the web site http://www.gov.au this shows in the HEAD section

<meta name="DC.Title" content="www.gov.au" />
<meta name="DC.Identifier" scheme="URI" content="http://www.gov.au" />
<meta name="DC.Subject" scheme="TAGS" content="Public administration; Government agencies; Government services; Federal government;
         State government; Local government; Access to information; Government information; Information services; Information systems;
          Australian Capital Territory; New South Wales; Northern Territory; Queensland; South Australia; Tasmania; Victoria; Western Australia" />
<meta name="DC.Description" content="www.gov.au is an Australian whole of government single point of access (portal) that 
          links to available Australian, State, Territory and Local government Internet entry points" />
<meta name="DC.Creator" scheme="GOLD" content="c=AU; o=Australian Government; ou=National Office for the Information Economy ;
          ou=Channel Development Branch" />
<meta name="DC.Publisher" scheme="GOLD" content="c=AU; o=Australian Government; ou=Australian Government Information Management Office" />
<meta name="DC.Language" scheme="RFC3066" content="en" />
<meta name="DC.Rights" content="Copyright Commonwealth of Australia 2007" />
<meta name="DC.Date.created" scheme="ISO8601" content="2007-10" />
<meta name="DC.Type.aggregationLevel" content="collection" />
<meta name="DC.Type.documentType" scheme="agls-document" content="homepage" />
<meta name="DC.Coverage.jurisdiction" content="Commonwealth of Australia; Australian Capital Territory; New South Wales; Northern Territory;
            Queensland; South Australia; Tasmania; Victoria; Western Australia" />
      

Values of attributes

Included in the fifteen elements is Date. Dates in particular are a minefield of options. US dates have month-day-year, while English dates have day-month-year and Chinese dates have year-month-day. And that doesn't even begin to add in hours, minutes, second, let alone variations such as timezones. The Wikipedia entry Date format by country lists some of the possibilities.

Such variation is not acceptable in an international context. It needs to be standardised. So, firstly we need standards for the elements themselves. The elements themselves are standardised by the URI http://purl.org/dc/elements/1.1 . This is actually a URL so we can follow it with a browser and that takes us to another site, http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements#H3 which is a section labelled "Section 3: Properties in the /elements/1.1/ namespace" in the "DCMI Metadata Terms" document. This is the document regarded as the definitive meaning and syntax of the Dublin Core elements.

Under the "Term Name: date" subsection is a description of the Date element. It has its own URI http://purl.org/dc/elements/1.1/date which is also a URL. Pointing your browser to that URL just brings you back to this place - that's all there is for information!

Under the description, it states "Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601" and refers to the URL http://www.w3.org/TR/NOTE-datetime which is "Date and Time Formats". The recommended format is a subset of ISO 8601. Basically it sets the date format as YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) or some leading part of this such as 1997-07-16 (16 July, 1997).

Languages are another source of variance. The document may be in Canadian French, for example. You wouldn't want to signal that by setting the Language to "Français canadien" - the 'ç' immediately brings in the problems of character set encodings: ASCII? not expressible; ISO 8859-1? yes, it is character code byte value 231; Unicode UTF8? yes, it is character code with two-byte value 50087. How do you know which one to choose?

To avoid these encoding issues, the Language is given in ASCII, using IETF RFC4646 Tags for Identifying Languages . In this, French is just "fr", while more specific forms such as "fr-Latn-CA" are possible. Of course, the document body itself will be in Canadian French, but the metadata in the language is in US ASCII.

A program to print Dublin Core attributes

This is not a programming book. But if you want to see the Dublin Core metadata for a Web page, you will have to run a program to show it, since the browsers don't make it easy. The linked program is in Python which runs on Linux, Windows and Mac. If you run it say by python printMeta.py www.gov.au it shows

Value of "DC.Title" is "www.gov.au"
Value of "DC.Title.alternate" is "Access to the information and services of the Australian, State, Territory and Local governments"
Value of "DC.Identifier" is "http://www.gov.au"
Value of "DC.Subject" is "Public administration; Government agencies; Government services; Federal government; State government; Local government; Access to information; Government information; Information services; Information systems; Australian Capital Territory; New South Wales; Northern Territory; Queensland; South Australia; Tasmania; Victoria; Western Australia"
Value of "DC.Description" is "www.gov.au is an Australian whole of government single point of access (portal) that links to available Australian, State, Territory and Local government Internet entry points"
Value of "DC.Creator" is "c=AU; o=Australian Government; ou=National Office for the Information Economy ; ou=Channel Development Branch"
Value of "DC.Publisher" is "c=AU; o=Australian Government; ou=Australian Government Information Management Office"
Value of "DC.Language" is "en"
Value of "DC.Rights" is "Copyright Commonwealth of Australia 2007"
Value of "DC.Date.created" is "2007-10"
Value of "DC.Type.aggregationLevel" is "collection"
Value of "DC.Type.documentType" is "homepage"
Value of "DC.Coverage.jurisdiction" is "Commonwealth of Australia; Australian Capital Territory; New South Wales; Northern Territory; Queensland; South Australia; Tasmania; Victoria; Western Australia"
      

The program is printMeta.py

Note that it only shows the Dublin Core metadata on an HTML Web page. Some sites make a link to their metadata - such as the Dublin Core site itself, which puts its metadata as an RDF document instead at http://dublincore.org/index.shtml.rdf . This isn't visible to this simple program.

URIs for Dublin Core

While the Dublin Core was first used as metadata in HTML documents, with the general growth in metadata usage it can be used in many other places. For most of these, the syntax through HTML META tags is no longer appropriate. Instead, the elements need to be described by URIs.

Well, we've already mentioned what the URIs are for the elements: for the Date element it is http://purl.org/dc/elements/1.1/date . Similarly the Title has URI http://purl.org/dc/elements/1.1/title and so on.

This section just repeats information from earlier. But it forms the bridge to using Dublin Core metadata in non-HTML contexts.


      
[an error occurred while processing this directive]