Australian Government Linked Data Working Group

Showcase

The AGLDWG aims to communicate the benefits and technical aspects of Linked Data use in government. Here are Linked Data systems and datasets already implemented by Australian government agencies as well as a presentations by group members.

Linked Data Catalogues

Linked Data resources for which this group has allocated persistent identifiers are:

http://linked.data.gov.au/def/ - definitional items register (taxonomies, ontologies)
http://linked.data.gov.au/dataset/ - (Linked Data) datasets register
http://linked.data.gov.au/org/ - Organisations register. Those involved with Australian LD

This catalogue only lists operational Linked Data resources that have identifiers allocated as placeholders only. The catalogue itself is a Linked Data system and can be navigated via Linked Data link hopping.

Example Resources

Below are listed some:

Vocabularies - structured lists with definitional content
Ontologies - structured concepts and categories in a subject area or domain
Datasets - things containing Linked Data data. Not definitional content but actual individuals
Systems & Tools - things that deliver Linked Data content and support functions

Vocabularies

Much of linked Data relies on definitions, indeed the Semantic Web, which Linked Data is helping build, is predicated on strong definitions for web resources. Vocabularies, standardised in their structure and delivery according to Linked Data and Semantic Web principles provide online, look-up-able definitions for things which can be used much more easily and powerfully than older vocabulary tools such as (paper) dictionaries, tables on web pages or XML code lists.

Here are Linked Data vocabularies and also an Australian national catalogue of research vocabularies:

Registry Status vocab
Australian Governments' Interactive Functions Thesaurus (AGIFT) vocab
ANDS's RVA - a vocabulary portal

Ontologies

Ontologies are data models that express knowledge within a domain and are often more complex than vocabularies although vocabularies themselves are a form of ontology.

A great number of foundational, or fundamental, ontologies have been produced to cater for such broadly required concepts as time (TIME ontology) and simple authoring information Dublin Core and tracing changes to things over time (PROV-O, the provenance ontology) this Linked Data WG has produced an ontology to define properties for datasets within the data.gov.au catalogue.

The AGRIF Ontology
GeoSPARQL Extensions Ontology
Plot Ontology

Datasets

Here are some examples of Linked Data datasets. Yes, the list is small now but we will be adding to it very soon!

Australian Linked Data Cache
ACORN SAT
Linked Data GNAF
Geoscience Australia's Samples Register

Systems & Tools

There're many different systems that can claim to be "Linked Data Systems": really anything that helps supply Linked Data. Some of them are dedicated to Linked Data, such as RDF triplestores and Linked Data APIs, others facilitate Linked Data along with other functions, such as general website content management.

Below are a few examples of Linked Data systems currently in operation within Australian government.

AGLDWG's Persistent ID Service
pyLDAPI - LD API framework
pyLODE - new style ontology documentation tool

Australian Governments' Interactive Functions Thesaurus (AGIFT)

AGIFT is a vocabulary delivered by the National Archives of Australia that lists functions performed by government. The web page delivering AGIFT is a system that allows for both human and machine-readable versions of the vocabulary formalised using the SKOS ontology.

The system used is the commercial PoolParty product.

Australian National Data Service' Research Vocabularies Australia

The ANDS provides a vocabulary hosting service for Australian government and academic users

A search for the word 'rock' yields both individual terms ("Concepts") within vocabularies about rocks and whole vocabularies about them.

GA's vocabs in the RVA portal

AGLDWG PID URI Service

One of the core tasks of Linked Data is to uniquely and usefully identify resources - information items - on the web. This is usually done with URIs which are just a slight extension to web page URLs allowing for non-web page things to be linked to, e.g. vocabulary terms in machine-to-machine data formats.

The Linked Data WG recently used an advanced web proxy, the PID Service, but has recently migrated its efforts in redirecting URLs to URL rewrite rules in Apache. This migration allows for lower overhead in complexity and maintenance.

The PID URI Service is used within the data.gov.au domain to manage PIDs made with a series of subdomains, such as environment.data.gov.au, reference.data.gov.au and others that accord with the AGLDWG's URI Guidelines, which indicates how to supply PIDs for use across government (tip: use PIDs associated with government functions, not organisations, as functions don't change, organisations do).

Agencies with datasets in the environment domain of government functions, such as the Bureau of Meteorology, can put Linked Data datasets online and use the proxying functions of this PID URI Service to create persistent identifiers for the datasets and their subcomponents which resolve to them, regardless of where and how they are implemented under the hood.

The PID URI http://environment.data.gov.au/def/op to CSIRO "Observable Properties" ontology which is a Linked Data resource about environmental properties. It's hosted on a CSIRO system but the PID makes it accessible via a nice, ordered, URI that won't change, even if CSIRO changes things (it can be redirected).

Geoscience Australia's Samples Register

Geoscience Australia's Sample Register delivers metadata for physical samples stored in it's repositories - internal databases. Multiple 'views' and 'formats' of samples' metadata is available, including the Dublin Core Metadata Initiative represented in RDF general purpose metadata, and more specialised metadata according to more sample-specialised schema, such as the W3C's Spatial Data on the Web's SOSA ontology.

The full catalogue (register) of all samples is available at http://pid.geoscience.gov.au/sample/ and W3C Data on the Web Best Practices are followed to allow for navigating the 2M samples.

Samples index: http://pid.geoscience.gov.au/sample/
Sample AU239's landing page (HTML): http://pid.geoscience.gov.au/sample/AU239
Sample AU239's Dublin Core metadata (HTML): http://pid.geoscience.gov.au/sample/AU239?_view=dc
Sample AU239's Dublin Core metadata (RDF): http://pid.geoscience.gov.au/sample/AU239?_view=dc&_format=text/turtle
Sample AU239's metadata in SOSA (RDF): http://pid.geoscience.gov.au/sample/AU239?_view=sosa&_format=text/turtle

The GeoSPARQL Extensions Ontology

The GeoSPARQL Ontology which s widely used for spatial data on the Web and which powers GeoSPARQL queries has been extended by members of this WG to include some properties and other elements found to be needed for Australian spatial Linked Data projects, particularly the Location Index.

Simple properties for basic values of area as well as complete property chain axioms have been added.

http://linked.data.gov.au/def/geox

The Australian Linked Data Cache

A cache of "Australian" Linked Data (i.e. LD attempted to be sourced from Australia only, but this is hard) is being worked on by University of Canberra students and Geoscience Australia.

This dataset will be presented here in August, 2017.

ACORN-SAT

"Experimental Environmental Linked-data published by the Bureau of Meteorology"

The Bureau of Meteorology (BOM) in collaboration with the National Plan for Environmental Information Initiative, the Australian Government Information Management Office (AGIMO) and the Information Engineering Laboratory of the CSIRO is providing experimental resources for Linked Data under lab.environment.data.gov.au. The data published under this domain makes data available in a Linked Data fashion, and illustrates some of the capabilities that can be developed with Linked Data.

Currently, the following environmental data sets are available as Linked Data:

ACORN-SAT

See http://lab.environment.data.gov.au for more info.

TERN's Plot ontology

The Terrestrial Ecosystems Research Network (TERN) . commissioned an OWL ontology to "provide a set of classes to support capture of plot- and site-based ecological survey data" with the result being the Plot Ontolgy.

The ontology is an extension to the W3C SSN/SOSA vocabulary and thus all data characterised using the Plot Ontology is compatible, at least in structure, with other SSN/SOSA data

Plot ontology

GA's Public Data Model Ontology

Geoscience Australia is moving to present all of its public online resources in accordance with a single ontology: the GA Public Data Ontology.

The ontology describes how GA's datasets, services, vocabularies, vocab terms, licenses, samples and all other resources online represented in Linked Data are linked. Using the ontology you can see that a Service operatesOn Dataset(s) and that the cardinality is 1+, i.e. every GA Service will indicate at least one Dataset.

The ontology provides semantic beyond that able to be provided by a single legacy catalogue tool.

PDM ontology

Machine Readable Australian Curriculum

"On behalf of ACARA, Education Services Australia publishes a machine readable version of the Australian Curriculum. The Australian Curriculum is published in machine readable form, using the Resource Description Framework (RDF). This uses Semantic Web technologies for an extensible encoding of metadata about the curriculum, expressed through relations between URIs."

http://rdf.australiancurriculum.edu.au

The Australian Government Records Interoperability Framework (AGRIF) ontology

The Australian Government Records Interoperability Framework (AGRIF) is a system of related semantic ontologies that describe the structure, functions, and activities of the Australian Government, providing sufficient context for the effective use – including but not limited to management – of Australian Government information assets. It complies with the World Wide Web Consortium’s Web Ontology Language (OWL2) Recommendation and makes reference to other Recommendations and existing domain ontologies for archival and preservation processes.

This ontology is expected to form one of the pillars of Semantic Web interoperability between Australian government organisations.

Python's Live OWL Documentation Environment

pyLODE is a development of the well-known LODE ontology documentation tool that's not available online any more.

pyLODE is written in Python and uses templating to deliver wither HTML or Markdown documentation for ontologies by interpreting the ontology and its metadata according to a series of display rules.

Many of the ontologies published by the AGLDWG are documented using pyLODE.

See the code repository at https://github.com/rdflib/pyLODE/

Note it's also available for use either as a Python package or as a Command Line application.

Registry Status Ontology

This vocabulary is a re-published version of the Registry Ontology's Status vocabulary (online in RDF). This re-publication was performed to allow for the URIs of each vocab term (skos:Concept) to resolve to both human-readable and machine-readable forms (HTML and RDF, respectively) using HTTP content negotiation.

Note that just like the original form of this vocabulary, while it is a SKOS vocabulary implemented as a single Concept Scheme, it is also an OWL Ontology and that each Status is both a skos:Concept and a reg:Status.

This vocabulary was the first vocab published using the AGLDWG's PID URI domain of linked.data.gov.au.

http://linked.data.gov.au/def/reg-status

Some of its features:

static HTML and RDF files used to deliver a 'slash URI' vocabulary
URIs for the whole vocab and each term
code repo for this vocab at https://github.com/AGLDWG/reg-status

pyLDAPI

A very small Python code module to add Linked Data API functionality to a Python Flask installation.

This module contains only a single Python file with a few static methods and classes that are indented to be added to a Flask API in order to add a series of extra functions to endpoints that the API delivers. It will also require the addition of one API endpoint - a ‘Register of Registers’ - if it is not already present.

An API using this module will get:

an alternates view for each Register and Object that the API delivers
- if the API declares the appropriate model view s for each item
a Register of Registers
- a start-up function that auto-generated a Register of Registers is run when the API is launched
a basic, over-writeable, template for Registers’ HTML & RDF

Linked Data GNAF

A Linked Data (OWL/RDF + HTML) version of the Geocoded National Address File (G-NAF) which is a database of all Australian street addresses and their "geo-codes" (coordinate location points).

The dataset is delivered according to Linked Data principles by use of the pyLDAPI tool and that ensures it is also conformant with the Content Negotiaon By Profile emerging Linked Data standard which allows for standard ways to request different profiles for data. This dataset supports a couple of differnt profiles.

Metadata for the dataset as a whole is also available according to the recently updated DCAT vocabulary for describing datasets.

http://linked.data.gov.au/dataset/gnaf