Australian Government Linked Data Working Group

Showcase

The AGLDWG aims to communicate the benefits and technical aspects of Linked Data use in government. Here are Linked Data systems and datasets already implemented by Australian government agencies as well as a presentations by group members.

Below are listed some:

Vocabularies

Much of linked Data relies on definitions, indeed the Semantic Web, which Linked Data is helping build, is predicated on strong definitions for web resources. Vocabularies, standardised in their structure and delivery according to Linked Data and Semantic Web principles provide online, look-up-able definitions for things which can be used much more easily and powerfully than older vocabulary tools such as (paper) dictionaries, tables on web pages or XML code lists.

Here are Linked Data vocabularies and also an Australian national catalogue of research vocabularies:

Ontologies

Ontologies are data models that express knowledge within a domain and are often more complex than vocabularies although vocabularies themselves are a form of ontology.

A great number of foundational, or fundamental, ontologies have been produced to cater for such broadly required concepts as time (TIME ontology) and simple authoring information Dublin Core and tracing changes to things over time (PROV-O, the provenance ontology) this Linked Data WG has produced an ontology to define properties for datasets within the data.gov.au catalogue.

Datasets

Here are some examples of Linked Data datasets. Yes, the list is small now but we will be adding to it very soon!

Systems & Tools

There're many different systems that can claim to be "Linked Data Systems": really anything that helps supply Linked Data. Some of them are dedicated to Linked Data, such as RDF triplestores and Linked Data APIs, others facilitate Linked Data along with other functions, such as general website content management.

Below are a few examples of Linked Data systems currently in operation within Australian government.

LD Presentations

Here are a few presentations about Linked Data given by members of this group and other Linked Data experts

Australian Governments' Interactive Functions Thesaurus (AGIFT)

AGIFT website screenshot

AGIFT is a vocabulary delivered by the National Archives of Australia that lists functions performed by government. The web page delivering AGIFT is a system that allows for both human and machine-readable versions of the vocabulary formalised using the SKOS ontology.

The system used is the commercial PoolParty product.

Australian National Data Service' Research Vocabularies Australia

The ANDS provides a vocabulary hosting service for Australian government and academic users

ANDS' Research Vocabularies Australia portal

A search for the word 'rock' yields both individual terms ("Concepts") within vocabularies about rocks and whole vocabularies about them.

AGLDWG PID URI Service

One of the core tasks of Linked Data is to uniquely and usefully identify resources - information items - on the web. This is usually done with URIs which are just a slight extension to web page URLs allowing for non-web page things to be linked to, e.g. vocabulary terms in machine-to-machine data formats.

The Linked Data WG recently used an advanced web proxy, the PID Service, but has recently migrated its efforts in redirecting URLs to URL rewrite rules in Apache. This migration allows for lower overhead in complexity and maintenance.

The PID URI Service is used within the data.gov.au domain to manage PIDs made with a series of subdomains, such as environment.data.gov.au, reference.data.gov.au and others that accord with the AGLDWG's URI Guidelines, which indicates how to supply PIDs for use across government (tip: use PIDs associated with government functions, not organisations, as functions don't change, organisations do).

Apache Rewrite Rules Example

Agencies with datasets in the environment domain of government functions, such as the Bureau of Meteorology, can put Linked Data datasets online and use the proxying functions of this PID URI Service to create persistent identifiers for the datasets and their subcomponents which resolve to them, regardless of where and how they are implemented under the hood.

The PID URI http://environment.data.gov.au/def/op to CSIRO "Observable Properties" ontology which is a Linked Data resource about environmental properties. It's hosted on a CSIRO system but the PID makes it accessible via a nice, ordered, URI that won't change, even if CSIRO changes things (it can be redirected).

Geoscience Australia's Samples Register

Geoscience Australia's Sample Register delivers metadata for physical samples stored in it's repositories - internal databases. Multiple 'views' and 'formats' of samples' metadata is available, including the Dublin Core Metadata Initiative represented in RDF general purpose metadata, and more specialised metadata according to more sample-specialised schema, such as the W3C's Spatial Data on the Web's SOSA ontology.

The full catalogue (register) of all samples is available at http://pid.geoscience.gov.au/sample/ and W3C Data on the Web Best Practices are followed to allow for navigating the 2M samples.

GA's Samples Register

The Dataset Ontology

While there are many ontologies that deal with datasets, such as the simple, well-known and widely-used DCAT, the purpose of this ontology is to cover aspects of organisational custodianship and governance using formulations common in many of the more rigorous ontologies such as ORG, rather than DACt's simplistic constructs.

This ontology has the expressive power of the ISO's 19115 -- metadata for geographic information standard but, in using Linked Data and Semantic Web tools rather than the older XML, it is far more powerful.

http://reference.data.gov.au/def/ont/dataset

The Dataset Ontology Landing Page in data.gov.au

The Australian Linked Data Cache


A cache of "Australian" Linked Data (i.e. LD attempted to be sourced from Australia only, but this is hard) is being worked on by University of Canberra students and Geoscience Australia.

This dataset will be presented here in August, 2017.

ACORN-SAT

"Experimental Environmental Linked-data published by the Bureau of Meteorology"

ACORN-SAT homepage

The Bureau of Meteorology (BOM) in collaboration with the National Plan for Environmental Information Initiative, the Australian Government Information Management Office (AGIMO) and the Information Engineering Laboratory of the CSIRO is providing experimental resources for Linked Data under lab.environment.data.gov.au. The data published under this domain makes data available in a Linked Data fashion, and illustrates some of the capabilities that can be developed with Linked Data.

Currently, the following environmental data sets are available as Linked Data:

See http://lab.environment.data.gov.au for more info.

FSDF's LINK ontology

The LINK

The LINK is the Australian Government's Foundational Spatial Data Framework initiative's online database of input data, agencies and so on that contribute to its themed datasets.

The LINK is presented online via a Content Management System (CMS) that makes all of its contents available via dynamic web pages draws from a relational database. That database's structure has been designed in accordance with the LINK's OWL ontology:

In the future, the content of LINK will be able to be exported as a Linked Data dataset using a mapping from the CMS to the ontology.

GA's Public Data Model Ontology

The GA PDM

Geoscience Australia is moving to present all of its public online resources in accordance with a single ontology: the GA Public Data Ontology.

The ontology describes how GA's datasets, services, vocabularies, vocab terms, licenses, samples and all other resources online represented in Linked Data are linked. Using the ontology you can see that a Service operatesOn Dataset(s) and that the cardinality is 1+, i.e. every GA Service will indicate at least one Dataset.

The ontology provides semantic beyond that able to be provided by a single legacy catalogue tool.

Machine Readable Australian Curriculum

Machine Readable Australian Curriculum

"On behalf of ACARA, Education Services Australia publishes a machine readable version of the Australian Curriculum. The Australian Curriculum is published in machine readable form, using the Resource Description Framework (RDF). This uses Semantic Web technologies for an extensible encoding of metadata about the curriculum, expressed through relations between URIs."

The Australian Government Records Interoperability Framework (AGRIF) ontology

The Australian Government Records Interoperability Framework (AGRIF) is a system of related semantic ontologies that describe the structure, functions, and activities of the Australian Government, providing sufficient context for the effective use – including but not limited to management – of Australian Government information assets. It complies with the World Wide Web Consortium’s Web Ontology Language (OWL2) Recommendation and makes reference to other Recommendations and existing domain ontologies for archival and preservation processes.

This ontology is expected to form one of the pillars of Semantic Web interoperability between Australian government organisations.

The AGRIF Ontology's Record class

Live OWL Documentation Environment v. 2

LODE2 is a Linked Data tool for extracting classes, object properties, data properties, named individuals, annotation properties, general axioms and namespace declarations from an OWL and OWL2 ontology. They are rendered in an ordered list along with their textual definitions in a human-readable HTML web document.

Ontologies and the Linked Data technology stack are key to the success of data interoperability. The make-up of ontologies are extremely machine-readable, but not so much for humans. LODE2 helps with expressing these ontologies in a human-readable way.

See the online version at http://lode2.linked.data.gov.au

http://lode2.linked.data.gov.au

Some of its features:

Registry Status Ontology

This vocabulary is a re-published version of the Registry Ontology's Status vocabulary (online in RDF). This re-publication was performed to allow for the URIs of each vocab term (skos:Concept) to resolve to both human-readable and machine-readable forms (HTML and RDF, respectively) using HTTP content negotiation.

Note that just like the original form of this vocabulary, while it is a SKOS vocabulary implemented as a single Concept Scheme, it is also an OWL Ontology and that each Status is both a skos:Concept and a reg:Status.

This vocabulary was the first vocab published using the AGLDWG's PID URI domain of linked.data.gov.au.

http://linked.data.gov.au/def/reg-status

pyLDAPI

A very small Python code module to add Linked Data API functionality to a Python Flask installation.

This module contains only a single Python file with a few static methods and classes that are indented to be added to a Flask API in order to add a series of extra functions to endpoints that the API delivers. It will also require the addition of one API endpoint - a ‘Register of Registers’ - if it is not already present.

An API using this module will get:

  • an alternates view for each Register and Object that the API delivers
    • if the API declares the appropriate model view s for each item
  • a Register of Registers
    • a start-up function that auto-generated a Register of Registers is run when the API is launched
  • a basic, over-writeable, template for Registers’ HTML & RDF
pyLDAPI on PyPI