The Australian Government coat of Arms

Communities of practice

Communities of practice

How do you get a five star (dataset) rating?


#1

I see all the datasets in the new search page have a star rating. How is that calculated and how do you get the proverbial five stars?


#2

Hi Stephen,
Outsiders persective, but it looks like it’s scored according to Tim Berners-Lee’s 5 star linked open data spec from https://5stardata.info/en/

More specifically, a description of how the scores on the search page are calculated can be found here - https://search.data.gov.au/page/dataset-quality

I would suggest that a more useful way of scoring datasets on the portal (which QLD is actively working towards) is the adoption of open data certificates which includes more granular, specified criteria required to attain each rank.

(I can only post two links in a post, but more info on the ODI certificates can be found at theodi. org. au/open-data-certificates/


#3

We’ve just updated the guide at https://search.data.gov.au/page/dataset-quality to be a bit more precise :).

The stuff the ODI is doing is great, but goes a bit beyond what we can reasonably automate at this stage… because search.data.gov.au has 60,000 datasets, even the most basic manual process is a massive undertaking so we’ve tried to stick with what we can estimate automatically.


#4

Hi Alex - ODIAu and QLD Gov are working on automating the awarding of certificates to all datasets who qualify at the portal level - I wouldn’t for a second suggest you do it by hand! Localisations and plugins for the portal are all in the testing stage now.


#5

I love the idea of automating the awarding of certificates, anything we can do to improve metadata. Especially if we can do it via the CKAN API (I’m assuming we’re talking about: https://certificates.theodi.org/en/autocertification)

The only problem I see is that it only awards Bronze level, which is open and web accessible data; so everything on Data.gov.au anyway. I’m also slightly hesitant to ping the owner of each dataset :sweat_smile:


#6

I’ve been running a number of tests on ODI certificates for my agency (Qld Gov’t) and whilst the certificate assessment script does award a bronze to ‘almost’ every dataset published to the portal I don’t see a problem with that because the data is at least open and accessible. The challenge for each publisher is going to be if they want to achieve Gold or Platinum.


#7

Hi there, the Australian Government Linked Data Working Group maintain the guidelines for this on behalf of the AG. On my phone at the moment but will reply properly soon!


#8

Perhaps consider awards at a per agency level - sent to a central data custodian rather than individual business area.
That way there could be an easy roll-up to portfolio level, so internally agency heads would have a clear view of their relative data quality. You could even share this publicly…


#9

I like the idea, it solves some upsides but creates some downsides.

Upsides
Better visibility around departments relative data quality and a nudge to standardise their data processes.
Aligns better with certain ODI criteria:
Social

  • Social media accounts used to promote data
  • Forum or mailing list for users
  • Dedicated comms team building user community

Downsides
Certificates are generally expected to be awarded per dataset.
Worse overview of individual datasets/certain datasets may tank their cert rating.

Thoughts
Maybe a combination where data gets it’s normal ODC, but the custodians also get an aggregated ‘Agency ODC’ indicating their publication level?
Also some criteria from the ODI still aren’t explained well enough, such as: Guaranteed timeliness; the wording is very broad about what constitutes quickly, timely or real-time and how do we go about automatically validating this?

Because when we get down to it, functionally the awarding of certificates needs to be an automated process.


#10

Well, we’ve cracked at least the first level of automated awarding of certificates, QLD now have 2,217 certificates awarded on https://certificates.theodi.org/en/datasets

The current discussion is about whether some of the criteria are better served being addressed at a portal level, rather than individual dataset level.


#11

That’s awesome news Dave - congratulations! :clap:t3::clap:t3::clap:t3::tada: