The Australian Government coat of Arms

Communities of practice

Communities of practice

Accessing records in searches that return more than 9,999 results

Hi - note certain if this is the correct forum, however a number of searches on data.gov.au return more than 9,999 records. It appears however that you can’t view the results past 9,999 (i.e. page 1000 onwards).

In such searches if you click page 1,000 you get the following …Error - Internal Server Error.

In some cases it is possible to use date filters etc to keep the results below the threshold - however date filters doesn’t seem to pick up all records.
For example - if you search on soil it returns in excess of 27,000 records. If you filter on date from Jan 1768 to Aug 2021 you only get 12,291. Does this mean there are records with out a date attribute?

Interested if there is a method to view the results beyond page 999.
Rgds, C

Hi @auricht

You’re in the right place!

Clicking on page 1000 and getting an error definitely looks like a bug.
I’m unsure if this is only on data.gov.au or relates to Magda (the underlying engine that powers data.gov.au) In any case, I’ll raise a ticket for you.

If you filter on date from Jan 1768 to Aug 2021 you only get 12,291. Does this mean there are records with out a date attribute?

While there maybe instance where datasets do not have dates, the date filter does not just relate to the Created and Updated tags on each dataset.

For example, if you search for soil and put the date filter to Before Jan 1768 you’ll get a NLA dataset for a journal written in 1768, which happens to be in the in the title of the dataset.

Hi Many thanks for the message. FYI have been in touch with Data61 who advised the the page limit is something that is configured within the system. In this context…

The search engine’s design principle is to offer the most relevant result on the top pages and set a hard limit on total no. of result returns (so it doesn’t have to go through all datasets) to improve the performance.

This design usually works well as user unlikely to go though 100s pages — they might simply change the keyword for search if couldn’t find result on the first few pages.

On this basis the approach to obtaining access to records beyond the 1000 page limit is to use the registry API to access the metadata store directly.

Kind regards, Chris