Occurrence records in the ALA can be filtered by using the spatially valid flag. This flag combines a set of tests applied to the record to see how reliable are its spatial data components. 


The flag is stored with the record so that spatial applications, such as the Spatial Portal, can quickly eliminate records that are likely to have severely inaccurate spatial information (e.g. incorrect latitude or longitude). It can also be used to filter records in the search interface or after records have been downloaded.


In downloads it will appear in the Location Quality column, and in queries direct to the API it will appear as the field spatiallyValid.


What makes a record spatially valid?

There's a list of tests. If the record fails any one of these, it's not considered to be spatially valid:

  • Zero coordinate - Coordinates are exactly 0 latitude/0 longitude, often indicating an actual null coordinate.
  • Country coordinate mismatch - The interpreted occurrence coordinates fall outside of the indicated country
  • Coordinate invalid - A coordinate value is given in some form, but we are unable to interpret it. This can occur when text values we are unable to interpret are provided.
  • Coordinate out of range - The supplied coordinates lie outside of the range for decimal latitude/longitude values (-90⁄90, -180⁄180).

These tests rely on the record having some sort of coordinate location - a latitude and longitude or a grid reference - something that would allow you to find the record on a map. If a record doesn't have any coordinate-based location, then the record is considered not spatially valid.  Examples include where the location is just a town name, there is only a text description of the location, or the record is actually a drawing showing the characteristics of a species and doesn't have any location information.


What has changed in 2021?

With the extensive update of ALA infrastructure in 2021, the spatially valid test was brought into line with the Global Biodiversity Information Federation (GBIF) spatially valid test. The calculation has changed in the following ways:

  • no longer checks whether there are user assertions (annotations) on records
  • no longer explicitly checks for failed decimal latitude/longitude conversion and failed UTM coordinate conversion, instead a higher level test will be done to see if the processing can recognise the coordinates provided, if not, the record will be flagged as spatially invalid
  • will now flag as spatially invalid records that fail the Coordinate country mismatch test - we have implemented a buffer area around the coast of Australia so that coordinates along the coast or just off-shore do not incorrectly flag this assertion. There may be some refinement required in setting the buffer space to ensure coordinates correctly match up to the spatial layer defining Australia. The layers we use are the GADM Spatial layers. To determine the correct country information has been supplied, we check this against the International standard for country and country codes (click on the country codes search option, leave the search entry box blank and click on Search).
  • where no coordinates are given, the record will automatically be set to not spatially valid.

Detailed comparison

The following is a detailed comparison of the new and old spatially valid tests.


Zero coordinate - Coordinates are exactly 0/0, often indicating an actual null coordinate. 

In use in both the old and new implementations.


Coordinate out of range - The supplied coordinates lie outside of the range for decimal latitude/longitude values (-90⁄90, -180⁄180).

In use in both the old and new implementations.


Coordinate invalid - A coordinate value is given in some form, but we are unable to interpret it. 

In use with some changes.  The new test requires valid coordinates to be given, otherwise the record is marked as spatially suspect. In the previous implementation, the specific test was called "Unparseable verbatim coordinates". In this implementation, if only text values are given for a location, the ALA considered the record to be neither spatially valid or invalid. The new test considers records with only text locations to be invalid.


Decimal latitude/longitude conversion failed - If we cannot convert the supplied decimal latitude and longitude to the WGS84 datum (EPSG 4326) that the ALA uses. This may be because the supplied position cannot be unambiguously translated to or because the supplied position is invalid. 

In use in both the old and new implementations.

Note: this used to be a separate test, it is now rolled into Coordinate invalid.


Unable to convert UTM coordinatesIf we only have an easting and northing and cannot convert these to a latitude and longitude, the record fails

In use in both the old and new implementations.

Note: this used to be a separate test, it is now rolled into Coordinate invalid.


Country coordinate mismatchThe interpreted occurrence coordinates fall outside of the indicated country. Also called "Country invalid" in the ALA search interface.

This is now in use. It is known coastal areas can cause some issues where the spatial layers we are using to determine the country boundaries do not quite align. If you are interested in organisms that occur in coastal habitats, you may want to turn off the spatially valid filter to check the records you are interested in are not being incorrectly filtered out.


Geospatial issue - If there is a user-annotated geospatial issue attached to the record, then the record fails

No longer in use.


Taxonomic issueIf there is a user-annotated taxonomic issue attached to the record, then the record fails

No longer in use.


How is the flag used?

The flag can be used via the ALA search interface, in directly querying the search API and in the downloads.


Search interface: default ALA General data profile

The ALA uses the spatially valid flag in the ALA General default data quality filter to automatically exclude records that fail any one or more of the spatially valid tests listed in the table above. You can toggle the spatially valid filter off using the checkbox either at the top of the search results or in the left hand pane.



Search interface: customising filters

If you turn off the ALA General default data profile (1), you can still use the spatially valid filter by changing the facet filtering. The facets are found on the right hand side of the search interface. You will need to Customise filters (2), and in the Location section select the checkbox for Spatial validity. The Spatially valid and Spatially suspect facets (3) will then display.



More detailed information on using the facets in searching is available in this article.


Directly querying the API

There are several ways of using the flag. In some applications, a spatially valid test is automatically added to any query, since it makes no sense to try and display spatially unusable records. The examples at the end of each entry are all searches for the genus Acacia, so that you can see how the filters work. 

  • spatiallyValid:true - shown in the biocache as Spatial validity: Spatially valid - is the general test used by spatial services, such as the spatial portal, to only include records that have usable latitude/longitude coordinates. Example
  • spatiallyValid:false - shown in the biocache as Spatial validity: Spatially suspect - shows records that have been supplied with coordinates but where there is something seriously wrong with the information provided, or records with no coordinates. Example
  • No spatial validity specified and no data profile - gives you everything. Example


Downloads

The spatially valid flag is included in the ALA legacy download format. Look for the Location Quality column, this is populated from the spatiallyValid field.


What else do we check for?

There are a number of flags of lesser severity that relate to spatial validity. 

To use these checks, you need to enable the Assertions facet in the filter menu. Click on Customise Filters near the top left of the page and endure that Assertions > Record Issues has been ticked. Next search for the species or other taxa you are interested in, you will be presented with a list of issues in the Narrow Your Results panel. You may find it useful to disable the ALA default data profile filter to see the following example queries below.

  • Coordinates centre of country. The coordinates are exactly at the centre of the country the record is in. There may be occasional records that are legitimate that pass this test but, for the most part, this indicates a record where some upstream processor has put in assumed coordinates when only the country is supplied as a location. In the ALA, for example, there are some very lost looking creatures in Northern NSW. The "centre of the country" is defined in the ALA as the centre point of the bounding box of the country; this may not match the definition that the upstream processor is using. This example shows Major Mitchell cockatoo records with coordinates given as the centre of Australia, where Norfolk Island is included in the calculation for the centre of Australia.
  • Supplied coordinates centre of state. The coordinates are exactly at the centre of the state or province the record is in. This test follows a similar pattern to the country test, above. 
  • Suspected outlier. The record lies outside the expected environment of the observed species (taxon). The ALA runs the reverse jackknife algorithm to detect records that are in locations where the environment is very different from most other records of that species. If the record exceeds a threshold on more than one (check) of the five environmental factors, then it is marked as potential outlier. 
  • Outside expert range for species. If we have an expert-compiled species spatial range and the record location is outside that range. Expert-defined ranges are currently available for a limited number of species such as birds and fish.

Note: we used to check for records with either latitude or longitude set to zero, and flag them with a warning. This is no longer the case, only where both latitude and longitude are set to zero, will records be flagged.