Taxonomy in the ALA : ALA Support

Jump to section:

Why does the ALA need a taxonomic backbone?
How does the taxonomic backbone work in the ALA?
Useful Examples
- 1. Resolving an unrecognised name using a higher match
- 2. Resolving an older name using a synonym match
Sum mary
FAQ

Why does the ALA need a taxonomic backbone?

The Atlas of Living Australia (ALA) is a biodiversity data repository focused primarily on observations of individual life forms. These observations are referred to as species occurrence records, and they tell us at least three things: which species is found where and when.

Because biodiversity data are collected from many different sources and across time, they may be provided to us in a variety of different formats and contain information that has not remained consistent over time. Species names, which are the product of the science of taxonomy, are one component of this dynamism. Names change over time and place, so can be inconsistent and conflicting due to new researcher discoveries and the changing understanding of species relationships. As a result, species names and classifications need to be kept updated, and occurrence records of the same species with conflicting and varying names need to be aggregated or separated – otherwise we risk presenting incomplete or scattered data of one species.

Learn more about the science of taxonomy and why species names change over time in our Introduction to Taxonomy article.

The ALA manages names as we do the other dynamic components of incoming data, by applying the global data standard for exchanging biodiversity data. This standard is called Darwin Core, and it describes vocabularies and data formats for recording species, location and other data relevant to capturing biodiversity information.

This article explains the process used by the ALA in which different authoritative sources of names are collected and transformed into a standardised taxonomic backbone that can be used across all of the ALA’s infrastructure.

We match names to build an index of occurrence records searchable by species: this works by joining like-with-like and building an index tree. A list of scientific names helps us search for species even if the names provided to us are slightly different. A common view of the taxonomic tree enables people to find records across higher taxa. For example, it allows users to find all records of birds by searching for “all records indexed to class:AVES”.

Each occurrence record provided to the ALA that can be matched is matched to a scientific name within the taxonomic backbone. All the information we know about that scientific name, its identifiers, and its location in the hierarchy, is stored as part of the occurrence record.

How does the ALA source taxonomic information?

The ALA is not an authoritative source of taxonomy and does not maintain its own taxonomy. Rather, we draw taxonomic information from the National Species List (NSL), maintained by the Australian Biological Resources Study within the Australian Government, with input from the Australian Plant Census, coordinated by the Council of Heads of Australasian Herbaria. The NSL is the national source for species names in Australia. The NSL is supplemented with a small group of other sources to complete our taxonomic coverage to best reflect a complete taxonomy. The ALA updates links to these sources regularly but will often be a number of months behind current taxonomy. The NSL should always be checked as the source of truth.

It is also important to recognise that the ALA matches species records to the most probable current name. In the event of a species split into new species, for example, if the ALA is unable to clearly allocate a record, the record will remain with the original species name or be allocated to the genus level instead. For example, the grassland earless dragon (Tympanocryptis lineata pinguicolla) was recently spilt into T. pinguicolla, T. liineata, T. osbornei and T. maccartneyi. Because the ALA cannot currently split these records only using names, most records will remain with T. pinguicolla or T. lineata.

The ALA does not provide names for all species in the world. We provide a taxonomic backbone for:

Australian species
Australian and New Zealand plants and fungi
species that are a biosecurity threat to Australia (species not yet found in Australia but whose arrival would threaten our unique biodiversity, agriculture, forestry and fisheries).

However, if you are interested in records of species from outside Australia held in Australian collections, for example the African elephant (Loxodonta africana) a search for observation records will return any digital record for this species, even though it is not in the taxonomy.

How does the taxonomic backbone work in the ALA?

The workflow in the ALA for creating a taxonomic backbone and applying it to occurrence records from hundreds of different data providers is described in the following steps:

Combine taxonomy data from multiple authoritative sources
Build an index of names and classify them into a tree
Match incoming occurrence records to names in the index

1. Combine taxonomy data from multiple authoritative sources

The ALA merges data from the following sources. Figure 1 shows the order of priority given to each.

the National Species List (NSL), which comprises the:
- Australian Faunal Directory (AFD) for the animal kingdom
- Australian Plant Census (APC) - an evolving consensus view of plant taxonomy
- Australian Plant Name Index (APNI) for Australian plants, and
- AusFungi, AusMoss, AusLichen and Australian Algal Names Index
ALA Species refinement lists - These are lists covering species missing from the NSL in Commonwealth, State and Territory conservation, sensitive, biosecurity, pest, weed and migratory lists
the New Zealand Organisms Register (NZOR) – which provides taxonomy to supplement the NSL enabling NZ and Australian herbarium species to be included as well as fungi and protists;
Remaining gaps in protists, algae, bacteria, nematodes and viruses are supplemented with the Catalogue of Life (CoL)

ordered ist of the priority of sources the ALA uses for taxonomic names. blue background with white boxes of text. Figure 1. Priority for building a taxonomic tree at the ALA. This is: 1) AusFungi, 2)  AusMoss and AusLichen, 3)  AFD, 4)  APNI and APC, 5) ALA lists (Conservation list, Biosecurity list, Pest and Weed list and Migratory Species list) 6) NZOR, 7) Catalogue of Life (CoL).

Previously the ALA merged taxonomies and then prioritised the names source. The revised taxonomy introduced in 2024 and continued in 2025 minimises the merging of taxonomic sources. For fauna, the National Species List (Australian Faunal Directory) is now the only accepted source, but it is supplemented by lists generated by the ALA covering species from conservation lists, biosecurity threats, identified pests and weeds and migratory species. These lists assign species and synonym corrections for species too recent to be in the NSL or covering issues such as spelling errors. For vascular plants, The National Species List (The Australian Plant Census supplemented by the Australian Plant Name Index and New Zealand Organism Register) are the major source, again supplemented by ALA generated lists.

For non-vascular plants and algae the NSL is used. For Fungi, a combination of AusFungi and NZOR is used. For Protists, the Australian Faunal Directory, NZOR and Australian Algal Names Index are combined, but duplicates have been removed where possible.

The more traditional merge process is still used in a very limited sense where sources disagree or double up. As at 2025, the taxonomic classification tree has been simplified based on the one used in the NSL and other sources such as NZOR or CoL have been reconfigured to fit within this.

We have introduced a new validation step, whereby we test the final draft index against the NSL, against statutory threatened species lists and against the species lists for states and territories to identify any missing or problematic names. We also check what names in our data are not matching and why. This has enabled us to reach a point where we match to 98.3% of names against all these sources.

2. Build an index of names and classify them into a tree

The combined sources are then used to create an index of names, or 'names index', containing all of the information provided by the sources, including the unique names, identifiers and higher level classifications, to build the view of a taxonomic tree.

The sources for names include:

accepted names - the best available current scientific name
synonyms - names for species that now go by a different name
Indigenous knowledge names
common names, or vernacular names
identifiers

There are several other terms that you will encounter that represent variations on "accepted" but of lower certainty. "Unreviewed" means a name that has been derived (generally from the Australian Plant Name Index) that has been included in the backbone as the probable best available name but has not been subject to the scrutiny of the Australian Plant Census. "Inferred" before "Accepted or "Synonym" means that the ALA has taken a decision to place a name or accept a name but that decision has not been formally reviewed by an expert. This will general include things such as spelling errors encountered in threatened species lists or similar. "Unranked" is a name that has been included but which is of very low reliability.

All this data is made publicly available on the ALA's species pages (Figure 2 uses the ALA's bare-nosed wombat species page as an example). Figure 2: Information from the authoritative taxonomic sources is available on the ALA's species pages in the Names and Classification tabs.

The Names tab (Figure 3) includes all of the names (synonyms, alternative spellings and vernacular names) and identifiers from various sources that have been allocated to that taxonomic unit.

Figure 3: An example of the Names tab on a species page showing the Accepted Name, any synonyms, vernacular names, identifiers and the sources of each.

The Classification tab (Figure 4) shows how the taxonomic units sit within the taxonomy backbone. The taxonomic backbone can be used to navigate up and down the taxonomic tree.

Figure 4: An example of the Classification tab on a species page shows the full taxonomic tree for that taxa, including links to the species pages in the ALA for each of those taxa/ranks.

3. Match incoming records to names in the index

3.1 Occurrence Records

During the ALA’s data ingestion process, occurrence records that are submitted are given a Match Type. A match of “exactMatch” means the name is a near identical fix. A “fuzzyMatch” means the name has been placed to a close match. A “HigherMatch” means the record could not be placed at the right level and has been matched at the next lowest possible taxonomic level. The ALA will populate each record with data describing the full taxonomic hierarchy. Figure 5 shows an example of the difference between the taxonomic data supplied to the ALA and that which the ALA has added to the record by name matching to the index.

table showing 'original' versus 'processed' values

Figure 5. Comparison table on an occurrence record of the Original, or supplied species data, against the complete taxonomic data (see Processed Values), derived after matching to the accepted name.

3.2 Species Lists

The ALA's Lists tool tool is a repository of species checklists, some of which are used by other tools in the ALA and include lists of threatened species and sensitive species. Loading a list will automatically use the ALA's name matching service to match any supplied scientificName field to the unique accepted name from names index (Figure 6). This function is available to any of the ALA's users. Note that many of the lists in the tool are not maintained by ALA.

Figure 6: The EPBC list of threatened species in the ALA's list tool, showing how the name in the supplied list is matched to the ALA's names index, indicated by the 'Scientific Name (matched)' column.

3.3 Species API

The ALA's species search API, sometimes referred to as the Biodiversity Information Explorer (BIE), for searching species and classification information is freely available to users. The most commonly used query is the generic search.

Useful Examples

1. Resolving an unrecognised name using a higher match

The example in Figure 7 shows the supplied name for a record was a subspecies of the grey kangaroo (Macropus giganteus giganteus). This name is not currently accepted in the NSL and so the name has been matched to the higher rank (species level) during processing: the grey kangaroo (Macropus giganteus). Figure 7 shows the Original vs Processed comparison table for the record.

screenshot showing a table of original versus processed values for 'scientific name'

Figure 7. An example of the supplied value of a record that matched to a higher rank to align with the current accepted naming for Macropus giganteus (Eastern Grey Kangaroo).

2. Resolving an older name using a synonym match

A name that has been supplied as a well-known older name (for example, Macropus rufus) will still match to the correct accepted name (in this case Osphranter rufus) via the use of synonym matching. Figure 8 shows an example of a record with this type of match, and in Figure 9, the species page for this name, which clearly lists all synonyms.

a table showing the matched scientific name of Osphranter rufus, as well as the supplied scientific name 'macropus rufus'

Figure 8. An example of the supplied value of a record that matched to a higher rank to align with the current accepted naming for Macropus giganteus (Eastern Grey Kangaroo).

species name page showing a list of several synonyms for scientific name for macropus giganteus (eastern grey kangaroo)

Figure 9. The species information page for Macropus giganteus (Eastern Grey Kangaroo), showing a list of all synonyms.

Summary

The ALA does not produce its own taxonomy but relies upon authoritative sources, principally the National Species List, which it aggregates with supplementary sources.
If a perceived issue with the taxonomy is identified, the ALA will pass the issue onto the the relevant source (if the issue is taxonomic) or rectify the issue in the next names loading (if the issue arises from the ALA’s aggregation).
Maintaining a large indexable database is complex, especially when the backbone of the system is ever-changing.
The dynamic nature of taxonomy means we require systems for processing old names, synonyms, etc.
The ALA creates a names index, which we then use to catalogue all incoming records to the best available name.

FAQ

Why can’t I find a species I know exists?

While the ALA goes to considerable effort to aggregate available information about all Australian species, there are some circumstances in which the ALA may not display information about a species you are interested in. These include:

The species may have changed due to a re-classification
The species is not listed in any of the sourses of the ALA’s species names, e.g. it may be very newly described, or the name is a manuscript or phrase name
The spelling used may be different from that officially recognised
The common name used may not be recognised by the ALA

Why are Dingoes not represented in the Atlas?

The ALA draws its taxonomic information from several sources, for animals this is the National Species List (Australian Faunal Directory). The AFD lists the dingo as a synonym of Canis familiaris, reflecting current scientific opinion that dingoes are a distinct variety of dog. The AFD’s page on C. familiaris describes describe why dingoes are currently classified with common dogs.

We recognise that one might still want to search for Dingoes and not other common dogs (which are all classified as C. familiaris). People who provide records of dingoes to the ALA can also specifically say that they are dingoes. This is stored as the supplied common name - search for records in the ALA that specifically include "dingo" as a name.

If you dive deeper and look at the names section of ALA’s Canis familiaris species page you’ll see that in the past dingoes have been classified as Canis dingo, Canis familiaris australasiae, Canis australiae, Canis macdonnellensis, Canis dingoides and even Canis lupus dingo.

Here are some recent papers which explore the phylogeny of the Dingo:

Field, M.A., Yadav, S., Dudchenko, O., Esvaran, M., Rosen, B.D., Skvortsova, K., Edwards, R.J., Keilwagen, J., Cochran, B.J., Manandhar, B. and Bustamante, S., 2022. The Australian dingo is an early offshoot of modern breed dogs. Science advances, 8(16), p.eabm5944. https://doi.org/10.1126/sciadv.abm5944

Jackson, S.M., Fleming, P.J., Eldridge, M.D., Archer, M., Ingleby, S., Johnson, R.N. and Helgen, K.M., 2021. Taxonomy of the Dingo: It’s an ancient dog. Australian Zoologist, 41(3), pp.347-357. https://doi.org/10.7882/AZ.2020.049

Why are there duplicate species?

The ALA is bringing together information from many sources and seeks to link together information provided under different scientific names, where these names are considered synonyms of a single species. However, many names which should be treated as synonyms are not covered by our data sources.