Jump to section:


Why does the ALA need a taxonomic backbone?


The Atlas of Living Australia (ALA) is a biodiversity data repository focused primarily on observations of individual life forms. These observations are referred to as species occurrence records, and they tell us at least three things: which species is found where and when. 

Because biodiversity data are collected from many different sources and across time, they may be provided to us in a variety of different formats and contain information that has not remained consistent over time. Species names, which are the product of the science of taxonomy, are one component of this dynamism. Names change
 over time and place, so can be inconsistent and conflicting due to new researcher discoveries and the changing understanding of species relationships. As a result, species names and classifications need to be kept updated, and occurrence records of the same species with conflicting and varying names need to be aggregated or separated – otherwise we risk presenting incomplete or scattered data of one species. 


Learn more about the science of taxonomy and why species names change over time in our Introduction to Taxonomy article.


The ALA manages names as we do the other dynamic components of incoming data, by applying the global data standard for exchanging biodiversity data. This standard is called Darwin Core, and it describes vocabularies and data formats for recording species, location and other data relevant to capturing biodiversity information.


This article explains the process used by the ALA in which different authoritative sources of names are collected and transformed into a standardised taxonomic backbone that can be used across all of the ALA’s infrastructure. 

 

We match names to build an index of occurrence records searchable by species: this works by joining like-with-like and building an index tree. A list of scientific names helps us search for species even if the names provided to us are slightly different. A common view of the taxonomic tree enables people to find records across higher taxa. For example, it allows users to find all records of birds by searching for “all records indexed to class:AVES”.  


Each occurrence record provided to the ALA that can be matched is matched to a scientific name within the taxonomic backbone. All the information we know about that scientific name, its identifiers, and its location in the hierarchy, is stored as part of the occurrence record. 

 

How does the ALA source taxonomic information?


The ALA is not an authoritative source of taxonomy and does not maintain its own taxonomy. Rather, we draw taxonomic information from the National Species List (NSL), maintained by the Australian Biological Resources Study within the Australian Government, with input from the Australian Plant Census, coordinated by the Council of Heads of Australasian Herbaria. The NSL is the national source for species names in Australia. The NSL is supplemented with a small group of other sources to complete our taxonomic coverage to best reflect a complete taxonomy. The ALA updates links to these sources regularly but will often be a number of months behind current taxonomy. The NSL should always be checked as the source of truth. 


It is also important to recognise that the ALA matches species records to the most probable current name. In the event of a species split into new species, for example, if the ALA is unable to clearly allocate a record, the record will remain with the original species name or be allocated to the genus level instead. For example, the grassland earless dragon (Tympanocryptis lineata pinguicolla) was recently spilt into T. pinguicolla, T. liineata, T. osbornei and T. maccartneyi. Because the ALA cannot currently split these records only using names, most records will remain with T. pinguicolla or T. lineata.  


The ALA also only attempts to provide names for Australian species, or for Australian and New Zealand plants. You will not find most international species in the taxonomy, expect for species regarded as a biosecurity threat. However, if you are interested in records of species from outside Australia held in Australian collections, for example the African elephant (Loxodonta africana) a search for records will return any digital record for this species, even though it is not in the taxonomy. 



How does the taxonomic backbone work in the ALA?

 

The workflow in the ALA for creating a taxonomic backbone and applying it to occurrence records from hundreds of different data providers is described in the following steps:


  1. Combine taxonomy data from multiple authoritative sources
  2. Build an index of names and classify them into a tree
  3. Match incoming occurrence records to names in the index



1. Combine taxonomy data from multiple authoritative sources


The ALA merges data from the following sources. Figure 1 shows the order of priority given to each.

  • the National Species List (NSL), which comprises the:
    • Australian Faunal Directory (AFD) for the animal kingdom
    • Australian Plant Census (APC) - an evolving consensus view of plant taxonomy
    • Australian Plant Name Index (APNI) for Australian plants, and 
    • AusFungi, AusMoss, AusLichen and algal lists 
  • ALA Species refinement lists - These are lists covering species missing from the NSL in Commonwealth, State and Territory conservation, sensitive, biosecurity, pest, weed and migratory lists
  • the New Zealand Organisms Register (NZOR) which provides taxonomy to supplement the NSL enabling NZ and Australian herbarium species to be included as well as algae, nematodes, protists, bacteria and viruses; 
  • Remaining gaps in protists, algae, bacteria, nematodes and viruses are supplemented with the Catalogue of Life (CoL)



ordered ist of the priority of sources the ALA uses for taxonomic names. blue background with white boxes of text. Figure 1. Priority for building a taxonomic tree at the ALA. This is: 1) AusFungi, 2)  AusMoss and AusLichen, 3)  AFD, 4)  APNI and APC, 5) ALA lists (Conservation list, Biosecurity List, Pest and Weed List and Migratory Species list) 6) NZOR, 7) Catalogue of Life (CoL).


Previously the ALA merged taxonomies and then prioritised the names source. The revised taxonomy introduced in 2024 minimises the merging of taxonomic sources. For fauna, the National Species List (Australian Faunal Directory) is now the only accepted source, but it is supplemented by lists generated by the ALA covering species from conservation lists, biosecurity threats, identified pests and weeds and migratory species. These lists assign species and synonym corrections for species too recent to be in the NSL or covering issues such as spelling errors. For vascular plants, The National Species List (The Australian Plant Census supplemented by the Australian Plant Name Index and New Zealand Organism Register) are the major source, again supplemented by ALA generated lists. 


For non-vascular plants, nematodes, fungi, protists, algae, bacteria and viruses, the more traditional merge process is still used but will be reviewed later in 2024-25. In most cases, where sources disagree or double up, merging partly resolves this. If there is disparity in classification between NZOR and AusFungi for example, then the AusFungi classification will take precedence as it is an Australian resource. However, if there is disparity between NZOR and CoL, then NZOR will be used. If, however, the species is only present in CoL, then this classification will be used. 


 

2. Build an index of names and classify them into a tree


The combined sources are then used to create an index of names, or 'names index', containing all of the information provided by the sources, including the unique names, identifiers and higher level classifications, to build the view of a taxonomic tree.


The sources for names include:

  • accepted names - the best available current scientific name
  • synonyms - names for species that now go by a different name
  • Indigenous knowledge names
  • common names, or vernacular names
  • identifiers
  • phylogenetic tree information


There are several other terms that you will encounter that represent variations on "accepted" but of lower certainty. "Unreviewed" means a name that has been derived (generally from the Australian Plant Name Index) that has been included in the backbone as the probable best available name but has not been subject to the scrutiny of the Australian Plant Census. "Inferred" before "Accepted or "Synonym" means that the ALA has taken a decision to place a name or accept a name but that decision has not been formally reviewed by an expert. This will general include things such as spelling errors encountered in threatened species lists or similar. "Unranked" is a name that has been included but which is of very low reliability.


All this data is made publicly available on the ALA's species pages (Figure 2 uses the ALA's bare-nosed wombat species page as an example). Figure 2: Information from the authoritative taxonomic sources is available on the ALA's species pages in the Names and Classification tabs.  



The Names tab (Figure 3) includes all of the names (synonyms, alternative spellings and vernacular names) and identifiers from various sources that have been allocated to that taxonomic unit. 

Figure 3: An example of the Names tab on a species page showing the Accepted Name, any synonyms, vernacular names, identifiers and the sources of each.  



The Classification tab (Figure 4) shows how the taxonomic units sits within the taxonomy backbone. The taxonomic backbone can be used to navigate up and done the taxonomic tree.

Figure 4: An example of the Classification tab on a species page shows the full taxonomic tree for that taxa, including links to the species pages in the ALA for each of those taxa/ranks.


3. Match incoming records to names in the index


3.1 Occurrence Records


During the ALA’s data ingestion process, occurrence records that are submitted with the most current species name match directly to an accepted name and are given a Match Type of “exactMatch”. The ALA will populate each record with data describing the full taxonomic hierarchy.  Figure 5 shows an example of the difference between the taxonomic data supplied to the ALA and that which the ALA has added to the record by name matching to the index. 


table showing 'original' versus 'processed' values

Figure 5. Comparison table on an occurrence record of the Original, or supplied species data, against the complete taxonomic data (see Processed Values), derived after matching to the accepted name.



3.2 Species Lists


The ALA's Lists tool is a database of species checklists, used for managing lists internally like conservation statuses and sensitive lists. Loading a list will automatically use the ALA's name matching service to match any supplied scientificName field to the unique accepted name from names index (Figure 6). This function is available to any of the ALA's users. Note that many of the lists in the tool are not maintained by ALA.


Figure 6: The EPBC list of threatened species in the ALA's list tool, showing how the name in the supplied list is matched to the ALA's names index, indicated by the 'Scientific Name (matched)' column.



3.3 Species API


The ALA's species search API, sometimes referred to as the Biodiversity Information Explorer (BIE), for searching species and classification information is freely available to users. The most commonly used query is the generic search.




Useful Examples


1. Resolving names from different authorities


The fungi species Malassezia pachydermatis has two different names from different naming authorities (Figure 7). Malassezia pachydermatis (Weidman) C.W.Dodge is the accepted name and this is taken from AusFungi because of the precedence of AusFungi over NZOR although the same name is also available from NZOR. Other synonyms applied to the name are Pityrosporum canis Gustafson (NZOR) and Pityrosporum pachydermatis Weidman (AusFungi). The ALA will use the identifier provided by AusFungi to uniquely identify this species.


species information page for a fungi species, showing the names tab with accepted name from AusFungi and synonyms from AusFungi and NZOR

Figure 7: Information about the naming authority used is displayed directly below the species name.



2. Resolving an unrecognised name using a higher match


The example in Figure 8 shows the supplied name for a record was a subspecies of the grey kangaroo (Macropus giganteus giganteus). This name is not currently accepted in the NSL and so the name has been matched to the higher rank (species level) during processing: the grey kangaroo (Macropus giganteus). Figure 8 shows the Original vs Processed comparison table for the record.


screenshot showing a table of original versus processed values for 'scientific name'

Figure 8. An example of the supplied value of a record that matched to a higher rank to align with the current accepted naming for Macropus giganteus (Eastern Grey Kangaroo).



3. Resolving an older name using a synonym match


A name that has been supplied as a well-known older name (for example, Macropus rufus) will still match to the correct accepted name (in this case Osphranter rufus) via the use of synonym matching. Figure 9 shows an example of a record with this type of match, and in Figure 10, the species page for this name, which clearly lists all synonyms.


a table showing the matched scientific name of Osphranter rufus, as well as the supplied scientific name 'macropus rufus'

Figure 9. An example of the supplied value of a record that matched to a higher rank to align with the current accepted naming for Macropus giganteus (Eastern Grey Kangaroo).



species name page showing a list of several synonyms for scientific name for macropus giganteus (eastern grey kangaroo)

Figure 10. The species information page for Macropus giganteus (Eastern Grey Kangaroo), showing a list of all synonyms.




Summary

  • The ALA does not produce its own taxonomy but relies upon authoritative sources, principally the National Species List, which it aggregates with supplementary sources.  
  • If a perceived issue with the taxonomy is identified, the ALA will pass the issue onto the the relevant source (if the issue is taxonomic) or rectify the issue in the next names loading (if the issue arises from the ALA’s aggregation).
  • Maintaining a large indexable database is complexespecially when the backbone of the system is ever-changing.
  • The dynamic nature of taxonomy means we require systems for processing old names, synonyms, etc.
  • The ALA creates a names index, which we then use to catalogue all incoming records to the best available name. 




FAQ


Why can’t I find a species I know exists?


While the ALA goes to considerable effort to aggregate available information about all Australian species, there are some circumstances in which the ALA may not display information about a species you are interested in. These include:

  • The species may have changed  from re-classification 
  • The species is not listed in any of the courses of the ALA’s species names, e.g. it may be very newly described, or the name is a manuscript or phrase name.
  • The spelling used may be different from that officially recognised
  • The common name used may not be recognised by the ALA



Why are Dingoes not represented in the Atlas?


The ALA draws its taxonomic information from several sources, for animals this is the National Species List (Australian Faunal Directory). The AFD lists the dingo as a synonym of Canis familiaris, reflecting current scientific opinion that dingoes are a distinct variety of dogThe AFD’s page on C. familiaris describe why dingoes are currently classified with common dogs.

 

We recognise that one might still want to search for Dingoes and not other common dogs (which are all classified as C. familiaris). People who provide records of dingoes to the ALA can also specifically say that they are dingoes. This is stored as the supplied common name search for records in the ALA that specifically include "dingo" as a name.


If you dive deeper and look at the names section of ALA’s Canis familiaris species page you’ll see that in the past dingoes have been classified as Canis dingo, Canis familiaris australasiae, Canis australiae, Canis macdonnellensis, Canis dingoides and even Canis lupus dingo. 
 

Here are some recent papers which explore the phylogeny of the Dingo:

 

Field, M.A., Yadav, S., Dudchenko, O., Esvaran, M., Rosen, B.D., Skvortsova, K., Edwards, R.J., Keilwagen, J., Cochran, B.J., Manandhar, B. and Bustamante, S., 2022. The Australian dingo is an early offshoot of modern breed dogs. Science advances, 8(16), p.eabm5944. https://doi.org/10.1126/sciadv.abm5944

 

Jackson, S.M., Fleming, P.J., Eldridge, M.D., Archer, M., Ingleby, S., Johnson, R.N. and Helgen, K.M., 2021. Taxonomy of the Dingo: It’s an ancient dog. Australian Zoologist, 41(3), pp.347-357. https://doi.org/10.7882/AZ.2020.049

 

Why are there duplicate species?


The ALA is bringing together information from many sources and seeks to link together information provided under different scientific names, where these names are considered synonyms of a single species. However, many names which should be treated as synonyms are not covered by our data sources.