When you are developing your surveys in BioCollect, it is important to think about how you can make your data high quality. If you choose to share your data with the ALA, it may be used many hundreds of times in the future, and in ways that you had not considered.

BioCollect has been designed to make it as easy as possible for you to collect high-quality data with good metadata. Taking the time to fill out your survey metadata, and to design a form that collects good data, will help future users to make the most of your data.

Below are a few things you should think about when designing your project and survey.

Determining fitness for use

When you are thinking about the quality of your data, it may be helpful to think about what future users of your data would like to know about it. Below are some questions you should be able to answer about your data, and which should be documented in your survey and project information.

  • Where has the data come from?
  • Who collected it?
  • Is the accuracy and precision of the data adequate for my intended use?
  • Were the collection, treatment methods and equipment suitable for my intended use?
  • Is the data comprehensive & complete?
  • Were the collection and treatment methods applied consistency for the whole dataset?
  • What quality assurance, curation, validation & management processes have been applied to the data?
  • What are the known and implied biases in the data?
  • What conditions apply to using the data?
  • How should the dataset be cited or referenced?
  • How can the data be accessed and in what format(s) is it available?
  • Is a data management plan available for the data?

Data quality general principles

  • Metadata that describes the processes and protocols by which the data was created and treated are critical in assisting data consumers to make informed choices about fitness for use.
  • Accuracy & precision are important for data interpretation and analysis.
  • The more accurate a record is, the higher it’s “quality”, as it has more potential for re-use across a wider range of situations.
  • Data precision can be reduced for a particular use, but it cannot be enhanced. Therefore, the more precise you can be at the point of making a record, the greater the utility of the record
  • Software design is an important factor in improving DQ/QA processes and outcomes, but it is only part of the solution. Project and data owners also need to take responsibility for these and employ a range of off-system solutions.