Different Linking Sequence Data Records to their Voucher Specimens

Molecular data such as DNA sequences are an important and growing component of biological research.  In research fields such as taxonomy and evolutionary biology, it is critically important to associate gene sequence data with morphological, ecological, behavioral, and other types of data.  The goal of this on-line registry is to create a system that permits records in nucleotide sequence databases (as well as other kinds of databases) to include links that point to the voucher specimens from which the DNA sequences were derived.

The Consortium for the Barcode of Life (CBOL) proposed a method for linking sequence records to voucher specimens to GenBank at the National Center for Biotechnology Information (NCBI) in 2005.  This method was developed in collaboration with the Global Biodiversity Information Facility (GBIF) and other major biodiversity database initiative.  The linkage uses a structured data format (see FAQs) based on the Darwin Core data standards developed by the Biodiversity Information Standards (TDWG, formerly the Taxonomic Database Working Group).  The data format was accepted by GenBank and then proposed to EMBL, and DDBJ, the other members of the International Nucleotide Database Collaboration (INSDC).  The structured data field for voucher specimens was approved by the INSDC in May 2005).

Institutional Acronyms and Collection Codes

The structured datafield for voucher specimens consists of three parts:

The universally-recognized acronym for the institution that holds the voucher specimen,

The institution’s code for the collection in which the voucher specimen is kept, and

The unique catalog number (or other identifier) in the catalog of specimens in that collection.

By combining these three elements, the voucher specimen datafield should point to a unique specimen.  A voucher specimen may be the source of many samples (frozen tissue, skeleton, DNA, etc.), and associating these samples with the original specimen can be a significant challenge.  This registry does not address this issue.

This on-line registry permits biorepositories to register their institutional acronyms and collection codes.  

A “biorepository” can be

  1. a museum, herbarium, zoo, or botanical garden,
  2. a culture collection, or medical research institute,
  3. an individual researcher, or a private collector, or
  4. any other refernce collection of biological specimens used for research.

The data collected through this registry will become part of NCBI’s data infrastructure and will be incorporated into the Biodiversity Collections Index (BCI), a central repository of information on biological collections.

Non-Institutional Biorepositories

Individual researchers and private collectors are urged to register their collections if they will be used as voucher specimens for molecular studies.  Using this online registry, they will be grouped together under the common institutional acronym “personal” and will have unique collection codes under that institution.

Using the Registry

This online database was initially populated with approximately 7,000 institutional acronyms compiled by NCBI from publications (e.g., the 1993 Insect and Spider Collections of the World) and directories of repositories (e.g., Index Herbariorum).  

To register an institutional acronym,

Look for the institution in the database by going to Institution on the blue navigation bar above and selecting Find and Institution.  You can sort the list by acronym, institution name, or country, and you can search by acronym, institutional name, city, state, or country.

If you find the institution and wish to register the acronym listed with the institution name, then click on Edit and fill in the required information.

If you find the institution but wish to register it under a different acronym, then go to Institution on the blue navigation bar and select Create an Institution.  Complete all the required information, save, and confirm the data.

If the acronym listed for your institution is also being used by another institution, a message to that effect will appear.  Please either select a different acronym or contact us to resolve the conflict.

After confirming the institutional data, you will be asked if you wish to register the collections within the institution.  Many institutions have multiple collections, and it’s essential to know the codes for the collections in order to distinguish specimens in different collections that might bear the same catalog numbers (e.g., ABC:Birds:12345 versus ABC:Fish:12345).  

Enter the required information records for each of the collections maintained by the institution. If the institution has a single collection (as defined below), this step can be skipped and you can exit the registry.

After the institution and collection data have been entered, saved and confirmed, confirmation emails will be sent to all the contact addresses entered.  The institution will be contacted by email or phone to confirm the authenticity of the entries. A new record will not appear on the registration site until it has been confirmed independently.

For the present purpose, a collection is defined as a set of specimens that are cataloged with a separate numbering system. If an institution uses a single numbering system for all its specimens, then it has a single catalog for a single collection. If an institution uses the same numbering system for three different sets of specimens (e.g., mammals, birds and fish), then it has three collections with separate catalogs for each one.

Non-Institutional Collections

The registry may contain acronyms that have been used previously to identify specimens in GenBank that are held by individual researchers and private collectors.  For this reason, people wishing to register their non-institutional collections should use the procedure described above.  When they specify that theirs is a non-institutional collection, the system will automatically register the acronym as a collection within the shared institutional acronym “personal”. 

