This manual contains basic information on commonly used standards for accession documentation and formats for data exchange.
Genesys PGR (Plant Genetic Resources) is a free online global portal accessible at www.genesys-pgr.org that allows the exploration of the world’s crop diversity through a single website. The data published on Genesys follows the Multi-crop Passport Descriptors (MCPD) standard.
The manual introduces:
The FAO WIEWS database and WIEWS Institute codes.
FAO/Bioversity Multi-crop Passport Descriptors
Genesys extensions to MCPD
Other standards relevant to accession documentation
Collections of plant genetic resources in genebanks document at least the following information for every accession:
Acquisition date, ACQDATE
: the date on which the accession entered the collection
The accession number is the unique identifier assigned to material as it enters the collection. This identifier is often made up of a prefix, a sequence number, and sometimes a suffix.
The prefix is commonly used to differentiate between different crop collections maintained by a genebank.
TMe
Manihot esculenta (Cassava) collection
TVSu
Vigna subterranea (Bambara groundnut) collection
TZm
Zea mays (Maize) collection
The sequence number is assigned manually or by a computer system to ensure there are no duplicates. Some institutes prefer to zero-pad the number, as in 00000102
.
A suffix allows for differentiating samples of the same original material. A suffix might be used after making a selection from the original accession (e.g. a single seed descent) to be maintained as a separate sample. The exact meaning of the suffix is different for every institute.
Prefix | Sequence number | Suffix | Accession number |
---|---|---|---|
TMe |
419 |
TMe-419 |
|
TVSu |
13 |
TVSu-13 |
Material enters a collection through collecting activities, breeding programs or acquisition from other institutes. In each case, the material will already have some kind of identifier assigned by the collector, breeder or other institute.
An accession name is the vernacular name of the material, and is commonly captured by the collector or assigned by the breeder.
Genebank accessions obtained through collecting missions should maintain data about the site, date of collection and collector information.
Lines developed by an institute’s breeding programs may be included in its collection. Information provided by the breeders should include the pedigree or ancestral information (selection history) of the material, along with names and identifiers used by the breeding program and the codes and names of institutes that developed the material.
Material coming from other institutes and genebanks must be accompanied by accession passport data as documented in the source genebank.
Country of origin is the country where the material was collected or bred, not the country of the source genebank. |
Accession documentation should capture any identifiers provided by the source institute. This data allows for validation and curation of passport data between genebanks and allows researchers to obtain material from either collection.
Accession genus, species, species author, subtaxon and subtaxon authority are usually known, but are subject to change after expert identification or change in the taxonomic system.
GRIN Taxonomy for Plants and the Mansfeld’s World Database of Agriculture and Horticultural Crops can serve for validating accession taxa.
Ex situ genebanks maintain plant genetic resources as seed, in the field, in vitro, in cryo or in DNA collections. A single accession may be maintained as several individual inventories or lots. Each inventory follows different management policies and is maintained in different conditions. For example, different inventories may be held in cryo and in vitro, or in base and active collections.
See Storage under the MCPD standard for more on how to capture multiple types of storage.
The United Nations Food and Agriculture Organization (FAO) maintains the World Information and Early Warning System (WIEWS) on Plant Genetic Resources for Food and Agriculture (PGRFA). WIEWS was establised as a worldwide dynamic mechanism to foster information exchange among FAO Member Countries, and as an instrument for the periodic assessment of the state of the world’s PGRFA.
The FAO WIEWS database contains basic information about institutes working with PGRFA. The data includes full names, acronyms, website links and contact information.
Genesys regularly updates the list of institutes from the FAO WIEWS database and makes them accessible at https://www.genesys-pgr.org/wiews/active.
This data cannot be directly managed through Genesys. Changes must be applied to the WIEWS database itself. |
An FAO WIEWS Institute Code consists of the 3-letter ISO 3166-1 alpha 3 country code of the country where the institute is located plus a number (e.g. COL001, USA1004).
The MCPD standard relies on WIEWS codes. The automated import of institute data through this code also allows Genesys to present individual pages for genebanks registered in the FAO WIEWS database.
WIEWS Institute Code | Genesys URL |
---|---|
COL001 |
|
NGA039 |
A new WIEWS code can be generated by contacting your National Focal Point or wiews@fao.org.
The WIEWS code of an institute may change. In that case, the old record is marked as inactive and will refer to the newly assigned code. Genesys will render a message stating that the institute record is archived, and provide a link to the new code:
The Multi-crop Passport Descriptors (MCPD) V.2.1 were released in 2015 as an update to the MCPD V.2 from 2012. The MCPD V.2, in turn, was a 2001 revision of the first FAO/IPGRI publication, expanded to accommodate emerging needs such as the broader use of GPS tools and the implementation of the International Treaty on Plant Genetic Resources for Food and Agriculture Multilateral System for access and benefit sharing.
The 2001 list, developed jointly by Bioversity International (formerly IPGRI) and FAO, has been widely used as the international standard to facilitate germplasm passport information exchange. These descriptors aim to be compatible with Bioversity’s crop descriptor lists, with the descriptors used for FAO WIEWS and with the Genesys PGR global portal.
For each MCPD, a brief explanation of content, coding scheme and, in parentheses, suggested fieldname are provided to assist in the computerized exchange of this type of data.
The authors of the MCPD recognize that networks or groups of users may further expand the descriptor list to meet their specific needs. As long as these additions allow for easy conversion to the format proposed in MCPD V.2 and V2.1, basic passport data can be exchanged worldwide in a consistent manner.
Field name | Description |
---|---|
Any persistent unique identifier assigned to the accession so it can be unambiguously referenced at the global level and the information associated with it harvested through automated means. Report one PUID for each accession. |
|
FAO WIEWS code of the institute where the accession is maintained. |
|
ACCENUMB |
Unique identifier of the accession within a genebank. |
COLLNUMB |
Original identifier assigned by the collector(s) of the sample, normally composed of the name or initials of the collector(s) followed by a number (e.g. |
FAO WIEWS code of the institute that collected the sample. |
|
COLLNAME |
Name of the institute that collected the sample. This descriptor should only be used if COLLCODE cannot be filled because the FAO WIEWS code for this institute is not available. |
COLLINSTADDRESS |
Address of the institute that collected the sample. This descriptor should only be used if COLLCODE cannot be filled because the FAO WIEWS code for this institute is not available. |
COLLMISSID |
Identifier of the collecting mission as used by the collecting institute (e.g. |
GENUS |
Genus name for taxon. An initial uppercase letter is required. |
SPECIES |
Specific epithet portion of the scientific name in lowercase letters. The abbreviation |
SPAUTHOR |
The authority for the species name. |
SUBTAXA |
A subtaxon can be used to store any additional taxonomic identifier. The following abbreviations are allowed: |
SUBTAUTHOR |
The subtaxon authority at the most detailed taxonomic level. |
Common name of the crop (e.g. |
|
ACCENAME |
Either a registered or other designation given to the material received, other than the donor’s accession number (DONORNUMB) or collecting number (COLLNUMB). An initial uppercase letter is required. |
ACQDATE |
The date on which the accession entered the collection, in the format |
ORIGCTY |
3-letter ISO 3166-1 code of the country in which the sample was originally collected (for a landrace, crop wild relative or farmers' variety), bred or selected (for breeding lines, GMOs, segregating populations, hybrids, modern cultivars, etc.). |
COLLSITE |
Location information below the country level that describes where the accession was collected, preferably in English. This might include the distance in kilometers and direction from the nearest town, village or map grid reference point (e.g. |
DECLATITUDE |
Latitude expressed in decimal degrees. Positive values are north of the Equator; negative values are south of the Equator (e.g. |
DECLONGITUDE |
Longitude expressed in decimal degrees. Positive values are east of the Greenwich Meridian; negative values are west of the Greenwich Meridian (e.g. |
COORDUNCERT |
Uncertainty associated with the coordinates in meters. Leave the value empty if the uncertainty is unknown. |
COORDDATUM |
The geodetic datum or spatial reference system upon which the coordinates given in decimal latitude and longitude are based (e.g. |
GEOREFMETH |
The georeferencing method used ( |
ELEVATION |
Elevation of collecting site expressed in meters above sea level. Negative values are not allowed. |
COLLDATE |
Collecting date of the sample, in the format |
FAO WIEWS code of the institute that has bred the material. If the holding institute has bred the material, the breeding institute code (BREDCODE) should be the same as the holding institute code (INSTCODE). |
|
BREDNAME |
Name of the institute (or person) that bred the material. This descriptor should only be used if BREDCODE cannot be filled because a FAO WIEWS code is not available or applicable. |
Biological status of the accession. |
|
ANCEST |
Information about pedigree (e.g. |
COLLSRC |
Collecting/acquisition source. |
FAO WIEWS code of the donor institute. |
|
DONORNAME |
Name of the donor institute (or person). This descriptor should be used only if DONORCODE cannot be filled because a FAO WIEWS code is not available or applicable. |
DONORNUMB |
Identifier assigned to an accession by the donor. Follows the ACCENUMB standard. |
OTHERNUMB |
Any other identifiers known to exist in other collections for this accession. Use the following format: |
FAO WIEWS code of the institute(s) where a safety duplicate of the accession is maintained. The WIEWS institute code for the Svalbard Global Seed Vault is NOR051. |
|
DUPLINSTNAME |
Name of the institute(s) where a safety duplicate of the accession is maintained. This descriptor should be used only if DUPLSITE cannot be filled because a FAO WIEWS code is not available. |
Type of germplasm storage. If germplasm is maintained under different types of storage, multiple choices are allowed, separated by a semicolon (e.g. |
|
MLSSTAT |
The status of an accession with regards to the Multilateral System (MLS) of the International Treaty on Plant Genetic Resources for Food and Agriculture. Leave the value empty if the status is not known. |
REMARKS |
The remarks field is used to add notes or to elaborate on descriptors with value |
A persistent unique identifier (PUID) is assigned to an accession so it can be unambiguously referenced at the global level and the information associated with it harvested through automated means. One PUID should be reported for each accession.
There are various standards for PUIDs, including DOI, UUID and LSID. The Secretariat of the International Treaty on Plant Genetic Resources for Food and Agriculture is facilitating the assignment of DOI to genetic resources at the accession level (http://www.planttreaty.org/doi).
UUID (Universally unique identifier) is an identifier standard used in software. A UUID is simply a 128-bit value (16 bytes). For human-readable display, many systems use a canonical format of hexadecimal text with inserted hyphen characters. For example:
de305d54-75b4-431b-adb2-eb6b9e546014
The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. In this context the word unique should be taken to mean "practically unique" rather than "guaranteed unique".
Different variants and versions of UUID exist. Version 4 (Random UUID) is the most commonly used in software.
Genebanks not applying a true PUID to their accessions should use, and request recipients to use, the concatenation of INSTCODE, ACCENUMB and GENUS as a globally unique identifier, similar in most respects to a PUID, whenever they exchange information on accessions with third parties (e.g. NOR017:NGB17773:ALLIUM
).
Genesys will read the CROPNAME as provided and attempt to link the name with an existing crop record in Genesys. Genesys currently supports the following crop names:
apple
, banana
, barley
, beans
, breadfruit
, cassava
, chickpea
, coconut
, cowpea
, eggplant
, fababean
, fingermillet
, grasspea
, lentil
, lettuce
, maize
, pearlmillet
, pigeonpea
, potato
, rice
, sorghum
, sunflower
, sweetpotato
, taro
, tomato
, wheat
, yam
The up-to-date list of crops and their coded names is available at https://www.genesys-pgr.org/c/.
As more data is uploaded to Genesys we will add aliases to crops, making sure that future uploads properly link the accession with the specified crop.
You are encouraged to use the crop names listed above, but more importantly, let helpdesk@genesys-pgr.org know if your crop is not yet listed.
Values for INSTCODE
, COLLCODE
, BREDCODE
, DONORCODE
and DUPLSITE
must be provided as FAO WIEWS codes of institutes.
The coding scheme for biological status can be used at two different levels of detail: either as a general code (e.g. 100
, 200
) or a more specific code (e.g. 110
, 120
).
SAMPSTAT
field
100
Wild
110
Natural
120
Semi-natural/wild
130
Semi-natural/sown
200
Weedy
300
Traditional cultivar/landrace
400
Breeding/research material
410
Breeder’s line
411
Synthetic population
412
Hybrid
413
Founder stock/base population
414
Inbred line (parent of hybrid cultivar)
415
Segregating population
416
Clonal selection
420
Genetic stock
421
Mutant (e.g. induced/insertion mutant, tilling population)
422
Cytogenetic stock (e.g. chromosome addition/substitution, aneuploid, amphiploid)
423
Other genetic stock (e.g. mapping population)
500
Advanced or improved cultivar (conventional breeding methods)
600
GMO (by genetic engineering)
999
Other (elaborate in REMARKS field)
If germplasm is maintained under different types of storage, multiple values are allowed. For example, when an accession is maintained in active and base collections, STORAGE
corresponds to both 11
and 13
and can be encoded as 11;13
.
STORAGE
field
10
Seed collection
11
Short term
12
Medium term
13
Long term
20
Field collection
30
In vitro collection
40
Cryopreserved collection
50
DNA collection
99
Other (elaborate in REMARKS field)
Field name | Description |
---|---|
Accession URL. |
|
Indicates current availability of accession for distribution. |
|
Indicates whether the record represents an accession no longer actively maintained by the genebank. |
|
UUID |
Universally unique identifier of the accession record. |
ECPGR originally extended the MCPD list with the Accession URL field ACCEURL
. The field should contain a direct link to the provider’s online portal where additional data about the accession may be available.
ACCEURL: http://my.iita.org/accession2/accession/TDr-3616
Genesys allows end-users to request material from holding institutes. Accession records marked as not available in Genesys will be excluded from user’s requests.
In addition to setting the availability flag, genebanks must opt in to allow end-users to request material through Genesys. |
Accessions are on occasion removed from a collection. This is especially true for pre-bred material and genetic stocks that are maintained by the genebank for a limited period of time. The records about such material must not be deleted from databases, as they can potentially be tracked to other collections where the material is still actively maintained.
The holding genebank may want to mark such records by setting the value of the HISTORIC
field to true
.
Values null
(not specified) and false
indicate that the record represents an actively managed accession.
Historic accessions cannot be requested through Genesys. |
The ISO-3166 standard defines Codes for the representation of names of countries and their subdivisions. ISO-3166-1 alpha-3 codes are three-letter country codes. Genesys uses http://download.geonames.org/export/dump/countryInfo.txt as the source of ISO-3166 country codes.
Special thanks go to Michael Mackay, Angela Marcela Hernandez and Edwin Rojas for their input, feedback and support.
You can contact the author, Matija Obreza, at matija.obreza@croptrust.org.