Accession passport data basics

Matija Obreza

version 2.2, December 2015
Documentation commit 9bd43760111b7f5ca0c6ca180b40094d33a4db7c

1. Introduction

This manual contains basic information on commonly used standards for accession documentation and formats for data exchange.

Genesys PGR (Plant Genetic Resources) is a free online global portal accessible at that allows the exploration of the world’s crop diversity through a single website. The data published on Genesys follows the Multi-crop Passport Descriptors standard.

The manual introduces

2. Acknowledgements

Special thanks go to Michael Mackay, Angela Marcela Hernandez, Edwin Rojas for their input, feedback and support.

You can contact Matija Obreza at

3. Accession documentation in genebanks

Collections of PGRFA material in genebanks document at least the following for each accession

A single accession is usually maintained as several individual inventories or lots, where each inventory follows different management policies and is maintained in different conditions (e.g. cryo and in vitro, or base and active collection).

Inventory management is a topic of genebank collection management and is not further described here.

3.1. Accession number

Accession number is the unique identifier assigned to the material as it enters the collection. This identifier generally has three components:

Prefix + Sequence number + Suffix

The prefix is commonly used to differentiate between different crop collections maintained by the genebank.

Some prefixes used by IITA genebank
  • TMe Cassava Manihot esculenta collection

  • TVSu Bambara groundnut Vigna subterranea collection

  • TZm Maize Zea mays collection

Sequence number is assigned manually or by a computer system to ensure there are no duplicates. Some institutes prefer to zero-pad the number 00000102.

The suffix allows differentiating samples of the same original material. A suffix might be used after making a selection from the original accession (e.g. a single seed descent) to be maintained as a separate sample. The exact meaning of the suffix is different for every institute.

Table 1. Example accession numbers
Prefix Sequence number Suffix Accession number







3.2. Other accession identifiers

Material enters the collection by collecting, from breeding programs, or acquisition from other institutes. In each case, the material will already have some identifier assigned by the collector, breeder or other institute.

Accession name is the vernacular name of the material and is commonly captured by the collector or assigned by the breeder.

3.2.1. Collected material

Genebank accessions obtained through collecting missions should maintain data about the site and dates of the collecting and collector information.

3.2.2. Breeders material

Lines developed by breeding programs of the institute may be included the collection. Information provided by the breeders should include the pedigree or ancestral information (selection history) of the material, along with names and identifiers used by the breeding program and the codes and names of institutes that developed the material.

3.2.3. Acquisitions

Material coming from other institutes and genebanks must be accompanied by accession passport data as documented in the source genebank.

Country of origin is the country where the material was collected or bred, not the country of the source genebank.

Accession documentation should capture any identifiers provided by the source institute. This data allows for validation and curation of passport data between the genebanks and allows researchers to obtain material from either collection.

3.3. Taxonomy

Accession genus, species, species author, subtaxon and subtaxon authority are usually known, but are subject to change after expert identification or change in taxonomic system.

3.4. Storage and maintenance

Ex situ genebanks maintain PGR material as seed, in the field, in vitro, cryo or in DNA collections. Inventories (lots) of one accession may be managed by different methods (e.g. seed and cryo). See Storage in MCPD standard on how to capture multiple types of storage.


The World Information and Early Warning System (WIEWS) on Plant Genetic Resources for Food and Agriculture (PGRFA), has been established by FAO, as a world-wide dynamic mechanism to foster information exchange among Member Countries and as an instrument for the periodic assessment of the State of the World’s PGRFA.

The FAO WIEWS database contains basic information about institutes working with PGRFA. The data includes full names, acronyms, website links and contact information.

Genesys regularly updates the list of institutes from the FAO WIEWS database and makes them accessible at

This data cannot be directly managed through Genesys, changes must be applied to the WIEWS database.

4.1. WIEWS Institute Codes

The FAO WIEWS code of the institute consist of the 3-letter ISO 3166-1 alpha 3 country code of the country where the institute is located plus a number (e.g. COL001, USA1004).

The Multi-Crop Passport Descriptors standard relies on WIEWS codes.

The automated import of institute data allows Genesys to present individual pages for genebanks registered in FAO WIEWS database.

Table 2. Direct access to genebank pages using WIEWS code
WIEWS Code Genesys URL



4.2. Obtaining a WIEWS code

A new WIEWS INSTCODE can be generated online by contacting your country National Focal Point or

4.3. Inactive WIEWS codes

The WIEWS code of an institute may change. In that case, the record is marked as inactive and it will refer to the newly assigned code. Genesys will render a message that the institute record is archived and provide a link to the new code:

5. Multi-Crop Passport Descriptors

The Multi-crop Passport Descriptors (MCPD V.2.1) is an update to MCPD V.2 which was released in 2012. The MCPD V.2 was a revision of the first FAO/IPGRI publication released in 2001, expanded to accommodate emerging needs, such as the broader use of GPS tools, or the implementation of the International Treaty on Plant Genetic Resources for Food and Agriculture Multilateral System for access and benefit sharing.

This MCPD V.2.1 list is an expansion of the first version of the MCPD, the descriptors and allowed values of the first version form a subset of those in this revision. The 2001 list, developed jointly by Bioversity International (formerly IPGRI) and FAO, has been widely used and is considered the international standard to facilitate germplasm passport information exchange. These descriptors aim to be compatible with Bioversity’s crop descriptor lists, with the descriptors used for the FAO World Information and Early Warning System (WIEWS) on plant genetic resources (PGR), and with the Genesys PGR global portal.

For each multi-crop passport descriptor, a brief explanation of content, coding scheme and, in parentheses, suggested fieldname are provided to assist in the computerized exchange of this type of data.

The authors of the MCPD recognize that networks or groups of users may further expand the MCPD list to meet their specific needs. As long as these additions allow for an easy conversion to the format proposed in MCPD V.2, basic passport data can be exchanged worldwide in a consistent manner.

5.1. MCPD Descriptors

Table 3. MCPD descriptors
Field name Description


Any persistent, unique identifier assigned to the accession so it can be unambiguously referenced at the global level and the information associated with it harvested through automated means. Report one PUID for each accession.


FAO WIEWS code of the institute where the accession is maintained.


Unique identifier of the accession within a genebank.


Original identifier assigned by the collector(s) of the sample, normally composed of the name or initials of the collector(s) followed by a number (e.g. FM9909). This identifier is essential for identifying duplicates held in different collections.


FAO WIEWS code of the institute collecting the sample.


Name of the institute collecting the sample. This descriptor should only be used if COLLCODE cannot be filled because the FAO WIEWS code for this institute is not available.


Address of the institute collecting the sample. This descriptor should only be used if COLLCODE cannot be filled because the FAO WIEWS code for this institute is not available.


Identifier of the collecting mission used by the Collecting Institute (e.g. CIATFOR-052, CN426).


Genus name for taxon. Initial upper case letter required.


Specific epithet portion of the scientific name in lower case letters.

The abbreviation sp. or spp. is allowed when exact species name is unknown.


Provide the authority for the species name.


Subtaxon can be used to store any additional taxonomic identifier. The following abbreviations are allowed: subsp. (for subspecies); convar. (for convariety); var. (for variety); f. (for form); Group (for cultivar group).


Provide the subtaxon authority at the most detailed taxonomic level.


Common name of the crop. Example: malting barley, macadamia, maize.


Either a registered or other designation given to the material received, other than the donor’s accession number (DONORNUMB) or collecting number (COLLNUMB). First letter upper case.


Date on which the accession entered the collection where YYYY is the year, MM is the month and DD is the day. Missing data (MM or DD) should be indicated with hyphens or 00 [double zero].


3-letter ISO 3166-1 code of the country in which the sample was originally collected (e.g. landrace, crop wild relative, farmers' variety), bred or selected (breeding lines, GMOs, segregating populations, hybrids, modern cultivars, etc.).


Location information below the country level that describes where the accession was collected, preferable in English. This might include the distance in kilometers and direction from the nearest town, village or map grid reference point, (e.g. 7km south of Curitiba in the state of Parana).


Latitude expressed in decimal degrees. Positive values are North of the Equator; negative values are South of the Equator (e.g. -44.6975).


Longitude expressed in decimal degrees. Positive values are East of the Greenwich Meridian; negative values are West of the Greenwich Meridian (e.g. +120.9123).


Uncertainty associated with the coordinates in meters. Leave the value empty if the uncertainty is unknown.


The geodetic datum or spatial reference system upon which the coordinates given in decimal latitude and longitude are based (e.g. WGS84, ETRS89, NAD83). The GPS uses the WGS84 datum.


The georeferencing method used (GPS, determined from map, gazetteer, or estimated using software). Leave the value empty if georeferencing method is not known.


Elevation of collecting site expressed in meters above sea level. Negative values are not allowed.


Collecting date of the sample, where YYYY is the year, MM is the month and DD is the day. Missing data (MM or DD) should be indicated with hyphens or 00 [double szero].


FAO WIEWS code of the institute that has bred the material. If the holding institute has bred the material, the breeding institute code (BREDCODE) should be the same as the holding institute code (INSTCODE). Follows INSTCODE standard.


Name of the institute (or person) that bred the material. This descriptor should only be used if BREDCODE cannot be filled because the FAO WIEWS code for this institute is not available.


Biological status of the accession.


Information about either pedigree or other description of ancestral information (e.g. parent variety in case of mutant or selection). For example a pedigree Hanna/7*Atlas//Turk/8*Atlas or a description mutation found in Hanna, selection from Irene or cross involving amongst others Hanna and Irene.


Collecting/acquisition source


FAO WIEWS code of the donor institute. Follows INSTCODE standard.


Name of the donor institute (or person). This descriptor should be used only if DONORCODE cannot be filled because FAO WIEWS code for this institute is not available.


Identifier assigned to an accession by the donor. Follows ACCENUMB standard.


Any other identifiers known to exist in other collections for this accession. Use the following format: INSTCODE:ACCENUMB;INSTCODE:identifier;… INSTCODE and identifier are separated by a colon : without space. Pairs of INSTCODE and identifier are separated by a semicolon ; without space. When the institute is not known, the identifier should be preceeded by a colon.


FAO WIEWS code of the institute(s) where a safety duplicate of the accession is maintained.

The WIEWS institute code for Svalbard Global Seed Vault is NOR051.


Name of the institute where a safety duplicate of the accession is maintained.


Type of germplasm storage. If germplasm is maintained under different types of storage, multiple choices are alllowed, separated by a semicolon (e.g. 20;30).


The status of an accession with regards to the Multilateral System (MLS) of the International Treaty on Plant Genetic Resources for Food and Agriculture. Leave the value empty if the status is not known.


The remarks field is used to add notes or to elaborate on descriptors with value 99 or 999 (= Other). Prefix remarks with the field name they refer to and a colon (:) without space (e.g. COLLSRC:riverside). Distinct remarks referring to different fields are separated by semicolon without space.

5.1.1. Persistent unique identifier

Any persistent, unique identifier assigned to the accession so it can be unambiguously referenced at the global level and the information associated with it harvested through automated means. Report one PUID for each accession.

There are various "types" of PUIDs: DOI, UUID, LSID, etc.

The Secretariat of the ITPGRFA is facilitating the assignment of DOI to PGRFA at the accession level (

UUID (Universally unique identifier) is an identifier standard used in software. A UUID is simply a 128-bit value (16 bytes).

For human-readable display, many systems use a canonical format using hexadecimal text with inserted hyphen characters. For example:


The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. In this context the word unique should be taken to mean "practically unique" rather than "guaranteed unique".

Different variants and versions of UUID exist. Version 4 (Random UUID) is most commonly used in software.

Genebanks not applying a true PUID to their accessions should use, and request recipients to use, the concatenation of INSTCODE, ACCENUMB, and GENUS as a globally unique identifier similar in most respects to the PUID whenever they exchange information on accessions with third parties (e.g. NOR017:NGB17773:ALLIUM).

5.1.2. Crop name

Genesys will read the CROPNAME as provided and attempt to link the name with an existing crop record in Genesys. Genesys currently supports the following crop names:

apple, banana, barley, beans, breadfruit, cassava, chickpea, coconut, cowpea, eggplant, fababean,
fingermillet, grasspea, lentil, lettuce, maize, pearlmillet, pigeonpea, potato, rice, sorghum,
sunflower, sweetpotato, taro, tomato, wheat, yam

The up-to-date list of crops and their coded names is available at

As more data is uploaded to Genesys we will add aliases to crops, making sure that future uploads properly link the accession with the specified crop.

You are encouraged to use the crop names listed above, but more importantly, let know if your crop is not yet listed.

5.1.3. Institute codes in MCPD

Values for INSTCODE, COLLCODE, BREDCODE, DONORCODE and DUPLSITE must be provided as FAO WIEWS codes of institutes.

5.1.4. Biological status of accession

The coding scheme proposed can be used at 2 different levels of detail: either by using the general codes such as 100, 200, 300, 400, or by using the more specific codes such as 110, 120, etc.

Allowed values for SAMPSTAT field
  • 100 Wild

    • 110 Natural

    • 120 Semi-natural/wild

    • 130 Semi-natural/sown

  • 200 Weedy

  • 300 Traditional cultivar/landrace

  • 400 Breeding/research material

    • 410 Breeder’s line

    • 411 Synthetic population

    • 412 Hybrid

    • 413 Founder stock/base population

    • 414 Inbred line (parent of hybrid cultivar)

    • 415 Segregating population

    • 416 Clonal selection

    • 420 Genetic stock

    • 421 Mutant (e.g. induced/insertion mutants, tilling populations)

    • 422 Cytogenetic stocks (e.g. chromosome addition/substitution, aneuploids, amphiploids)

    • 423 Other genetic stocks (e.g. mapping populations)

  • 500 Advanced or improved cultivar (conventional breeding methods)

  • 600 GMO (by genetic engineering)

  • 999 Other (Elaborate in REMARKS field)

5.1.5. Accession storage

If germplasm is maintained under different types of storage, multiple values are allowed. When an accession is maintained in active- and base collections, STORAGE corresponds to 11 and 13 and can be encoded as 11;13.

Allowed values for STORAGE field
  • 10 Seed collection

    • 11 Short term

    • 12 Medium term

    • 13 Long term

  • 20 Field collection

  • 30 In vitro collection

  • 40 Cryopreserved collection

  • 50 DNA collection

  • 99 Other (elaborate in REMARKS field)

5.2. Genesys extensions to MCPD

Table 4. MCPD extensions
Field name Description


Accession URL


Indicates current availabilty of accession for distribution


Indicates whether the record represents an accession no longer actively maintained by the genebank


Universally unique identifier of the accession record

5.2.1. Accession URL

ECPGR originally extended the MCPD list with Accession URL field ACCEURL. The field should contain the direct link to the provider’s on-line portal where additional data about the accession may be available.

Passport data of IITA’s TDr-3616 yam accession

5.2.2. Accession availability

Genesys allows end-users to request for material from holding institutes. Accession records marked as not available in Genesys will be excluded from user’s request.

In addition to the availability flag, genebanks must opt-in to allow end-users to request for material through Genesys.

5.2.3. Historic records

Accessions are on occasion removed from the collection. This is especially true for pre-bred material and genetic stocks that are maintained by the genebank for a limited period of time. The records about such material must not be deleted from the databases as they can potentially be tracked to other collections where the material is still actively maintained.

The holding genebank may want to mark these records by setting the value of HISTORIC field to true.

Values null (not specified) and false indicate that the record represents an actively managed accession.

Historic accessions cannot be requested through Genesys.

6. Other relevant standards

6.1. ISO-3166 Country codes

ISO-3166 standard defines Codes for the representation of names of countries and their subdivisions. ISO-3166-1 alpha-3 codes are three-letter country codes. The Wikipedia page contains the listing of valid country codes. Genesys uses as the source of ISO-3166 country codes.

6.2. UN M.49

UN defines standard country or area codes and geographical regions for statistical use: