GPC Members Login
If you have any problems or have forgotten your login please contact [email protected]

An Entire Botanical Garden, 760 plant specimens, of Genomes

Researchers in China provide genome sequence data and species identification for 760 plant specimens from Ruili Botanical Garden, South-West China.

Researchers from the China National GeneBank, BGI, and the Forestry Bureau of Ruili, China have sampled and sequenced 761 samples, representing 689 vascular plant species from 137 families and 49 orders. The plant samples are all from in and around the 500-hectare Ruili Botanical Garden, a subtropical part of China bordering Myanmar. Being in a biologically rich part of China, the garden is committed to protecting endangered and Chinese-endemic plants, including the preservation and archiving of these germplasm resources to assist with their long-term conservation. This project is the world's first scientific and systematic attempt to digitize a whole botanical garden based on genomic as well as voucher specimen information.

On the scientific potential of this resource, BGI’s CEO and author on the paper Xun Xu highlights that: “Current understanding of the evolution of plants and their diversity in a phylogenomic context is limited because of the lack of genome-scale information across phylogenetically diverse species. This innovative project integrates a new way of thinking about the digitization of all the plant species to augment evolutionary and ecological research in botanical gardens.”

In total, the researchers produced 54 terabytes of sequencing data, with an average sequencing depth of 60X per species. In addition to the basic challenge of carrying out DNA sequencing on this number of species, another major task was scaling up the species identification, digitizing images of the specimens, and building a new herbarium for their storage at a new China National GeneBank (CNGB) herbarium in Shenzhen. So far, of the 761 specimens, sequence and chloroplast data has enabled the identification of 257 plants at the species level and 504 at the family level. Deep learning has also been successful applied to 181 species to enable them to be identified to the species level.

Author Ting Yang says that this was “the largest amount of data I have ever processed. During the data analyses, I think the biggest challenges was sequence checking and results examination.” This required researchers to individually check each of the 761 sample’s sequencing data, and compare the chloroplast gene sequences with herbarium specimens for species identification.

Another difficulty relating to simply getting to the point of being able to do the sequencing work was collecting all the samples. Author Jinpu Wei states: “We cooperated with experts from the Ruili Forestry Bureau to collect plant materials distributed in the area of Ruili for the establishment of a digital botanical garden. After 45-days of tiring effort, we collected 1,093 plant materials. Although it was challenging for us to transport the materials properly, we finally managed to ensure the high quality of these plant materials for future research.”

Corresponding author, Xin Liu, adds that the project “was a baseline project to fine tune and standardize the sampling, methodologies, and the data accumulation and analyses techniques for large-scale genome projects like the 10KP (10 thousand Plant Genome Project). From this project, we have gained considerable and useful experience for subsequent sample collection, sequencing, and assembly. At the same time, the data produced from this study can be effectively used in subsequent genome projects.”

Despite having constructed only one sequencing library for each species, the authors were
able to assemble preliminary genomes for 17 of them, reflecting the quality and reuse potential of the DNA.
Researchers at the Chinese University of Hong Kong have already independently assembled the genomes of species of particular interest to them. The potential for the wider research community to study their species of interest, improve other genomes, develop tools and methods, and provide education opportunities for new generations of scientists is enormous.

Lead author Huan Liu added that “Genomic characterization will provide a large amount of basic data for plant genome assembly, which will be an excellent start for the 10KP project. At the same time, it lays a good foundation for the future research on the correlation mechanism from macroscopic ecology and biodiversity to microscopic molecular level.”

To promote more extensive data sharing than just making sequence data available, the researchers are also making the digitized images available and providing access to the herbarium. The Herbarium (HCNGB) serves as a living plant database that records the position of species grown in the Ruili Botanical Garden and monitors the status of each species.

All the digital data generated here (images, raw sequencing data, assembled chloroplast genomes, and preliminary nuclear genome assemblies) are available via the NCBI SRA under accession number PRJNA43840, GigaScience GigaDB database and China National GeneBank CNSA. Additionally, to enable the data to be searched and genomes and species identification to be updated, metadata is indexed and linked via Datacite and GigaDB. And all resources are released without restriction under a CC0 waiver.

Author Dr Sunil Kumar Sahu highlighted that this is the most important legacy of the project “This dataset is of great value to plant researchers, and more importantly, can serve as a reference for future planetary-scale genome sequencing projects including the Earth BioGenome Project (EBP) and 10 thousand Plant Genome Project (10KP).”

Read the paper: Gigascience

Image credit: Gigascience


Local plant-microbe alliances shape global biomes

Dense rainforests, maple-blanketed mountains and sweeping coniferous forests demonstrate the growth and proliferation of trees adapted to specific conditions. The regional dominance of tree species we see on the surface now, however, might actually have been determined underground long ago.

Pre-Crop Values from Satellite Images to Support Diversification of Agriculture

Pre-crop values for a high number of previous and following crop combinations originating from farmers’ fields are, for the first time, available to support diversification of currently monotonous crop sequencing patterns in agriculture. The groundbreaking method utilizing satellite images was developed by Natural Resources Institute Finland (Luke) in collaboration with Finnish Geospatial Research Institute (FGI).

Editing of RNA may play a role in chloroplast-to-nucleus communication

What will a three-degree-warmer world look like? How will plants fare in more extreme weather conditions? When experiencing stress or damage from various sources, plants use chloroplast-to-nucleus communication to regulate gene expression and help them cope.