Reach Us +44-1993-227344

The Complete Chloroplast Genome of Coptis teeta (Ranunculaceae), An Endangered Plant Species Endemic to the Eastern Himalaya

Ya-Fang Gao1, Xiao-Li Liu1, Guo-Dong Li1*, Zi-Gang Qian1*, Yong-Hong Zhang2 and Ying-Ying Liu1,3

1Faculty of Traditional Chinese Pharmacy, Yunnan University of Traditional Chinese Medicine, Kunming, P.R. China

2School of Life Sciences, Yunnan Normal University, Kunming, P.R. China

3Yunnan Institute for Food and Drug, Kunming, P.R. China

*Corresponding Author:
Guo-Dong Li and Zi-Gang Qian
Faculty of Traditional Chinese Pharmacy
Yunnan University of Traditional Chinese Medicine
Kunming-650500, P.R. China.
E-mail: [email protected] ; [email protected]

Received Date: August 23, 2018; Accepted Date: September 06, 2018; Published Date: September 10, 2018

Citation: Gao YF, Liu XL, Li GD, Qian ZG, Zhang YH, et al. (2018) The Complete Chloroplast Genome of Coptis teeta (Ranunculaceae), An Endangered Plant Species Endemic to the Eastern Himalaya. Biochem Mol Biol J Vol. 4: No.3:19. DOI: 10.21767/2471-8084.100068

 
Visit for more related articles at Biochemistry & Molecular Biology Journal

Abstract

Coptis teeta is an endemic and endangered medicinal plant from the Eastern Himalaya. It has been categorized by the International Union for Conservation of Nature (IUCN) as Endangered (EN). The whole chloroplast genome of C. teeta was sequenced based on nextgeneration sequencing (NGS) in present study. The circular chloroplast genome exhibits typical quadripartite regions with 154,280 bp in size, including two inverted repeat (IR, 24,583 bp) regions, one large singe copy region (LSC) and one small singe copy region (SSC) of 87,519 bp and 17,595 bp, respectively. The genome contains 125 genes, including 81 protein-coding genes (PCGs), 36 tRNA genes and 8 rRNA genes. Total GC content of C. teeta is 38.3%, while those of IR regions (43.3%) are higher than LSC (36.7%) and SSC (32.2%) regions. Forty-two forward and twenty-three reverted repeats were detected in cp genome of C. teeta. The genome was rich in SSRs and totally 62 SSRs were visualized. The phylogenetic tree showed that species from the Ranunculaceae formed a monophyletic clade and the intra-family topology was consistent with previous studies. The results strongly supported C. teeta and its congeneric species, C. chinensis, as sister group with 100% bootstrap value.

Keywords

Coptis teeta Wallich; Chloroplast genome; Endangered species; IUCN; Phylogenetic

Introduction

Coptis teeta Wallich, a perennial herb of Ranunculaceae, was endemic to Eastern Himalaya with narrow distribution range. It is a shade-tolerant species, mainly distributed in the moist temperate, evergreen, broad-leaved forests in northwest Yunnan, China, and northeast India and it occupied highly specialized niches in temperate oak - rhododendron forests and restricted to elevations between 2350 and 3100 m [1,2]. The rhizome of this species, known as Yunnan goldthread (Yunlian in Chinese), is important Chinese herbology since the period of Sheng-Nong (3000 B.C.) [1]. It has excellent pharmacological activity and was used to treat various diseases such as diarrhea, disorder of glucose metabolism, hypertension, cardiovascular and cerebral vessel diseases [3]. The previous study revealed that the species have highly specific microsite requirements that cannot be met in other habitats. Owing to the over-exploitation, several anthropogenic factors, and environmental disruption, the wild population of C. teeta decreased rapidly in recent years [4]. C. teeta has been listed in IUCN Red List of Threatened Species (http://www.iucnredlist.org/) as endangered species with status “A2cd”. And it is also included in Category II of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) [5]. Therefore, it is necessary to protect this endangered plant for its highly economic and ecological values, and for the conservation of biodiversity.

To date, few studies of this species have been performed due to lack of genomic data of C. teeta. The previous studies mainly focus on phylogenetic analysis and biogeographic pattern of Coptis by using two plastid and one nuclear markers including psbA-trnH, trnL-trnF and ITS, and six markers, including five plastid and one nuclear markers, respectively [6,7]. In this study, as a part of the genome sequencing project of C. teeta, we assemble and annotate its complete plastid genome and describing its characteristics.

Materials and Methods

Plant material and DNA extraction

Fresh leaves of C. teeta were collected from Gongshan County (27°73′E, 98°66′N), Yunnan province and voucher specimens were deposited in Yunnan University of Traditional Chinese Medicine. Total genomic DNA was extracted using the modified plant genome kit (Bioteke, Beijing, China). DNA quality was detected by electrophoresis on 1% agarose gel (Figure 1) and 1 μL of DNA sample to test concentration using to the NanoDrop spectrophotometers (ThermoFisher Scientific, Wilmington, Delaware, USA), the result showed that its value is 62.6 ng/μL>50 ng/μL.

biochem-molbio-Agarose-gel

Figure 1: 1% Agarose gel electrophoretic separation mapping of total DNA.

Genome sequencing, assembly and annotation

A sequence library was constructed and sequencing was performed using the Illumina HiSeq 2500-PE150 platform (Illumina, CA, USA). All raw reads were filtered by using NGSQC Toolkit_v2.3.3 with default parameters to obtain clean reads that has discard low quality regions [8]. The plastome was de novo assembled using GetOrganelle pipeline (https:// github.com/Kinggerm/GetOrganelle). The complete chloroplast genome was annotated with the online annotation tool GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) [9], using the published cp genome of C. chinensis (NCBI accession number: NC036485) as a reference sequence, then manual correction was performed with Geneious R11software [10]. The plastid genome map was drawn using OGDRAW program (http:// ogdraw.mpimp-golm.mpg.de/) [11]. The annotated cp genome of C. teeta has been deposited into GenBank with the Accession Number MH359096.

Repeats and simple sequence repeats (SSRs) analysis

REPuter [12] was used to find forward and reversed tandem repeats≥15 bp with minimum alignment score and maximum period size at 100 and 500, respectively. IMEx [13] was used to visualize the SSRs with the minimum repeat numbers set to 10, 5, 4, 3, 3 and 3 for mono-, di-, tri-, tetra-, penta- and hexanucleotides, respectively.

Phylogenetic analysis

The phylogenetic analysis was conducted based on 31 published chloroplast genomes to infer phylogenetic position of C. teeta within the family of Ranunculaceae. The cp genome of Nandina domestica (GenBank: DQ923117) was included as outgroup. The LSC, SSC and one IR region of the total 32 chloroplast genomes were aligned using MAFFT 7.308 [14]. The maximum likelihood (ML) tree was reconstructed by RAxML 8.2.11 [15] with the nucleotide substitution model of GTR+G and node support was estimated by means of bootstrap analysis with 1000 replicates.

Results and Discussion

Characteristics of chloroplast genome of C. teeta

The complete chloroplast genome of C. teeta is a circular DNA with 154,280 bp in length, comprising four subunits: one large singe copy (LSC) (87,519 bp), one small singe copy (SSC) (17,595 bp) and two inverted repeat regions (IRs) (24,583 bp for each) (Figure 2). The overall GC content was 38.3 %. The IR regions had a higher GC content (43.3%) than LSC (36.7%) and SSC regions (32.2%). That was caused by the high GC content of the four ribosomal RNA (rRNA) genes (55.5%) presented in the IR regions, similar to that of C. chinensis Franchet [16].

biochem-molbio-Plastome-map

Figure 2: Plastome map of Coptis teeta. The darker gray in the inner circle corresponds to GC content, while the lighter gray corresponds to AT content.

The chloroplast genome of C. teeta contains 125 genes, comprising 81 protein-coding genes (PCGs), 36 transfer RNA (tRNA) genes and 8 rRNA genes. Among these genes, ndhA, ndhB, rpl2, rpoc1, atpF, rps16, trnA-UGC, trnI-GAU, trnV-UAC, trnL-UAA, trnG-UCC, trnK-UUU contain one intron, while the clpP and ycf3 genes contain two introns. The trnK-UUU gene has a larger intron of 2,853 bp compared with other introns. The IR regions include seven tRNAs (trnN-GUU, trnR-ACG, trnA-UGC, trnL-GAU, trnV-GAC, trnL-CAA, trnI-CAU ), four rRNAs without intron (rrn16, rrn23, rrn4.5 and rrn5) and four PCGs (rcf1, rcf2, rps7, ndhB, rps15) and all of these genes are totally duplicated. Additionally, one tRNA (trnL-UAG) and ten PGGs (rpl32, rps15, ccsA, ndhD, Psac, ndhE, ndhG, ndhI, ndhA, ndhH) are contained in SSC region of C. teeta chloroplast genome.

Repeat and SSR analysis

For repeat structure analysis, 42 forward and 23 reverted repeats with minimal repeat size of 15 bp were detected in cp genome of C. teeta (Table 1). Most of these repeats were between 15 and 20 bp. The longest forward repeats were of 39 bp, one sequence of which located in the intergenic region between trnV-GAC and rps7 of inverted repeated regions (IR), the other sequence located in ycf3 of LSC. There are 31 repeats with two sequences started in the same region. Among them, 21 repeats located in the LSC region, 7 located in the IR regions, and 3 located in SSC region. Other 34 repeats with two sequences started in separated regions.

ID Repeat Start 1 Type Size (bp) Repeat Start 2 E-Value Region Gene
1 161 F 15 94911 6.23 IRb/LSC ycf1; IGS
2 1319 F 15 146257 6.23 IRb/SSC ycf1; ndhA
3 1686 F 18 8903 0.0974 IRb trnN-GUU; trnl-GAU
4 1729 F 15 22897 6.23 IRb trnN-GUU; ycf2
5 2352 F 15 34195 6.23 IRb/LSC trnR-ACG; trnS-GCU
6 2352 F 15 61944 6.23 IRb/LSC trnR-ACG; trnS-UGA
7 3964 F 15 7706 6.23 IRb rrn23; trnl-GAU
8 6084 F 15 91596 6.23 IRb/LSC IGS
9 6598 F 19 7684 0.0244 IRb trnA-UGC; trnl-GAU
10 7003 F 15 95317 6.23 IRb/LSC trnA-UGC;IGS
11 7017 F 18 74997 0.0244 IRb/LSC trnA-UGC; trnF-GAA
12 8184 F 18 8215 0.0974 IRb IGS
13 9475 F 15 148599 6.23 IRb/SSC rrn16; ndhH
14 10179 F 17 57871 0.39 IRb/LSC trnV-GAC; trnT-GGU
15 11638 F 16 38957 1.56 IRb/LSC IGS
16 11991 F 39 70172 2.21E-14 IRb/LSC IGS; ycf3
17 19923 F 16 20059 1.56 IRb ycf2
18 20113 F 16 71145 1.56 IRb/LSC ycf2; IGS
19 24047 F 16 94901 1.56 IRb/LSC ycf2; IGS
20 30432 F 20 62669 0.00609 LSC IGS
21 32229 F 16 81798 1.56 LSC IGS
22 34192 F 21 61941 0.00152 LSC trnS-GCU; trnS-UGA
23 35625 F 19 62985 0.0244 LSC trnG-GCC; trnG-UCC
24 38000 F 16 38051 1.56 LSC atpF
25 39908 F 17 101565 0.39 LSC IGS; petB
26 45078 F 16 94306 1.56 LSC rpoC2; IGS
27 46933 F 16 98952 1.56 LSC rpoC1; psbB
28 54520 F 16 102145 1.56 LSC IGS; petB
29 55783 F 16 77512 1.56 LSC IGS
30 57577 F 17 139324 0.39 LSC/SSC IGS
31 58768 F 16 92720 1.56 LSC IGS
32 62639 F 17 71507 0.39 LSC IGS
33 62772 F 16 152783 1.56 LSC/SSC IGS
34 63190 F 21 92511 0.00152 LSC trnfM-CAU; trnP-UGG
35 65213 F 21 67437 0.00152 LSC psaB; psaA
36 70175 F 37 146807 3.54E-13 LSC/SSC ycf3; ndhA
37 77524 F 16 95015 1.56 LSC IGS
38 106539 F 16 139969 1.56 LSC/SSC rpl14; IGS
39 108087 F 16 138701 1.56 LSC/SSC IGS
40 138345 F 16 153380 1.56 SSC IGS
41 139327 F 16 150849 1.56 SSC IGS
42 139331 F 17 153983 0.39 SSC IGS
43 412 R 15 52362 6.23 SSC/LSC IGS; petB
44 1320 R 15 17771 6.23 SSC/IRa IGS; ycf1
45 10547 R 15 19893 6.23 SSC/IRa psaC; IGS
46 11639 R 15 101219 6.23 SSC/LSC ndhD; IGS
47 12190 R 15 118155 6.23 SSC/LSC ndhD; atpA
48 30692 R 16 74169 1.56 IRa/LSC rps7; atpB
49 32228 R 17 77513 0.39 IRa/LSC ndhB; ndhK
50 38958 R 18 101216 0.0974 IRa/LSC ycf2; IGS
51 39215 R 16 81430 1.56 IRa/LSC ycf2; rps14
52 39633 R 16 90680 1.56 IRa/LSC ycf2; rps14
53 40216 R 16 62753 1.56 IRa/LSC ycf2; petL
54 44548 R 17 139326 0.39 LSC/IRb rpl22; ndhB
55 44548 R 16 150849 1.56 LSC/IRb rpl22; rrn23
56 55782 R 17 81799 0.39 LSC psbB; rps4
57 55869 R 17 73441 0.39 LSC psbB; atpB
58 71197 R 16 147968 1.56 LSC/IRb rbcL; trnA-UGC
59 73193 R 16 108492 1.56 LSC atpB; rpoC2
60 77510 R 16 93741 1.56 LSC ndhK; psbC
61 77512 R 17 81798 0.39 LSC ndhK; rps4
62 81801 R 16 93738 1.56 LSC rps4; psbC;
63 83628 R 16 139655 1.56 LSC/IRb ycf1; ndhB
64 85488 R 16 149157 1.56 SSC/IRb IGS; rrn23
65 139356 R 16 139960 1.56 IRb ndhB
               
F: Forward; R: Reverted; IGS: intergenic space

Table 1: Repeat sequences in C. teeta chloroplast genome.

cpSSRs markers are widely used to study the population genetics and evolutionary processes of wild plants [17,18]. There were totally 62 SSRs in cp genome of C. teeta, most of which were in LSC (Table 2). Among them, 31 (50.0%) were mononucleotide SSRs, fifteen (24.2%) were dinucleotide SSRs, six (9.7%) were tri-nucleotide SSRs, eight (12.9%) were tetranucleotide SSRs, one (0.2%) was penta-nucleotide SSR, and one (0.2%) was hexa-nucleotide SSRs. Only twelve SSRs were located in genes and the others were in intergenic regions. 30 (96.8%) of the mononucleotide SSRs belonged to the A/T type, which were consistent with the hypothesis that cpSSRs were generally composed of short polyadenine (poly A) or polythymine (poly T) repeats and rarely contained tandem guanine (G) or cytosine (C) repeats. These cpSSR markers can be used in the conservation genetics of C. teeta.

ID Repeat Motif Length (bp) Start End Region Gene
1  (T) 10 10 2610 2619 IRb  
2  (AATA) 3 12 11645 11656 IRb  
3  (ATCT) 3 12 29571 29582 LSC trnK-UUU
4  (A) 10 10 30435 30444 LSC  
5  (C) 11 11 31241 31251 LSC rps16
6  (AT) 7 14 32232 32245 LSC  
7  (A) 11 11 33582 33592 LSC  
8  (T) 11 11 33825 33835 LSC  
9  (T) 10 10 35187 35196 LSC trnG-UCC
10  (T) 10 10 35734 35743 LSC  
11  (CTGT) 3 12 36989 37000 LSC atpA
12  (T) 10 10 40322 40331 LSC  
13  (T) 10 10 42352 42361 LSC  
14  (T) 14 14 44550 44563 LSC rpoC2
15  (AT) 5 10 45302 45311 LSC rpoC2
16  (TA) 5 10 45923 45932 LSC rpoC2
17  (TA) 5 10 53640 53649 LSC  
18  (A) 10 10 54985 54994 LSC  
19  (TA) 7 14 55785 55798 LSC  
20  (TA) 5 10 55828 55837 LSC  
21  (T) 13 13 57582 57594 LSC  
22  (TTATA) 3 15 58077 58091 LSC  
23  (TA) 6 12 58573 58584 LSC  
24  (AAAG) 3 12 59161 59172 LSC  
25  (A) 10 10 62672 62681 LSC  
26  (ATA) 4 12 68971 68982 LSC  
27  (A) 11 11 72877 72887 LSC  
28  (A) 14 14 73196 73209 LSC  
29  (AT) 5 10 73741 73750 LSC  
30  (T) 10 10 75418 75427 LSC  
31  (TA) 7 14 77514 77527 LSC  
32  (ATA) 4 12 77525 77536 LSC  
33  (ACCA) 3 12 78836 78847 LSC trnV-UAC
34  (ATA) 4 12 81391 81402 LSC atpB
35  (AT) 7 14 81801 81814 LSC  
36  (AT) 6 12 86612 86623 LSC  
37  (T) 11 11 86623 86633 LSC  
38  (A) 12 12 86710 86721 LSC  
39  (T) 10 10 89930 89939 LSC  
40  (TA) 6 12 93742 93753 LSC  
41  (T) 10 10 94315 94324 LSC  
42  (ATA) 4 12 95016 95027 LSC  
43  (T) 10 10 97412 97421 LSC clpP
44  (ATT) 4 12 100305 100316 LSC psbN
45  (TTCT) 3 12 108128 108139 LSC  
46  (A) 16 16 108491 108506 LSC  
47  (TTAT) 3 12 125025 125036 IRa  
48  (A) 10 10 134063 134072 IRa  
49  (T) 14 14 139329 139342 SSC  
50  (AT) 5 10 139695 139704 SSC  
51  (AT) 5 10 143541 143550 SSC  
52  (AT) 5 10 145476 145485 SSC  
53  (ATA) 4 12 146996 147007 SSC ndhA
54  (T) 14 14 150851 150864 SSC  
55  (T) 11 11 151012 151022 SSC  
56  (CATT) 3 12 151870 151881 SSC  
57  (T) 12 12 152189 152200 SSC  
58  (T) 11 11 152711 152721 SSC  
59  (CTTTTA) 3 18 152750 152767 SSC  
60  (A) 11 11 153416 153426 SSC  
61  (T) 11 11 153836 153846 SSC  
62  (T) 11 11 153984 153994 SSC  

Table 2: Simple sequence repeats (SSRs) in the C. teeta chloroplast genome.

Phylogenetic analysis

The phylogenetic tree showed that species from the Ranunculaceae formed a monophyletic clade (Figure 3) and the intra-family topology was consistent with previous studies [6,16,19]. The result strongly supported C. teeta and its congeneric species, C. chinensis, as sister group with 100% bootstrap value. This newly reported chloroplast genome will provide new insights into phylogenetic studies within the Ranunculaceae and facilitate future conservation of C. teeta.

biochem-molbio-plastome-phylogeny

Figure 3: The plastome phylogeny of Ranunculaceae. Bootstraps were shown next to the node.

Conclusion

In this study, we reported and analyzed the first complete chloroplast genome of C. teeta, which are an endemic and endangered plant and a source for famous traditional Chinese medicine.

The circular chloroplast genome exhibits typical quadripartite regions with 154,280 bp in size, including two inverted repeat (IR, 24,583 bp) regions, one large singe copy region (LSC) and one small singe copy region (SSC) of 87,519 bp and 17,595 bp, respectively.

The cp genome of C. teeta was rich in SSRs, which will be informative sources for developing new molecular markers to evaluate genetic diversity and provide effective strategies for conservation of this species. The phylogenetic analysis showed that C. teeta and C. chinensis form one clade as sister group. This information will be useful on phylogenetic analysis of genus Coptis and will also enhance our understanding on the evolutionary relationships among Ranunculaceae.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 81560613), Special subsidies for public health services of TCM “the national survey of TCM resources” (DSS, MOF. No 66/2017), and the Key laboratory training program in Yunnan (2017DG006).

References

Select your language of interest to view the total content in your interested language

Viewing options

Post your comment

Share This Article

Flyer image
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh