Developing bioinformatics methods and computational tools to analyze large-scale datasets from high throughput technologies to understand the massive amounts of information at system level incuding genome, transcriptome and proteome. Application of next-generation sequencing technologies and bioinformatics on several major species (including Bamboo, China fir and Populus trichocarpa) to infer gene regulatory networks by integrating omics data. Investigation of the interplay between non-coding RNA (miRNA/circular RNA) and different post-transcription regulations (alternative splicing/alternative polyadenlation) to provide theoretical basis for improving production of important species.

Research focus1: interplay between miRNA and alternative splicing regulation

Small RNAs and alternative splicing are important post-transcriptional regulation mechanisms in most eukaryotes. In both animal and plant, mature small RNAs have been recognized and extensively studied. The phenomena of alternative splicing have been emerged in higher plants. However the interplay between small RNAs and splicing regulation is largely unknown in planta. Thus we will focus on the regulation of alternative splicing and miRNA.

We are pleasure to release out first version of ASmiRdb: the interplay between miRNA and alternative splicing in plant. The database was accompanied by easy-to-use web query interfaces for data visualization and downstream analysis. Especially, the forest genomics community also can submit own dataset to the web service to search for the small RNA target sites locating at AS region.

Circular RNA (or circRNA) is a type of RNA which, unlike the better known linear RNA, forms a covalently closed continuous loop, i.e., in circular RNA the 3' and 5' ends normally present in an RNA molecule have been joined together. This feature confers numerous properties to circular RNAs, many of which have only recently been identified. Case study reports that circRNA can bind strongly to its cognate DNA locus, forming an RNA:DNA hybrid, or R-loop to regulate exon skipping. We are focusing on the genome-wide mechanistic insight for interplay between circular RNA and alternative splicing regulation remains unknown.

Research focus2: interplay between circular RNA and alternative splicing regulation

Phyllostachys edulis, moso bamboo, or mao zhu (Chinese name: 毛竹) is a temperate species of giant timber bamboo. This bamboo can reach heights of up to 28 m. This particular species of bamboo is the most common species used in the bamboo textile industry of China. Phyllostchys edulis spreads using both asexual and sexual reproduction. The most common and well known for this plant is asexual reproduction. This occurs when the plant sends up new culms from underground rhizomes. Lateral buds on rhizome form new shoots which grow rapidly after emerging from soil and complete average culm height of 13 meters within 38 days in moso bamboo (Li et al., 1998, Song et al., 2016). The fast growth of the new shoots is entirely dependent on the well-developed rhizome-root system, which can horizontally spread widely and connect the young culms with other mature bamboos (Li et al., 2000, Embaye et al., 2005, Zhou et al., 2005, Song et al., 2016). The rhizome system has important function on energy storage, transportation and vegetative reproduction (Li et al., 1998).

moso bamboo PacBio database

However, the post-transcriptional regulation mechanism and circular RNA has not been comprehensively studied for the development of rhizome system in bamboo. We therefore combined single-molecule long-read sequencing technology and SGS (circular RNA sequencing, RNA-seq and PAS-seq) to genome-wide identify and quantify circular RNA, alternative splicing (AS) and alternative polyadenylation (APA) in the rhizomes system. Taken together, our results suggest that posttranscriptional regulation and long non-coding RNA may potentially play vital role in the underground rhizome-root system.

Developing bioinformatics methods and computational tools to analyze large-scale datasets from high throughput technologies to understand the massive amounts of information at system level incuding genome, transcriptome and epigenetics.

Research focus3: Developing bioinformatics methods and computational tools

We are pleasure to release out first version of PRAPI, which a one-stop solution for Iso-Seq analysis of analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively.

The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results.