What is fusion gene? What is Chimeric RNAs?
A fusion gene is a chimeric gene formed by the fusion of partial sequences of two genes, usually due to chromosomal translocations or deletions. These chimeric genes can form abnormal transcripts or proteins in subsequent biological processes, which can lead to or promote the development of tumors.
For example, in chronic granulocytic leukemia, molecular biology is characterized by the detection of a BCR-ABL fusion gene; this fusion gene translates a fusion protein with strong tyrosine kinase activity, which leads to excessive cell proliferation, inhibition of apoptosis and consequent development of various pathologies.
The first found of gene fusions date back to the 1960s when Hungerford and Nowell described their initial observation that two patients with chronic granulocytic leukemia (CML) had a characteristic small chromosome, named the "Philadelphia chromosome".
A "chimeric RNA" is any transcript that consists of exons of different parental genes. The fusion transcripts are not necessarily all derived from fusion genes. In addition to transcripts of fusion genes, chimeric RNAs can also originate from the trans-splicing of two independent precursor mRNAs and the variable splicing of two adjacent genes.
Please read our article Chimeric RNA and Sequencing Technologies: Advancing Detection and Research for basic knowledge of chimeric RNA.
Why it is important to identify fusion transcripts in cancer research?
Chromosomal abnormalities occur frequently in human tumors. Chromosomal translocations and gene fusions were originally identified in hematological malignancies. Disease subtypes can be defined by detecting the type of chromosomal abnormality. In cancer, however, certain recurrent gene fusions are used as diagnostic markers for cancer and have been targeted for treatment with substantial clinical success. The development and widespread use of sequencing technologies has accelerated the identification and detection of genetic aberrations.
Indeed, accurate detection of fusion genes or transcripts is important for the prevention, treatment and overall understanding of such oncological diseases.
Methods of Fusion Characterization
Traditional methods of cell biology analysis
There are many experimental and computational methods to detect fusion transcripts. Prior to next-generation sequencing (NGS), fusion identification in hematological malignancies relied on traditional cytogenetic karyotyping to detect relatively large chromosomal rearrangements. Examples include fluorescence in situ hybridization (FISH), spectral karyotyping (SKY), multicolor FISH (M-FISH), comparative genomic hybridization (CGH), which has identified more rearrangements and high-density array comparative genomic hybridization (a-CGH).
However, traditional cytogenetic and non-cytogenetic methods are based on predefined fusion targets. As such, they are limited by the need for a priori knowledge and are not suitable for large-scale ab initio gene fusion discovery. In contrast, sequencing-based methods such as whole genome sequencing (WGS) and RNA sequencing are widely used to identify previously unidentified gene fusions.
Sequencing-based assays
High throughput de novo gene fusion discovery has become a reality with the development of NGS technology, which can analyze the entire genome and transcriptome to exhaustively identify copy number alterations, somatic point mutations, structural rearrangements and gene expression alterations. Large sample throughput and deep sequencing platforms are now widely used to characterize cancer genomes, and the throughput levels of NGS are unmatched by FISH. The Cancer Genome Atlas (TCGA) describes that DNA and RNA sequence aberrations in at least 25 different cancer types can be identified at the genome-wide level using NGS. The use of targeted RNA sequencing increases the sensitivity of fusion detection and provides a more comprehensive characterization of the tumor transcriptome.
NGS not only provides a large amount of data information at once, allowing the discovery of new transcripts, but also expands the potential to predict fusion loci. This can be combined with phenotypic data to identify fusions and other somatic sequence variations. At the same time, the combination of phenotypic data can help us to identify changes in the cancer genome that are functionally relevant. Several bioinformatics methods have been developed to detect fusion transcripts from RNA-Seq data, such as ChimeraScan, SnowShoes-FTD, TopHat-Fusion, FusionMap and FusionSeq.
However, the greatest computational challenge in identifying fusion transcripts is the abnormal frequency of false positives, which is caused by the direct application of short-read mappers. PacBio SMRT and Nanopore sequencing technologies, which provide direct access to complete full-length transcripts without splicing, result in higher-quality transcripts and facilitate the study of mRNA structure, such as alternative splicing, fusion genes, allelic expression, etc.
Application of fusion transcripts Identification in oncology research and therapy
For the development of therapeutic treatment
In disease research, the identification of fusion transcripts can contribute to the development of therapeutic pathways. For example, the discovery of EML4-ALK fusions in non-small cell lung cancer (NSCLC) led to the development of the therapeutic agent crizotinib (Xalkori), which has a prevalence of approximately 5% in NSCLC, where EML4-ALK fusions promote cell growth and reduce apoptosis. Just 3 years after Pfizer optimized the chemical structure and specificity of ALK, Pfizer's Xalkori was approved by the FDA for the treatment of metastatic NSCLC.
Fusion transcripts as valuable biomarkers
Among the various genomic aberrations in cancer, recurrent gene fusions have been identified as the major category of mutations in hematological malignancies; they follow different patterns of occurrence depending on their origin, spectrum, tissue specificity, structure and function.
Several molecular abnormalities are included in the latest WHO classification of hematological malignancies. Similarly, molecular analysis is becoming a tool for the differential diagnosis of soft tissue sarcomas: for example, SS18-SSX fusions in synovial sarcoma, EWSR1 fusions in Ewing's sarcoma and PAX3/7-FKHR fusions in alveolar rhabdomyosarcoma. EML4-ALK fusions in lung cancer and RAF family rearrangements in different subsets of solid cancers stratify the disease.
To date, more than 2,000 such tumor-specific gene fusions have been documented (http://cgap.nci.nih.gov/Chromosomes/Mitelman), which are real or potential prognostic biomarkers or drug targets.
NGS for detection of fusion transcripts in FFPE tumor samples
Formalin-fixed paraffin-embedded (FFPE) tissue samples are the most commonly used clinical specimens for cancer diagnosis, but RNA degradation due to fixation and storage poses a challenge for detection and analysis techniques. In contrast, the development of RNA sequencing technology for FFPE samples has allowed researchers to maximize the amount of valuable data available from low-quality RNA samples. By utilizing target capture technology, transcript profiles can be obtained at higher throughput and lower sequencing depth, and with lower demand for RNA starting volumes.
References:
- Nowell PC, Hungerford DA. A minute chromosome in human chronic granulocytic leukemia. Science. 1960;142:1497.
- Qu K, Baker J, Ma Y. Applications of NGS to Screen FFPE Tumours for Detecting Fusion Transcripts[J]. Next Generation Sequencing in Cancer Research, Volume 2: From Basepairs to Bedsides, 2015: 155-177.
- Mukherjee S, Heng H H, Frenkel-Morgenstern M. Emerging role of chimeric RNAs in cell plasticity and adaptive evolution of cancer cells[J]. Cancers, 2021, 13(17): 4328.
- Pederzoli F, Bandini M, Marandino L, et al. Targetable gene fusions and aberrations in genitourinary oncology[J]. Nature Reviews Urology, 2020, 17(11): 613-625.
- Bayliss R, Choi J, Fennell D A, et al. Molecular mechanisms that underpin EML4-ALK driven cancers and their response to targeted drugs[J]. Cellular and molecular life sciences, 2016, 73: 1209-1224.
Comments