小RNA建库测序后的数据分析-实例讲解

我在B站有一个microRNA测序数据分析实战演练,主要是从ebi下载,项目号是Project: PRJNA486534,共11个小RNA建库测序后样本的数据分析。但还是得找其它练习题给大家作为课程配套练习。看到一个发表于:Plasma extracellular RNA profiles in healthy and cancer patients. Sci Rep 2016 Jan 20;6:19413. PMID: 26786760,研究者纳入192人,涉及到3种癌症:

  • 100 colon cancer
  • 36 prostate cancer
  • 6 pancreatic cancer patients
  • 50 healthy individuals

    临床特性总结如下:
    image-20201217230542068
    平均测序量是12M, 比对成功的平均是 5.4M, 分布在:
  • microRNAs(miRNAs) (~40.4%)
  • piwi-interacting RNAs(piwiRNAs) (~40.0%)
  • pseudo-genes (~3.7%)
  • long noncoding RNAs (lncRNAs) (~2.4%)
  • tRNAs (~2.1%)
  • mRNAs (~2.1%)
    参考:miRNA, siRNA, piRNA: Knowns of the unknown. - NCBI 可以深入理解。
    参考 miRNA命名规则| 生物信息博客

    小RNA建库测序后的数据分析

    参考基因组选择:

  • miRBase (http://www.mirbase.org, Release 21)
  • piwiRNABank (http://piwiRNAbank.ibab.ac.in)
  • siRNDdb (http://siRNA.cgb.ki.se)
  • FLJ Human cDNA Database (http://flj.lifesciencedb.jp/top/, v3.2)
  • human genome references were downloaded from NCBI (ftp://ftp.ncbi.nlm.nih.gov.genomes/Homo_sapiens/, Release 106)
    数据分析流程:
    使用bowtie软件,参数是: −l 18, −v 1, −m 2, and –norc –best –strata

    100个colon cancer和50个正常人做差异分析

    100个colon cancer 分成4组,分别是I–IV期,然后都跟50个正常人对比后做差异分析

  • We compared each stage (I–IV, N = 25 per stage) to the controls separately.
  • This analysis identified 1, 47, 120, and 167 unique RNA transcripts showing significant differences (FDR < 0.05) in stage I, II, III, and IV, respectively.
    另外两个癌症,也是跟正常人对比后做差异分析。
    Of all significant changes, two miRNAs (miR-125a-5p and miR-1343-3p) were significantly decreased in all evaluated types of cancer.

    诊断模型或者预后模型

    如果有生存数据就可以做预后模型,否则就只能是做诊断模型。
    诊断模型:

  • miR-1343-3p alone was able to generate an area under the curve (AUC) of 0.59, 0.72, 0.73, 0.82 for colon cancer stages I, II, III and IV, 0.75 and 0.87 for HSPC and CRPC, respectively.
  • miR-125a-5p alone was able to generate an AUC of 0.62, 0.76, 0.74, 0.77 for colon cancer stages I, II, III and IV, 0.77 and 0.86 for HSPC and CRPC, respectively.

    文章的附件信息

  • Supplementary Table S1 shows our complete list of the 493 RNA transcripts with log2 transformed RPM >5.
  • A complete list of these miRNA isoforms is provided in the supplementary file (Supplementary Table S2).
  • This analysis revealed various expression stabilities among 60 selected transcripts across the 192 samples (Supplementary Table S3).
  • This analysis revealed 20 exRNA transcripts to be significantly associated with sex and 15 RNA transcripts with age (FDR < 0.05) (Supplementary Table S4).
  • The statistical significance for all RNA transcripts is listed in Supplementary Table S5.

    学徒任务

    下载这个测序数据,走文章的同样的上游分析,拿到表达矩阵后,走同样的差异分析,拿到差异的miRNA后走同样的诊断模型!

Comments are closed.