前些天逛bioStar论坛的时候看到了一个问题，是关于miRNA分析，提问者从NCBI的SRA数据下载文献提供的原始数据，然后处理的时候有些不懂，我看到他列出的数据是iron torrent测序仪的，而且我以前还没玩过miRNA-seq的数据分析， 就抽空自学了一下。因为我有RNA-seq的基础，所以理解学习起来比较简单。特记录一下自己的学习过程，希望对后学者有帮助。
这里选择的文章是2014年发表的，作者用ET-1刺激human iPSCs (hiPSC-CMs) 细胞前后，想看看 miRNA和mRNA表达量的变化，我并没有细看该文章的生物学意义，仅仅从数据分析的角度解读一下这篇文章，mRNA表达量用的是Affymetrix Human Genome U133 Plus 2.0 Array，分析起来特别容易，就是得到表达矩阵，然后用limma这个包找找差异表达基因即可。但是mRNA分析起来就有点麻烦了，作者用的是iron torrent测序仪，但是从SRA数据中心下载的是已经去掉接头的测序数据，fastq格式的，所以这里其实并不需要考虑测序仪的特异性。
## Aggarwal P, Turner A, Matter A, Kattman SJ et al. RNA expression profiling of human iPSC-derived cardiomyocytes in a cardiac hypertrophy model. PLoS One 2014;9(9):e108051. PMID: 25255322## The accession numbers are 1. SuperSeries (mRNA+miRNA) - GSE60293## 2. mRNA expression array - GSE60291 (Affymetrix Human Genome U133 Plus 2.0 Array)## 3. miRNA-Seq - GSE60292 (Ion Torrent)
Ion Torrent's Torrent Suite version 3.6 was used for basecalling
Raw sequencing reads were aligned using the SHRiMP2 aligner and were aligned against the human reference genome (hg19) for novel miRNA prediction and then against a custom reference sequence file containing miRBase v.20 known human miRNA hairpins, tRNA, rRNA, adapter sequences and predicted novel miRNA sequences.(Genome_build: hg19, miRBase v.20 human miRNA hairpins)
The miRDeep2 package (default parameters) was used to predict novel (as yet undescribed) miRNAs
Alignments with less than 17 bp matches and a custom 3′ end phred q-score threshold of 17 were filtered out.
miRNA quanitification was done using HTSeq v0.5.3p3 using the default union parameter.
Differential miRNA expression was analyzed using the DESeq (v.1.12.1) R/Bioconductor package
In this study, differentially expressed genes that had a false discovery rate cutoff at 10% (FDR< = 0.1), a log2 fold change greater than 1.5 and less than −1.5 were considered significant.
Target gene prediction was performed using the TargetScan (version 6.2) database
We also used miRTarBase (version 4.3), to identify targets that have been experimentally validated
## miR-Deep2 and miReap ## predict exact precursor sequence according from mature sequence .
Supplementary_files_format_and_content: tab-delimited text files containing raw read counts for known mature human miRNAs.（表达矩阵）
We detected 836 known human mature miRNAs in the control-CMs and 769 in the ET1-CMs
Based on our miRNA-Seq data, we predicted 506 sequences to be potentially novel, as yet undescribed miRNAs.
In order to validate the expression profiles of the miRNAs detected, we performed RT-qPCR on a subset of five known human mature and five of our predicted novel miRNAs.
we obtained a total of 1,922 predicted miRNA-mRNA pairs represented by 309 genes and 174 known mature human miRNAs. （）