<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; RNA</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/rna/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>RNA-seq流程需要进化啦！</title>
		<link>http://www.bio-info-trainee.com/1022.html</link>
		<comments>http://www.bio-info-trainee.com/1022.html#comments</comments>
		<pubDate>Fri, 25 Sep 2015 14:46:21 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[转录组软件]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[流程]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1022</guid>
		<description><![CDATA[Tophat 首次被发表已经是6年前 Cufflinks也是五年前的事情了 St &#8230; <a href="http://www.bio-info-trainee.com/1022.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Tophat 首次被发表已经是6年前</p>
<p>Cufflinks也是五年前的事情了</p>
<p>Star的比对速度是tophat的50倍，hisat更是star的1.2倍。</p>
<p>stringTie的组装速度是cufflinks的25倍，但是内存消耗却不到其一半。</p>
<p>Ballgown在差异分析方面比cuffdiff更高的特异性及准确性，且时间消耗不到cuffdiff的千分之一</p>
<p>Bowtie2+eXpress做质量控制优于tophat2+cufflinks和bowtie2+RSEM</p>
<p>Sailfish更是跳过了比对的步骤，直接进行kmer计数来做QC，特异性及准确性都还行，但是速度提高了25倍</p>
<p>kallisto同样不需要比对，速度比sailfish还要提高5倍！！！</p>
<p>参考：<a href="https://speakerdeck.com/stephenturner/rna-seq-qc-and-data-analysis-using-the-tuxedo-suite">https://speakerdeck.com/stephenturner/rna-seq-qc-and-data-analysis-using-the-tuxedo-suite</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1022.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>用 GMAP/GSNAP软件进行RNA-seq的alignment</title>
		<link>http://www.bio-info-trainee.com/1016.html</link>
		<comments>http://www.bio-info-trainee.com/1016.html#comments</comments>
		<pubDate>Thu, 24 Sep 2015 14:22:13 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[转录组软件]]></category>
		<category><![CDATA[GSNAP]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[比对]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1016</guid>
		<description><![CDATA[软件发表在：http://bioinformatics.oxfordjourna &#8230; <a href="http://www.bio-info-trainee.com/1016.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div>
<div>软件发表在：<a href="http://bioinformatics.oxfordjournals.org/content/26/7/873.abstract">http://bioinformatics.oxfordjournals.org/content/26/7/873.abstract</a></div>
<p>软件的解说ppt ：<a href="http://www.mi.fu-berlin.de/wiki/pub/ABI/CompMethodsWS11/MHuska_GSNAP.pdf">http://www.mi.fu-berlin.de/wiki/pub/ABI/CompMethodsWS11/MHuska_GSNAP.pdf</a></p>
<div>一个例子：<a href="http://qteller.com/RNAseq-analysis-recipe.pdf">http://qteller.com/RNAseq-analysis-recipe.pdf</a></div>
<div>一个shell脚本 ： <a href="https://github.com/vsbuffalo/rna-seq-example">https://github.com/vsbuffalo/rna-seq-example</a></div>
<div>软件的下载地址： <a href="http://research-pub.gene.com/gmap/">http://research-pub.gene.com/gmap/</a></div>
<div>有研究者认为这个软件的比对效果要比tophat要好，虽然现在已经多出来了非常多的RNA-seq的alignment软件，我还是简单看看这个软件吧，它本来是2005就出来的一个专门比对低通量的est序列，叫GMAP，后来进化成了GSNAP</div>
<div>step1：下载安装GMAP/GSNAP</div>
<div>wget <a href="http://research-pub.gene.com/gmap/src/gmap-gsnap-2015-09-21.tar.gz">http://research-pub.gene.com/gmap/src/gmap-gsnap-2015-09-21.tar.gz</a></div>
</div>
<div>是一个标准的linux源码程序，安装之前一定要看readme  ，<a href="http://research-pub.gene.com/gmap/src/README">http://research-pub.gene.com/gmap/src/README</a></div>
<div>解压进去，然后源码安装三部曲,首先 ./configu  然后make 最后make install</div>
<div>会默认安装在 /usr/local/bin 下面，这里需要修改，因为你可能没有 /usr/local/bin 权限,安装到自己的目录，然后把它添加到环境变量！</div>
<div></div>
<div>step2 ：准备数据</div>
<div>比对一般都只需要两个数据，一是索引好的参考基因组，另一个是需要比对的测序数据。</div>
<div>但是这个GSNAP，还需要对应的GTF注释文件。</div>
<div>首先需要参考基因组：虽然软件本身提供了一个hg19的参考基因组，并且已经索引好了<a href="http://research-pub.gene.com/gmap/genomes/hg19.tar.gz">Human genome, version hg19 (5.5 GB)</a>(http://research-pub.gene.com/gmap/genomes/hg19.tar.gz) ，但是下载很慢，而且不是对所有版本的GSNAP都适用。所以我这里对我自己的参考基因组进行索引。</div>
<div>gmap_build -D ./ -d  my_hg19.fa</div>
<div>然后取ensemble下载hg19的gtf文件。</div>
<div>然后还需要把自己下载的gtf文件也构建索引，需要两个步骤</div>
<div>cat my_hg19.gtf |  ~/software/gmap-2011-10-16/util/gtf_splicesites &gt;<span class="Apple-converted-space"> </span>my_hg19.splicesites</div>
<div>cat  my_hg19.splicesites <span class="Apple-converted-space"> </span>|   iit_store -o<span class="Apple-converted-space"> </span>my_hg19.gtf.index</div>
<div>然后拷贝需要比对的RNA-seq测序文件</div>
<div></div>
<div>step3: 运行程序</div>
<div>就是一步比对而已</div>
<div>
<div data-canvas-width="408.51666666666677">gsnap</div>
<div data-canvas-width="408.51666666666677">-D /home/jschnable/gsnap_indexes/</div>
<div data-canvas-width="408.51666666666677">-d arabidopsisv10</div>
<div data-canvas-width="11.05">--nthreads=50</div>
<div data-canvas-width="11.05">-B 5</div>
<div data-canvas-width="11.05">-s  /home/jschnable/gsnap_indexes/arabidopsisv10.iit</div>
<div>-n 2</div>
<div>-Q</div>
<div>--nofails</div>
<div data-canvas-width="11.2">--format=sam temp.fastq</div>
<div data-canvas-width="88.01666666666667">&gt; results.sam</div>
<div data-canvas-width="88.01666666666667">参数有点多，自己看看说明书吧<a href="http://qteller.com/RNAseq-analysis-recipe.pdf">http://qteller.com/RNAseq-analysis-recipe.pdf</a> 讲的非常详细。</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1016.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RNA-seq完整学习手册！</title>
		<link>http://www.bio-info-trainee.com/703.html</link>
		<comments>http://www.bio-info-trainee.com/703.html#comments</comments>
		<pubDate>Tue, 05 May 2015 04:57:08 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[杂谈-随笔]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[转录组]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=703</guid>
		<description><![CDATA[需耗时两个月！里面网盘资料如果过期了，请直接联系我1227278128，或者我的 &#8230; <a href="http://www.bio-info-trainee.com/703.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<h3>需耗时两个月！里面网盘资料如果过期了，请直接联系我1227278128，或者我的群201161227，所有的资源都可以在 <img src="file:///C:\Users\Jimmy\AppData\Local\Temp\%W@GJ$ACOF(TYDYECOKVDYB.png" alt="" /><a href="http://pan.baidu.com/s/1jIvwRD8" target="_blank">http://pan.baidu.com/s/1jIvwRD8 </a>此处找到</h3>
<p>搜索可以得到非常多的流程，我这里简单分享一些，我以前搜索到的文献。</p>
<p>&nbsp;</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册141.png"><img class="alignnone size-full wp-image-704" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册141.png" alt="RNA-seq完整学习手册141" width="554" height="332" /></a></p>
<p>北大也有讲RNA-seq的原理</p>
<p>链接：http://pan.baidu.com/s/1kTmWmv9 密码：6yaz</p>
<p>甚至，我还有个华大的培训课程！！！这可是5天的培训教程哦，好像当初还花了五千多块钱的资料！！！</p>
<p>链接：http://pan.baidu.com/s/1nt5OV5B 密码：gyul</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册294.png"><img class="alignnone size-full wp-image-705" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册294.png" alt="RNA-seq完整学习手册294" width="263" height="157" /></a></p>
<p>优酷也有视频，可以自己搜索看看</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册312.png"><img class="alignnone size-full wp-image-706" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/05/RNA-seq完整学习手册312.png" alt="RNA-seq完整学习手册312" width="410" height="254" /></a></p>
<p>然后还有几个pipeline，就是生信的分析流程，即使你啥都不会，按照pipeline来也不是问题啦</p>
<p>export PATH=/share/software/bin:$PATH</p>
<p>bowtie2-build ./data/GRCh37_chr21.fa  chr21</p>
<p>tophat -p 1 -G ./data/genes.gtf -o P460.thout chr21 ./data/P460_R1.fq  ./data/P460_R2.fq</p>
<p>tophat -p 1 -G ./data/genes.gtf -o C460.thout chr21 ./data/C460_R1.fq  ./data/C460_R2.fq</p>
<p>cufflinks -p 1 -o P460.clout P460.thout/accepted_hits.bam</p>
<p>cufflinks -p 1 -o C460.clout C460.thout/accepted_hits.bam</p>
<p>samtools  view  -h  P460.thout/accepted_hits.bam  &gt;  P460.thout/accepted_hits.sam</p>
<p>samtools  view  -h  C460.thout/accepted_hits.bam  &gt;  C460.thout/accepted_hits.sam</p>
<p>echo ./P460.clout/transcripts.gtf &gt; assemblies.txt</p>
<p>echo ./C460.clout/transcripts.gtf &gt;&gt; assemblies.txt</p>
<p>cuffmerge -p 1 -g ./data/genes.gtf -s ./data/GRCh37_chr21.fa  assemblies.txt</p>
<p>cuffdiff -p 1 -u merged_asm/merged.gtf  -b ./data/GRCh37_chr21.fa  -L P460,C460 -o P460-C460.diffout P460.thout/accepted_hits.bam C460.thout/accepted_hits.bam</p>
<p>samtools  index  P460.thout/accepted_hits.bam</p>
<p>samtools  index  C460.thout/accepted_hits.bam</p>
<p>&nbsp;</p>
<p>和另外一个</p>
<p>#!/bin/bash</p>
<p># Approx 75-80m to complete as a script</p>
<p>cd ~/RNA-seq</p>
<p>ls -l data</p>
<p>&nbsp;</p>
<p>tophat --help</p>
<p>&nbsp;</p>
<p>head -n 20 data/2cells_1.fastq</p>
<p>&nbsp;</p>
<p>time tophat --solexa-quals \</p>
<p>-g 2 \</p>
<p>--library-type fr-unstranded \</p>
<p>-j annotation/Danio_rerio.Zv9.66.spliceSites\</p>
<p>-o tophat/ZV9_2cells \</p>
<p>genome/ZV9 \</p>
<p>data/2cells_1.fastq data/2cells_2.fastq                  # 17m30s</p>
<p>&nbsp;</p>
<p>time tophat --solexa-quals \</p>
<p>-g 2 \</p>
<p>--library-type fr-unstranded \</p>
<p>-j annotation/Danio_rerio.Zv9.66.spliceSites\</p>
<p>-o tophat/ZV9_6h \</p>
<p>genome/ZV9 \</p>
<p>data/6h_1.fastq data/6h_2.fastq                          # 17m30s</p>
<p>&nbsp;</p>
<p>samtools index tophat/ZV9_2cells/accepted_hits.bam</p>
<p>samtools index tophat/ZV9_6h/accepted_hits.bam</p>
<p>&nbsp;</p>
<p>cufflinks --help</p>
<p>time cufflinks  -o cufflinks/ZV9_2cells_gff \</p>
<p>-G annotation/Danio_rerio.Zv9.66.gtf \</p>
<p>-b genome/Danio_rerio.Zv9.66.dna.fa \</p>
<p>-u \</p>
<p>--library-type fr-unstranded \</p>
<p>tophat/ZV9_2cells/accepted_hits.bam                  # 2m</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>time cufflinks  -o cufflinks/ZV9_6h_gff \</p>
<p>-G annotation/Danio_rerio.Zv9.66.gtf \</p>
<p>-b genome/Danio_rerio.Zv9.66.dna.fa \</p>
<p>-u \</p>
<p>--library-type fr-unstranded \</p>
<p>tophat/ZV9_6h/accepted_hits.bam                      # 2m</p>
<p>&nbsp;</p>
<p># guided assembly</p>
<p>time cufflinks  -o cufflinks/ZV9_2cells \</p>
<p>-g annotation/Danio_rerio.Zv9.66.gtf \</p>
<p>-b genome/Danio_rerio.Zv9.66.dna.fa \</p>
<p>-u \</p>
<p>--library-type fr-unstranded \</p>
<p>tophat/ZV9_2cells/accepted_hits.bam                  # 16m</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>time cufflinks  -o cufflinks/ZV9_6h \</p>
<p>-g annotation/Danio_rerio.Zv9.66.gtf \</p>
<p>-b genome/Danio_rerio.Zv9.66.dna.fa \</p>
<p>-u \</p>
<p>--library-type fr-unstranded \</p>
<p>tophat/ZV9_6h/accepted_hits.bam                      # 13m</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>time cuffdiff -o cuffdiff/ \</p>
<p>-L ZV9_2cells,ZV9_6h \</p>
<p>-T \</p>
<p>-b genome/Danio_rerio.Zv9.66.dna.fa \</p>
<p>-u \</p>
<p>--library-type fr-unstranded \</p>
<p>annotation/Danio_rerio.Zv9.66.gtf \</p>
<p>tophat/ZV9_2cells/accepted_hits.bam \</p>
<p>tophat/ZV9_6h/accepted_hits.bam                        # 7m</p>
<p>&nbsp;</p>
<p>head -n 20 cuffdiff/gene_exp.diff</p>
<p>&nbsp;</p>
<p>sort -t$'\t' -g -k 13 cuffdiff/gene_exp.diff \</p>
<p>&gt; cuffdiff/gene_exp_qval.sorted.diff</p>
<p>&nbsp;</p>
<p>head -n 20 cuffdiff/gene_exp_qval.sorted.diff</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/703.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>转录组-TransDecoder-对trinity结果进行注释</title>
		<link>http://www.bio-info-trainee.com/346.html</link>
		<comments>http://www.bio-info-trainee.com/346.html#comments</comments>
		<pubDate>Thu, 19 Mar 2015 12:38:33 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[转录组软件]]></category>
		<category><![CDATA[ORF]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[trinity]]></category>
		<category><![CDATA[注释]]></category>
		<category><![CDATA[预测]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=346</guid>
		<description><![CDATA[   一：下载安装该软件 下载安装该软件：  wget https://code &#8230; <a href="http://www.bio-info-trainee.com/346.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><b>   一：下载安装该软件</b></p>
<p>下载安装该软件：  wget <a href="https://codeload.github.com/TransDecoder/TransDecoder/tar.gz/2.0.1">https://codeload.github.com/TransDecoder/TransDecoder/tar.gz/2.0.1</a></p>
<p>解压进入该目录，查看里面的文件</p>
<p>make一下就可以用了，看起来好像是依赖于perl模块的</p>
<p><b><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF420.png"><img class="alignnone size-full wp-image-347" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF420.png" alt="转录组-TransDecoder-预测ORF420" width="284" height="171" /></a></b></p>
<p>这个TransDecoder.LongOrfs就是我们这次需要的程序，查看该程序，的确真是一个perl程序，看来perl还是蛮有用的。</p>
<p><b>二：准备数据</b></p>
<p><b>它里面有个测试数据，是比较全面的，也比较复杂，我就不贴出来了，反正我是那</b><b>trinity组装好的fasta格式的转录组数据来预测ORF的。</b></p>
<p><b>三：运行命令</b></p>
<p><b>它给的测试命令也很复杂</b></p>
<p><b>## generate alignment gff3 formatted output</b></p>
<p><b>../util/cufflinks_gtf_to_alignment_gff3.pl transcripts.gtf &gt; transcripts.gff3</b></p>
<p><b> </b></p>
<p><b>## generate transcripts fasta file</b></p>
<p><b>../util/cufflinks_gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta &gt; transcripts.fasta </b></p>
<p><b> </b></p>
<p><b>## Extract the long ORFs</b></p>
<p><b>../TransDecoder.LongOrfs -t transcripts.fasta</b></p>
<p><b>当然我们只需要看最后一步，这是重点</b></p>
<p>我这里是直接对我们的trinity组装好的转录本进行预测ORF</p>
<p>/home/jmzeng/bio-soft/TransDecoder/TransDecoder.LongOrfs  -t Trinity.fasta</p>
<p>命令很简单</p>
<p><b><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF1471.png"><img class="alignnone size-full wp-image-349" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF1471.png" alt="转录组-TransDecoder-预测ORF1471" width="907" height="120" /></a></b></p>
<p>输出来的文件就有预测的蛋白文件，这个文件是trinotate对转录本进行注释所必须的文件</p>
<p><b><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF1714.png"><img class="alignnone size-full wp-image-350" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/03/转录组-TransDecoder-预测ORF1714.png" alt="转录组-TransDecoder-预测ORF1714" width="266" height="74" /></a></b></p>
<p><b> </b></p>
<p><b>四：输出文件解读</b></p>
<p><b>longest_orfs.cds  </b><b>这个是预测到的</b><b>cds碱基序列，</b></p>
<p><b>longest_orfs.gff3  </b><b>这个是预测得到的</b><b>gff文件</b></p>
<p><b>longest_orfs.pep</b><b>   这个就是预测得到的蛋白文件</b></p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/346.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
