<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; limma</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/limma/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>我用rmarkdown写过的教程</title>
		<link>http://www.bio-info-trainee.com/2372.html</link>
		<comments>http://www.bio-info-trainee.com/2372.html#comments</comments>
		<pubDate>Wed, 15 Mar 2017 09:16:05 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[DESeq2]]></category>
		<category><![CDATA[GEOquery]]></category>
		<category><![CDATA[limma]]></category>
		<category><![CDATA[rmarkdown]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=2372</guid>
		<description><![CDATA[用rmarkdown写教程真心非常方便，尤其是R语言相关的，比如一些R包的应用， &#8230; <a href="http://www.bio-info-trainee.com/2372.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div>用rmarkdown写教程真心非常方便，尤其是R语言相关的，比如一些R包的应用，或者一些可视化，或者一些统计，下面我简单列出一些我以前写过的，图文并茂，关键是还非常省心，不需要排版，不需要上传图片，整理图片。</div>
<p>一般来说看链接最后的文件名就知道这篇文章讲的是什么了：</p>
<div>首先是几个R包的讲解：<br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/limma.html" target="_blank">http://www.bio-info-trainee.com/ ... software/limma.html</a><br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/DESeq2.html" target="_blank">http://www.bio-info-trainee.com/ ... oftware/DESeq2.html</a><br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/GEOquery.html" target="_blank">http://www.bio-info-trainee.com/ ... tware/GEOquery.html</a><br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/limma_voom.html" target="_blank">http://www.bio-info-trainee.com/ ... are/limma_voom.html</a><br />
当然，一些并不是bioconductor的包我也会写教程， 偶尔：<br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/GOplot.html" target="_blank">http://www.bio-info-trainee.com/ ... oftware/GOplot.html</a><br />
<a href="http://www.bio-info-trainee.com/bioconductor_China/software/Rcircos.html" target="_blank">http://www.bio-info-trainee.com/ ... ftware/Rcircos.html</a></div>
<p><span id="more-2372"></span></p>
<div></div>
<div>下面是一个统计学里面的逻辑分析的讲解</div>
<div><a href="http://www.bio-info-trainee.com/tmp/tutorial_for_logical_analysis.html">http://www.bio-info-trainee.com/tmp/tutorial_for_logical_analysis.html</a></div>
<div>下面是一个表达矩阵的15个常见的可视化图形的制作：</div>
<div><a href="http://bio-info-trainee.com/tmp/basic_visualization_for_expression_matrix.html">http://bio-info-trainee.com/tmp/basic_visualization_for_expression_matrix.html</a></div>
<div></div>
<div>
<h1 class="title toc-ignore">用deconstructSigs来做cosmic的mutation signature图</h1>
</div>
<div><a href="http://biotrainee.com/jmzeng/markdown/deconstuctSigs.html" target="_blank">http://biotrainee.com/jmzeng/markdown/deconstuctSigs.html</a></div>
<div></div>
<div>这个史上最全方差分析，不是我写的，但是写的很赞，我就不多此一举了：</div>
<div><a href="http://biotrainee.com/jmzeng/markdown/ANOVA.html" target="_blank">http://biotrainee.com/jmzeng/markdown/ANOVA.html  </a>推荐大家看看</div>
<div></div>
<div>
<h1 class="title toc-ignore">标准的基因检测报告目录  <a href="http://www.biotrainee.com/jmzeng/blogMyGenome/name_introduction.html" target="_blank">http://www.biotrainee.com/jmzeng/blogMyGenome/name_introduction.html</a></h1>
</div>
<div></div>
<div></div>
<div></div>
<h1><strong><span style="color: #ff0000;">下面是一堆高通量测序分析的结题报告：</span></strong></h1>
<div></div>
<div> 简单 <span style="color: #6e8b3d;">RNA-seq</span> 项目结题报告</div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Ref_RNAseq_result/index.html" target="_blank">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Ref_RNAseq_result/index.html</a></div>
<div></div>
<div>
<div>16s rDNA 高变区测序 项目结题报告</div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/16sRNA/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/16sRNA/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/16sRNA/index.html">示范 宏基因组分析 结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/MetaGenome_result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/MetaGenome_result/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/MetaGenome_result/index.html">示范 细菌基因组分析 结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Pacbio_Genome_result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Pacbio_Genome_result/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Pacbio_Genome_result/index.html">示范 小RNA 项目结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/SmallRNA_result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/SmallRNA_result/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/SmallRNA_result/index.html">示范 lncRNA 项目结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/lncRNA_result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/lncRNA_result/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/lncRNA_result/index.html">示范ChIP-Seq结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/chip-report/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/chip-report/index.html</a></div>
<div></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/chip-report/index.html">示范 转录组测序（De novo） 项目结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Denovo_transcriptome/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Denovo_transcriptome/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/Denovo_transcriptome/index.html">示范 WGCNA分析 结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/WGCNA_Traits_result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/WGCNA_Traits_result/index.html</a></div>
<div></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/WGCNA_Traits_result/index.html">蛋白iTRAQ定量分析 项目结题报告</a></div>
<div><a href="http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/iTRAQ_Result/index.html">http://www.biotrainee.com/jmzeng/html_report/d/e/e/p/i/n/iTRAQ_Result/index.html</a></div>
<div></div>
</div>
<div></div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/2372.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>自学miRNA-seq分析第七讲~miRNA样本配对mRNA表达量获取</title>
		<link>http://www.bio-info-trainee.com/1716.html</link>
		<comments>http://www.bio-info-trainee.com/1716.html#comments</comments>
		<pubDate>Fri, 01 Jul 2016 15:57:59 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[hgu133plus2]]></category>
		<category><![CDATA[limma]]></category>
		<category><![CDATA[miRNA-seq]]></category>
		<category><![CDATA[差异分析]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1716</guid>
		<description><![CDATA[这一讲其实算不上是自学miRNA-seq分析，本质就是affymetrix的mR &#8230; <a href="http://www.bio-info-trainee.com/1716.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>这一讲其实算不上是自学miRNA-seq分析，本质就是affymetrix的mRNA表达芯片数据分析，而且还是最常用的那种GPL570    HG-U133_Plus_2，但是因为是跟miRNA样本配对检测的，而且后面会利用到这两个数据分析结果来做共表达网络分析等等，所以就贴出对该芯片数据的分析结果。文章里面也提到了 Messenger RNA expression analysis identified 731 probe sets with significant differential expression，作者挑选的差异分析结果的显著基因列表如下：<span id="more-1716"></span>## <a href="http://journals.plos.org/plosone/article/asset?unique&amp;id=info:doi/10.1371/journal.pone.0108051.s002">http://journals.plos.org/plosone/article/asset?unique&amp;id=info:doi/10.1371/journal.pone.0108051.s002</a><br />
## mRNA expression array - GSE60291  (Affymetrix Human Genome U133 Plus 2.0 Array)</p>
<p>hgu133plus2芯片数据太常见了，可以从GEO里面下载该study的原始测序数据，然后用affy,limma包来分析，也可以直接用GEOquery包来下载作者分析好的表达矩阵，然后直接做差异分析。我这里选择的是后者，而且我跟作者分析方法有一点区别是，我先把探针都注释好了基因，然后对每个基因只挑最大表达量的基因。而作者是直接对探针为单位的的表达矩阵进行差异分析，对分析结果里面的探针进行基因注释。我这里无法给出哪种方法好的绝对评价。代码如下：</p>
<blockquote><p>rm(list=ls())<br />
library(GEOquery)<br />
library(limma)<br />
GSE60291 &lt;- getGEO('GSE60291', destdir=".",getGPL = F)</p>
<p>#下面是表达矩阵<br />
<strong><span style="color: #ff0000;">exprSet</span></strong>=exprs(GSE60291[[1]])<br />
library("annotate")<br />
GSE60291[[1]]<br />
## 下面是分组信息<br />
pdata=pData(GSE60291[[1]])<br />
<span style="color: #ff0000;"><strong>treatment</strong></span>=factor(unlist(lapply(pdata$title,function(x) strsplit(as.character(x),"-")[[1]][1])))<br />
#treatment=relevel(treatment,'control')<br />
## 下面做基因注释<br />
platformDB='hgu133plus2.db'<br />
library(platformDB, character.only=TRUE)<br />
probeset &lt;- featureNames(GSE60291[[1]])<br />
#EGID &lt;- as.numeric(lookUp(probeset, platformDB, "ENTREZID"))<br />
SYMBOL &lt;-  lookUp(probeset, platformDB, "SYMBOL")<br />
## 下面对每个基因挑选最大表达量探针<br />
a=cbind(SYMBOL,exprSet)<br />
## remove the duplicated probeset<br />
rmDupID &lt;-function(a=matrix(c(1,1:5,2,2:6,2,3:7),ncol=6)){<br />
exprSet=a[,-1]<br />
rowMeans=apply(exprSet,1,function(x) mean(as.numeric(x),na.rm=T))<br />
a=a[order(rowMeans,decreasing=T),]<br />
exprSet=a[!duplicated(a[,1]),]<br />
#<br />
exprSet=exprSet[!is.na(exprSet[,1]),]<br />
rownames(exprSet)=exprSet[,1]<br />
exprSet=exprSet[,-1]<br />
return(exprSet)<br />
}<br />
exprSet=rmDupID(a)<br />
rn=rownames(exprSet)<br />
exprSet=apply(exprSet,2,as.numeric)<br />
rownames(exprSet)=rn<br />
exprSet[1:4,1:4]<br />
#exprSet=log(exprSet) ## based on e<br />
boxplot(exprSet,las=2)<br />
## 下面用limma包来进行芯片数据差异分析<br />
design=model.matrix(~ treatment)<br />
fit=lmFit(exprSet,design)<br />
fit=eBayes(fit)<br />
#vennDiagram(decideTests(fit))<br />
DEG=topTable(fit,coef=2,n=Inf,adjust='BH')<br />
dim(DEG[abs(DEG[,1])&gt;1.2 &amp; DEG[,5]&lt;0.05,])  ## 806 genes<br />
write.csv(DEG,"ET1-normal.DEG.csv")</p></blockquote>
<p>得到的ET1-normal.DEG.csv 文件就是我们的差异分析结果，可以跟文章提供的差异结果做比较，是几乎一模一样的！</p>
<p>如果根据logFC 1.2 p 矫正P 值0.05来挑选，可以拿到806个基因。</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1716.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>用samr包对芯片数据做差异分析</title>
		<link>http://www.bio-info-trainee.com/1608.html</link>
		<comments>http://www.bio-info-trainee.com/1608.html#comments</comments>
		<pubDate>Thu, 05 May 2016 11:43:04 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[基础数据库]]></category>
		<category><![CDATA[基础软件]]></category>
		<category><![CDATA[bioconductor]]></category>
		<category><![CDATA[limma]]></category>
		<category><![CDATA[samr]]></category>
		<category><![CDATA[差异分析]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1608</guid>
		<description><![CDATA[本来搞差异分析的工具和包就一大堆了，而且limma那个包已经非常完善了，我是不准 &#8230; <a href="http://www.bio-info-trainee.com/1608.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<blockquote><p>本来搞差异分析的工具和包就一大堆了，而且limma那个包已经非常完善了，我是不准备再讲这个的，正好有个同学问了一下这个包，我就随手测试了一下，顺便看看它跟limma有什么差异没有！手痒了就记录了测试流程！</p></blockquote>
<blockquote><p>学习一个包其实非常简单，就是找到包的官网看看说明书即可！<a href="https://cran.r-project.org/web/packages/samr/samr.pdf">说明书链接</a></p>
<p>&nbsp;</p></blockquote>
<p><span id="more-1608"></span></p>
<p>samr这个包更简单，就一个函数<strong>SAM</strong>,但是根据分析数据的不同被包装成了两个函数，分别是处理高通量测序数据的<strong>SAMseq</strong>和处理芯片数据的<strong>samr</strong>,本次我只讲解芯片数据的处理，然后跟limma这个包做一个简单比较~</p>
<p>所以，我们只需要制作好数据，然后学会用samr这个函数即可！</p>
<p>我们还是利用CLL这个包的测试数据来讲解这个包的用法，首先也是制作表达矩阵和分组信息。</p>
<blockquote>
<pre class="r"><code class="r"><span class="identifier">suppressPackageStartupMessages</span><span class="paren">(</span><span class="keyword">library</span><span class="paren">(</span><span class="identifier">CLL</span><span class="paren">)</span><span class="paren">)</span>
<span class="identifier">data</span><span class="paren">(</span><span class="identifier">sCLLex</span><span class="paren">)</span>
<span class="identifier">exprSet</span><span class="operator">=</span><span class="identifier">exprs</span><span class="paren">(</span><span class="identifier">sCLLex</span><span class="paren">)</span>   <span class="comment">##sCLLex是依赖于CLL这个package的一个对象</span>
<span class="identifier">samples</span><span class="operator">=</span><span class="identifier">sampleNames</span><span class="paren">(</span><span class="identifier">sCLLex</span><span class="paren">)</span>
<span class="identifier">pdata</span><span class="operator">=</span><span class="identifier">pData</span><span class="paren">(</span><span class="identifier">sCLLex</span><span class="paren">)</span>
<span class="identifier">group_list</span><span class="operator">=</span><span class="identifier">as.character</span><span class="paren">(</span><span class="identifier">pdata</span><span class="paren">[</span>,<span class="number">2</span><span class="paren">]</span><span class="paren">)</span>
<span class="identifier">group_list</span></code></pre>
<pre><code>##  [1] "progres." "stable"   "progres." "progres." "progres." "progres."
##  [7] "stable"   "stable"   "progres." "stable"   "progres." "stable"  
## [13] "progres." "stable"   "stable"   "progres." "progres." "progres."
## [19] "progres." "progres." "progres." "stable"</code></pre>
<pre class="r"><code class="r"><span class="identifier">as.numeric</span><span class="paren">(</span><span class="identifier">as.factor</span><span class="paren">(</span><span class="identifier">group_list</span><span class="paren">)</span><span class="paren">)</span></code></pre>
<pre><code>##  [1] 1 2 1 1 1 1 2 2 1 2 1 2 1 2 2 1 1 1 1 1 1 2</code></pre>
</blockquote>
<p>这个表达矩阵exprSet和分组信息group_list就可以直接用来做差异分析啦~！ 它的分组信息要求比较读取，需要1,1,1,2,2,2这样的向量，所以我用了as.numeric(as.factor(group_list))，具体见下面的代码！</p>
<blockquote>
<pre class="r"><code class="r"><span class="identifier">suppressPackageStartupMessages</span><span class="paren">(</span><span class="keyword">library</span><span class="paren">(</span><span class="identifier">samr</span><span class="paren">)</span><span class="paren">)</span>
<span class="identifier">data</span><span class="operator">=</span><span class="identifier">list</span><span class="paren">(</span><span class="identifier">x</span><span class="operator">=</span><span class="identifier">exprSet</span>,<span class="identifier">y</span><span class="operator">=</span><span class="identifier">as.numeric</span><span class="paren">(</span><span class="identifier">as.factor</span><span class="paren">(</span><span class="identifier">group_list</span><span class="paren">)</span><span class="paren">)</span>, 
          <span class="identifier">geneid</span><span class="operator">=</span><span class="identifier">as.character</span><span class="paren">(</span><span class="number">1</span><span class="operator">:</span><span class="identifier">nrow</span><span class="paren">(</span><span class="identifier">exprSet</span><span class="paren">)</span><span class="paren">)</span>,
          <span class="identifier">genenames</span><span class="operator">=</span><span class="identifier">rownames</span><span class="paren">(</span><span class="identifier">exprSet</span><span class="paren">)</span>, 
          <span class="identifier">logged2</span><span class="operator">=</span><span class="literal">TRUE</span>
<span class="paren">)</span>
<span class="identifier">samr.obj</span><span class="operator">&lt;-</span><span class="identifier">samr</span><span class="paren">(</span><span class="identifier">data</span>, <span class="identifier">resp.type</span><span class="operator">=</span><span class="string">"Two class unpaired"</span>, <span class="identifier">nperms</span><span class="operator">=</span><span class="number">100</span><span class="paren">)</span></code></pre>
</blockquote>
<p>这样其实已经OK啦，重点是如何调整这个函数的参数，以及如何理解这个函数返回的结果(samr.obj这个对象非常重要，关乎你能否真正用好samr)~</p>
<p>我这里的genenames其实是探针名，如果真正要做分析，可以修改，而且我的nperms次数为100，也可以修改，一般是1000.</p>
<p>除了直接应用它找差异基因外，它还有几个单独的函数</p>
<p>首先是对表达矩阵进行normalization</p>
<blockquote>
<pre class="r"><code class="r"><span class="identifier">x.norm</span> <span class="operator">&lt;-</span> <span class="identifier">samr.norm.data</span><span class="paren">(</span><span class="identifier">data</span><span class="operator">$</span><span class="identifier">x</span><span class="paren">)</span>
<span class="identifier">par</span><span class="paren">(</span><span class="identifier">mfrow</span><span class="operator">=</span><span class="identifier">c</span><span class="paren">(</span><span class="number">1</span>,<span class="number">2</span><span class="paren">)</span><span class="paren">)</span>
<span class="identifier">boxplot</span><span class="paren">(</span><span class="identifier">exprSet</span>, <span class="identifier">col</span> <span class="operator">=</span> <span class="identifier">rainbow</span><span class="paren">(</span><span class="identifier">exprSet</span><span class="paren">)</span>,<span class="identifier">main</span><span class="operator">=</span><span class="string">"before normalization"</span>,<span class="identifier">las</span><span class="operator">=</span><span class="number">2</span><span class="paren">)</span>
<span class="identifier">boxplot</span><span class="paren">(</span><span class="identifier">x.norm</span>,  <span class="identifier">col</span> <span class="operator">=</span> <span class="identifier">rainbow</span><span class="paren">(</span><span class="identifier">exprSet</span><span class="paren">)</span>,<span class="identifier">main</span><span class="operator">=</span><span class="string">"after normalization"</span>,<span class="identifier">las</span><span class="operator">=</span><span class="number">2</span><span class="paren">)
<a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/05/QQ截图20160505194154.png"><img class="alignnone size-full wp-image-1609" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/05/QQ截图20160505194154.png" alt="QQ截图20160505194154" width="720" height="503" /></a>
</span></code></pre>
</blockquote>
<p>&nbsp;</p>
<p>看图好像没什么区别</p>
<p>另外几个函数，我就不一一介绍了，大家可以自行探索。</p>
<p>* samr.plot(samr.obj, del, min.foldchange=0)</p>
<p>* samr.plot(samr.obj, del=.3)</p>
<p>* samr.assess.samplesize.obj&lt;- samr.assess.samplesize(samr.obj, data, log2(1.5))</p>
<p>* samr.assess.samplesize.plot(samr.assess.samplesize.obj)</p>
<p>我们重点看看这个samr得到的差异与limma的差异区别在哪里</p>
<blockquote>
<pre class="r"><code class="r"><span class="comment">## 首先提取samr做差异分析检验的p值</span>
<span class="identifier">pv</span><span class="operator">=</span><span class="identifier">samr.pvalues.from.perms</span><span class="paren">(</span><span class="identifier">samr.obj</span><span class="operator">$</span><span class="identifier">tt</span>, <span class="identifier">samr.obj</span><span class="operator">$</span><span class="identifier">ttstar</span><span class="paren">)</span>
<span class="comment">## 然后提取limma包做差异分析检验的p值</span>
<span class="keyword">library</span><span class="paren">(</span><span class="identifier">limma</span><span class="paren">)</span> 
<span class="identifier">design</span><span class="operator">=</span><span class="identifier">model.matrix</span><span class="paren">(</span><span class="operator">~</span><span class="identifier">factor</span><span class="paren">(</span><span class="identifier">sCLLex</span><span class="operator">$</span><span class="identifier">Disease</span><span class="paren">)</span><span class="paren">)</span>
<span class="identifier">fit</span><span class="operator">=</span><span class="identifier">lmFit</span><span class="paren">(</span><span class="identifier">sCLLex</span>,<span class="identifier">design</span><span class="paren">)</span>
<span class="identifier">fit</span><span class="operator">=</span><span class="identifier">eBayes</span><span class="paren">(</span><span class="identifier">fit</span><span class="paren">)</span>
<span class="identifier">options</span><span class="paren">(</span><span class="identifier">digits</span> <span class="operator">=</span> <span class="number">4</span><span class="paren">)</span>
<span class="identifier">DEG_limma</span><span class="operator">=</span><span class="identifier">topTable</span><span class="paren">(</span><span class="identifier">fit</span>,<span class="identifier">coef</span><span class="operator">=</span><span class="number">2</span>,<span class="identifier">adjust</span><span class="operator">=</span><span class="string">'BH'</span>,<span class="identifier">n</span><span class="operator">=</span><span class="literal">Inf</span><span class="paren">)</span> 
<span class="identifier">pv_limma</span><span class="operator">=</span><span class="identifier">DEG_limma</span><span class="operator">$</span><span class="identifier">P.Value</span>
<span class="identifier">names</span><span class="paren">(</span><span class="identifier">pv_limma</span><span class="paren">)</span><span class="operator">=</span><span class="identifier">rownames</span><span class="paren">(</span><span class="identifier">DEG_limma</span><span class="paren">)</span>
<span class="identifier">head</span><span class="paren">(</span><span class="identifier">pv</span><span class="paren">[</span><span class="identifier">sort</span><span class="paren">(</span><span class="identifier">names</span><span class="paren">(</span><span class="identifier">pv</span><span class="paren">)</span><span class="paren">)</span><span class="paren">]</span><span class="paren">)</span></code></pre>
<pre><code>##  100_g_at   1000_at   1001_at 1002_f_at 1003_s_at   1004_at 
##    0.2531    0.4144    0.5671    0.5686    0.4687    0.6340</code></pre>
<pre class="r"><code class="r"><span class="identifier">head</span><span class="paren">(</span><span class="identifier">pv_limma</span><span class="paren">[</span><span class="identifier">sort</span><span class="paren">(</span><span class="identifier">names</span><span class="paren">(</span><span class="identifier">pv_limma</span><span class="paren">)</span><span class="paren">)</span><span class="paren">]</span><span class="paren">)</span></code></pre>
<pre><code>##  100_g_at   1000_at   1001_at 1002_f_at 1003_s_at   1004_at 
##    0.2497    0.4312    0.5349    0.5498    0.4361    0.6473</code></pre>
<pre class="r"><code class="r"><span class="identifier">cor</span><span class="paren">(</span><span class="identifier">pv</span><span class="paren">[</span><span class="identifier">sort</span><span class="paren">(</span><span class="identifier">names</span><span class="paren">(</span><span class="identifier">pv</span><span class="paren">)</span><span class="paren">)</span><span class="paren">]</span>,<span class="identifier">pv_limma</span><span class="paren">[</span><span class="identifier">sort</span><span class="paren">(</span><span class="identifier">names</span><span class="paren">(</span><span class="identifier">pv_limma</span><span class="paren">)</span><span class="paren">)</span><span class="paren">]</span><span class="paren">)</span></code></pre>
<pre><code>## [1] 0.9976</code></pre>
</blockquote>
<p>从数据上来看，没什么本质区别,而且相关系数高达0.9978.</p>
<p>所以结论是，没必要搞那么多的包，用limma就好了，甚至直接用t检验也是OK的</p>
<p>还有plot和summary也是可以直接作用于samr的结果samr.obj对象的</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1608.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>差异分析是否需要比较矩阵</title>
		<link>http://www.bio-info-trainee.com/1514.html</link>
		<comments>http://www.bio-info-trainee.com/1514.html#comments</comments>
		<pubDate>Sat, 09 Apr 2016 02:33:51 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[生信基础]]></category>
		<category><![CDATA[limma]]></category>
		<category><![CDATA[差异分析]]></category>
		<category><![CDATA[比较矩阵]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1514</guid>
		<description><![CDATA[最流行的差异分析软件就是limma了，它现在更新了一个voom的算法，所以既可以 &#8230; <a href="http://www.bio-info-trainee.com/1514.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<blockquote><p>最流行的差异分析软件就是limma了，它现在更新了一个voom的算法，所以既可以对芯片数据，也可以对转录组高通量测序数据进行分析，其它所有的差异分析软件其实都是模仿这个的。</p></blockquote>
<p>我以前讲到过做差异分析，需要三个数据：</p>
<ul>
<li>表达矩阵</li>
<li>分组矩阵</li>
<li>差异比较矩阵</li>
</ul>
<p>前面两个肯定是必须的，有表达矩阵，样本必须进行分组，才能分析，但是我看到过好几种例子，有的有差异比较矩阵，有的没有。</p>
<p>后来我仔细研究了一下limma包的说明书，发现这其实是一个很简单的问题。</p>
<h2><a id="user-content-大家仔细观察下面的两个代码" class="anchor" href="https://github.com/bioconductor-china/basic/blob/master/makeContrasts.md#大家仔细观察下面的两个代码"></a>大家仔细观察下面的两个代码</h2>
<h3><a id="user-content-首先是不需要差异比较矩阵的" class="anchor" href="https://github.com/bioconductor-china/basic/blob/master/makeContrasts.md#首先是不需要差异比较矩阵的"></a>首先是不需要差异比较矩阵的</h3>
<div class="highlight highlight-source-r">
<blockquote>
<pre>    library(<span class="pl-smi">CLL</span>)
    data(<span class="pl-smi">sCLLex</span>)
    library(<span class="pl-smi">limma</span>)
    <span class="pl-v">design</span><span class="pl-k">=</span>model.matrix(<span class="pl-k">~</span><span class="pl-k">factor</span>(<span class="pl-smi">sCLLex</span><span class="pl-k">$</span><span class="pl-smi">Disease</span>))
    <span class="pl-v">fit</span><span class="pl-k">=</span>lmFit(<span class="pl-smi">sCLLex</span>,<span class="pl-smi">design</span>)
    <span class="pl-v">fit</span><span class="pl-k">=</span>eBayes(<span class="pl-smi">fit</span>)
    options(<span class="pl-v">digits</span> <span class="pl-k">=</span> <span class="pl-c1">4</span>)
    <span class="pl-c">#topTable(fit,coef=2,adjust='BH') </span>
    <span class="pl-k">&gt;</span> topTable(<span class="pl-smi">fit</span>,<span class="pl-v">coef</span><span class="pl-k">=</span><span class="pl-c1">2</span>,<span class="pl-v">adjust</span><span class="pl-k">=</span><span class="pl-s"><span class="pl-pds">'</span>BH<span class="pl-pds">'</span></span>)
               <span class="pl-smi">logFC</span> <span class="pl-smi">AveExpr</span>      <span class="pl-smi">t</span>   <span class="pl-smi">P.Value</span> <span class="pl-smi">adj.P.Val</span>     <span class="pl-smi">B</span>
    <span class="pl-ii">39400_at</span>  <span class="pl-c1">1.0285</span>   <span class="pl-c1">5.621</span>  <span class="pl-c1">5.836</span> <span class="pl-c1">8.341e-06</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.234</span>
    <span class="pl-ii">36131_at</span> <span class="pl-k">-</span><span class="pl-c1">0.9888</span>   <span class="pl-c1">9.954</span> <span class="pl-k">-</span><span class="pl-c1">5.772</span> <span class="pl-c1">9.668e-06</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.117</span>
    <span class="pl-ii">33791_at</span> <span class="pl-k">-</span><span class="pl-c1">1.8302</span>   <span class="pl-c1">6.951</span> <span class="pl-k">-</span><span class="pl-c1">5.736</span> <span class="pl-c1">1.049e-05</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.052</span>
    <span class="pl-ii">1303_at</span>   <span class="pl-c1">1.3836</span>   <span class="pl-c1">4.463</span>  <span class="pl-c1">5.732</span> <span class="pl-c1">1.060e-05</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.044</span>
    <span class="pl-ii">36122_at</span> <span class="pl-k">-</span><span class="pl-c1">0.7801</span>   <span class="pl-c1">7.260</span> <span class="pl-k">-</span><span class="pl-c1">5.141</span> <span class="pl-c1">4.206e-05</span>   <span class="pl-c1">0.10619</span> <span class="pl-c1">1.935</span>
    <span class="pl-ii">36939_at</span> <span class="pl-k">-</span><span class="pl-c1">2.5472</span>   <span class="pl-c1">6.915</span> <span class="pl-k">-</span><span class="pl-c1">5.038</span> <span class="pl-c1">5.362e-05</span>   <span class="pl-c1">0.11283</span> <span class="pl-c1">1.737</span>
    <span class="pl-ii">41398_at</span>  <span class="pl-c1">0.5187</span>   <span class="pl-c1">7.602</span>  <span class="pl-c1">4.879</span> <span class="pl-c1">7.824e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.428</span>
    <span class="pl-ii">32599_at</span>  <span class="pl-c1">0.8544</span>   <span class="pl-c1">5.746</span>  <span class="pl-c1">4.859</span> <span class="pl-c1">8.207e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.389</span>
    <span class="pl-ii">36129_at</span>  <span class="pl-c1">0.9161</span>   <span class="pl-c1">8.209</span>  <span class="pl-c1">4.859</span> <span class="pl-c1">8.212e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.389</span>
    <span class="pl-ii">37636_at</span> <span class="pl-k">-</span><span class="pl-c1">1.6868</span>   <span class="pl-c1">5.697</span> <span class="pl-k">-</span><span class="pl-c1">4.804</span> <span class="pl-c1">9.355e-05</span>   <span class="pl-c1">0.11811</span> <span class="pl-c1">1.282</span>
</pre>
</blockquote>
</div>
<h3><a id="user-content-然后是需要差异比较矩阵的" class="anchor" href="https://github.com/bioconductor-china/basic/blob/master/makeContrasts.md#然后是需要差异比较矩阵的"></a>然后是需要差异比较矩阵的</h3>
<div class="highlight highlight-source-r">
<blockquote>
<pre>    library(<span class="pl-smi">CLL</span>)
    data(<span class="pl-smi">sCLLex</span>)
    library(<span class="pl-smi">limma</span>)
    <span class="pl-v">design</span><span class="pl-k">=</span>model.matrix(<span class="pl-k">~</span><span class="pl-c1">0</span><span class="pl-k">+</span><span class="pl-k">factor</span>(<span class="pl-smi">sCLLex</span><span class="pl-k">$</span><span class="pl-smi">Disease</span>))
    colnames(<span class="pl-smi">design</span>)<span class="pl-k">=</span>c(<span class="pl-s"><span class="pl-pds">'</span>progres<span class="pl-pds">'</span></span>,<span class="pl-s"><span class="pl-pds">'</span>stable<span class="pl-pds">'</span></span>)
    <span class="pl-v">fit</span><span class="pl-k">=</span>lmFit(<span class="pl-smi">sCLLex</span>,<span class="pl-smi">design</span>)
    <span class="pl-v">cont.matrix</span><span class="pl-k">=</span>makeContrasts(<span class="pl-s"><span class="pl-pds">'</span>progres-stable<span class="pl-pds">'</span></span>,<span class="pl-v">levels</span> <span class="pl-k">=</span> <span class="pl-smi">design</span>)
    <span class="pl-v">fit2</span><span class="pl-k">=</span>contrasts.fit(<span class="pl-smi">fit</span>,<span class="pl-smi">cont.matrix</span>)
    <span class="pl-v">fit2</span><span class="pl-k">=</span>eBayes(<span class="pl-smi">fit2</span>)
    options(<span class="pl-v">digits</span> <span class="pl-k">=</span> <span class="pl-c1">4</span>)
    topTable(<span class="pl-smi">fit2</span>,<span class="pl-v">adjust</span><span class="pl-k">=</span><span class="pl-s"><span class="pl-pds">'</span>BH<span class="pl-pds">'</span></span>)

               <span class="pl-smi">logFC</span> <span class="pl-smi">AveExpr</span>      <span class="pl-smi">t</span>   <span class="pl-smi">P.Value</span> <span class="pl-smi">adj.P.Val</span>     <span class="pl-smi">B</span>
    <span class="pl-ii">39400_at</span> <span class="pl-k">-</span><span class="pl-c1">1.0285</span>   <span class="pl-c1">5.621</span> <span class="pl-k">-</span><span class="pl-c1">5.836</span> <span class="pl-c1">8.341e-06</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.234</span>
    <span class="pl-ii">36131_at</span>  <span class="pl-c1">0.9888</span>   <span class="pl-c1">9.954</span>  <span class="pl-c1">5.772</span> <span class="pl-c1">9.668e-06</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.117</span>
    <span class="pl-ii">33791_at</span>  <span class="pl-c1">1.8302</span>   <span class="pl-c1">6.951</span>  <span class="pl-c1">5.736</span> <span class="pl-c1">1.049e-05</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.052</span>
    <span class="pl-ii">1303_at</span>  <span class="pl-k">-</span><span class="pl-c1">1.3836</span>   <span class="pl-c1">4.463</span> <span class="pl-k">-</span><span class="pl-c1">5.732</span> <span class="pl-c1">1.060e-05</span>   <span class="pl-c1">0.03344</span> <span class="pl-c1">3.044</span>
    <span class="pl-ii">36122_at</span>  <span class="pl-c1">0.7801</span>   <span class="pl-c1">7.260</span>  <span class="pl-c1">5.141</span> <span class="pl-c1">4.206e-05</span>   <span class="pl-c1">0.10619</span> <span class="pl-c1">1.935</span>
    <span class="pl-ii">36939_at</span>  <span class="pl-c1">2.5472</span>   <span class="pl-c1">6.915</span>  <span class="pl-c1">5.038</span> <span class="pl-c1">5.362e-05</span>   <span class="pl-c1">0.11283</span> <span class="pl-c1">1.737</span>
    <span class="pl-ii">41398_at</span> <span class="pl-k">-</span><span class="pl-c1">0.5187</span>   <span class="pl-c1">7.602</span> <span class="pl-k">-</span><span class="pl-c1">4.879</span> <span class="pl-c1">7.824e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.428</span>
    <span class="pl-ii">32599_at</span> <span class="pl-k">-</span><span class="pl-c1">0.8544</span>   <span class="pl-c1">5.746</span> <span class="pl-k">-</span><span class="pl-c1">4.859</span> <span class="pl-c1">8.207e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.389</span>
    <span class="pl-ii">36129_at</span> <span class="pl-k">-</span><span class="pl-c1">0.9161</span>   <span class="pl-c1">8.209</span> <span class="pl-k">-</span><span class="pl-c1">4.859</span> <span class="pl-c1">8.212e-05</span>   <span class="pl-c1">0.11520</span> <span class="pl-c1">1.389</span>
    <span class="pl-ii">37636_at</span>  <span class="pl-c1">1.6868</span>   <span class="pl-c1">5.697</span>  <span class="pl-c1">4.804</span> <span class="pl-c1">9.355e-05</span>   <span class="pl-c1">0.11811</span> <span class="pl-c1">1.282</span></pre>
</blockquote>
</div>
<p>大家运行一下这些代码就知道，两者结果是一模一样的。</p>
<p>而差异比较矩阵的需要与否，主要看分组矩阵如何制作的！</p>
<p>design=model.matrix(~factor(sCLLex$Disease))</p>
<p>design=model.matrix(~0+factor(sCLLex$Disease))</p>
<p>有本质的区别！！！</p>
<p>前面那种方法已经把需要比较的组做出到了一列，需要比较多次，就有多少列，第一列是截距不需要考虑，第二列开始往后用coef这个参数可以把差异分析结果一个个提取出来。</p>
<p>而后面那种方法，仅仅是分组而已，组之间需要如何比较，需要自己再制作差异比较矩阵，通过makeContrasts函数来控制如何比较！</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1514.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>用limma包对芯片数据做差异分析</title>
		<link>http://www.bio-info-trainee.com/1194.html</link>
		<comments>http://www.bio-info-trainee.com/1194.html#comments</comments>
		<pubDate>Fri, 11 Dec 2015 14:34:55 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[基础软件]]></category>
		<category><![CDATA[limma]]></category>
		<category><![CDATA[差异分析]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1194</guid>
		<description><![CDATA[下载该R语言包，然后看说明书，需要自己做好三个数据（表达矩阵，分组矩阵，差异比较 &#8230; <a href="http://www.bio-info-trainee.com/1194.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>下载该R语言包，然后看说明书，需要自己做好三个数据（表达矩阵，分组矩阵，差异比较矩阵），总共三个步骤（lmFit,eBayes,topTable）就可以啦</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0011.png"><img class="alignnone size-full wp-image-1195" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0011.png" alt="image001" width="1002" height="476" /></a></p>
<p>首先做第一个数据，基因表达矩阵！</p>
<p>自己在NCBI里面可以查到下载地址，然后用R语言读取即可</p>
<p>exprSet=read.table("GSE63067_series_matrix.txt.gz",comment.char = "!",stringsAsFactors=F,header=T)</p>
<p>rownames(exprSet)=exprSet[,1]</p>
<p>exprSet=exprSet[,-1]</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0021.png"><img class="alignnone size-full wp-image-1196" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0021.png" alt="image002" width="682" height="354" /></a></p>
<p>然后做好分组矩阵，如下</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0031.png"><img class="alignnone size-full wp-image-1197" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0031.png" alt="image003" width="294" height="376" /></a></p>
<p>然后做好，差异比较矩阵，就是说明你想把那些组拿起来做差异分析，如下</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0041.png"><img class="alignnone size-full wp-image-1198" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image0041.png" alt="image004" width="542" height="112" /></a></p>
<p>最后输出结果：</p>
<p>我进行了6次比较，所以会输出6次比较结果</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image005.png"><img class="alignnone size-full wp-image-1199" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image005.png" alt="image005" width="591" height="157" /></a></p>
<p>最后打开差异结果，解读，说明书如下！</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image006.png">忒<img class="alignnone size-full wp-image-1200" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/image006.png" alt="image006" width="1026" height="584" /></a></p>
<p>在我的github有完整代码</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1194.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
