<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; arrayexpress</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/arrayexpress/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>用R语言包从EBI的arrayexpress数据库里面下载芯片数据</title>
		<link>http://www.bio-info-trainee.com/1432.html</link>
		<comments>http://www.bio-info-trainee.com/1432.html#comments</comments>
		<pubDate>Thu, 03 Mar 2016 14:13:26 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[arrayexpress]]></category>
		<category><![CDATA[bioconductor]]></category>
		<category><![CDATA[GEO]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1432</guid>
		<description><![CDATA[这个包跟GEOquery区别不是很大，只不过一个是正对NCBI的GEO数据库，一 &#8230; <a href="http://www.bio-info-trainee.com/1432.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div>这个包跟GEOquery区别不是很大，只不过一个是正对NCBI的GEO数据库，一个是针对EBI的arrayexpress数据库，只有对写自动化脚本的人来说才有需求，一般个人分析者都是自己去数据库主页里面查找，然后拿到下载链接，一个个下载。</div>
<div>从EBI的arrayexpress数据库里面下载芯片数据：</div>
<div>主页：<a href="https://www.ebi.ac.uk/arrayexpress/">https://www.ebi.ac.uk/arrayexpress/</a></div>
<div>update to 2016-3-1 11:41:27</div>
<div>63890 experiments</div>
<div>1912744 assays</div>
<div>40.53 TB of archived data 数据量还是蛮大的</div>
<div>所有的data，都可以在ftp服务器里面下载：<a href="ftp://ftp.ebi.ac.uk/pub/databases/arrayexpress/data/experiment/BUGS/">ftp://ftp.ebi.ac.uk/pub/databases/arrayexpress/data/experiment/BUGS/</a></div>
<div>根据ID号很整齐的储存着。</div>
<div>也可以用一个R语言包：ArrayExpress R package</div>
<div>说明书；<a href="https://bioconductor.org/packages/release/bioc/vignettes/ArrayExpress/inst/doc/ArrayExpress.pdf">https://bioconductor.org/packages/release/bioc/vignettes/ArrayExpress/inst/doc/ArrayExpress.pdf</a></div>
<div>这个包来自于文献：<a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723004/">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723004/</a></div>
<div>2009年，那个时候R语言用的人很少，这个简单的包都可以发文章，现在看来简直不可思议！</div>
<div></div>
<div>其实大部分数据都是跟GEO数据库对应的：比如<a href="https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-55645/">https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-55645/</a><span class="Apple-converted-space"> </span> 对应于：GEO - GSE55645</div>
<div>比如对NASH表达数据查找：<a href="https://www.ebi.ac.uk/arrayexpress/search.html?query=NASH++expression">https://www.ebi.ac.uk/arrayexpress/search.html?query=NASH++expression</a><span class="Apple-converted-space"> </span> 30条结果里面只有4条是arrayexpress数据库独有的！</div>
<div>source("<a href="https://bioconductor.org/biocLite.R">https://bioconductor.org/biocLite.R</a>")</div>
<div>biocLite("ArrayExpress")</div>
<div>library(ArrayExpress)</div>
<div>网页搜索功能：<a href="https://www.ebi.ac.uk/arrayexpress/search.html?query=NASH++expression+Homo+sapiens" target="_blank">https://www.ebi.ac.uk/arrayexpress/search.html?query=NASH++expression+Homo+sapiens</a></div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/1.png"><img class="alignnone size-full wp-image-1433" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/1.png" alt="1" width="761" height="593" /></a></div>
<div>如果用R语言，搜索如下：</div>
<div>可以用sets = queryAE(keywords = "NASH+expression", species = "homo+sapiens")</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/2.png"><img class="alignnone size-full wp-image-1434" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/2.png" alt="2" width="731" height="303" /></a></div>
<div>效果是一样的！</div>
<div>下载数据用：</div>
<div>back = getAE("E-MEXP-3291")</div>
<div>下载其实也就是里面存储了链接，直接调用R语言的下载函数即可！</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/3.png"><img class="alignnone size-full wp-image-1435" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/3.png" alt="3" width="784" height="270" /></a></div>
<div>一般没必要下载原始测序文件，直接用下面这个函数就可以得到一个数据对象，可以直接得到表达矩阵和实验的metadata</div>
<p>rawset = ArrayExpress("E-MEXP-3291")</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1432.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
