<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; 基因组版本</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/%e5%9f%ba%e5%9b%a0%e7%bb%84%e7%89%88%e6%9c%ac/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>基因组各种版本对应关系</title>
		<link>http://www.bio-info-trainee.com/1469.html</link>
		<comments>http://www.bio-info-trainee.com/1469.html#comments</comments>
		<pubDate>Tue, 15 Mar 2016 11:50:00 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[基础数据库]]></category>
		<category><![CDATA[基础数据格式]]></category>
		<category><![CDATA[未分类]]></category>
		<category><![CDATA[ENSEMBL]]></category>
		<category><![CDATA[ncbi]]></category>
		<category><![CDATA[UCSC]]></category>
		<category><![CDATA[基因组版本]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1469</guid>
		<description><![CDATA[我是受到了SOAPfuse的启发才想到整理各种基因组版本的对应关系，完整版！！！ &#8230; <a href="http://www.bio-info-trainee.com/1469.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<pre>我是受到了SOAPfuse的启发才想到整理各种基因组版本的对应关系，完整版！！！</pre>
<pre>以后再也不用担心各种基因组版本混乱了，我还特意把所有的下载链接都找到了，可以下载任意版本基因组的基因fasta文件，gtf注释文件等等！！！</pre>
<div>首先是NCBI对应UCSC，对应ENSEMBL数据库：</div>
<div></div>
<div>
<blockquote>
<div>GRCh36 (hg18): ENSEMBL release_52.</div>
<div>GRCh37 (hg19): ENSEMBL release_59/61/64/68/69/75.</div>
<div>GRCh38 (hg38): ENSEMBL  release_76/77/78/80/81/82.</div>
</blockquote>
<div></div>
<div>可以看到ENSEMBL的版本特别复杂！！！很容易搞混！</div>
<div>但是UCSC的版本就简单了，就hg18,19,38, 常用的是hg19，但是我推荐大家都转为hg38</div>
<div>看起来NCBI也是很简单，就GRCh36,37,38，但是里面水也很深！</div>
<div>
<blockquote>
<pre>Feb 13 2014 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/April_14_2003/">April_14_2003</a>
Apr 06 2006 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.33/">BUILD.33</a>
Apr 06 2006 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.34.1/">BUILD.34.1</a>
Apr 06 2006 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.34.2/">BUILD.34.2</a>
Apr 06 2006 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.34.3/">BUILD.34.3</a>
Apr 06 2006 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.35.1/">BUILD.35.1</a>
Aug 03 2009 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.36.1/">BUILD.36.1</a>
Aug 03 2009 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.36.2/">BUILD.36.2</a>
Sep 04 2012 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.36.3/">BUILD.36.3</a>
Jun 30 2011 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.37.1/">BUILD.37.1</a>
Sep 07 2011 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.37.2/">BUILD.37.2</a>
Dec 12 2012 00:00    Directory <a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.37.3/">BUILD.37.3</a></pre>
</blockquote>
</div>
<div>可以看到，有37.1,   37.2，  37.3 等等，不过这种版本一般指的是注释在更新，基因组序列一般不会更新！！！</div>
<div>反正你记住hg19基因组大小是3G，压缩后八九百兆即可！！！</div>
<div></div>
<div>如果要下载GTF注释文件，基因组版本尤为重要！！！</div>
<div></div>
<div>对NCBI：<span style="font-family: Arial,Helvetica,sans-serif;"><a href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/GFF/">ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/GFF/          ##最新版（hg38）</a></span></div>
<div><span style="font-family: Arial,Helvetica,sans-serif;"><a href="ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/">ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/    ## 其它版本</a></span></div>
<div></div>
<div>对于ensembl：</div>
<div><a href="ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz" rel="nofollow">ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz</a></div>
<div>变幻中间的release就可以拿到所有版本信息：<a href="ftp://ftp.ensembl.org/pub/">ftp://ftp.ensembl.org/pub/</a></div>
<div>对于UCSC，那就有点麻烦了：</div>
<div>
<div>需要选择一系列参数：</div>
<div><a href="http://genome.ucsc.edu/cgi-bin/hgTables">http://genome.ucsc.edu/cgi-bin/hgTables</a></div>
<div></div>
<blockquote>
<div>1. Navigate to <a href="http://genome.ucsc.edu/cgi-bin/hgTables" target="_blank" rel="nofollow">http://genome.ucsc.edu/cgi-bin/hgTables</a></div>
<div></div>
<div>2. Select the following options:<br />
clade: Mammal<br />
genome: Human<br />
assembly: Feb. 2009 (GRCh37/hg19)<br />
group: Genes and Gene Predictions<br />
track: UCSC Genes<br />
table: knownGene<br />
region: Select "genome" for the entire genome.<br />
output format: GTF - gene transfer format<br />
output file: enter a file name to save your results to a file, or leave blank to display results in the browser</div>
<div></div>
<div>3. Click 'get output'.</div>
</blockquote>
</div>
<div> 现在重点来了，搞清楚版本关系了，就要下载呀！</div>
<div>UCSC里面下载非常方便，只需要根据基因组简称来拼接url即可：</div>
<div>
<blockquote>
<div><a href="http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz">http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz</a></div>
<div><a href="http://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/chromFa.tar.gz">http://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/chromFa.tar.gz</a></div>
<div><a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz</a></div>
<div><a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/chromFa.tar.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/chromFa.tar.gz</a></div>
</blockquote>
<div>或者用shell脚本指定下载的染色体号：</div>
<blockquote>
<div>for i in $(seq 1 22) X Y M;<br />
do echo $i;<br />
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr${i}.fa.gz;</div>
<div>## 这里也可以用NCBI的：ftp://ftp.ncbi.nih.gov/genomes/M_musculus/ARCHIVE/MGSCv3_Release3/Assembled_Chromosomes/chr前缀<br />
done<br />
gunzip *.gz<br />
for i in $(seq 1 22) X Y M;<br />
do cat chr${i}.fa &gt;&gt; hg19.fasta;<br />
done<br />
rm -fr chr*.fasta</div>
</blockquote>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1469.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
