这个也是读者来信最多的,关于基因组某些区域的起始终止坐标的下载问题,genomic feature的问题,一般是gtf文件或者bed文件,比如人类hg19上面的所有外显子的坐标记录文件,所有基因的坐标记录文件,所有lncRNA,rRNA等等,我这里拿CpG Islands记录文件下载的4种方式举例子给大家说明一下: Continue reading
十二
15
这个也是读者来信最多的,关于基因组某些区域的起始终止坐标的下载问题,genomic feature的问题,一般是gtf文件或者bed文件,比如人类hg19上面的所有外显子的坐标记录文件,所有基因的坐标记录文件,所有lncRNA,rRNA等等,我这里拿CpG Islands记录文件下载的4种方式举例子给大家说明一下: Continue reading
- chromatin structure (5C)
- open chromatin (DNase-seq and FAIRE-seq)
- histone modifications and DNA-binding of over 100 transcription factors (ChIP-seq)
- RNA transcription (RNAseq and CAGE)
我是受到了SOAPfuse的启发才想到整理各种基因组版本的对应关系,完整版!!!
以后再也不用担心各种基因组版本混乱了,我还特意把所有的下载链接都找到了,可以下载任意版本基因组的基因fasta文件,gtf注释文件等等!!!
GRCh36 (hg18): ENSEMBL release_52.GRCh37 (hg19): ENSEMBL release_59/61/64/68/69/75.GRCh38 (hg38): ENSEMBL release_76/77/78/80/81/82.
Feb 13 2014 00:00 Directory April_14_2003 Apr 06 2006 00:00 Directory BUILD.33 Apr 06 2006 00:00 Directory BUILD.34.1 Apr 06 2006 00:00 Directory BUILD.34.2 Apr 06 2006 00:00 Directory BUILD.34.3 Apr 06 2006 00:00 Directory BUILD.35.1 Aug 03 2009 00:00 Directory BUILD.36.1 Aug 03 2009 00:00 Directory BUILD.36.2 Sep 04 2012 00:00 Directory BUILD.36.3 Jun 30 2011 00:00 Directory BUILD.37.1 Sep 07 2011 00:00 Directory BUILD.37.2 Dec 12 2012 00:00 Directory BUILD.37.3
1. Navigate to http://genome.ucsc.edu/cgi-bin/hgTables2. Select the following options:
clade: Mammal
genome: Human
assembly: Feb. 2009 (GRCh37/hg19)
group: Genes and Gene Predictions
track: UCSC Genes
table: knownGene
region: Select "genome" for the entire genome.
output format: GTF - gene transfer format
output file: enter a file name to save your results to a file, or leave blank to display results in the browser3. Click 'get output'.
for i in $(seq 1 22) X Y M;
do echo $i;
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr${i}.fa.gz;## 这里也可以用NCBI的:ftp://ftp.ncbi.nih.gov/genomes/M_musculus/ARCHIVE/MGSCv3_Release3/Assembled_Chromosomes/chr前缀
done
gunzip *.gz
for i in $(seq 1 22) X Y M;
do cat chr${i}.fa >> hg19.fasta;
done
rm -fr chr*.fasta
X-DAS-Version: DAS/0.95 X-DAS-Status: 200 Content-Type:text Access-Control-Allow-Origin: * Access-Control-Expose-Headers: X-DAS-Version X-DAS-Status X-DAS-Capabilities UCSC DAS Server. See http://www.biodas.org for more info on DAS. Try http://genome.ucsc.edu/cgi-bin/das/dsn for a list of databases. See our DAS FAQ (http://genome.ucsc.edu/FAQ/FAQdownloads#download23) for more information. Alternatively, we also provide query capability through our MySQL server; please see our FAQ for details (http://genome.ucsc.edu/FAQ/FAQdownloads#download29). Note that DAS is an inefficient protocol which does not support all types of annotation in our database. We recommend you access the UCSC database by downloading the tab-separated files in the downloads section (http://hgdownload.cse.ucsc.edu/downloads.html) or by using the Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables) instead of DAS in most circumstances.