十一 05

点突变详解

DNA分子中某一个碱基为另一种碱基置换,导致DNA碱基序列异常,是基因突变的一种类型。可分为转换和颠换两类。转换(transitions)是同类碱基的置换(AT→GCGC→AT,颠换(transversions) 是不同类碱基的置换(AT→TACG,GC→CGTA

DNA substitution mutations are of two types. Transitions are interchanges of two-ring purines (A  G) or of one-ring pyrimidines (C  T): they therefore involve bases of similar shape. Transversions are interchanges of purine for pyrimidine bases, which therefore involve exchange of one-ring and two-ring structures.

我们在分析driver mutation的时候会区分各种点突变:

  • 1. CpG transitions
  • 2. CpG transversions
  • 3. C:G transitions
  • 4. C:G transversions
  • 5. A:T transitions
  • 6. A:T transversions

那么,我们有64种密码子,每种密码子都会有9种突变可能,我们如何得到一个所有的突变可能的分类并且打分表格呢?

类似于下面这样的表格:共576行!!!

head category.acgt
AAA>AAT 2 A T 6
AAA>AAC 2 A C 6
AAA>AAG 2 A G 5
AAA>ATA 1 A T 6
AAA>ACA 1 A C 6
AAA>AGA 1 A G 5
AAA>TAA 0 A T 6
AAA>CAA 0 A C 6
AAA>GAA 0 A G 5
AAT>AAA 2 T A 6

tail category.acgt
GGC>GGG 2 C G 2
GGG>AGG 0 G A 3
GGG>TGG 0 G T 4
GGG>CGG 0 G C 4
GGG>GAG 1 G A 3
GGG>GTG 1 G T 4
GGG>GCG 1 G C 4
GGG>GGA 2 G A 3
GGG>GGT 2 G T 4
GGG>GGC 2 G C 4

我本来以为这是一件很简单的事情,写起来,才发现好麻烦

1Capture

 

里面用到的一个函数如下:就是判断突变属于上述六种的哪一种!

2

参考:https://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html

https://en.wikipedia.org/wiki/Transversion

http://www.uvm.edu/~cgep/Education/Mutations.html

突变,也称作单碱基替换(single base substitution),指由单个碱基改变发生的突变

可以分为转换(transitions)和颠换(transversions)两类。

转换:嘌呤和嘌呤之间的替换,或嘧啶和嘧啶之间的替换。

颠换:嘌呤和嘧啶之间的替换。

方便理解下面再附上一张示意图,如下:

 

Transitions_vs_Transversions

 

十一 01

WES(七)看de novo变异情况

de novo变异寻找这个也属于snp-calling的一部分,但是有点不同的就是该软件考虑了一家三口的测序文件,找de novo突变

软件有介绍这个功能:http://varscan.sourceforge.net/trio-calling-de-novo-mutations.html

而且还专门有一篇文章讲ASD和autism与de novo变异的关系,但是文章不清不楚的,没什么意思

Trio Calling for de novo Mutations

image001

Min coverage:   10

Min reads2:     4

Min var freq:   0.2

Min avg qual:   15

P-value thresh: 0.05

Adj. min reads2:        2

Adj. var freq:  0.05

Adj. p-value:   0.15

Reading input from trio.filter.mpileup

1371416525 bases in pileup file (137M的序列)

83123183 met the coverage requirement of 10 (其中有83M的测序深度大于10X)

145104 variant positions (132268 SNP, 12836 indel) (共发现15.5万的变异位点)

4403 were failed by the strand-filter

139153 variant positions reported (126762 SNP, 12391 indel)

502 de novo mutations reported (376 SNP, 126 indel) (真正属于 de novo mutations只有502个)

1734 initially DeNovo were re-called Germline

12 initially DeNovo were re-called MIE

3 initially DeNovo were re-called MultAlleles

522 initially MIE were re-called Germline

1 initially MIE were re-called MultAlleles

3851 initially Untransmitted were re-called Germline

然后我看了看输出的文件trio.mpileup.output.snp.vcf

软件是这样解释的:The output of the trio subcommand is a single VCF in which all variants are classified as germline (transmitted or untransmitted), de novo, or MIE.

  • FILTER - mendelError if MIE, otherwise PASS
  • STATUS - 1=untransmitted, 2=transmitted, 3=denovo, 4=MIE
  • DENOVO - if present, indicates a high-confidence de novo mutation call

里面的信息量好还是不清楚

我首先对我们拿到的trio.de_novo.mutaion.snp.vcf文件进行简化,只看基因型!
head status.txt   (顺序是dad,mom,child)
STATUS=2 0/0 0/1 0/1
STATUS=2 1/1 1/1 1/1
STATUS=2 0/1 0/0 0/1
STATUS=2 1/1 1/1 1/1
STATUS=1 0/1 0/0 0/0
STATUS=1 0/1 0/0 0/0
STATUS=2 1/1 1/1 1/1
STATUS=2 1/1 1/1 1/1
STATUS=2 1/1 1/1 1/1
STATUS=2 0/1 0/1 0/1
那么总结如下:
  26564 STATUS=1 无所以无 (0/0 0/1 0/0或者 0/1 0/0 0/0等等)
  97764 STATUS=2 有所以有 (1/1 1/1 1/1 或者0/1 0/1 1/1等等)
    385 STATUS=3 无中生有 (0/0 0/0 0/1 或者0/0 0/0 1/1)
   1485 STATUS=4 有中生无 (1/1 0/1 0/0 等等)

我用annovar注释了一下

/home/jmzeng/bio-soft/annovar/convert2annovar.pl -format vcf4  trio.mpileup.output.snp.vcf > trio.snp.annovar

/home/jmzeng/bio-soft/annovar/annotate_variation.pl -buildver hg19  --geneanno --outfile  trio.snp.anno trio.snp.annovar /home/jmzeng/bio-soft/annovar/humandb

结果是:

A total of 132268 locus in VCF file passed QC threshold, representing 132809 SNPs (86633 transitions and 46176 transversions) and 3 indels/substitutions

可以看到最后被注释到外显子上面的突变有两万多个

23794  284345 3123333 trio.snp.anno.exonic_variant_function

这个应该是非常有意义的,但是我还没学会后面的分析。只能先做到这里了

 

28

模拟Y染色体测序判断,并比对到X染色体上面,看同源性

首先下载两条染色体序列

wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chrX.fa.gz;

wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chrY.fa.gz;

152M Mar 21  2009 chrX.fa

58M Mar 21  2009 chrY.fa

然后把X染色体构建bwa的索引

bwa index chrX.fa

[bwa_index] Pack FASTA... 1.97 sec

[bwa_index] Construct BWT for the packed sequence...

[BWTIncCreate] textLength=310541120, availableWord=33850812

[BWTIncConstructFromPacked] 10 iterations done. 55838672 characters processed.

[BWTIncConstructFromPacked] 20 iterations done. 103157920 characters processed.

[BWTIncConstructFromPacked] 30 iterations done. 145211344 characters processed.

[BWTIncConstructFromPacked] 40 iterations done. 182584528 characters processed.

[BWTIncConstructFromPacked] 50 iterations done. 215797872 characters processed.

[BWTIncConstructFromPacked] 60 iterations done. 245313968 characters processed.

[BWTIncConstructFromPacked] 70 iterations done. 271543920 characters processed.

[BWTIncConstructFromPacked] 80 iterations done. 294853104 characters processed.

[bwt_gen] Finished constructing BWT in 88 iterations.

[bwa_index] 98.58 seconds elapse.

[bwa_index] Update BWT... 0.96 sec

[bwa_index] Pack forward-only FASTA... 0.91 sec

[bwa_index] Construct SA from BWT and Occ... 33.18 sec

[main] Version: 0.7.8-r455

[main] CMD: /lrlhps/apps/bioinfo/bwa/bwa-0.7.8/bwa index chrX.fa

[main] Real time: 141.623 sec; CPU: 135.605 sec

由于X染色体也就152M,所以很快,两分钟解决战斗!

然后模拟Y染色体的测序判断(PE100insert400

209M Oct 28 10:19 read1.fa

209M Oct 28 10:19 read2.fa

模拟的程序很简单

tmp

 

while(<>){
chomp;
$chrY.=uc $_;
}
$j=0;
open FH_L,">read1.fa";
open FH_R,">read2.fa";
foreach (1..4){
for ($i=600;$i<(length($chrY)-600);$i = $i+50+int(rand(10))){
$up = substr($chrY,$i,100);
$down=substr($chrY,$i+400,100);
next unless $up=~/[ATCG]/;
next unless $down=~/[ATCG]/;
$down=reverse $down;
$down=~tr/ATCG/TAGC/;
$j++;
print FH_L ">read_$j/1\n";
print FH_L "$up\n";
print FH_R ">read_$j/2\n";
print FH_R "$down\n";
}
}
close FH_L;
close FH_R;

然后用bwa mem 来比对

bwa mem -t 12 -M chrX.fa read*.fa >read.sam

用了12个线层,所以也非常快

[main] Version: 0.7.8-r455

[main] CMD: /apps/bioinfo/bwa/bwa-0.7.8/bwa mem -t 12 -M chrX.fa read1.fa read2.fa

[main] Real time: 136.641 sec; CPU: 1525.360 sec

643M Oct 28 10:24 read.sam

然后统计比对结果

samtools view -bS read.sam >read.bam

158M Oct 28 10:26 read.bam

samtools flagstat read.bam

3801483 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 duplicates

2153410 + 0 mapped (56.65%:-nan%)

3801483 + 0 paired in sequencing

1900666 + 0 read1

1900817 + 0 read2

645876 + 0 properly paired (16.99%:-nan%)

1780930 + 0 with itself and mate mapped

372480 + 0 singletons (9.80%:-nan%)

0 + 0 with mate mapped to a different chr

0 + 0 with mate mapped to a different chr (mapQ>=5)

我自己看sam文件也发现真的同源性好高呀,总共就模拟了380万reads,就有120万是百分百比对上了。

所以对女性个体来说,测序判断比对到Y染色体是再正常不过的了。如果要判断性别,必须要找那些X,Y差异性区段

21

一个MIT的博士要离开学术圈,结果引发了上千人的热烈讨论(下)

Dr Petra EichelsdoerferFebruary 17, 2014 at 11:17 AM

My university lacked the funds to support me while applying for my first grant after my post-doc. So I was unable to re-write it after its first rejection. Although it was heartbreaking to walk away, I had a family to support. This was nearly four years ago. How many others have had to make the same decision I have? What will be the long-term consequences?

我们大学也是资金紧张,所以我直到博士后出站才拿到资金的第一笔科研基金。大约四年前把,当我第一次被基金会拒绝后,我实在难以鼓起勇气再次申请基金。尽管我必须去申请,因为我还得养家糊口。会有多少人曾与我经历过同样的处境呢,又有多少人与我一样做了同样的选择呢? Continue reading

20

一个MIT的博士要离开学术圈,结果引发了上千人的热烈讨论(中)

的确是上千人进行讨论,我这里分两次翻译网友们的讨论结果:

Lenny TeytelmanFebruary 15, 2014 at 7:53 AM

The reason I did not want to publish this - a single voice is invariably dismissed. So, I want to assemble in a central place as many essays like this from students, postdocs, and professors. The funding crisis will not be addressed until the severity of it is acknowledged and NIH, politicians, and scientists are alarmed enough. Please e-mail me your stories to lenny at zappylab dot com (whether new or published elsewhere). I will put together a site aggregating all of them.

以前我不发表这个观点的原因是-势单力孤。所以我希望等到足够多的类似的观点由不同的学生,博士后,甚至教授提出来之后,把它们整合在一起。学术圈的资金危机很难解决,除非NIH本身意识到问题的严重性,而且最好是政客,科学家们也对此有着足够的警醒。所以请把你的经历发邮件给我,不管你以前是否也在其它什么场合抱怨过,我将对他们做一个汇总,统一发表出来。

BioluminessaFebruary 17, 2014 at 10:55 AM

Hi Lenny. Thank you for sharing your perspective! Here is another to add to your collection of essays. I wrote this piece the other day coming to similar conclusions. http://bioluminate.blogspot.com/2014/02/the-seven-stages-of-grief-for-academic.html

谢谢你分享你的观点,这里有个故事可能比较符合你的要求。

sheiselsewhereFebruary 25, 2014 at 11:54 AM

I often get told that I shouldn't be so negative and that things will get better. Unfortunately, I don't have the time to wait. Here is my contribution.

http://sheiselsewhere.mosdave.com/2014/02/16/singing-for-supper/

长久以来,人们就告诉我我不应该如此的悲观,一切都会好起来的。不幸的是,我已经没有足够的时间去等待事情好转了。下面是我的经历。

Dregev21February 28, 2014 at 2:13 PM

Wow, thank you for posting this! I have gone through a very similar situation and have also decided to quit pursuing this dream. I was a 4th year PhD student at the University of Florida (where I had already had to change labs since my first mentor moved to UAB) and my project was going nowhere fast. I also started seeing academia for what it has become; an industry of cheap labor and false hopes. But like you, I stayed in it for as long as I could because of my love for science, learning and teaching. I quit and got out with a MS degree this past November and I am very happy with my decision. I began working as a research coordinator at UF, making more money and like you felt liberated and free from the constant stress of graduate work and research. I believe most students come in to graduate scholl for the same reasons, but it has become so disheartening and scary, that it didn't seem worth it to me anymore. I think it is important for current students to know and understand that there are other things to do in life that are more fruitfull, less stressful and just as intelectually stimulating and rewarding. In any case, thank you for sharing!

哇,谢谢你的分享!我有着同样的经历,也准备放弃一直追寻的梦想了。我是佛罗里达大学的一名在读博士生,这是我博士生涯的第四年,而博士期间,我已经换过一次导师了,因为我的前导师去了UAB,所以我的博士课题也换了。就我的经历来看,我也认为学生圈现在充斥着廉价劳动力和虚假的期望。然而就像你一样,凭着对科学的一腔热血,我坚持了很长时间。直到去年十一月,我辍学了,就拿到了一个硕士学位,并且我非常开心我做了这样的决定。我的第一份工作是在UF做一个研究协调员,可以挣很多钱了,也终于从无止境的毕业研究课题压力中解放出来了,跟你一样的自由了。我觉得是时候让现在的学生知道在我们的生命中还有这非常多的更有意义,会收获更大,但却压力更小的事情可以做。

Travels with Moby March 1, 2014 at 12:58 PM

Good Luck to you. I made this choice for many similar reasons about 6 years ago. But I was a college professor at a small college. You have aptly described the scenario. Even without the added stress of the grant machine, the choices that we are forces to make that divorce ourselves from family, friends to pursue this academic dream are incredibly costly. What I did find after a year in industry, is that I was not alone, I met former academics in industry and elsewhere that have expressed the same concerns. I wish you the best, and you are not alone.

祝你好运!我也做了同样的决定,在六年前。但是我曾经是一个小学院的一名教授。你非常精准的描述了我们这样的教授的状态。即使没有基金方面的压力,我们这样的选择也使得我们无法兼顾家庭,朋友。追求学术的道理牺牲真的好大。当我离开学术圈,进入产业界的第一年,我终于不再觉得孤单了,我遇到以前学术圈的好友的时候也听到过他们有着类似的抱怨。不管怎么样,还是希望你能好好的,毕竟,我们都不孤单。

Kevin ZelnioMarch 3, 2014 at 8:51 AM

Good luck! Life is better outside academia lol. I left 2 phds (got my masters after first one) and a decade long research career with 12 and then a 5+ year science communication career, left the country and started a microbrewery in Sweden. My skills as a scientist have been instrumental in my new profession as a beer maker (serious lab and sanitation skills here!) and a business person (improved and more diverse funding sources! AKA investors and people who drink BEER - which is like everybody). I cried a lot, I won't lie. Almost wrecked my marriage and the stress turned me into a horrible father for a while. Its just not a sustainable career for some types of people. Which is a shame, because the career is selecting for the same type of people and missing out on a diversity of life styles which could most likely benefit the scientific community in a number of ways. Here was my story: http://deepseanews.com/2013/02/19294/

祝你好运!学术圈外面的世界更精彩,O(∩_∩)O哈哈哈~我有着长达12年的科学研究经历和多于5年的科学普及经历,终于离开了我的祖国,来到了瑞典开始酿酒。在我还是科学家的时候掌握的各种技能对我现在作为一个啤酒酿造师还是蛮有帮助的,尤其是严肃认真的实验习惯以及无菌操作技巧。那时候申请各种稀奇古怪的基金所锻炼出来的沟通技巧也特别适合自己现在转行做商人。我曾经为这样的转变哭泣过,我承认。这几乎摧毁了我的婚姻生活,也在一段时间让我变成了一个糟糕的父亲。对某些人来说,这样的职业不是一辈子的事情,而且是某种意义上的丢脸。以下是我的故事。

UnknownMarch 3, 2014 at 2:10 PM

I don't see why I should view your departure as
a bad sign for the life sciences. As an engineer,
we celebrate when our students graduate, go
start a company or join an existing one, and
create products that make the world a better
place. Or, go work at a national laboratory,
the FCC, a non-profit, or any of the other types of
jobs where engineers make a contribution.

我不认为,应该把你的离开看作是科学界的损失。作为工程师,我们欢迎毕业了的同学开公司或者加入已有的公司,去创造一些让人们生活更美好的产品。或者,干脆去国家实验室工作,或者FCC也行,一个非盈利组织,或者可以参加很多其它类似的,对工程师有需求的工作

Lenny TeytelmanMarch 3, 2014 at 2:26 PM

First of all, it is terrific that you are supportive of graduate students who go on to be productive outside of academia! Unfortunately, in life sciences, you often lose support of your mentor the second you say that you do not plan to be a tenure track professor.
Second, and most importantly - the reason our departures and anxieties are cause for concern - being a professor, in the current funding climate, requires a level of sacrifice for science that fewer and fewer of the most talented and brightest scientists will make. Our taxpayers spend an extraordinary amount funding research. If the best scientists leave academia, research will suffer. Training of the future scientists will suffer. Science, inside and outside of academia will suffer.

首先,像你这样的毕业生,走出学术圈之后还能创造如此多的产品,必然是非常了不起的。不幸的是,在生命科学领域,你经常就会失去你导师的资助,就像你博文里面提到的第二点那样,你也不打算去争取一个终身教职了。

其次,最重要的是我们的离开仅仅是因为焦虑,在如今的基金资助条件下,成为一个教授,需要一定程度的牺牲,只有极少数的非常聪明,非常有才智的科学家能做到了。我们的纳税人还是投入了大笔资金在科学研究的,如果大量的优秀科学家都离开了学术圈,那谈何学术成果呢。一个科学家的训练是需要耗费大量时间和金钱的,他们都将因此受损,同时,不管是学术圈里面还是外面都会蒙受巨大的损失。

gregMarch 3, 2014 at 3:07 PM

Your story collection is a great idea. I hope you'll keep the sources of the site open? I bet a lot of people would like to contribute to making that project stand out - I would certainly be helping out.

Thanks a ton for your blog post. Your last point about leaving your wife if she'd treated you as badly as science does is awesome. I'm just coming to the realization that you seem to already have: the notion that "If you can see yourself possibly loving any profession as much as you love science, you're not cut out for science" is unhealthy - it's a mark of the sort of brainwashing that academia does to you.

Best wishes on your future path.

你这个想把所有类似的学术圈失意的故事收集起来的想法很棒。我希望你能持续收集,并且保持开放。我也确信会有非常多的人参与进来贡献他们的故事,让这个计划传播开来。我也会尽我的努力去推动它。

能看到你这个博文真的是三生有幸。你最后那句话(如果你妻子对你像科学对你那样你肯定把她给甩了)简直是太精彩了。我也开始有着你曾经有过的想法了,如果你对追求科学还抱有疑问,是因为你的爱不够坚定,这样的想法是不对的,这只是学术圈对你的洗脑。

Jessica WilsonMarch 6, 2014 at 11:43 AM

This is fantastic writing, despite the sadness. I sympathize (finishing PhD in neuroscience, considering heading out).

I'd love to try and make a video with some of the stories you've accumulated. I'm already looking through that Google Doc you posted right now, and my heart is breaking.

文章写的真棒,尽管让人感到莫名的悲伤和失望。作为一个刚刚结束了神经科学学习课程的博士生,我深有同感。

我很乐意为你收集的这些故事拍一个video,我也看完了你在google doc里面所表达的观点,实在是太震撼了。

CBMarch 12, 2014 at 6:09 PM

Really great stuff. I have reread this post a dozen times over the past couple weeks, as I am a postdoc currently on the precipice of throwing in the towel on my academic career. I find the last sentence particularly meaningful. I can't shake the feeling that giving up on this career that I have been laser-focused on for ten years feels an awful lot like a traumatic breakup. But the simple truth is exactly as you described, academic science simply doesn't respect its professionals nearly enough for the best of us to stick around.

Ugh, breakups suck!

 

Michael RuddyMarch 14, 2014 at 1:55 PM

How appropriate for a Valentine post ... if you do not love everything about what you are doing – move on until you find it!

O(∩_∩)O哈哈~在情人节发表这样的观点实在是太适合不过了。如果你实在是不喜欢你正在做的事情,果断的放弃,持续寻找直到找到你所爱的。

Nick EffordFebruary 15, 2014 at 8:53 AM

I sympathise and wish you a successful and fulfilling future, wherever that takes you. The pressures in UK academia are much the same, as is the relatively low pay. We've seen our pay fall 13% taking into account inflation over the last 5 or 6 years, and universities refuse to offer a decent pay increase despite increasing their income from students and despite the fact that they are sitting on huge cash reserves. My own institution would rather spend £50 million on new buildings than reward its staff for their dedication.

Like you and countless others, I'm reluctant to leave a job that can be very exciting and stimulating. But the truth is that the stress levels make it increasingly unsustainable. There is constant pressure to write papers and secure research funding and simultaneous pressure to improve teaching quality, but there is a failure to recognise that time is a finite resource, so one activity must inevitably be traded off against the other.

I don't expect to receive the same remuneration as I would in industry, but I do need one of two things to happen: either working conditions need to improve or the pay needs to improve to reflect the real pressures of the job. I've sacrificed too many evenings and weekends over the years, and that has had a negative impact on personal physical and mental health as well as family relationships. If something doesn't give soon, I could well end up following you out of academia.

The trade union for academics in the UK is currently locked in a bitter pay dispute with the universities. You can find out more about it at http://fairpay.web.ucu.org.uk/

Good luck!

Nick

我也深有同感,也期望你有个成功而且精彩的未来,不管你走向哪个领域。英国的学术圈压力也很类似,因为相对来说基金资助量都很小。在过去的5到6年间,我们的资金总量,考虑到通货膨胀,反而缩减了13%,大学却拒绝对研究基金做一个比较像样的提高,尽管他们向学生收取的学费更多了,而且他们的现金流也非常健康。我所在的研究所情愿花五千万欧元区修建一个大楼,也不会去奖励那些辛辛苦苦奉献着的教职工。

像你以及无数的其他类似经历的人一样,我也不情愿离开这个即兴奋又刺激的研究工作。但是压力的与日俱增让我的研究工作越来越难以为继。我们既要保证发表足够的文章,还要争取研究基金的支持,同时还要提高我们的教学质量,但是我们不得不承认,一个人的时间是有限的,所以我们必然会在有些方面做得不够。

我不期望能得到与工业界相当的报酬,我仅仅是期望我们的工作条件能得到改善,并且我们的报酬水平应该能对得住我们实际上所承受的工作压力。长年以来,我已经付出了无数个夜晚和周末,而这对我个人的身心健康是一个很大的影响,同时也极大的影响了我的家庭关系。如果这一切不尽快改善的话,我想,我应该马上就会追寻你的脚步,离开这学术圈了。

16

最全面的转录组研究软件收集

能看到这个网站真的是一个意外,现在看来,还是外国人比较认真呀, 这份软件清单,能看出作者的确是花了大力气的,满满的都是诚意。from: https://en.wiki2.org/wiki/List_of_RNA-Seq_bioinformatics_tools
https://en.wiki2.org/wiki/List_of_RNA-Seq_bioinformatics_tools软件主要涵盖了转录组分析的以下18个方向,看我我才明白自己的水平的确没到家,印象中的转录组分析也就是差异表达,然后注释以下,最多分析一下融合基因,要不然就看看那些miRNA,和lncRNA咯,没想到里面的学问也大着呢,怪不得生物是一个大坑,来再多的学者也不怕,咱有的是研究方向给你。

    1 Quality control and pre-processing data
        1.1 Quality control and filtering data
        1.2 Detection of chimeric reads
        1.3 Errors Correction
        1.4 Pre-processing data
    2 Alignment Tools
        2.1 Short (Unspliced) aligners
        2.2 Spliced aligners
            2.2.1 Aligners based on known splice junctions (annotation-guided aligners)
            2.2.2 De novo Splice Aligners
                2.2.2.1 De novo Splice Aligners that also use annotation optionally
                2.2.2.2 Other Spliced Aligners
    3 Normalization, Quantitative analysis and Differential Expression
        3.1 Multi-tool solutions
    4 Workbench (analysis pipeline / integrated solutions)
        4.1 Commercial Solutions
        4.2 Open (free) Source Solutions
    5 Alternative Splicing Analysis
        5.1 General Tools
        5.2 Intron Retention Analysis
    6 Bias Correction
    7 Fusion genes/chimeras/translocation finders/structural variations
    8 Copy Number Variation identification
    9 RNA-Seq simulators
    10 Transcriptome assemblers
        10.1 Genome-Guided assemblers
        10.2 Genome-Independent (de novo) assemblers
            10.2.1 Assembly evaluation tools
    11 Co-expression networks
    12 miRNA prediction
    13 Visualization tools
    14 Functional, Network & Pathway Analysis Tools
    15 Further annotation tools for RNA-Seq data
    16 RNA-Seq Databases
    17 Webinars and Presentations
    18 References

 

16

3000多份水稻全基因组测序数据共享-主要是突变数据

感觉最近接触的生物信息学知识越多,越对大数据时代的到来更有同感了。现在的研究者,其实很多都可以自己在家里做了,大量的数据基本都是公开的, 但是一个人闭门造车成就真的有限,与他人交流的思想碰撞还是蛮重要的。

这里面列出了3000多份水稻全基因组测序数据,都共享在亚马逊云上面,是全基因组的双端测序数据,共3,024个水稻数据,比对到了五种不同的水稻参考基因组上面,而且主要是用GATK来找差异基因的。
而且,数据收集者还给出了一个snp calling的标准流程
我以前也是用这样的流程
SNP Pipeline Commands

1. Index the reference genome using bwa index

   /software/bwa-0.7.10/bwa index /reference/japonica/reference.fa

2. Align the paired reads to reference genome using bwa mem. 
   Note: Specify the number of threads or processes to use using the -t parameter. The possible number of threads depends on the machine where the command will run.

   /software/bwa-0.7.10/bwa mem -M -t 8 /reference/japonica/reference.fa /reads/filename_1.fq.gz /reads/filename_2.fq.gz > /output/filename.sam

3. Sort SAM file and output as BAM file

   java -Xmx8g -jar /software/picard-tools-1.119/SortSam.jar INPUT=/output/filename.sam OUTPUT=/output/filename.sorted.bam VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=TRUE

4. Fix mate information

   java -Xmx8g -jar /software/picard-tools-1.119/FixMateInformation.jar INPUT=/output/filename.sorted.bam OUTPUT=/output/filename.fxmt.bam SO=coordinate VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=TRUE

5. Mark duplicate reads

   java -Xmx8g -jar /software/picard-tools-1.119/MarkDuplicates.jar INPUT=/output/filename.fxmt.bam OUTPUT=/output/filename.mkdup.bam METRICS_FILE=/output/filename.metrics VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=TRUE MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000

6. Add or replace read groups

   java -Xmx8g -jar /software/picard-tools-1.119/AddOrReplaceReadGroups.jar INPUT=/output/filename.mkdup.bam OUTPUT=/output/filename.addrep.bam RGID=readname PL=Illumina SM=readname CN=BGI VALIDATION_STRINGENCY=LENIENT SO=coordinate CREATE_INDEX=TRUE

7. Create index and dictionary for reference genome

   /software/samtools-1.0/samtools faidx /reference/japonica/reference.fa
   
   java -Xmx8g -jar /software/picard-tools-1.119/CreateSequenceDictionary.jar REFERENCE=/reference/japonica/reference.fa OUTPUT=/reference/reference.dict

8. Realign Target 

   java -Xmx8g -jar /software/GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T RealignerTargetCreator -I /output/filename.addrep.bam -R /reference/japonica/reference.fa -o /output/filename.intervals -fixMisencodedQuals -nt 8

9. Indel Realigner

   java -Xmx8g -jar /software/GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T IndelRealigner -fixMisencodedQuals -I /output/filename.addrep.bam -R /reference/japonica/reference.fa -targetIntervals /output/filename.intervals -o /output/filename.realn.bam 

10. Merge individual BAM files if there are multiple read pairs per sample

   /software/samtools-1.0/samtools merge /output/filename.merged.bam /output/*.realn.bam

11. Call SNPs using Unified Genotyper

   java -Xmx8g -jar /software/GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar -T UnifiedGenotyper -R /reference/japonica/reference.fa -I /output/filename.merged.bam -o filename.merged.vcf -glm BOTH -mbq 20 --genotyping_mode DISCOVERY -out_mode EMIT_ALL_SITES
16

NGS数据比对工具持续收集

无意中看到了这个网站,比wiki的还有全面和专业。搜集了现有还算比较出名的比对软件,并且列出来了,还做了简单评价,里面对比对工具的收集,主要是基于2012年的一个综述《Tools for mapping high-throughput sequencing data》,相信应该是有不少人都看过这篇综述的,其实生物信息初学者应该自己去文献数据库找点感兴趣的关键词的综述多看看,广泛涉猎总没有坏处的。

<img src="http://www.ebi.ac.uk/~nf/hts_mappers/mappers_timeline.jpeg" alt="Mappers Timeline" width="800">

Features Comparison

The following Table enables a comparison of mappers based on different characteristics. The table can be sorted by column (just click on the column name). The data was collected from different sources and in some cases was provided by the developers. For execution times and memory requirements we refer to the above mentioned review (supplementary data is available here).

The Data column indicates if the mapper is specifically tailored for DNA, RNA, miRNA, or bisulfite sequences.The Seq.Plat. column indicates if the mapper supports natively reads from a specific sequencing platform or not (N). The version column indicates the version of the mapper considered. Read length limits are showed in two columns: minimum read length (Min. RL) and maximum read length (Max. RL.). Unless otherwise stated the unit is base pairs. The support for mismatches and short indels is also presented including, when possible, the maximum number of allowed mismatches and indels: by default the value is presented in bases; in some cases the value is presented as a percentage of the read size; or as score, meaning that mapper uses a score function. The alignments reported column indicate the alignments reported when a read maps to multiple locations. The alignment column indicates if the reads are aligned end-to-end (Globally) or not (Locally). The Parallel column indicates if the mapper can be run in parallel and, if yes, how: using a shared-memory (SM) or/and a distributed memory (DM) computer. The QA (quality awareness) column indicates if the mapper uses read quality information during the mapping. The support for paired-end/mate-pair reads is indicated in the PE column. The Splicing column indicates, for the RNA mappers, if the detection of splice junctions is made de novo or/and through user provided libraries (Lib). The Index column indicates if the reads or/and the reference are indexed. The number of citations was obtained from Google Scholar on 13 June 2015.
16

nature发表的统计学专题Statistics in biology

生物学里面,唯一还算有点技术含量,和有点门槛,就是生物统计了,而这也是绝大部分研究者的痛点,有能力的,可以看看nature上面关于统计学的专题讨论,而且主要是应用于自然科学的统计学讨论。

里面有几句统计学名言警句:
Statistics does not tell us whether we are right. It tells us the chances of being wrong.
统计学并不会告诉我们是否正确,而只是说明我们错误的可能性是多少。
Quality is often more important than quantity.
数据的质量远比数量要重要的多
The meaning of error bars is often misinterpreted, as is the statistical significance of their overlap.
Good experimental designs mitigate experimental error and the impact of factors not under study.
文章列表:
Research methods: Know when your numbers are significant
Scientific method: Statistical errors
Weak statistical standards implicated in scientific irreproducibility
The fickle P value generates irreproducible results
Vital statistics
Experimental biology: Sometimes Bayesian statistics are better
A call for transparent reporting to optimize the predictive value of preclinical research
Power failure: why small sample size undermines the reliability of neuroscience
Basic statistical analysis in genetic case-control studies
Erroneous analyses of interactions in neuroscience: a problem of significance
Analyzing 'omics data using hierarchical models
Advantages and pitfalls in the application of mixed-model association methods
Quality control and conduct of genome-wide association meta-analyses
Circular analysis in systems neuroscience: the dangers of double dipping
A solution to dependency: using multilevel analysis to accommodate nested data
How does multiple testing correction work?
What is Bayesian statistics?
What is a hidden Markov model?
下面的这些文章,其实就是我们正常课本里面统计学的知识点,但是放在nature杂志发表,就顿时高大上了好多
Points of significance: Importance of being uncertain
Points of Significance: Error bars
Points of significance: Significance, P values and t-tests
Points of significance: Power and sample size
Points of Significance: Visualizing samples with box plots
Points of significance: Comparing samples €”part I
Points of significance: Comparing samples part II
Points of significance:  Nonparametric tests
Points of significance: Designing comparative experiments
Points of significance: Analysis of variance and blocking
Points of Significance:  Replication
Points of Significance:  Nested designs
Points of Significance: Two-factor designs
Points of significance: Sources of variation
Points of Significance: Split plot design
Points of Significance: Bayes' theorem
Points of significance: Bayesian statistics
Points of Significance: Sampling distributions and the bootstrap
Points of Significance: Bayesian networks
A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles.
16

生物信息学学者学习mysql之路

我一直都知道mysql其实很有用的, 哪怕是在bioinformatics领域。也断断续续的看过不少mysql教程,只是苦于没有机会应用。毕竟应用才是最好的学习方法,正好这些天需要用了,我就又梳理了一遍作为一个生物信息学学者,该如何学习mysql数据库。
然后再搜搜一堆技巧
差不多就可以开始啦。
我们不拿数据库来做网页,所以需要的仅仅是查询公共数据库的数据,当然,一般人都会选择直接去网页可视化的查询,或者去ftp批量下载后自己写脚本来查询,我以前也是这样想的,所以感觉mysql没什么用,因为它能做的, 我写一个脚本都能做到。但是任何事物能发展到如此流行的程度毕竟还是有它的优点的。
而在我看来,mysql的优点就是,不需要存储大量的文件信息,随查随用,如果我们想把数据库备份到本地,就要建立一大堆的文件夹,存放各种refgene信息呀,entrez gene信息呀,转录本,外显子等等各个文件夹,每个文件夹下面还有一堆文件,而且还要分物种存储,总之就是很麻烦,但是在数据库就不一样啦。
比如我们可以连接UCSC的数据库 (前提是你的机器里面可以允许mysql这个命令,而且你可以联网)
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A
就这么简单, 你就用mysql远程登录了UCSC的数据库,可以show databases;或者use database hg19 ; 等等
里面有两百多个数据库,主要是多物种多版本,然后如果我们看hg19这个数据库,里面还有一万多个数据表,包含着hg19的全面信息。
还有很多其它的公共数据库可以练习
来自于:https://www.biostars.org/p/474/#9095

for example, I would cite:

UCSC http://genome.ucsc.edu/FAQ/FAQdownloads#download29
ENSEMBL http://uswest.ensembl.org/info/data/mysql.html
GO http://www.geneontology.org/GO.database.shtml#mirrors

1000 Genomes: since June 16, 2011: http://www.1000genomes.org/public-ensembl-mysql-instance

mysql -h mysql-db.1000genomes.org -u anonymous -P 4272

Flybase has direct access to its postgres chado database.
http://flybase.org/forums/viewtopic.php?f=14&t=114
hostname: flybase.org port: 5432 username: flybase password: no password database name: flybase
e.g. psql -h flybase.org -U flybase flybase

mysql -h database.nencki-genomics.org -u public
mysql -h useastdb.ensembl.org -u anonymous -P 5306

你都可以登录进去看看里面有什么,也可以练习练习mysql的语法,但是增删改查种的查是可以用的
然后我们可以用R或者perl或者Python来连接数据库,也是蛮好用的, 我现在比较倾向于R
所以我就简单看了一下这个包的说明书,然后成功连接了
#Connect to the MySQL server using the command:
#mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A
#The -A flag is optional but is recommended for speed
library(RMySQL)
my.port="";
my.user="genome";
my.password="";
my.db="hg19";
#there are 203 databases,such as hg18,hg38,mm9,mm10,ce10
con <- dbConnect(MySQL(), host=my.host, user=my.user,dbname=my.db)
dbListTables(con) # there are 11016 tables in this hg19 database;

是不是很简单呀,只有你认真的学习,其实这些应用的东西都还是蛮简单的。

下面这本书也比较好,就讲了R或者perl或者Python来连接数据库,很全面

当然,如果想看mysql在bioinformatics方面的应用,下面还有很多学习资料
进阶版还可以看看具体事例,GO数据库的设计:http://geneontology.org/page/lead-database-schema
从这个来看,python要比perl 好很多http://www.personal.psu.edu/iua1/courses/files/2010/week15.pdf
16

居然还可以出售TCGA的数据,只有你稍微进行分析一下即可

亮瞎了我的双眼,原来还可以这样挣钱。
这个数据库的作者在2011年发了一篇如何寻找融合基因的文章:*Edgren, Henrik, et al. "Identification of fusion genes in breast cancer by paired-end RNA-sequencing." Genome Biol 12.1 (2011): R6.

然后基于此,把TCGA计划里面的所有癌症样本数据都处理了,并且得到了融合基因数据集,然后就以此出售

价格高达一万欧元,折合人民币七万多,一本万利,而且人家TCGA计划的数据的公开而且免费的,他做了二次处理就可以拿来挣钱,让我感觉很不爽。
到目前为止他们处理了TCGA计划里面的7652个癌症样本的数据,建立了一个囊括28种癌症的融合基因数据集,并且打包成了一个叫做FusionSCOUT 的产品来出售。
价格如下:

Pricing of FusionSCOUT datasets:

  • Single gene in one cancer set                        490€    /  580$ per dataset
  • Single gene fusions across all cancers          4900€  /  5800$ dataset
  • Individual cancer set                                       990 €   /  1250 $ per dataset
  • Full TCGA dataset                                          9900€  /  12500$ per dataset
该网站是这样介绍他们的产品的,号称有3500个研究团体已经使用了他们的数据,但是我感觉纯粹是吹牛,毕竟他这篇文献也就一百多的引用量,再说3500次购买,就这一个产品就能让他成为亿万富翁了,想想都觉得可怕。而且这网站这么烂,中国访问速度是渣渣,也就是相当于失去了中国的所有土豪客户了,怎么可能还有3500的销量,搞笑!

One of the latest therapeutics angles in the fight against cancer is fusion genes and their regulation. To aid in fusion gene research and reveal the multitude of gene fusion event in cancer samples MediSapiens has developed a proprietary FusionSCOUT pipeline for identifying fusion genes from RNA sequencing datasets.

Currently we have analysed 7625 tumour samples from the TCGA project building a fusion gene dataset covering 28 different cancers within the TCGA project which can be accessed through our FusionSCOUT product.

Using this pipeline, we have discovered 3930 samples with gene fusions with 9667 different fusion genes. We´ve discovered numerous novel gene fusions as well as new cancer types in which previously known fusions appear.

You can now purchase these gene fusions datasets with few mouse clicks and get the worlds most comprehensive gene fusions from cancer sets within days

FusionSCOUT cancer Reports

With FusionSCOUT you can access the full listings of all fusion genes in specific cancer datasets. Find new leads for possible cause of the cancer, examine the pathways that are affected by different fusions, stratify patients by shared fusion genes or search for potential target for drugs and companion diagnostics.

Once you purchase a FusionSCOUT dataset we will send you a detailed report with information on the fused genes, sample ID from the TCGA dataset, fusion frequencies across the dataset as well as fusion mRNA sequences and lists of protein domains present in the fusion transcripts.

By ordering the MediSapiens FusionSCOUT dataset, you´ll get:

  • A list of all gene fusions that involve your gene of interest, across all TCGA cancer types
  • TCGA sample ID: s of the for the samples with fusions
  • Exact exon junctions for the fusions, including alternatively spliced variants and data on whether reading frame is retained
  • Detailed list of protein domains retained in the fusion genes
  • cDNA sequence for the fusion mRNAs

Contact us to access the most up-to-date and comprehensive datasets of fusion gene events in different cancers!contact@medisapiens.com

Check out also our Fusion Gene Detection pipeline service for your samples!

Dataset missing? Email us and well add your favorite dataset to FusionSCOUT!

FusionSCOUT Cancer sets, March 2015

Cancer type Number of samples Number of fusion genes
Acute Myeloid Leukemia, LAML 153 69
Adrenocortical carcinoma, ACC 79 115
Bladder Urothelial Carcinoma, BLCA 273 473
Brain Lower Grade Glioma, LGG 467 309
Breast Invasive Carcinoma, BRCA 1029 3267
Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma, CESC 195 190
Colon Adenocarcinoma, COAD 287 212
Glioblastoma multiforme, GBM 170 379
Head and Neck Squamous Cell Carcinoma, HNSC 412 386
Kidney Chromophobe, KICH 66 19
Kidney Renal Clear Cell Carcinoma, KIRC 523 217
Kidney Renal Papillary Cell Carcinoma, KIRP 226 145
Liver Hepatocellular Carcinoma, LIHC 198 317
Lung Adenocarcinoma, LUAD 456 991
Lung Squamous Cell Carcinoma, LUSC 482 1374
Lymphoid Neoplasm Diffuse Large B-cell Lymphoma, DLBC 28 18
Mesothelioma, MESO 36 26
Ovarian Serous Cystadenocarcinoma, OV 420 1166
Pancreatic Adenocarcinoma, PAAD 84 46
Pheochromocytoma and Paraganglioma, PCPG 184 83
Prostate Adenocarcinoma, PRAD 336 859
Rectum Adenocarcinoma, READ 85 74
Sarcoma, SARC 161 799
Skin Cutaneous Melanoma, SKCM 355 620
Stomach Adenocarcinoma, STAD 190 311
Thyroid Carcinoma, THCA 506 195
Uterine Carcinosarcoma, UCS 57 229
Uterine Corpus Endometrial Carcinoma, UCEC 167 422
14

几个国外出名的跟生物信息学相关的会议

会议列表如下:
ASGH会议-Annual Meeting of the American Society of Human Genetics
AGBT会议-Advances in Genome Biology & Technology (AGBT)
ASM会议-annual meeting of the American Society for Microbiology
ASHI会议-The American Society for Histocompatibility and Immunogenetics 
BOSC-生物信息开放会议:Bioinformatics Open Source Conference
ISMB/ECCB会议
ACMG会议-The ACMG Annual Clinical Genetics Meeting
annual Biology of Genomes (BoG) meeting at Cold Spring Harbor
以上排名不分先后,
一年一度的美国人类遗传学协会(ASHG)年会是遗传学界的盛事,也是目前规模最大的人类遗传学会议。2015年的年会于10月6-10日在马里兰州的巴尔的摩举行,吸引了6500多名科学家参与。他们将在会议上介绍和讨论人类遗传学各个方面的最新进展。
会议官网是:http://www.ashg.org/
非常隆重,也受到业界追捧!
会议ppt均可下载,但是要翻墙

https://storify.com/andrewsu/ashg14-speaker-slides

基因组生物学技术进展大会(AGBT)

中文介绍:

The 23rd Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2015)
14th Annual European Conference on Computational Biology (ECCB 2015)
这也是一个老牌会议了,会议官网是:https://www.iscb.org/
2015年的会议资料可以直接下载了:
ASM会议就比较水一点:
会议官网是:http://www.asm.org/

WHO WE ARE

The American Society for Microbiology (ASM) is the oldest and largest single life science membership organization in the world. Membership has grown from 59 scientists in 1899 to more than 39,000 members today, with more than one third located outside the United States. The members represent all aspects of the microbial sciences including microbiology educators.

The mission of ASM is to promote and advance the microbial sciences.

ASM accomplishes this mission through a variety of products, services and activities.

  • We provide a platform for sharing the latest scientific discoveries through our books, journals, meetings and conferences.
  • We help strengthen sustainable health systems around the world though our laboratory capacity building and global engagement programs.
  • We advance careers through our professional development programs and certifications.
  • We train and inspire the next generation of scientists through our outreach and educational programs.

ASM members have a passion for the microbial sciences, a desire to connect with their colleagues and a drive to be involved with the profession. Whether it is publishing in an ASM Journal, attending an ASM meeting or volunteering on one of the Society's many boards and committees.

Big parts of our everyday lives, from energy production, waste recycling, new sources of food, new drug development and infectious diseases to environmental problems and industrial processes-are studied in the microbial sciences.

Microbiology boasts some of the most illustrious names in the history of science--Pasteur, Koch, Fleming, Leeuwenhoek, Lister, Jenner and Salk--and some of the greatest achievements for mankind. Within the 20th century, a third of all Nobel Prizes in Physiology or Medicine have been awarded to microbiologists.

ASHI主要是免疫学相关的:

官网是:http://www.ashi-hla.org/

The ASHI 41st Annual Meeting site is now live, for the latest updates visit 2015.ashi-hla.org.
About ASHI

The American Society for Histocompatibility and Immunogenetics (ASHI) is a not-for-profit association of clinical and research professionals including immunologists, geneticists, molecular biologists, transplant physicians and surgeons, pathologists and technologists. As a professional society involved in histocompatibility, immunogenetics and transplantation, ASHI is dedicated to advancing the science and application of histocompatibility and immunogenetics; providing a forum for the exchange of information; and advocating the highest standards of laboratory testing in the interest of optimal patient care.

BOSC-生物信息开放会议

在wiki里面有详细的介绍:

The Bioinformatics Open Source Conference (BOSC) is an academic conference on open source programming in bioinformatics organised by the Open Bioinformatics Foundation. The conference has been held annually since 2000 and is run as a two-day satellite meeting preceding the Intelligent Systems for Molecular Biology (ISMB) conference.

annual Biology of Genomes (BoG) meeting

ACMG会议是临床相关的,报道的比较少
会议官网是;http://www.acmgmeeting.net/

ABOUT

The ACMG Annual Clinical Genetics Meeting provides genetics professionals with the opportunity to learn how genetics and genomics are being integrated into medical or clinical practice. The ACMG Annual Meeting Program Committee has developed a high caliber scientific program that will present the latest developments and research in clinical genetics and genomics

 

09

对vcf突变数据与公开发表的进行比对

当我们对NGS数据call了snp之后一般会输出成vcf格式的数据,一行代表一个突变,例如
20      2451451 .       G       T       1939.77 .
AC=1;AF=0.500;AN=2;BaseQRankSum=-10.134;DP=239;Dels=0.00;FS=2.276;HaplotypeScore=0.0000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQ0=0;MQRankSum=-0.258;QD=8.12;ReadPosRankSum=0.823;SOR=0.870
GT:AD:DP:GQ:PL  0/1:150,89:239:99:1968,0,3874
#前几列记录着该突变发生在第几号染色体以及该染色体的哪个坐标,我们的参考基因组在该位点是什么碱基,我们测到的突变成了什么碱基。
最后两列是测序深度以及正负测序深度,或者ref和allele的测序深度。
只有第8列是最复杂的,可以有高达几百个数据信息,取决于我们用什么样的软件来call的snp,以及call了snp之后用什么样的软件做的注释。
接下来我们还需要探究我们找到的突变是否在其它以及公开发表的数据库里面被找到过,所以可以下载非常多的公共数据库进行比对,我所见过的有一下一些,估计完全下载有0.5T
dbsnp144 (这个是ncbi提供的最权威的啦)
cgi69
ExAC.vcf.gz(这个是broadinstitute提供的外显子联盟)
Cosmic_v73.ann.vcf.gz (这个是癌症突变信息集)
finalTCGA.vcf.gz (TCGA计划也是癌症相关的)
icgc.vcf.gz
dbNSFP2.6vcf
SCLP.ann.vcf.gz
CCLE.ann.vcf.gz
ESP6500-SIv2.vcf.gz (Variants from the Exome Sequencing Project (ESP))
adni-sum
safs-sum.indel.vcf.gz
gonl.vcf.gz
ssm.vcf.gz
ssi.vcf.gz
uk10k.vcf.gz
1000g-ph3v5.gff.gz  (千人基因组计划)
gwasCatalog.gff.gz  \
phewascatalog.gff.gz  \
dbgap-gwas.gff.gz  \
interproDomain.gff.gz \
clinvar.gff.gz \
RegulomeDB.gff.gz \
CancerGAMAdb.gff.gz \
29

推荐5个生物信息学领域的教授

排名不分先后:

推荐宾夕法尼亚州立大学的一个教授Istvan Albert

他写了一本书是: https://www.biostarhandbook.com/
他还可以授予网上课程学位:http://www.personal.psu.edu/iua1/certificate.html
他还推荐了一本R语言书籍:http://onepager.togaware.com/

关注一下华盛顿大学医学院的教授Obi L. Griffith

他的主页:http://www.obigriffith.org/

他的一个比较出名的的贡献是 www.rnaseq.wiki
他在 Biostars bioinformatics forum 非常活跃
他的课程包括Molecular Basis of Cancer (BIO5288) and Genetics and Genomics of Disease (BIO5487) at Washington University School of Medicine.
I was a TA for Genome Analysis (MEDG505) and the bioinformatics section of Advanced Human Molecular Genetics (MEDG520) and a guest instructor for Cell Biology For Biomedical Engineering Graduate Students (APSC552), Cell and Organismal Biology (BIOL111) and Cell Biology (BIOL200) at UBC.

关注一下华盛顿大学医学院的教授Malachi Griffith

他的个人主页是:http://www.malachigriffith.org/index.htm

他的github主页是:https://github.com/malachig

WashU TGI Faculty page: Profile
Linked In: Profile
Twitter: Feed
Google Scholar: Citations
Research Gate: Profile
Scopus: Profile
Open Research ID: Profile
Github: Profile
BioStar: Profile
SeqAnswers: Profile
Code Academy: Profile
Iterative Genomics Consulting: Company website
Flickr: Photostream
www.dgidb.org
www.alexaplatform.org

关注一下麦吉尔大学的Pablo Cingolani教授

他是snpeff的作者

他的github是:https://github.com/pcingola
现就职于McGill University

推荐弗吉尼亚大学的stephen教授

他是个人主页:http://stephenturner.us/

他所有公开的ppt : https://speakerdeck.com/stephenturner
stephen教授我要重点提一下,因为他的教育资源特别多。
09

affymetix的基因表达芯片数据差异基因分析

我主要是看了一个差异分析的教程,讲的非常详细,全面,我先简单列出这个教程,然后再贴出我的代码

GEO本来只有三种层级的数据,分别是Sample, Platform, and Series
现在共有14,927 platforms,包括主流的affymetrix,agilent,illumina等产商的芯片,以及它们在不同领域的应用(snp,snv,gwas等等),以及各种不同的生物体(人,小鼠,大鼠)
这个分析流程,仅仅针对于affymetrix公司的基因表达相关的芯片数据。
目录如下:
因为他也是转载,所以链接失效了,现在的链接如下:
其实根据目录名重新搜索肯定能得到内容的, 链接失效太正常了。
具体内容,我整理并且重新注释了以下,在有道云笔记里面。
基本上只需要用心看这个教程,都能上手芯片数据的差异分析,但这只是差异分析的一种方法而已,而且还是非常过时的方法。
现在比较流行DESeq,edgeR等高通量测序的差异分析包,即使是十几年前的芯片数据,也不需要下载cel那种数据,可以直接下载每个项目的表达量矩阵Series Matrix File(s)
然后在R里面用read.table,调整好参数就可以直接读取啦!
06

JQuery学习笔记

以后写这样的文章就直接用有道云笔记分享啦,这样可以节约这个免费的云服务器的空间。

jquery学习笔记第一弹:基础语法

http://note.youdao.com/share/?id=82021515144eb4820762e9fdbc686340&type=note

JQuery笔记第二弹:ppt效果操作

http://note.youdao.com/share/?id=08eb606b2084b9b0d8c9eb5ef72e3433&type=note

JQuery笔记第三弹:操作html元素

http://note.youdao.com/share/?id=fb8ff7deeb186adb82751838bf82cfbe&type=note

JQuery笔记第四弹:循环,遍历,判断等语句实现

http://note.youdao.com/share/?id=746ac6f1a801351f49d13cb3d7a335bf&type=note

JQuery笔记第五弹:Ajax实现

http://note.youdao.com/share/?id=0b2c6fb8c89e307ec79602e6d67e7c66&type=note

JQuery参考手册-函数大全

http://note.youdao.com/share/?id=2e926f98c9bd51b1192d309706f8c1ca&type=note

 

 

29

研究癌症领域必看文献

最近需要了解一些癌症相关知识,看到了这个文献列表,觉得非常棒,所以推荐给大家。

抽时间慢慢看,一个月应该可以把这些文献看完的。

癌症种类大全 http://www.cancer.gov/types
癌症药物大全 http://www.cancer.gov/about-cancer/treatment/drugs
癌症所有的信息几乎都能在这个网站上面找到 http://www.cancer.gov/
包括癌症的科普、treatment、diagnosis,prognosis,classification,drugs、prediction等等

different_kinds_of_cancer_in_CCLE

Cancer Precision Medicine: Improving Evidence in Practice - August 24, 2015

NCI-MATCH Trial Opens,External Web Site Icon AACR blog post, August 2015

NCI-MATCH launch highlights new trial design in precision-medicine eraExternal Web Site Icon
McNeal C , JNCI, August 2015

The Cancer Genomics Resource List, 2014External Web Site Icon
Zutter MM et al. CAP Lab Improvement Program,Archives of Pathology, August 2015

Personalized medicine and economic evaluation in oncology: all theory and no practice?External Web Site Icon
Garattini L et al. Expert Rev Pharmacoecon Outcomes Res 2015 Aug 9. 1-6

Precision medicine trials bring targeted treatments to more patients,External Web Site Icon C. Helwick, ASCO Post, Jul 25

Next-generation sequencing to guide cancer therapy External Web Site Icon
Gagan J et al, Genome Medicine, July 29, 2015

Feasibility of large-scale genomic testing to facilitate enrollment onto genomically matched clinical trials.External Web Site Icon
Meric-Bernstam F et al. J. Clin. Oncol. 2015 May 26.

Brave-ish new world-what's needed to make precision oncology a practical reality.External Web Site Icon
MacConaill LE et al. JAMA Oncol 2015 Jul 16.

Genomic profiling: Building a continuum from knowledge to careExternal Web Site Icon
Helen C et al. JAMA Oncology, July 2015

Are we there yet?External Web Site Icon
When it comes to curing cancer, targeted therapies and genomic sequencing are helping, but we still have far to go. Genome Magazine, June 29, 2015

Artificial intelligence, big data, and cancerExternal Web Site Icon
Kantarjian H et al, JAMA Oncology, June 2015

Multigene panel testing in oncology practice - how should we respond?External Web Site Icon
Kurian AW et al. JAMA Oncology, June 2015

Use of whole genome sequencing for diagnosis and discovery in the cancer genetics clinic.External Web Site Icon
Foley SB et al. EBioMedicine 2015 Jan 2(1) 74-81

The future of molecular medicine: biomarkers, BATTLEs, and big data External Web Site Icon
ES Kim, ASCO University, June 2015

NCI-MATCH trial will link targeted cancer drugs to gene abnormalitiesExternal Web Site Icon

Targeted agent and profiling utilization registry study,External Web Site Icon from the American Society for Clinical Oncology

ASCO study aims to learn from patient access to targeted cancer drugs used off-label,External Web Site Icon American Society for Clinical Oncology

Improving evidence developed from population-level experience with targeted agents Adobe PDF file [PDF 462.93 KB]External Web Site Icon
McLellan M et al Issue Brief. Conference on Clinical Cancer Research November 2014

Implementing personalized cancer care.External Web Site Icon
Schilsky RL et al. Nat Rev Clin Oncol 2014 Jul (7) 432-8

Accelerating the delivery of patient-centered, high-quality cancer care.External Web Site Icon
Abrahams E et al. Clin. Cancer Res. 2015 May 15. (10) 2263-7

Next-generation clinical trials: Novel strategies to address the challenge of tumor molecular heterogeneity.External Web Site Icon
Catenacci DV et al. Mol Oncol 2015 May (5) 967-996

Cancer Precision Medicine: Improving Evidence in Practice - May 29, 2015

Diagnosis and treatment of cancer using genomicsExternal Web Site Icon
Vockley JG et al. BMJ, May 28, 2015

Targeted agent and profiling utilization registry study,External Web Site Icon from the American Society for Clinical Oncology

ASCO study aims to learn from patient access to targeted cancer drugs used off-label,External Web Site Icon American Society for Clinical Oncology

Improving evidence developed from population-level experience with targeted agents Adobe PDF file [PDF 462.93 KB]External Web Site Icon
McLellan M et al Issue Brief. Conference on Clinical Cancer Research November 2014

Implementing personalized cancer care.External Web Site Icon
Schilsky RL et al. Nat Rev Clin Oncol 2014 Jul (7) 432-8

Accelerating the delivery of patient-centered, high-quality cancer care.External Web Site Icon
Abrahams E et al. Clin. Cancer Res. 2015 May 15. (10) 2263-7

Next-generation clinical trials: Novel strategies to address the challenge of tumor molecular heterogeneity.External Web Site Icon
Catenacci DV et al. Mol Oncol 2015 May (5) 967-996

Precision Medicine: Cancer and Genomics - May 12, 2015

Promise, peril seen in personalized cancer therapy,External Web Site Iconby Marie McCullough, Philadelphia Inquirer, May 10

A decision support framework for genomically informed investigational cancer therapy.External Web Site Icon
Meric-Bernstam F et al. J. Natl. Cancer Inst. 2015 Jul (7)

Divide and conquer: The molecular diagnosis of cancer,External Web Site Icon by Louis M. Staudt, National Cancer Insitute, Apr 13

Health: Make precision medicine work for cancer careExternal Web Site Icon
To get targeted treatments to more cancer patients pair genomic data with clinical data, and make the information widely accessible, Mark A. Rubin. Nature News, Apr 15

Using somatic mutations to guide treatment decisionsExternal Web Site Icon
Horlings H et al. JAMA Oncology, March 12, 2015

The landscape of precision cancer medicine clinical trials in the United StatesExternal Web Site Icon
Roper N et al. Cancer Treatment Reviews 2015

What is “precision medicine?External Web Site Icon Information from the National Cancer Institute

Impact of cancer genomics on precision medicine for the treatment of cancer,External Web Site Icon from the Cancer Genome Atlas, NCI

US precision-medicine proposal sparks questions,External Web Site Icon by Sara Reardon, Nature News, Jan 22

Obama's 'precision medicine' means gene mapping,External Web Site IconNBC News, Jan 21

What is President Obama's 'precision medicine' plan, and how might it help you?External Web Site Icon By Lenny Bernstein, Jan 21

Recent reviews

Companion diagnostics: the key to personalized medicine.External Web Site Icon
Jørgensen JT. Expert Rev Mol Diagn. 2015 Feb;15(2):153-6

Promoting precision cancer medicine through a community-driven knowledgebase.External Web Site Icon
Geifman N, et al. J Pers Med. 2014 Dec 15;4(4):475-88.

Toward a prostate cancer precision medicine.External Web Site Icon
Rubin MA. Urol Oncol. 2014 Nov 20.

Prioritizing targets for precision cancer medicine.External Web Site Icon
Andre F, et al. Ann Oncol. 2014 Dec;25(12):2295-303

Toward precision medicine with next-generation EGFR inhibitors in non-small-cell lung cancer.External Web Site Icon
Yap TA, Popat S. Pharmgenomics Pers Med. 2014 Sep 19;7:285-95.

Genomically driven precision medicine to improve outcomes in anaplastic thyroid cancer.External Web Site Icon
Pinto N, et al.  J Oncol. 2014;936285

Translating genomics for precision cancer medicine.External Web Site Icon
Roychowdhury S, Chinnaiyan AM. Annu Rev Genomics Hum Genet. 2014;15:395-415

The Cancer Genome Atlas: Accomplishments and Future - April 3, 2015

The Cancer Genome Atlas (TCGA): an immeasurable source of knowledgeExternal Web Site Icon
Tomczak K, et al. Contemp Oncol (Pozn). 2015; 19(1A): A68-A77.

The Cancer Genome Atlas' 4th Annual Scientific SymposiumExternal Web Site Icon
May 11-12 ~ Bethesda, MD

The Cancer Genome Atlas (TCGA) Data Portal External Web Site Icon
Portal provides a platform for researchers to search, download, and analyze data sets generated by TCGA

Cancer Genomics Hub: A resource of the National Cancer Institute,External Web Site Icon from the USC Genome Browser

Molecular classification of gastric adenocarcinoma: translating new insights from The Cancer Genome Atlas Research Network.External Web Site Icon
Sunakawa Y et al. Curr Treat Options Oncol 2015 Apr (4) 331

TCGA data and patient-derived orthotopic xenografts highlight pancreatic cancer-associated angiogenesis.External Web Site Icon
Gore J et al. Oncotarget 2015 Feb 25.

Radiogenomics of clear cell renal cell carcinoma: preliminary findings of The Cancer Genome Atlas-Renal Cell Carcinoma (TCGA-RCC) Imaging Research Group.External Web Site Icon
Shinagare AB et al. Abdom Imaging 2015 Mar 10.

Proteomics of colorectal cancer in a genomic context: First large-scale mass spectrometry-based analysis from the Cancer Genome Atlas.External Web Site Icon
Jimenez CR et al. Clin. Chem. 2015 Feb 26.

End of cancer-genome project prompts rethinkExternal Web Site Icon
Geneticists debate whether focus should shift from sequencing genomes to analysing function. Heidi Ledford, Nature News and Comments, January 2015

Cancer Genomics: Insights into Driver Mutations - March 10, 2015

Seek and destroy: Relating cancer drivers to therapiesExternal Web Site Icon
E. Martinez-Ledesma et al. Cell, March 9, 2015

In silico prescription of anticancer drugs to cohorts of 28 tumor types reveals targeting opportunitiesExternal Web Site Icon
C Rubio-Perez et al. Cancer Cell, March 9, 2015

MADGiC: a model-based approach for identifying driver genes in cancer. Adobe PDF file [PDF 373.56 KB]External Web Site Icon
Keegan D. Korthauer et al. Bioinformatics, January 2015

Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine.External Web Site Icon
Benjamin J Raphael et al. Genome Medicine 2014

Novel recurrently mutated genes in African American colon cancers.External Web Site Icon
Guda K et al. Proc Natl Acad Sci U S A. 2015 Jan 12

Sparse expression bases in cancer reveal tumor drivers.External Web Site Icon
Logsdon BA, et al. Nucleic Acids Res. 2015 Jan 12

Patient-specific driver gene prediction and risk assessment through integrated network analysis of cancer omics profiles.External Web Site Icon
Bertrand D, et al. Nucleic Acids Res. 2015 Jan 8

Identification of constrained cancer driver genes based on mutation timing.External Web Site Icon
Sakoparnig T, et al. PLoS Comput Biol. 2015 Jan 8;11(1):e1004027

CaMoDi: a new method for cancer module discovery.External Web Site Icon
Manolakos A, et al. BMC Genomics. 2014 Dec 12;15 Suppl 10:S8.

VHL, the story of a tumour suppressor gene.External Web Site Icon
Gossage L, et al. Nat Rev Cancer. 2014 Dec 23;15(1):55-64

Targeting the MET pathway for potential treatment of NSCLC.External Web Site Icon
Li A, et al. Expert Opin Ther Targets. 2014 Dec 23:1-12

Deciphering oncogenic drivers: from single genes to integrated pathways.External Web Site Icon
Chen J, et al. Brief Bioinform. 2014 Nov 5.

Driver and passenger mutations in cancer.External Web Site Icon
Pon JR, et al. Annu Rev Pathol. 2014 Oct 17

Hereditary Cancer Genetic Testing: Where are We? - December 18, 2014

NCI paper:Prevalence and correlates of receiving and sharing high-penetrance cancer genetic test results: Findings from the Health Information National Trends SurveyExternal Web Site Icon
Taber J.M. et al Public Health Genomics, January 2015

Clinical decisions: Screening an asymptomatic person for genetic risk--polling resultsExternal Web Site Icon
Schulte J, et al. N Engl J Med 2014 Nov;371(20):e30

Testing for hereditary breast cancer: Panel or targeted testing? Experience from a clinical cancer genetics practice.External Web Site Icon
Doherty J, J Genet Couns. 2014 Dec 5

Hereditary colorectal cancer syndromes: American Society of Clinical Oncology clinical practice guideline endorsement of the familial risk-colorectal cancer: European Society for Medical Oncology clinical practice guidelines.External Web Site Icon
Stoffel EM, et al. J Clin Oncol. 2014 Dec 1

Population testing for cancer predisposing BRCA1/BRCA2 mutations in the Ashkenazi-Jewish community: A randomized controlled trial.External Web Site Icon
Manchanda R, et al. J Natl Cancer Inst. 2014 Nov 30;107(1)

Cost-effectiveness of population screening for BRCA mutations in Ashkenazi Jewish women compared with family history-based testing.External Web Site Icon
Manchanda R et al. J Natl Cancer Inst. 2014 Nov 30;107(1). pii: dju380. doi: 10.1093/jnci/dju380. Print 2015 Jan.

Check out our Cancer Genetic Testing  Update Page for additional information and links

Cancer Genomic Tests (October 30, 2014)

Cancer Precision Medicine: Where Are We? - September 18, 2014

NIH announces the launch of 3 integrated precision medicine trials; ALCHEMIST is for patients with certain types of early-stage lung cancer,External Web Site Icon August 2014

National Cancer Institute's Precision Medicine Initiatives for the New National Clinical Trials Network.External Web Site Icon Jeffrey Abrams et al. ASCO Annual Meeting 2014

Personalized medicine: Special treatment.External Web Site Icon
Michael Eisenstein. Nature, September 11, 2014

Why the controversy? Start sequencing tumor genes at diagnosis. Tumor sequencing at the time of diagnosis can give significant insight for successful cancer treatment,External Web Site Icon by Shelly Gunn, Genetic Engineering & Biotechnology News, Sep 10

National Cancer Institute information: Precision medicine and targeted therapyExternal Web Site Icon

Genomics and precision oncology: What's a targeted therapy for cancer?External Web Site Icon An updated list of approved drugs from the National Cancer Institute (2014)

Therapy: This time it's personalExternal Web Site Icon
Gravitz L Nature 509, S52-S54 2014 May 29

Multi-marker solid tumor panels using next-generation sequencing to direct molecularly targeted therapiesExternal Web Site Icon
Michael Marrone, et al. PLoS Currents Evidence on Genomic Tests 2014 May 27

Impact of cancer genomics on precision medicine for the treatment of cancer,External Web Site Icon from the National Cancer Institute

Cancer genomics and precision medicine in the 21st century Adobe PDF file [PDF 2.20 MB]External Web Site Icon, power point presentation from the National Human Genome Research Institute

 

28

TCGA年度研讨会资料分享

TCGA想必搞生信都或有耳闻,尤其是癌症研究方向的,共4个年度研讨会,主要是pdf格式的ppt分享,有需要的可以具体点击到页面一个个下载自己慢慢研究,也可以用我下面链接直接下载。

本来是有youtube分享演讲视频的,但是国内被墙了,大家就看看ppt吧

http://www.genome.gov/17516564

The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

TCGA is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), which are both part of the National Institutes of Health, U.S. Department of Health and Human Services.

Meetings

pdf链接地址如下

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Laird.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Durbin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Ley.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Sartor.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Ciriello.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Imielinski.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Gao.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Carter.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Ng.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Parvin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Raphael.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Lawrence.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Kreisberg.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Marra.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Helman.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Stuart.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Cooper.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Levine.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Natsoulis.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Haussler.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Erkkila.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Gehlenborg.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Qiao.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Sivachenko.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Sumazin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Gutman.pdf

http://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Mardis.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/01_Shaw.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/02_Chanock.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/03_Staudt.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/05_Creighton.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/06_Stojanov.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/07_Karchin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/08_Mungall.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/09_Hakimi.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/10_Gao.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/11_Hayes.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/12_Troester.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/13_Knobluach.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/14_Raphael.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/15_Akbani.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/16_Giordano.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/17_Weinstein.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/18_Zheng.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/19_Getz.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/20_VanDneBroek.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/21_Liao.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/22_Khazanov.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/23_Levine.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/24_Miller.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/25_Ewing.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/26_Cirello.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/27_Verhaak.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/28_Hofree.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/29_Meyerson.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/30_Yang.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/31_Wheeler.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/32_Parfenov.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/33_Bernard-Rovira.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/34_Hast.pdf

http://www.genome.gov/Multimedia/Slides/TCGA2/36_Sellars.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/04_Brat.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/05_Mungall.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/06_Boutros.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/07_Zmuda.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/08_Benz.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/09_Zheng.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/11_Creighton.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/12_Aksoy.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/13_Dinh.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/14_Stuart.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/15_Amin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/16_Gross.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/15_Akbani.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/18_Giordano.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/19_Amin-Mansour.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/20_Oesper.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/21_Gatza.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/22_Bernard.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/23_Sinha.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/24_Akbani.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/25_Watson.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/26_Martignetti.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/27_Bandlamudi.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/28_Fu.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/29_Akdemir.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/30_Bass.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/31_Hakimi.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/32_Wheeler.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/33_Lehmann.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/34_Gordenin.pdf

http://www.genome.gov/Multimedia/Slides/TCGA3/35_Wyczalkowski.pdf

 

http://www.genome.gov/Multimedia/Slides/TCGA4/02_Zenklusen.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/03_Hutter.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/04_Brat.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/05_Mungall.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/06_Linehan.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/07_Brooks.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/08_Wu.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/09_Giger.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/10_Wilkerson.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/11_Orsulic.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/12_Zhong.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/13_Knijnenburg.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/14_Akbani.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/15_Wang.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/16_Poisson.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/17_Alaeimahabadi.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/18_Noushmehr.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/19_Pantazi.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/20_Shih.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/21_Stransky.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/22_Giordano.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/23_Davidsen.pdf

http://www.genome.gov/Multimedia/Slides/TCGA4/24_Gross.pdf