最新的一个寻找cancer 的driver gene的软件：
Cancer is thought to result from the accumulation of causal somatic mutations throughout the lifetime of an individual.
这些cancer-driving mutations 主要影响三类基因： 1、oncogenes 2、tumor-suppressor genes 3， stablity geens
第一个突变是tumorigenesis ，随后的突变就 driver tumor progression
识别这些突变非常有利于了解gene function 和药物靶点设计
区分 driver genes 和 passenger genes 能更好的利用各种数据库得到的海量突变信息
基于频率的区分方法 rely on an estimate of a background mutation rate which represents the rate of random passenger mutations.
也就是文献(Ding et al., 2008).提出的方法，但它忽略了以下四点
1、mutation type (transition versus transversion)
2、nucleotide context(which base is at the mutation site
3、dinucleotide context (which bases are located at neighboring sites to the mutation),
4、expression level of the gene
Sjoblom et al.(2006) account for nucleotide and dinucleotide context in searching for drivers of breast and colorectal cancer.
MuSiC (Dees et al.,2012) accounts for mutation type and allows for sample-specific mutation rates;
Lawrence et al.(2013) (MutSigCV) also allow for the inclusion of gene-specific factors such as expression level and replication timing.
实际上，除了突变频率，还有一些criteria也很重要， 所以有两个数据库SIFT (first reported by Ng and Henikoff (2001), later updated by Kumar et al. (2009)), Polyphen (Adzhubei et al., 2010) 和MutationAssessor (Reva et al., 2011)
这两个数据库整合了 sequence context, position, and protein characteristics to assess a mutation’s functional impact.
总结一下identity cancer driver genes的criteria
3、gene-specific features such as replication timing and expression level that are known to affect background rates of mutation,
4、mutation-specific scores that assess functional impact, and the spatial patterning of mutations that only becomes apparent when thousands of samples are considered.
而我们提出了a unified empirical Bayesian Model-based Approach for identifying Driver Genes in Cancer (MADGiC) that utilizes each of these features.