TCGA表达数据的多项应用之2–对指定基因在不同癌种里面画boxplot,或者在所有的normal组织里面看表达量!

好像文章题目没有长度限制,太好了!本讲所实现的目标非常简单,如题,指定基因在不同癌种里面画boxplot,或者在所有的normal组织里面看表达量!下面是一个具体的例子:

1

 

代码如下:

稍微懂一点R的小伙伴都看得出来,只需要手动修改指定的基因,然后指定的癌症种类,就可以来容易画上面的图了,但要完成这一步,必须把前面的那一步导入mysql数据库搞懂。

TCGA表达数据的多项应用之1–下载数据并且导入mysql

rm(list=ls())
searchGene = 'VCX3B';
searchTable1='tumor_gbm_rpkm';
searchTable2='tumor_lgg_rpkm';
library(RMySQL)
con <- dbConnect(MySQL(), host="127.0.0.1", port=3306, user="root", password="11111111")
dbSendQuery(con, "USE gse62944")
dbListTables(con)
query = paste0(' select * from ', searchTable1 ,' where genesymbol = ',shQuote(searchGene)) ;
gbm=dbGetQuery(con,query)
query = paste0(' select * from ', searchTable2 ,' where genesymbol = ',shQuote(searchGene)) ;
lgg=dbGetQuery(con,query)
gbm=as.numeric(gbm[,-1]);gbm=data.frame(value=gbm,type='gbm')
lgg=as.numeric(lgg[,-1]);lgg=data.frame(value=lgg,type='lgg')
dat1= rbind(gbm,lgg)
boxplot( value ~  type, data = dat1, lwd = 2, ylab = 'value')
stripchart(value ~ type, vertical = TRUE, data = dat1,
           method = "jitter", add = TRUE, pch = 20, col = 'blue')
 还有很多其它的应用,重点就是如何从sql里面提取数据并可视化而已
2
比如上面那个在正常表达量矩阵里面查询,多种癌旁组织合并起来画图!
sqlTable = 'normalrpkm';
sqlQuery=paste0(' select * from ', sqlTable ,' where genesymbol = ',shQuote(searchGene))
normalExpression=dbGetQuery(con,sqlQuery)
normalExpression= normalExpression[,-length(normalExpression)]
normalExpression = data.frame(sampleID=names(normalExpression),
                              values=as.numeric(normalExpression)
                              )
normalCancerType2amples=dbGetQuery(con,'select * from normalcancertype2amples')
normalCancerType2amples$sampleID=gsub("-",".", normalCancerType2amples$sampleID)
dat2 = merge(normalExpression,normalCancerType2amples,by='sampleID')
boxplot( values ~  CancerType, data = dat2, lwd = 2, ylab = 'values',las=2,main=searchGene)
stripchart(values ~ CancerType, vertical = TRUE, data = dat2,
           method = "jitter", add = TRUE, pch = 20, col = 'blue')

Comments are closed.