<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; tutorial</title>
	<atom:link href="http://www.bio-info-trainee.com/category/tutorial/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>跟着jimmy玩博客</title>
		<link>http://www.bio-info-trainee.com/4474.html</link>
		<comments>http://www.bio-info-trainee.com/4474.html#comments</comments>
		<pubDate>Thu, 11 Jul 2019 08:51:20 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4474</guid>
		<description><![CDATA[最近打理自己的生信菜鸟团博客发现阿里云又开始搞活动了，这次虽然不是2年免费，不过 &#8230; <a href="http://www.bio-info-trainee.com/4474.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post.php?post=4474&amp;action=edit&amp;message=6">
<p style="margin: 0px 0px 1.2em !important;">最近打理自己的生信菜鸟团博客发现阿里云又开始搞活动了，这次虽然不是2年免费，不过也差不多，三年才五百多块钱！<span id="more-4474"></span><br />
<a href="https://wanwang.aliyun.com/hosting?spm=5176.200021.297964.9.3e7d4e358nFA7U">https://wanwang.aliyun.com/hosting?spm=5176.200021.297964.9.3e7d4e358nFA7U</a><br />
<img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/07/image-20190711145501756.png" alt="image-20190711145501756" /><br />
对个人博客来说，这个配置绰绰有余，而且是国内主机，加上如果域名也买在阿里云，操作起来就非常贴心了！<br />
提醒一下，这个是虚拟主机，无法提供ssh登陆的，如果需要购买真正的云主机，需要购买阿里云ECS服务器，也有学生活动，登录阿里云官方网站，在“产品与服务”中选择云服务器ECS，选择立即购买。运气好的话还能赶上阿里的一些优惠活动。有学生机跟新用户专享的199一年的体验机。<br />
当然，在国内，所以需要备案，遵纪守法嘛！</p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">首先买空间买域名</h3>
<p style="margin: 0px 0px 1.2em !important;">空间根据自己的经济实力哦。<br />
域名需要解析到自己的空间</p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">然后备案</h3>
<p style="margin: 0px 0px 1.2em !important;">遵纪守法，按照提示一步步来即可。</p>
<h3 id="-wordpress-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">最后开辟WordPress博客</h3>
<p style="margin: 0px 0px 1.2em !important;">基本上看看前辈们教程，很容易学会:<br />
<a href="https://www.alibabacloud.com/zh/getting-started/projects/set-up-wordpress-with-one-click-solution">一键部署Wordpress 网站－ 阿里云新手学堂 - Alibaba Cloud</a><br />
<a href="https://yq.aliyun.com/articles/221634">阿里云ECS服务器搭建wordpress个人博客网站【详细图文教程】-云栖 …</a><br />
<a href="https://www.zhihu.com/question/36495153">如何在阿里云服务器上搭建wordpress博客？ - 知乎</a></p>
<h3 id="-github-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">也可以使用GitHub免费博客</h3>
<p style="margin: 0px 0px 1.2em !important;">其实吧，说这样的话可能会被喷，毕竟是薅羊毛，不过呢，大家不一定会选择这个策略，因为GitHub免费博客各种限制，而且访问速度通常很慢，如果你确实感兴趣，三分钟就搞定，看教程：<a href="https://zhuanlan.zhihu.com/p/28321740">https://zhuanlan.zhihu.com/p/28321740</a><br />
使用客户端玩转GitHub ，就是 modify -&gt; conmmit -&gt; publish -&gt; view<br />
中文就是：修改 -&gt; 提交版本 -&gt; 发布到云端 -&gt; 在网站上查看。</p>
<ul style="margin: 1.2em 0px; padding-left: 2em;">
<li style="margin: 0.5em 0px;">下载地址：<a href="https://desktop.github.com/">https://desktop.github.com/</a>
<ul style="margin: 0px; padding-left: 1em;">
<li style="margin: 0.5em 0px;">html模板地址：<a href="https://html5up.net/">https://html5up.net/</a><br />
<h3 id="-markdown-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">最好是用markdown写作</h3>
<p>一般来说做笔记分享，需要用markdown语法，不熟悉的人可能会害怕，但是一旦你花15分钟了解了它，你会爱上写作，相信我。<br />
小技巧的第二节我讲的就是markdown：<a href="https://www.bilibili.com/video/av25131640">https://www.bilibili.com/video/av25131640</a><br />
学习markdown 可以先看看 基础语法：<a href="https://mp.weixin.qq.com/s/hZj91FWIaw4cS39_jgjl7g">https://mp.weixin.qq.com/s/hZj91FWIaw4cS39_jgjl7g</a><br />
学习编辑器，推荐typora：<a href="https://vip.biotrainee.com/d/82-typora-markdown/10">https://vip.biotrainee.com/d/82-typora-markdown/10</a><br />
最后是关于记笔记： <a href="https://vip.biotrainee.com/d/268">https://vip.biotrainee.com/d/268</a><br />
markdown效果如下：<a href="https://raw.githubusercontent.com/jmzeng1314/my_WGCNA/master/readme.md">https://raw.githubusercontent.com/jmzeng1314/my_WGCNA/master/readme.md</a><br />
<a href="https://github.com/jmzeng1314/my_WGCNA/blob/master/readme.md">https://github.com/jmzeng1314/my_WGCNA/blob/master/readme.md</a><br />
类似于上面这样的，就是markdown<br />
其实还有rmarkdown更方便，参考我在腾讯课堂的免费视频： <a href="https://ke.qq.com/course/274681?tuin=4926c730">https://ke.qq.com/course/274681?tuin=4926c730</a></p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">拿到博客的第一个任务</h3>
<p>把我在生信技能树的周末班全套练习题内容复制粘贴到自己的博客，注意标记出处哦，同时也把自己完成作业后的答案写在博客。</p>
<h4 id="r-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.2em;">R语言的练习题</h4>
<p>初级10 个题目，尽量根据参考代码理解及完成：<a href="http://www.bio-info-trainee.com/3793.html">http://www.bio-info-trainee.com/3793.html</a><br />
中级要求是：<a href="http://www.bio-info-trainee.com/3750.html">http://www.bio-info-trainee.com/3750.html</a><br />
高级要求是完成20题： <a href="http://www.bio-info-trainee.com/3415.html">http://www.bio-info-trainee.com/3415.html</a><br />
统计专题 30题：<a href="http://www.bio-info-trainee.com/4385.html">http://www.bio-info-trainee.com/4385.html</a><br />
可视化专题30题：<a href="http://www.bio-info-trainee.com/4387.html">http://www.bio-info-trainee.com/4387.html</a></p>
<h4 id="linux-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.2em;">LINUX的练习题：</h4>
<p>最低要求是完成我的 linux 20题 <a href="http://www.bio-info-trainee.com/2900.html">http://www.bio-info-trainee.com/2900.html</a><br />
其次完成生物信息学数据格式的习题(blast/blat/fa-fq/sam-bam/vcf/bed/gtf-gff)，收集这些格式的说明书。<br />
fasta和fastq格式文件的shell小练习 <a href="http://www.bio-info-trainee.com/3575.html">http://www.bio-info-trainee.com/3575.html</a><br />
sam和bam格式文件的shell小练习 <a href="http://www.bio-info-trainee.com/3578.html">http://www.bio-info-trainee.com/3578.html</a><br />
VCF格式文件的shell小练习 <a href="http://www.bio-info-trainee.com/3577.html">http://www.bio-info-trainee.com/3577.html</a></p>
<h4 id="rna-seq-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.2em;">RNA-seq只有小考核</h4>
<p><a href="http://www.bio-info-trainee.com/3920.html">http://www.bio-info-trainee.com/3920.html</a></p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">第二个任务</h3>
<p>把我学徒培养的各种资料细节完善，代码在：<br />
<a href="https://www.jianshu.com/p/49d035b121b8">https://www.jianshu.com/p/49d035b121b8</a> 给学徒的WES数据分析流程<br />
<a href="https://www.jianshu.com/p/a84cd44bac67">https://www.jianshu.com/p/a84cd44bac67</a> 原创10000+生信教程大神给你的RNA实战视频演练<br />
<a href="https://www.jianshu.com/p/5bce43a537fd">https://www.jianshu.com/p/5bce43a537fd</a> 给学徒的ATAC-seq数据实战<br />
<a href="https://mp.weixin.qq.com/s/a4qAcKE1DoukpLVV_ybobA">https://mp.weixin.qq.com/s/a4qAcKE1DoukpLVV_ybobA</a> 给学徒ChIP-seq数据处理流程<br />
讲义都是在：<br />
学徒第一周：文档链接：<a href="https://mubu.com/doc/38tEycfrQg">https://mubu.com/doc/38tEycfrQg</a> 密码：vl3q<br />
学徒第二周：文档链接：<a href="https://mubu.com/doc/38y7pmgzLg">https://mubu.com/doc/38y7pmgzLg</a> 密码：p6fo<br />
学徒第三周：文档链接：<a href="https://mubu.com/doc/1iDucLlG5g">https://mubu.com/doc/1iDucLlG5g</a> 密码：7uch<br />
学徒第四周：文档链接：<a href="https://mubu.com/doc/11taEb9ZYg">https://mubu.com/doc/11taEb9ZYg</a> 密码：wk29<br />
授课内容包括：<br />
<img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/07/image-20190711150442295.png" alt="image-20190711150442295" /></li>
</ul>
</li>
</ul>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5pyA6L+R5omT55CG6Ieq5bex55qE55Sf5L+h6I+c6bif5Zui5Y2a5a6i5Y+R546w6Zi/6YeM
5LqR5Y+I5byA5aeL5pCe5rS75Yqo5LqG77yM6L+Z5qyh6Jm954S25LiN5pivMuW5tOWFjei0ue+8
jOS4jei/h+S5n+W3ruS4jeWkmu+8jOS4ieW5tOaJjeS6lOeZvuWkmuWdl+mSse+8gTxpbWcgY2xh
c3M9IndwLW1vcmUtdGFnIG1jZS13cC1tb3JlIiB0aXRsZT0i6ZiF6K+75pu05aSa4oCmIiBzcmM9
ImRhdGE6aW1hZ2UvZ2lmO2Jhc2U2NCxSMGxHT0RsaEFRQUJBSUFBQUFBQUFQLy8veUg1QkFFQUFB
QUFMQUFBQUFBQkFBRUFBQUlCUkFBNyIgYWx0PSIiIGRhdGEtd3AtbW9yZT0ibW9yZSIgZGF0YS13
cC1tb3JlLXRleHQ9IiIgZGF0YS1tY2UtcmVzaXplPSJmYWxzZSIgZGF0YS1tY2UtcGxhY2Vob2xk
ZXI9IjEiIGRhdGEtbWNlLXNyYz0iZGF0YTppbWFnZS9naWY7YmFzZTY0LFIwbEdPRGxoQVFBQkFJ
QUFBQUFBQVAvLy95SDVCQUVBQUFBQUxBQUFBQUFCQUFFQUFBSUJSQUE3Ij48L3A+PHA+Jmx0O2h0
dHBzOi8vd2Fud2FuZy5hbGl5dW4uY29tL2hvc3Rpbmc/c3BtPTUxNzYuMjAwMDIxLjI5Nzk2NC45
LjNlN2Q0ZTM1OG5GQTdVJmd0OzwvcD48cD4hW2ltYWdlLTIwMTkwNzExMTQ1NTAxNzU2XShodHRw
Oi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE5LzA3L2lt
YWdlLTIwMTkwNzExMTQ1NTAxNzU2LnBuZyk8L3A+PHA+5a+55Liq5Lq65Y2a5a6i5p2l6K+077yM
6L+Z5Liq6YWN572u57uw57uw5pyJ5L2Z77yM6ICM5LiU5piv5Zu95YaF5Li75py677yM5Yqg5LiK
5aaC5p6c5Z+f5ZCN5Lmf5Lmw5Zyo6Zi/6YeM5LqR77yM5pON5L2c6LW35p2l5bCx6Z2e5bi46LS0
5b+D5LqG77yBPC9wPjxwPuaPkOmGkuS4gOS4i++8jOi/meS4quaYr+iZmuaLn+S4u+acuu+8jOaX
oOazleaPkOS+m3NzaOeZu+mZhueahO+8jOWmguaenOmcgOimgei0reS5sOecn+ato+eahOS6keS4
u+acuu+8jOmcgOimgei0reS5sOmYv+mHjOS6kUVDU+acjeWKoeWZqO+8jOS5n+acieWtpueUn+a0
u+WKqO+8jOeZu+W9lemYv+mHjOS6keWumOaWuee9keerme+8jOWcqOKAnOS6p+WTgeS4juacjeWK
oeKAneS4remAieaLqeS6keacjeWKoeWZqEVDU++8jOmAieaLqeeri+WNs+i0reS5sOOAgui/kOaw
lOWlveeahOivnei/mOiDvei1tuS4iumYv+mHjOeahOS4gOS6m+S8mOaDoOa0u+WKqOOAguacieWt
pueUn+acuui3n+aWsOeUqOaIt+S4k+S6q+eahDE5OeS4gOW5tOeahOS9k+mqjOacuuOAgjwvcD48
cD7lvZPnhLbvvIzlnKjlm73lhoXvvIzmiYDku6XpnIDopoHlpIfmoYjvvIzpgbXnuqrlrojms5Xl
mJvvvIE8L3A+PHA+IyMjIOmmluWFiOS5sOepuumXtOS5sOWfn+WQjTwvcD48cD7nqbrpl7TmoLnm
ja7oh6rlt7HnmoTnu4/mtY7lrp7lipvlk6bjgII8L3A+PHA+5Z+f5ZCN6ZyA6KaB6Kej5p6Q5Yiw
6Ieq5bex55qE56m66Ze0PC9wPjxwPiMjIyDnhLblkI7lpIfmoYg8L3A+PHA+6YG157qq5a6I5rOV
77yM5oyJ54Wn5o+Q56S65LiA5q2l5q2l5p2l5Y2z5Y+v44CCPC9wPjxwPiMjIyDmnIDlkI7lvIDo
vp9Xb3JkUHJlc3PljZrlrqI8L3A+PHA+5Z+65pys5LiK55yL55yL5YmN6L6I5Lus5pWZ56iL77yM
5b6I5a655piT5a2m5LyaOjwvcD48cD5b5LiA6ZSu6YOo572yV29yZHByZXNzIOe9keerme+8jSDp
mL/ph4zkupHmlrDmiYvlrabloIIgLSBBbGliYWJhIENsb3VkXShodHRwczovL3d3dy5hbGliYWJh
Y2xvdWQuY29tL3poL2dldHRpbmctc3RhcnRlZC9wcm9qZWN0cy9zZXQtdXAtd29yZHByZXNzLXdp
dGgtb25lLWNsaWNrLXNvbHV0aW9uKTwvcD48cD5b6Zi/6YeM5LqRRUNT5pyN5Yqh5Zmo5pCt5bu6
d29yZHByZXNz5Liq5Lq65Y2a5a6i572R56uZ44CQ6K+m57uG5Zu+5paH5pWZ56iL44CRLeS6keag
liAuLi5dKGh0dHBzOi8veXEuYWxpeXVuLmNvbS9hcnRpY2xlcy8yMjE2MzQpPC9wPjxwPlvlpoLk
vZXlnKjpmL/ph4zkupHmnI3liqHlmajkuIrmkK3lu7p3b3JkcHJlc3PljZrlrqLvvJ8gLSDnn6Xk
uY5dKGh0dHBzOi8vd3d3LnpoaWh1LmNvbS9xdWVzdGlvbi8zNjQ5NTE1Myk8L3A+PHA+IyMjIOS5
n+WPr+S7peS9v+eUqEdpdEh1YuWFjei0ueWNmuWuojwvcD48cD7lhbblrp7lkKfvvIzor7Tov5nm
oLfnmoTor53lj6/og73kvJrooqvllrfvvIzmr5Xnq5/mmK/oloXnvormr5vvvIzkuI3ov4flkaLv
vIzlpKflrrbkuI3kuIDlrprkvJrpgInmi6nov5nkuKrnrZbnlaXvvIzlm6DkuLpHaXRIdWLlhY3o
tLnljZrlrqLlkITnp43pmZDliLbvvIzogIzkuJTorr/pl67pgJ/luqbpgJrluLjlvojmhaLvvIzl
poLmnpzkvaDnoa7lrp7mhJ/lhbTotqPvvIzkuInliIbpkp/lsLHmkJ7lrprvvIznnIvmlZnnqIvv
vJombHQ7aHR0cHM6Ly96aHVhbmxhbi56aGlodS5jb20vcC8yODMyMTc0MCZndDs8L3A+PHA+5L2/
55So5a6i5oi356uv546p6L2sR2l0SHViIO+8jOWwseaYryBtb2RpZnkgLSZndDsgY29ubW1pdCAt
Jmd0OyBwdWJsaXNoIC0mZ3Q7IHZpZXc8L3A+PHA+5Lit5paH5bCx5piv77ya5L+u5pS5IC0mZ3Q7
IOaPkOS6pOeJiOacrCAtJmd0OyDlj5HluIPliLDkupHnq68gLSZndDsg5Zyo572R56uZ5LiK5p+l
55yL44CCPC9wPjxwPi0g5LiL6L295Zyw5Z2A77yaJmx0O2h0dHBzOi8vZGVza3RvcC5naXRodWIu
Y29tLyZndDs8YnI+IC0gaHRtbOaooeadv+WcsOWdgO+8miZsdDtodHRwczovL2h0bWw1dXAubmV0
LyZndDs8L3A+PHA+IyMjIOacgOWlveaYr+eUqG1hcmtkb3du5YaZ5L2cPC9wPjxwPuS4gOiIrOad
peivtOWBmueslOiusOWIhuS6q++8jOmcgOimgeeUqG1hcmtkb3du6K+t5rOV77yM5LiN54af5oKJ
55qE5Lq65Y+v6IO95Lya5a6z5oCV77yM5L2G5piv5LiA5pem5L2g6IqxMTXliIbpkp/kuobop6Pk
uoblroPvvIzkvaDkvJrniLHkuIrlhpnkvZzvvIznm7jkv6HmiJHjgII8L3A+PHA+5bCP5oqA5ben
55qE56ys5LqM6IqC5oiR6K6y55qE5bCx5pivbWFya2Rvd27vvJombHQ7aHR0cHM6Ly93d3cuYmls
aWJpbGkuY29tL3ZpZGVvL2F2MjUxMzE2NDAmZ3Q7PC9wPjxwPuWtpuS5oG1hcmtkb3duIOWPr+S7
peWFiOeci+eciyDln7rnoYDor63ms5XvvJombHQ7aHR0cHM6Ly9tcC53ZWl4aW4ucXEuY29tL3Mv
aFpqOTFGV0lhdzRjUzM5X2pnamw3ZyZndDs8L3A+PHA+5a2m5Lmg57yW6L6R5Zmo77yM5o6o6I2Q
dHlwb3Jh77yaJmx0O2h0dHBzOi8vdmlwLmJpb3RyYWluZWUuY29tL2QvODItdHlwb3JhLW1hcmtk
b3duLzEwJmd0OzwvcD48cD7mnIDlkI7mmK/lhbPkuo7orrDnrJTorrDvvJogJmx0O2h0dHBzOi8v
dmlwLmJpb3RyYWluZWUuY29tL2QvMjY4Jmd0OzwvcD48cD5tYXJrZG93buaViOaenOWmguS4i++8
miZsdDtodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vam16ZW5nMTMxNC9teV9XR0NO
QS9tYXN0ZXIvcmVhZG1lLm1kJmd0OzwvcD48cD4mbHQ7aHR0cHM6Ly9naXRodWIuY29tL2ptemVu
ZzEzMTQvbXlfV0dDTkEvYmxvYi9tYXN0ZXIvcmVhZG1lLm1kJmd0OzwvcD48cD7nsbvkvLzkuo7k
uIrpnaLov5nmoLfnmoTvvIzlsLHmmK9tYXJrZG93bjwvcD48cD7lhbblrp7ov5jmnIlybWFya2Rv
d27mm7Tmlrnkvr/vvIzlj4LogIPmiJHlnKjohb7orq/or77loILnmoTlhY3otLnop4bpopHvvJog
Jmx0O2h0dHBzOi8va2UucXEuY29tL2NvdXJzZS8yNzQ2ODE/dHVpbj00OTI2YzczMCZndDs8L3A+
PHA+IyMjIOaLv+WIsOWNmuWuoueahOesrOS4gOS4quS7u+WKoTwvcD48cD7miormiJHlnKjnlJ/k
v6HmioDog73moJHnmoTlkajmnKvnj63lhajlpZfnu4PkuaDpopjlhoXlrrnlpI3liLbnspjotLTl
iLDoh6rlt7HnmoTljZrlrqLvvIzms6jmhI/moIforrDlh7rlpITlk6bvvIzlkIzml7bkuZ/mioro
h6rlt7HlrozmiJDkvZzkuJrlkI7nmoTnrZTmoYjlhpnlnKjljZrlrqLjgII8L3A+PHA+IyMjIyBS
6K+t6KiA55qE57uD5Lmg6aKYPC9wPjxwPuWInee6pzEwIOS4qumimOebru+8jOWwvemHj+agueaN
ruWPguiAg+S7o+eggeeQhuino+WPiuWujOaIkO+8miZsdDtodHRwOi8vd3d3LmJpby1pbmZvLXRy
YWluZWUuY29tLzM3OTMuaHRtbCZndDs8L3A+PHA+5Lit57qn6KaB5rGC5piv77yaJmx0O2h0dHA6
Ly93d3cuYmlvLWluZm8tdHJhaW5lZS5jb20vMzc1MC5odG1sJmd0OzwvcD48cD7pq5jnuqfopoHm
sYLmmK/lrozmiJAyMOmimO+8miAmbHQ7aHR0cDovL3d3dy5iaW8taW5mby10cmFpbmVlLmNvbS8z
NDE1Lmh0bWwmZ3Q7PC9wPjxwPue7n+iuoeS4k+mimCAzMOmimO+8miZsdDtodHRwOi8vd3d3LmJp
by1pbmZvLXRyYWluZWUuY29tLzQzODUuaHRtbCZndDs8L3A+PHA+5Y+v6KeG5YyW5LiT6aKYMzDp
opjvvJombHQ7aHR0cDovL3d3dy5iaW8taW5mby10cmFpbmVlLmNvbS80Mzg3Lmh0bWwmZ3Q7PC9w
PjxwPiMjIyMgTElOVVjnmoTnu4PkuaDpopjvvJo8L3A+PHA+5pyA5L2O6KaB5rGC5piv5a6M5oiQ
5oiR55qEIGxpbnV4IDIw6aKYICZsdDtodHRwOi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29tLzI5
MDAuaHRtbCZndDs8L3A+PHA+5YW25qyh5a6M5oiQ55Sf54mp5L+h5oGv5a2m5pWw5o2u5qC85byP
55qE5Lmg6aKYKGJsYXN0L2JsYXQvZmEtZnEvc2FtLWJhbS92Y2YvYmVkL2d0Zi1nZmYp77yM5pS2
6ZuG6L+Z5Lqb5qC85byP55qE6K+05piO5Lmm44CCPC9wPjxwPmZhc3Rh5ZKMZmFzdHHmoLzlvI/m
lofku7bnmoRzaGVsbOWwj+e7g+S5oCAmbHQ7aHR0cDovL3d3dy5iaW8taW5mby10cmFpbmVlLmNv
bS8zNTc1Lmh0bWwmZ3Q7PC9wPjxwPnNhbeWSjGJhbeagvOW8j+aWh+S7tueahHNoZWxs5bCP57uD
5LmgICZsdDtodHRwOi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29tLzM1NzguaHRtbCZndDs8L3A+
PHA+VkNG5qC85byP5paH5Lu255qEc2hlbGzlsI/nu4PkuaAgJmx0O2h0dHA6Ly93d3cuYmlvLWlu
Zm8tdHJhaW5lZS5jb20vMzU3Ny5odG1sJmd0OzwvcD48cD4jIyMjIFJOQS1zZXHlj6rmnInlsI/o
gIPmoLg8L3A+PHA+Jmx0O2h0dHA6Ly93d3cuYmlvLWluZm8tdHJhaW5lZS5jb20vMzkyMC5odG1s
Jmd0OzwvcD48cD4jIyMg56ys5LqM5Liq5Lu75YqhPC9wPjxwPuaKiuaIkeWtpuW+kuWfueWFu+ea
hOWQhOenjei1hOaWmee7huiKguWujOWWhO+8jOS7o+eggeWcqO+8mjwvcD48cD4mbHQ7aHR0cHM6
Ly93d3cuamlhbnNodS5jb20vcC80OWQwMzViMTIxYjgmZ3Q7IOe7meWtpuW+kueahFdFU+aVsOaN
ruWIhuaekOa1geeoizwvcD48cD4mbHQ7aHR0cHM6Ly93d3cuamlhbnNodS5jb20vcC9hODRjZDQ0
YmFjNjcmZ3Q7IOWOn+WImzEwMDAwK+eUn+S/oeaVmeeoi+Wkp+elnue7meS9oOeahFJOQeWunuaI
mOinhumikea8lOe7gzwvcD48cD4mbHQ7aHR0cHM6Ly93d3cuamlhbnNodS5jb20vcC81YmNlNDNh
NTM3ZmQmZ3Q7IOe7meWtpuW+kueahEFUQUMtc2Vx5pWw5o2u5a6e5oiYPC9wPjxwPiZsdDtodHRw
czovL21wLndlaXhpbi5xcS5jb20vcy9hNHFBY0tFMURvdWtwTFZWX3lib2JBJmd0OyDnu5nlrabl
vpJDaElQLXNlceaVsOaNruWkhOeQhua1geeoizwvcD48cD7orrLkuYnpg73mmK/lnKjvvJo8L3A+
PHA+5a2m5b6S56ys5LiA5ZGo77ya5paH5qGj6ZO+5o6l77yaJmx0O2h0dHBzOi8vbXVidS5jb20v
ZG9jLzM4dEV5Y2ZyUWcmZ3Q7IOWvhuegge+8mnZsM3E8L3A+PHA+5a2m5b6S56ys5LqM5ZGo77ya
5paH5qGj6ZO+5o6l77yaJmx0O2h0dHBzOi8vbXVidS5jb20vZG9jLzM4eTdwbWd6TGcmZ3Q7IOWv
huegge+8mnA2Zm88L3A+PHA+5a2m5b6S56ys5LiJ5ZGo77ya5paH5qGj6ZO+5o6l77yaJmx0O2h0
dHBzOi8vbXVidS5jb20vZG9jLzFpRHVjTGxHNWcmZ3Q7IOWvhuegge+8mjd1Y2g8L3A+PHA+5a2m
5b6S56ys5Zub5ZGo77ya5paH5qGj6ZO+5o6l77yaJmx0O2h0dHBzOi8vbXVidS5jb20vZG9jLzEx
dGFFYjlaWWcmZ3Q7IOWvhuegge+8mndrMjk8L3A+PHA+5o6I6K++5YaF5a655YyF5ous77yaPC9w
PjxwPiFbaW1hZ2UtMjAxOTA3MTExNTA0NDIyOTVdKGh0dHA6Ly93d3cuYmlvLWluZm8tdHJhaW5l
ZS5jb20vd3AtY29udGVudC91cGxvYWRzLzIwMTkvMDcvaW1hZ2UtMjAxOTA3MTExNTA0NDIyOTUu
cG5nKTwvcD4=">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4474.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>神技能-自动化批量从PDF里面提取表格</title>
		<link>http://www.bio-info-trainee.com/4466.html</link>
		<comments>http://www.bio-info-trainee.com/4466.html#comments</comments>
		<pubDate>Thu, 27 Jun 2019 07:55:17 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4466</guid>
		<description><![CDATA[最近给学徒布置了一个作业，是一篇文章的数据图表重现，如下： 很简单，就是参考文献 &#8230; <a href="http://www.bio-info-trainee.com/4466.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<p style="margin: 0px 0px 1.2em !important;">最近给学徒布置了一个作业，是一篇文章的数据图表重现，如下：</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/20190622073150.png" alt="image-20190627153442135" /><span id="more-4466"></span></p>
<p style="margin: 0px 0px 1.2em !important;">很简单，就是参考文献的28个免疫基因集拿出来，对从GEO下载的表达矩阵进行ssGSEA分析的结果热图呈现即可，比较难的应该是理解那28个免疫基因集，并且拿到每个基因集对应的基因列表，我本来以为是</p>
<ul style="margin: 1.2em 0px; padding-left: 2em;">
<li style="margin: 0.5em 0px;"><strong>2013的CELL文章</strong>：Spatiotemporal Dynamics of Intratumoral Immune Cells Reveal the Immune Landscape in Human Cancer</li>
<li style="margin: 0.5em 0px;">结果是<strong>2016的CELL reports文章</strong>：Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade</li>
</ul>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">表现优异的学徒</h3>
<p style="margin: 0px 0px 1.2em !important;">但是拿到学徒提交的代码才<strong>眼前一亮</strong>，她居然是从上面文章的PDF附件里面，使用R语言的pdftools包进行自动化读取，并且格式化成为基因集列表进行后续ssGSEA分析，虽然代码很丑，但是实现了目的，PDF如下所示：</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/image-20190627153442135.png" alt="image-20190627153442135" /></p>
<p style="margin: 0px 0px 1.2em !important;">可以看到第 20到36页，是记录着基因集信息。</p>
<p style="margin: 0px 0px 1.2em !important;">读取PDF并且提取信息的代码如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">rm(list=ls())
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(pdftools)
options(stringsAsFactors = <span class="hljs-literal">F</span>)
b &lt;- pdf_text(<span class="hljs-string" style="color: #dd1144;">'SupplementaryTables.pdf'</span>)
geneset_substract&lt;- <span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(tmp){split_to_line&lt;- gsub(<span class="hljs-string" style="color: #dd1144;">'\r'</span>,<span class="hljs-string" style="color: #dd1144;">''</span>,strsplit(tmp,split = <span class="hljs-string" style="color: #dd1144;">'\n'</span>)[[<span class="hljs-number" style="color: #008080;">1</span>]])
 gene_name&lt;- apply(data.frame(split_to_line),<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(x){ line&lt;- strsplit(x,split=<span class="hljs-string" style="color: #dd1144;">' '</span>)[[<span class="hljs-number" style="color: #008080;">1</span>]]
 pos&lt;- grep(<span class="hljs-string" style="color: #dd1144;">'[A-Za-z\\d+]|\\d+'</span>,line)
 res &lt;- line[pos[<span class="hljs-number" style="color: #008080;">1</span>]]})
 cell_type&lt;- apply(data.frame(split_to_line),<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(x){ line&lt;- strsplit(x,split=<span class="hljs-string" style="color: #dd1144;">' '</span>)[[<span class="hljs-number" style="color: #008080;">1</span>]]
 pos&lt;- grep(<span class="hljs-string" style="color: #dd1144;">'[A-Za-z\\d+]|\\d+'</span>,line)
 res &lt;- line[pos]
 res &lt;- res[c(-<span class="hljs-number" style="color: #008080;">1</span>,-length(res))]
 s &lt;- <span class="hljs-string" style="color: #dd1144;">''</span>
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">for</span> (i <span class="hljs-keyword" style="color: #333333; font-weight: bold;">in</span> <span class="hljs-number" style="color: #008080;">1</span>:length(res)){
 s&lt;- paste(s,res[i])}
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">return</span>(s)})
 result&lt;- data.frame(gene_name,cell_type)
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">return</span>(result)
 }
gene_set &lt;- data.frame()
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">for</span>(i <span class="hljs-keyword" style="color: #333333; font-weight: bold;">in</span> <span class="hljs-number" style="color: #008080;">20</span>:<span class="hljs-number" style="color: #008080;">36</span>){
 gene_set&lt;- rbind(gene_set,geneset_substract(b[i]))
}
gene_set &lt;- gene_set[c(-<span class="hljs-number" style="color: #008080;">1</span>,-<span class="hljs-number" style="color: #008080;">2</span>),]
list &lt;- list()
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">for</span>(i <span class="hljs-keyword" style="color: #333333; font-weight: bold;">in</span> <span class="hljs-number" style="color: #008080;">1</span>:length(unique(gene_set$cell_type))){
 list[[i]] &lt;- gene_set$gene_name[gene_set$cell_type== (unique(gene_set$cell_type)[i])]
}
names(list)&lt;- unique(gene_set$cell_type)
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">得到的基因集列表如下：</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/image-20190627153547654.png" alt="image-20190627153547654" /></p>
<p style="margin: 0px 0px 1.2em !important;">后续ssGSEA分析以及热图可视化，见生信菜鸟团的周一数据挖掘专场吧，这里留个悬念哈！</p>
<h3 id="apply-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">apply家族函数要活学活用</h3>
<p style="margin: 0px 0px 1.2em !important;">不过， 我还是觉得学徒代码太丑，修改了一下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">rm(list=ls())
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(pdftools)
options(stringsAsFactors = <span class="hljs-literal">F</span>)

b &lt;- pdf_text(<span class="hljs-string" style="color: #dd1144;">'SupplementaryTables.pdf'</span>)

tmp = unlist(lapply(<span class="hljs-number" style="color: #008080;">20</span>:<span class="hljs-number" style="color: #008080;">36</span>, <span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(i){
 trimws(strsplit(b[[i]],split = <span class="hljs-string" style="color: #dd1144;">'\n'</span>)[[<span class="hljs-number" style="color: #008080;">1</span>]])
}))
tmp=tmp[-c(<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">2</span>)]
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(stringr)
tmp=do.call(rbind,lapply(str_split(tmp,<span class="hljs-string" style="color: #dd1144;">' '</span>), <span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(x){
 x=x[-length(x)]
 gene_name&lt;- x[<span class="hljs-number" style="color: #008080;">1</span>]
 cell_type&lt;- trimws(paste(x[-<span class="hljs-number" style="color: #008080;">1</span>],collapse = <span class="hljs-string" style="color: #dd1144;">' '</span>))
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">return</span>(c(gene_name,cell_type))
}))
immune_list &lt;- split(tmp[,<span class="hljs-number" style="color: #008080;">1</span>],tmp[,<span class="hljs-number" style="color: #008080;">2</span>])
</code></pre>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">后记</h3>
<p style="margin: 0px 0px 1.2em !important;">我相信这个技巧在很多场合都蛮有用的，不仅仅是生物信息学，如果你根据我们的教程学了后用到了可以发邮件跟我们交流哦。</p>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5pyA6L+R57uZ5a2m5b6S5biD572u5LqG5LiA5Liq5L2c5Lia77yM5piv5LiA56+H5paH56ug
55qE5pWw5o2u5Zu+6KGo6YeN546w77yM5aaC5LiL77yaPGJyPjwvcD48cD4hW2ltYWdlLTIwMTkw
NjI3MTUzNDQyMTM1XShodHRwOi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29tL3dwLWNvbnRlbnQv
dXBsb2Fkcy8yMDE5LzA2LzIwMTkwNjIyMDczMTUwLnBuZyk8aW1nIGNsYXNzPSJ3cC1tb3JlLXRh
ZyBtY2Utd3AtbW9yZSIgdGl0bGU9IumYheivu+abtOWkmuKApiIgc3JjPSJkYXRhOmltYWdlL2dp
ZjtiYXNlNjQsUjBsR09EbGhBUUFCQUlBQUFBQUFBUC8vL3lINUJBRUFBQUFBTEFBQUFBQUJBQUVB
QUFJQlJBQTciIGFsdD0iIiBkYXRhLXdwLW1vcmU9Im1vcmUiIGRhdGEtbWNlLXJlc2l6ZT0iZmFs
c2UiIGRhdGEtbWNlLXBsYWNlaG9sZGVyPSIxIiBkYXRhLW1jZS1zcmM9ImRhdGE6aW1hZ2UvZ2lm
O2Jhc2U2NCxSMGxHT0RsaEFRQUJBSUFBQUFBQUFQLy8veUg1QkFFQUFBQUFMQUFBQUFBQkFBRUFB
QUlCUkFBNyI+PC9wPjxwPuW+iOeugOWNle+8jOWwseaYr+WPguiAg+aWh+eMrueahDI45Liq5YWN
55ar5Z+65Zug6ZuG5ou/5Ye65p2l77yM5a+55LuOR0VP5LiL6L2955qE6KGo6L6+55+p6Zi16L+b
6KGMc3NHU0VB5YiG5p6Q55qE57uT5p6c54Ot5Zu+5ZGI546w5Y2z5Y+v77yM5q+U6L6D6Zq+55qE
5bqU6K+l5piv55CG6Kej6YKjMjjkuKrlhY3nlqvln7rlm6Dpm4bvvIzlubbkuJTmi7/liLDmr4/k
uKrln7rlm6Dpm4blr7nlupTnmoTln7rlm6DliJfooajvvIzmiJHmnKzmnaXku6XkuLrmmK88L3A+
PHA+LSAqKjIwMTPnmoRDRUxM5paH56ugKirvvJpTcGF0aW90ZW1wb3JhbCBEeW5hbWljcyBvZiBJ
bnRyYXR1bW9yYWwgSW1tdW5lIENlbGxzIFJldmVhbCB0aGUgSW1tdW5lIExhbmRzY2FwZSBpbiBI
dW1hbiBDYW5jZXI8YnI+LSDnu5PmnpzmmK8qKjIwMTbnmoRDRUxMIHJlcG9ydHPmlofnq6AqKu+8
mlBhbi1jYW5jZXIgSW1tdW5vZ2Vub21pYyBBbmFseXNlcyBSZXZlYWwgR2Vub3R5cGUtSW1tdW5v
cGhlbm90eXBlIFJlbGF0aW9uc2hpcHMgYW5kIFByZWRpY3RvcnMgb2YgUmVzcG9uc2UgdG8gQ2hl
Y2twb2ludCBCbG9ja2FkZTwvcD48cD4jIyMg6KGo546w5LyY5byC55qE5a2m5b6SPC9wPjxwPuS9
huaYr+aLv+WIsOWtpuW+kuaPkOS6pOeahOS7o+eggeaJjSoq55y85YmN5LiA5LquKirvvIzlpbnl
sYXnhLbmmK/ku47kuIrpnaLmlofnq6DnmoRQREbpmYTku7bph4zpnaLvvIzkvb/nlKhS6K+t6KiA
55qEcGRmdG9vbHPljIXov5vooYzoh6rliqjljJbor7vlj5bvvIzlubbkuJTmoLzlvI/ljJbmiJDk
uLrln7rlm6Dpm4bliJfooajov5vooYzlkI7nu61zc0dTRUHliIbmnpDvvIzomb3nhLbku6PnoIHl
vojkuJHvvIzkvYbmmK/lrp7njrDkuobnm67nmoTvvIxQREblpoLkuIvmiYDnpLrvvJo8L3A+PHA+
IVtpbWFnZS0yMDE5MDYyNzE1MzQ0MjEzNV0oaHR0cDovL3d3dy5iaW8taW5mby10cmFpbmVlLmNv
bS93cC1jb250ZW50L3VwbG9hZHMvMjAxOS8wNi9pbWFnZS0yMDE5MDYyNzE1MzQ0MjEzNS5wbmcp
PC9wPjxwPuWPr+S7peeci+WIsOesrCAyMOWIsDM26aG177yM5piv6K6w5b2V552A5Z+65Zug6ZuG
5L+h5oGv44CCPC9wPjxwPuivu+WPllBERuW5tuS4lOaPkOWPluS/oeaBr+eahOS7o+eggeWmguS4
i++8mjwvcD48cD5gYGByPGJyPnJtKGxpc3Q9bHMoKSk8YnI+bGlicmFyeShwZGZ0b29scyk8YnI+
b3B0aW9ucyhzdHJpbmdzQXNGYWN0b3JzID0gRik8YnI+YiAmbHQ7LSBwZGZfdGV4dCgnU3VwcGxl
bWVudGFyeVRhYmxlcy5wZGYnKTxicj5nZW5lc2V0X3N1YnN0cmFjdCZsdDstIGZ1bmN0aW9uKHRt
cCl7c3BsaXRfdG9fbGluZSZsdDstIGdzdWIoJ1xyJywnJyxzdHJzcGxpdCh0bXAsc3BsaXQgPSAn
XG4nKVtbMV1dKTxicj4gZ2VuZV9uYW1lJmx0Oy0gYXBwbHkoZGF0YS5mcmFtZShzcGxpdF90b19s
aW5lKSwxLGZ1bmN0aW9uKHgpeyBsaW5lJmx0Oy0gc3Ryc3BsaXQoeCxzcGxpdD0nICcpW1sxXV08
YnI+IHBvcyZsdDstIGdyZXAoJ1tBLVphLXpcXGQrXXxcXGQrJyxsaW5lKTxicj4gcmVzICZsdDst
IGxpbmVbcG9zWzFdXX0pPGJyPiBjZWxsX3R5cGUmbHQ7LSBhcHBseShkYXRhLmZyYW1lKHNwbGl0
X3RvX2xpbmUpLDEsZnVuY3Rpb24oeCl7IGxpbmUmbHQ7LSBzdHJzcGxpdCh4LHNwbGl0PScgJylb
WzFdXTxicj4gcG9zJmx0Oy0gZ3JlcCgnW0EtWmEtelxcZCtdfFxcZCsnLGxpbmUpPGJyPiByZXMg
Jmx0Oy0gbGluZVtwb3NdPGJyPiByZXMgJmx0Oy0gcmVzW2MoLTEsLWxlbmd0aChyZXMpKV08YnI+
IHMgJmx0Oy0gJyc8YnI+IGZvciAoaSBpbiAxOmxlbmd0aChyZXMpKXs8YnI+IHMmbHQ7LSBwYXN0
ZShzLHJlc1tpXSl9PGJyPiByZXR1cm4ocyl9KTxicj4gcmVzdWx0Jmx0Oy0gZGF0YS5mcmFtZShn
ZW5lX25hbWUsY2VsbF90eXBlKTxicj4gcmV0dXJuKHJlc3VsdCk8YnI+IH08YnI+Z2VuZV9zZXQg
Jmx0Oy0gZGF0YS5mcmFtZSgpPGJyPmZvcihpIGluIDIwOjM2KXs8YnI+IGdlbmVfc2V0Jmx0Oy0g
cmJpbmQoZ2VuZV9zZXQsZ2VuZXNldF9zdWJzdHJhY3QoYltpXSkpPGJyPn08YnI+Z2VuZV9zZXQg
Jmx0Oy0gZ2VuZV9zZXRbYygtMSwtMiksXTxicj5saXN0ICZsdDstIGxpc3QoKTxicj5mb3IoaSBp
biAxOmxlbmd0aCh1bmlxdWUoZ2VuZV9zZXQkY2VsbF90eXBlKSkpezxicj4gbGlzdFtbaV1dICZs
dDstIGdlbmVfc2V0JGdlbmVfbmFtZVtnZW5lX3NldCRjZWxsX3R5cGU9PSAodW5pcXVlKGdlbmVf
c2V0JGNlbGxfdHlwZSlbaV0pXTxicj59PGJyPm5hbWVzKGxpc3QpJmx0Oy0gdW5pcXVlKGdlbmVf
c2V0JGNlbGxfdHlwZSk8YnI+YGBgPC9wPjxwPuW+l+WIsOeahOWfuuWboOmbhuWIl+ihqOWmguS4
i++8mjwvcD48cD4hW2ltYWdlLTIwMTkwNjI3MTUzNTQ3NjU0XShodHRwOi8vd3d3LmJpby1pbmZv
LXRyYWluZWUuY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE5LzA2L2ltYWdlLTIwMTkwNjI3MTUz
NTQ3NjU0LnBuZyk8L3A+PHA+5ZCO57utc3NHU0VB5YiG5p6Q5Lul5Y+K54Ot5Zu+5Y+v6KeG5YyW
77yM6KeB55Sf5L+h6I+c6bif5Zui55qE5ZGo5LiA5pWw5o2u5oyW5o6Y5LiT5Zy65ZCn77yM6L+Z
6YeM55WZ5Liq5oKs5b+15ZOI77yBPC9wPjxwPiMjIyBhcHBseeWutuaXj+WHveaVsOimgea0u+Wt
pua0u+eUqDwvcD48cD7kuI3ov4fvvIwg5oiR6L+Y5piv6KeJ5b6X5a2m5b6S5Luj56CB5aSq5LiR
77yM5L+u5pS55LqG5LiA5LiL77yaPC9wPjxwPmBgYHI8YnI+cm0obGlzdD1scygpKTxicj5saWJy
YXJ5KHBkZnRvb2xzKTxicj5vcHRpb25zKHN0cmluZ3NBc0ZhY3RvcnMgPSBGKTwvcD48cD5iICZs
dDstIHBkZl90ZXh0KCdTdXBwbGVtZW50YXJ5VGFibGVzLnBkZicpPC9wPjxwPnRtcCA9IHVubGlz
dChsYXBwbHkoMjA6MzYsIGZ1bmN0aW9uKGkpezxicj4gdHJpbXdzKHN0cnNwbGl0KGJbW2ldXSxz
cGxpdCA9ICdcbicpW1sxXV0pPGJyPn0pKTxicj50bXA9dG1wWy1jKDEsMildPGJyPmxpYnJhcnko
c3RyaW5ncik8YnI+dG1wPWRvLmNhbGwocmJpbmQsbGFwcGx5KHN0cl9zcGxpdCh0bXAsJyAnKSwg
ZnVuY3Rpb24oeCl7PGJyPiB4PXhbLWxlbmd0aCh4KV08YnI+IGdlbmVfbmFtZSZsdDstIHhbMV08
YnI+IGNlbGxfdHlwZSZsdDstIHRyaW13cyhwYXN0ZSh4Wy0xXSxjb2xsYXBzZSA9ICcgJykpPGJy
PiByZXR1cm4oYyhnZW5lX25hbWUsY2VsbF90eXBlKSk8YnI+fSkpPGJyPmltbXVuZV9saXN0ICZs
dDstIHNwbGl0KHRtcFssMV0sdG1wWywyXSk8YnI+YGBgPC9wPjxwPiMjIyDlkI7orrA8L3A+PHA+
5oiR55u45L+h6L+Z5Liq5oqA5ben5Zyo5b6I5aSa5Zy65ZCI6YO96Juu5pyJ55So55qE77yM5LiN
5LuF5LuF5piv55Sf54mp5L+h5oGv5a2m77yM5aaC5p6c5L2g5qC55o2u5oiR5Lus55qE5pWZ56iL
5a2m5LqG5ZCO55So5Yiw5LqG5Y+v5Lul5Y+R6YKu5Lu26Lef5oiR5Lus5Lqk5rWB5ZOm44CCPC9w
Pg==">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4466.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>想做单细胞数据分析，完成一个R考核题</title>
		<link>http://www.bio-info-trainee.com/4458.html</link>
		<comments>http://www.bio-info-trainee.com/4458.html#comments</comments>
		<pubDate>Sat, 15 Jun 2019 13:18:18 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4458</guid>
		<description><![CDATA[打开你的Rstudio，运行下面的代码： set.seed(0.12345) n &#8230; <a href="http://www.bio-info-trainee.com/4458.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<p style="margin: 0px 0px 1.2em !important;">打开你的Rstudio，运行下面的代码：<span id="more-4458"></span></p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">set.seed(<span class="hljs-number" style="color: #008080;">0.12345</span>)
n=<span class="hljs-number" style="color: #008080;">26</span>
df=data.frame(LETTERS,rnorm(n),rnorm(n),
 rnorm(n),rnorm(n),rnorm(n))
a=lapply(<span class="hljs-number" style="color: #008080;">2</span>:ncol(df), <span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(i){
 x=df[,c(<span class="hljs-number" style="color: #008080;">1</span>,i)]
 x=x[x[,<span class="hljs-number" style="color: #008080;">2</span>]&gt;<span class="hljs-number" style="color: #008080;">0</span>,]
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">return</span>(x)
})
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">很明显，开始有一个数据框如下：</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/11431560561839_.pic_hd.png"><img class="alignnone size-full wp-image-4459" src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/11431560561839_.pic_hd.png" alt="11431560561839_-pic_hd" width="1434" height="790" /></a></p>
<p style="margin: 0px 0px 1.2em !important;">然后就不小心被lapply这个循环弄成了长短不一的list，这个时候需要把长短不一的list再次还原为数据框或者矩阵，大概如下：</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/11441560561942_.pic_hd.png"><img class="alignnone size-full wp-image-4460" src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/11441560561942_.pic_hd.png" alt="11441560561942_-pic_hd" width="1256" height="686" /></a></p>
<p style="margin: 0px 0px 1.2em !important;">之前那些因为小于0所以被过滤掉的这个时候还原到数据框里面，需要补为0即可。</p>
<p style="margin: 0px 0px 1.2em !important;">如果你能独立完成这个题目，恭喜你，省了 3199 培训费，可以直接参加我们的单细胞线下培训啦。</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="https://mp.weixin.qq.com/s/ePD7XyvDeYdKkuSMI65cbg" target="_blank">https://mp.weixin.qq.com/s/ePD7XyvDeYdKkuSMI65cbg </a></p>
<h2 id="activity-name" class="rich_media_title"><a href="https://mp.weixin.qq.com/s/ePD7XyvDeYdKkuSMI65cbg" target="_blank">一年一度的生信技能树单细胞线下培训班火热招生 </a></h2>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5omT5byA5L2g55qEUnN0dWRpb++8jOi/kOihjOS4i+mdoueahOS7o+egge+8mjxpbWcgY2xh
c3M9IndwLW1vcmUtdGFnIG1jZS13cC1tb3JlIiB0aXRsZT0i6ZiF6K+75pu05aSa4oCmIiBzcmM9
ImRhdGE6aW1hZ2UvZ2lmO2Jhc2U2NCxSMGxHT0RsaEFRQUJBSUFBQUFBQUFQLy8veUg1QkFFQUFB
QUFMQUFBQUFBQkFBRUFBQUlCUkFBNyIgYWx0PSIiIGRhdGEtd3AtbW9yZT0ibW9yZSIgZGF0YS1t
Y2UtcmVzaXplPSJmYWxzZSIgZGF0YS1tY2UtcGxhY2Vob2xkZXI9IjEiIGRhdGEtbWNlLXNyYz0i
ZGF0YTppbWFnZS9naWY7YmFzZTY0LFIwbEdPRGxoQVFBQkFJQUFBQUFBQVAvLy95SDVCQUVBQUFB
QUxBQUFBQUFCQUFFQUFBSUJSQUE3Ij48YnI+PC9wPjxwPmBgYHI8YnI+c2V0LnNlZWQoMC4xMjM0
NSk8YnI+bj0yNjxicj5kZj1kYXRhLmZyYW1lKExFVFRFUlNbMTpuXSxybm9ybShuKSxybm9ybShu
KSw8YnI+IHJub3JtKG4pLHJub3JtKG4pLHJub3JtKG4pKTxicj5hPWxhcHBseSgyOm5jb2woZGYp
LCBmdW5jdGlvbihpKXs8YnI+IHg9ZGZbLGMoMSxpKV08YnI+IHg9eFt4WywyXSZndDswLF08YnI+
IHJldHVybih4KTxicj59KTxicj5gYGA8L3A+PHA+5b6I5piO5pi+77yM5byA5aeL5pyJ5LiA5Liq
5pWw5o2u5qGG5aaC5LiL77yaPC9wPjxwPiFbMTE0MzE1NjA1NjE4MzlfLnBpY19oZF0oaHR0cDov
L3d3dy5iaW8taW5mby10cmFpbmVlLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAxOS8wNi8xMTQz
MTU2MDU2MTgzOV8ucGljX2hkLmpwZyk8L3A+PHA+54S25ZCO5bCx5LiN5bCP5b+D6KKrbGFwcGx5
6L+Z5Liq5b6q546v5byE5oiQ5LqG6ZW/55+t5LiN5LiA55qEbGlzdO+8jOi/meS4quaXtuWAmemc
gOimgeaKiumVv+efreS4jeS4gOeahGxpc3Tlho3mrKHov5jljp/kuLrmlbDmja7moYbmiJbogIXn
n6npmLXvvIzlpKfmpoLlpoLkuIvvvJo8L3A+PHA+IVsxMTQ0MTU2MDU2MTk0Ml8ucGljX2hkXSho
dHRwOi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE5LzA2
LzExNDQxNTYwNTYxOTQyXy5waWNfaGQuanBnKTwvcD48cD7kuYvliY3pgqPkupvlm6DkuLrlsI/k
uo4w5omA5Lul6KKr6L+H5ruk5o6J55qE6L+Z5Liq5pe25YCZ6L+Y5Y6f5Yiw5pWw5o2u5qGG6YeM
6Z2i77yM6ZyA6KaB6KGl5Li6MOWNs+WPr+OAgjwvcD48cD7lpoLmnpzkvaDog73ni6znq4vlrozm
iJDov5nkuKrpopjnm67vvIzmga3llpzkvaDvvIznnIHkuoYgMzE5OSDln7norq3otLnvvIzlj6/k
u6Xnm7TmjqXlj4LliqDmiJHku6znmoTljZXnu4bog57nur/kuIvln7norq3llabjgII8L3A+">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4458.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>hg38按照200k分区间统计测序深度及GC含量</title>
		<link>http://www.bio-info-trainee.com/4452.html</link>
		<comments>http://www.bio-info-trainee.com/4452.html#comments</comments>
		<pubDate>Wed, 12 Jun 2019 05:47:28 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4452</guid>
		<description><![CDATA[【直播】我的基因组47:测序深度和GC含量的关系 以前自己写脚本，太复杂，大家看 &#8230; <a href="http://www.bio-info-trainee.com/4452.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<h3 style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;"><a class="gj_safe_a" href="http://mp.weixin.qq.com/s?__biz=MzAxMDkxODM1Ng==&amp;mid=2247483912&amp;idx=2&amp;sn=98431355a5aaa363741bd66de50272c4&amp;chksm=9b4842b3ac3fcba535b26486ffbb95892d172f8240f9b55a281e19b50fff713d05a53aefb45a&amp;scene=21#wechat_redirect" target="_blank">【直播】我的基因组47:测序深度和GC含量的关系</a> 以前自己写脚本，太复杂，大家看不懂</h3>
<h3 id="-hg38-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">下载hg38参考基因组</h3>
<p style="margin: 0px 0px 1.2em !important;">直接谷歌搜索即可：</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/06/image-20190612132629183.png" alt="image-20190612132629183" /><span id="more-4452"></span></p>
<p style="margin: 0px 0px 1.2em !important;">拿到下载链接，近1G大小的文件，里面存放分类约3G的参考基因组。</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a></p>
<h3 id="-hg38-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">拿到hg38各个染色体长度</h3>
<p style="margin: 0px 0px 1.2em !important;">使用python里面的pyfaidx模块的faidx命令，代码如下</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-bash" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">pip install pyfaidx
faidx hg38.fasta -i chromsizes &gt; sizes.genome
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">结果如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">chr1 248956422
chr10 133797422
chr11 135086622
chr12 133275309
chr13 114364328
chr14 107043718
chr15 101991189
chr16 90338345
chr17 83257441
chr18 80373285
chr19 58617616
chr2 242193529
chr20 64444167
chr21 46709983
chr22 50818468
chr3 198295559
chr4 190214555
chr5 181538259
chr6 170805979
chr7 159345973
chr8 145138636
chr9 138394717
chrM 16569
chrX 156040895
chrY 57227415
</code></pre>
<h3 id="-200kb-bed-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">生成200Kb的区间bed文件</h3>
<p style="margin: 0px 0px 1.2em !important;">这里需要写脚本，我使用自己擅长的perl语言，代码如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-perl" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">cat sizes.genome |perl -alne <span class="hljs-string" style="color: #dd1144;">'{print "$F[0]\t",200000*$_,"\t",200000*($_+1) foreach 0..$F[1]/200000-1}'</span> &gt; <span class="hljs-number" style="color: #008080;">200</span>k.bed
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">输出文件开头如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">chr1 0 200000
chr1 200000 400000
chr1 400000 600000
chr1 600000 800000
chr1 800000 1000000
chr1 1000000 1200000
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">值得注意的是线粒体长度是小于200K的，所以后面会出现警告：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">Feature (chrM:0-200000) beyond the length of chrM size (16569 bp). Skipping.
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">不过问题不大。</p>
<h3 id="-bedtools-200kb-gc-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">使用bedtools统计200Kb的区间的基因组GC含量</h3>
<p style="margin: 0px 0px 1.2em !important;">因为使用的是bedtools这样成熟的轮子， 所以就是一行代码而已：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">bedtools nuc -fi hg38.fa -bed 200k.bed | cut -f 1-3,5 &gt; 200k_gc.bed
# 4_pct_at 5_pct_gc 6_num_A 7_num_C 8_num_G 9_num_T 10_num_N
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">文件如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">$head 200k_gc.bed
#1_usercol 2_usercol 3_usercol 5_pct_gc
chr1 0 200000 0.420110
chr1 200000 400000 0.220065
chr1 400000 600000 0.315425
chr1 600000 800000 0.427140
chr1 800000 1000000 0.534730
chr1 1000000 1200000 0.608690
chr1 1200000 1400000 0.616775
chr1 1400000 1600000 0.567760
chr1 1600000 1800000 0.550655
</code></pre>
<h3 id="-bedtools-200kb-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">使用bedtools统计200Kb的区间的测序情况</h3>
<p style="margin: 0px 0px 1.2em !important;">用的bedtools的multicov 这个小命令 ，代码如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">bedtools multicov -bams alignment/SRR3081110.bam -bed 200k.bed &gt; 200K_counts.bed
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">输出文件如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">$head 200K_counts.bed
chr1 0 200000 53
chr1 200000 400000 44
chr1 400000 600000 58
chr1 600000 800000 146
chr1 800000 1000000 39
chr1 1000000 1200000 16
chr1 1200000 1400000 16
chr1 1400000 1600000 33
chr1 1600000 1800000 43
chr1 1800000 2000000 63
</code></pre>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+IyMjIOS4i+i9vWhnMzjlj4LogIPln7rlm6Dnu4Q8YnI+PC9wPjxwPuebtOaOpeiwt+atjOaQ
nOe0ouWNs+WPr++8mjwvcD48cD4hW2ltYWdlLTIwMTkwNjEyMTMyNjI5MTgzXShodHRwOi8vd3d3
LmJpby1pbmZvLXRyYWluZWUuY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE5LzA2L2ltYWdlLTIw
MTkwNjEyMTMyNjI5MTgzLnBuZyk8aW1nIGNsYXNzPSJ3cC1tb3JlLXRhZyBtY2Utd3AtbW9yZSIg
dGl0bGU9IumYheivu+abtOWkmuKApiIgc3JjPSJkYXRhOmltYWdlL2dpZjtiYXNlNjQsUjBsR09E
bGhBUUFCQUlBQUFBQUFBUC8vL3lINUJBRUFBQUFBTEFBQUFBQUJBQUVBQUFJQlJBQTciIGFsdD0i
IiBkYXRhLXdwLW1vcmU9Im1vcmUiIGRhdGEtbWNlLXJlc2l6ZT0iZmFsc2UiIGRhdGEtbWNlLXBs
YWNlaG9sZGVyPSIxIiBkYXRhLW1jZS1zcmM9ImRhdGE6aW1hZ2UvZ2lmO2Jhc2U2NCxSMGxHT0Rs
aEFRQUJBSUFBQUFBQUFQLy8veUg1QkFFQUFBQUFMQUFBQUFBQkFBRUFBQUlCUkFBNyI+PC9wPjxw
PuaLv+WIsOS4i+i9vemTvuaOpe+8jOi/kTFH5aSn5bCP55qE5paH5Lu277yM6YeM6Z2i5a2Y5pS+
5YiG57G757qmM0fnmoTlj4LogIPln7rlm6Dnu4TjgII8L3A+PHA+aHR0cDovL2hnZG93bmxvYWQu
Y3NlLnVjc2MuZWR1L2dvbGRlblBhdGgvaGczOC9iaWdaaXBzL2hnMzguZmEuZ3o8L3A+PHA+IyMj
IOaLv+WIsGhnMzjlkITkuKrmn5PoibLkvZPplb/luqY8L3A+PHA+5L2/55SocHl0aG9u6YeM6Z2i
55qEcHlmYWlkeOaooeWdl+eahGZhaWR45ZG95Luk77yM5Luj56CB5aaC5LiLPC9wPjxwPmBgYGJh
c2g8YnI+cGlwIGluc3RhbGwgcHlmYWlkeDxicj5mYWlkeCBoZzM4LmZhc3RhIC1pIGNocm9tc2l6
ZXMgJmd0OyBzaXplcy5nZW5vbWU8YnI+YGBgPC9wPjxwPue7k+aenOWmguS4i++8mjwvcD48cD5g
YGA8YnI+Y2hyMSAyNDg5NTY0MjI8YnI+Y2hyMTAgMTMzNzk3NDIyPGJyPmNocjExIDEzNTA4NjYy
Mjxicj5jaHIxMiAxMzMyNzUzMDk8YnI+Y2hyMTMgMTE0MzY0MzI4PGJyPmNocjE0IDEwNzA0Mzcx
ODxicj5jaHIxNSAxMDE5OTExODk8YnI+Y2hyMTYgOTAzMzgzNDU8YnI+Y2hyMTcgODMyNTc0NDE8
YnI+Y2hyMTggODAzNzMyODU8YnI+Y2hyMTkgNTg2MTc2MTY8YnI+Y2hyMiAyNDIxOTM1Mjk8YnI+
Y2hyMjAgNjQ0NDQxNjc8YnI+Y2hyMjEgNDY3MDk5ODM8YnI+Y2hyMjIgNTA4MTg0Njg8YnI+Y2hy
MyAxOTgyOTU1NTk8YnI+Y2hyNCAxOTAyMTQ1NTU8YnI+Y2hyNSAxODE1MzgyNTk8YnI+Y2hyNiAx
NzA4MDU5Nzk8YnI+Y2hyNyAxNTkzNDU5NzM8YnI+Y2hyOCAxNDUxMzg2MzY8YnI+Y2hyOSAxMzgz
OTQ3MTc8YnI+Y2hyTSAxNjU2OTxicj5jaHJYIDE1NjA0MDg5NTxicj5jaHJZIDU3MjI3NDE1PGJy
PmBgYDwvcD48cD4jIyMg55Sf5oiQMjAwS2LnmoTljLrpl7RiZWTmlofku7Y8L3A+PHA+6L+Z6YeM
6ZyA6KaB5YaZ6ISa5pys77yM5oiR5L2/55So6Ieq5bex5pOF6ZW/55qEcGVybOivreiogO+8jOS7
o+eggeWmguS4i++8mjwvcD48cD5gYGBwZXJsPGJyPmNhdCBzaXplcy5nZW5vbWUgfHBlcmwgLWFs
bmUgJ3twcmludCAiJEZbMF1cdCIsMjAwMDAwKiRfLCJcdCIsMjAwMDAwKigkXysxKSBmb3JlYWNo
IDAuLiRGWzFdLzIwMDAwMC0xfScgJmd0OyAyMDBrLmJlZDxicj5gYGA8L3A+PHA+6L6T5Ye65paH
5Lu25byA5aS05aaC5LiL77yaPC9wPjxwPmBgYDxicj5jaHIxIDAgMjAwMDAwPGJyPmNocjEgMjAw
MDAwIDQwMDAwMDxicj5jaHIxIDQwMDAwMCA2MDAwMDA8YnI+Y2hyMSA2MDAwMDAgODAwMDAwPGJy
PmNocjEgODAwMDAwIDEwMDAwMDA8YnI+Y2hyMSAxMDAwMDAwIDEyMDAwMDA8YnI+YGBgPC9wPjxw
PuWAvOW+l+azqOaEj+eahOaYr+e6v+eykuS9k+mVv+W6puaYr+Wwj+S6jjIwMEvnmoTvvIzmiYDk
u6XlkI7pnaLkvJrlh7rnjrDorablkYrvvJo8L3A+PHA+YGBgPGJyPkZlYXR1cmUgKGNock06MC0y
MDAwMDApIGJleW9uZCB0aGUgbGVuZ3RoIG9mIGNock0gc2l6ZSAoMTY1NjkgYnApLiBTa2lwcGlu
Zy48YnI+YGBgPC9wPjxwPuS4jei/h+mXrumimOS4jeWkp+OAgjwvcD48cD4jIyMg5L2/55SoYmVk
dG9vbHPnu5/orqEyMDBLYueahOWMuumXtOeahOWfuuWboOe7hEdD5ZCr6YePPC9wPjxwPuWboOS4
uuS9v+eUqOeahOaYr2JlZHRvb2xz6L+Z5qC35oiQ54af55qE6L2u5a2Q77yMIOaJgOS7peWwseaY
r+S4gOihjOS7o+eggeiAjOW3su+8mjwvcD48cD5gYGA8YnI+YmVkdG9vbHMgbnVjIC1maSBoZzM4
LmZhIC1iZWQgMjAway5iZWQgfCBjdXQgLWYgMS0zLDUgJmd0OyAyMDBrX2djLmJlZDxicj4jIDRf
cGN0X2F0IDVfcGN0X2djIDZfbnVtX0EgN19udW1fQyA4X251bV9HIDlfbnVtX1QgMTBfbnVtX048
YnI+YGBgPC9wPjxwPuaWh+S7tuWmguS4i++8mjwvcD48cD5gYGA8YnI+JGhlYWQgMjAwa19nYy5i
ZWQ8YnI+IzFfdXNlcmNvbCAyX3VzZXJjb2wgM191c2VyY29sIDVfcGN0X2djPGJyPmNocjEgMCAy
MDAwMDAgMC40MjAxMTA8YnI+Y2hyMSAyMDAwMDAgNDAwMDAwIDAuMjIwMDY1PGJyPmNocjEgNDAw
MDAwIDYwMDAwMCAwLjMxNTQyNTxicj5jaHIxIDYwMDAwMCA4MDAwMDAgMC40MjcxNDA8YnI+Y2hy
MSA4MDAwMDAgMTAwMDAwMCAwLjUzNDczMDxicj5jaHIxIDEwMDAwMDAgMTIwMDAwMCAwLjYwODY5
MDxicj5jaHIxIDEyMDAwMDAgMTQwMDAwMCAwLjYxNjc3NTxicj5jaHIxIDE0MDAwMDAgMTYwMDAw
MCAwLjU2Nzc2MDxicj5jaHIxIDE2MDAwMDAgMTgwMDAwMCAwLjU1MDY1NTxicj5gYGA8L3A+PHA+
IyMjIOS9v+eUqGJlZHRvb2xz57uf6K6hMjAwS2LnmoTljLrpl7TnmoTmtYvluo/mg4XlhrU8L3A+
PHA+55So55qEYmVkdG9vbHPnmoRtdWx0aWNvdiDov5nkuKrlsI/lkb3ku6Qg77yM5Luj56CB5aaC
5LiL77yaPC9wPjxwPmBgYDxicj5iZWR0b29scyBtdWx0aWNvdiAtYmFtcyBhbGlnbm1lbnQvU1JS
MzA4MTExMC5iYW0gLWJlZCAyMDBrLmJlZCAmZ3Q7IDIwMEtfY291bnRzLmJlZCA8YnI+YGBgPC9w
PjxwPui+k+WHuuaWh+S7tuWmguS4i++8mjwvcD48cD5gYGA8YnI+JGhlYWQgMjAwS19jb3VudHMu
YmVkPGJyPmNocjEgMCAyMDAwMDAgNTM8YnI+Y2hyMSAyMDAwMDAgNDAwMDAwIDQ0PGJyPmNocjEg
NDAwMDAwIDYwMDAwMCA1ODxicj5jaHIxIDYwMDAwMCA4MDAwMDAgMTQ2PGJyPmNocjEgODAwMDAw
IDEwMDAwMDAgMzk8YnI+Y2hyMSAxMDAwMDAwIDEyMDAwMDAgMTY8YnI+Y2hyMSAxMjAwMDAwIDE0
MDAwMDAgMTY8YnI+Y2hyMSAxNDAwMDAwIDE2MDAwMDAgMzM8YnI+Y2hyMSAxNjAwMDAwIDE4MDAw
MDAgNDM8YnI+Y2hyMSAxODAwMDAwIDIwMDAwMDAgNjM8YnI+YGBgPC9wPg==">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4452.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>使用DSS包多种方式检验差异甲基化信号区域</title>
		<link>http://www.bio-info-trainee.com/4392.html</link>
		<comments>http://www.bio-info-trainee.com/4392.html#comments</comments>
		<pubDate>Fri, 31 May 2019 09:28:26 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4392</guid>
		<description><![CDATA[一起学一个包吧！ 一个背景 哺乳动物基因组CpG位点通常集中在称为CpG岛（Cp &#8230; <a href="http://www.bio-info-trainee.com/4392.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<p style="margin: 0px 0px 1.2em !important;">一起学一个包吧！<span id="more-4392"></span></p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">一个背景</h3>
<p style="margin: 0px 0px 1.2em !important;">哺乳动物基因组CpG位点通常集中在称为CpG岛（CpG island，CGI）的区域中，并且已知人基因启动子~60％含有CpG岛。CpG岛上下游不超过2000个碱基对（2kb）的基因组区域称为CpG“岛岸”（shores），其中CpG shelves指位于CpG shores 上下游2kb以内的区域，open sea指CpG islands、CpG shores和CpG shelves之外的其他区域。这4种情况形成了CpG resort。CpG位点的密度从island到open sea递减。</p>
<h3 id="3-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">3个技术</h3>
<p style="margin: 0px 0px 1.2em !important;">主要是 甲基化测序的 WGBS和RRBS，还有 芯片：</p>
<p style="margin: 0px 0px 1.2em !important;"><strong>全基因组DNA甲基化测序（Whole Genome Bisulfite Sequencing，WGBS）</strong>是 DNA 甲基化研究的金标准，它通过 Bisulfite 处理和全基因组 DNA 测序结合的方式，对整个基因组上的甲基化情况进行分析，具有单碱基分辨率，可精确评估单个 C 碱基的甲基化水平，构建全基因组精细甲基化图谱。数据量非常大。</p>
<p style="margin: 0px 0px 1.2em !important;"><strong>简化甲基化测序</strong> (Reduced representation bisulfite sequencing, RRBS)是一种<strong>准确、高效、经济</strong>的DNA甲基化研究方法，通过酶切 (Msp I) 富集启动子及CpG岛区域，并进行Bisulfite测序，同时实现DNA甲基化状态检测的高分辨率和测序数据的高利用率。作为一种高性价比的甲基化研究方法，简化甲基化测序在大规模临床样本的研究中具有广泛的应用前景。</p>
<p style="margin: 0px 0px 1.2em !important;"><strong>Illumina的Infinium BeadChip芯片</strong>，包括HumanMethyation450（450K）和MethylationEPIC（850K）。Infinium芯片存在染料偏差、不同探针化学和位置效应的问题，已知这些问题会影响结果，必须在数据处理过程中进行校正。Infinium 450K探针交叉反应和模糊比对到人类基因组中的多个位置影响了485,000个探测器中的约140,000个探针（29％），将可用探针的数量减少到约345,000个。这个问题在新发布850K仍然存在，其包括&gt; 90％的450K探针。</p>
<p style="margin: 0px 0px 1.2em !important;">有文章比较这3个技术：Empirical comparison of reduced representation bisulfite sequencing and Infinium BeadChip reproducibility and coverage of DNA methylation in humans</p>
<p style="margin: 0px 0px 1.2em !important;">主要是分析 <strong>差异甲基化区域（DMRs）与 DMR 相关差异表达基因</strong></p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">数据介绍</h3>
<p style="margin: 0px 0px 1.2em !important;">这里选择 <a href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52140">https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52140</a></p>
<p style="margin: 0px 0px 1.2em !important;">提供每个样本的信号值矩阵下载</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/05/image-20190530121638089.png" alt="image-20190530121638089" /></p>
<p style="margin: 0px 0px 1.2em !important;">下载并且了解数据:</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/05/image-20190531144259742.png" alt="image-20190531144259742" /></p>
<p style="margin: 0px 0px 1.2em !important;">查看压缩包内容，如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">head GSM1084240_A3R_d6_250.cpgs.txt
CHR POS A3R_d6_250
chr1 10525 21/23
chr1 10542 21/23
chr1 10609 1/23
chr1 10617 0/24
chr1 10620 0/24
chr1 10631 0/24
chr1 10633 0/24
chr1 10636 0/24
chr1 10638 0/24
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">DSS包要求输入文件数据:<em>每一行代表一个CpG site</em>, 格式如下:</p>
<ul style="margin: 1.2em 0px; padding-left: 2em;">
<li style="margin: 0.5em 0px;">第一列为染色体</li>
<li style="margin: 0.5em 0px;">第二列为位置</li>
<li style="margin: 0.5em 0px;">第三列为total reads</li>
<li style="margin: 0.5em 0px;">第四列为甲基化的reads</li>
</ul>
<p style="margin: 0px 0px 1.2em !important;">所以我们下载的数据需要进行拆分，然后导入到R里面才能被DSS包使用。</p>
<h3 id="dss-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">DSS包介绍</h3>
<p style="margin: 0px 0px 1.2em !important;">主要是把上面项目的数据文件下载，然后导入到R里面，是有DSS包进行分析。</p>
<p style="margin: 0px 0px 1.2em !important;"><strong>DSS (Dispersion Shrinkage for Sequencing data)</strong>，为基于高通量测序数据的差异分析而设计的Bioconductor包。<strong>主要应用于BS-seq（亚硫酸氢盐测序）中计算不同组别间差异甲基化位点（DML）和差异甲基化区域（DMR）</strong>即Call DML or DMR。</p>
<p style="margin: 0px 0px 1.2em !important;"><strong>DSS包的使用主要包括：</strong></p>
<ul style="margin: 1.2em 0px; padding-left: 2em;">
<li style="margin: 0.5em 0px;">输入文件的准备</li>
<li style="margin: 0.5em 0px;">利用DMLtest函数检验所有的位点</li>
<li style="margin: 0.5em 0px;">利用callDML函数挑选统计学显著的位点</li>
<li style="margin: 0.5em 0px;">利用callDMR函数Call DMR</li>
<li style="margin: 0.5em 0px;">利用showOneDMR函数对DMRs可视化</li>
</ul>
<p style="margin: 0px 0px 1.2em !important;">首先我们导入上面GSE52140数据集的文件：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"><span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(data.table)
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(stringr)
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(tidyverse) 
allDat &lt;- lapply(list.files(<span class="hljs-string" style="color: #dd1144;">'GSE52140_RAW/'</span>,pattern=<span class="hljs-string" style="color: #dd1144;">'*cpgs.txt.gz'</span>),<span class="hljs-keyword" style="color: #333333; font-weight: bold;">function</span>(f){
 <span class="hljs-comment" style="color: #999988; font-style: italic;"># f="GSM1251242_H2R_d0.cpgs.txt.gz";</span>
 print(f);
 tmp=fread(file.path(<span class="hljs-string" style="color: #dd1144;">'GSE52140_RAW/'</span>,f))
 chr=as.character(tmp$CHR)
 pos=as.character(tmp$POS) 
 newTmp=separate(tmp,col =<span class="hljs-number" style="color: #008080;">3</span>,into = c(<span class="hljs-string" style="color: #dd1144;">"methy"</span>, <span class="hljs-string" style="color: #dd1144;">"unmethy"</span>), sep = <span class="hljs-string" style="color: #dd1144;">"/"</span>)
 newTmp$all=as.numeric(newTmp$methy)+as.numeric(newTmp$unmethy)
 newTmp=as.data.frame(newTmp[,c(<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">5</span>,<span class="hljs-number" style="color: #008080;">3</span>)])
 colnames(newTmp)=c(<span class="hljs-string" style="color: #dd1144;">'chr'</span>, <span class="hljs-string" style="color: #dd1144;">'pos'</span> ,<span class="hljs-string" style="color: #dd1144;">'N'</span> ,<span class="hljs-string" style="color: #dd1144;">'X'</span>)
 <span class="hljs-keyword" style="color: #333333; font-weight: bold;">return</span>(newTmp)
})

<span class="hljs-comment" style="color: #999988; font-style: italic;">## 值得注意的是每个样本的位点数量不一致哦</span>
do.call(rbind,lapply(allDat,dim))
tmp=do.call(cbind,lapply(allDat,head))

sn=gsub(<span class="hljs-string" style="color: #dd1144;">'.cpgs.txt.gz'</span>,<span class="hljs-string" style="color: #dd1144;">''</span>,list.files(<span class="hljs-string" style="color: #dd1144;">'GSE52140_RAW/'</span>,pattern=<span class="hljs-string" style="color: #dd1144;">'*cpgs.txt.gz'</span>))
sn=gsub(<span class="hljs-string" style="color: #dd1144;">'GSM.*?_'</span>,<span class="hljs-string" style="color: #dd1144;">''</span>,sn)
sn
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">也就是说把这17个文件读入了，样本名字是：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">&gt; sn
 [1] "A0R_d0_rep1" "A3R_d0_rep1" "A3R_d6_250" "A3R_d6_1000" 
 [5] "A3R_d13_250" "A3R_d13_1000" "H0R_d0_rep1" "H3R_d0_rep1" 
 [9] "H3R_d6_250" "H3R_d13_250" "A0R_d0_rep2" "A3R_d0_rep2" 
[13] "H0R_d0_rep2" "H3R_d0_rep2" "A1R_d0" "A2R_d0" 
[17] "H2R_d0"
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">这个时候，这个变量有点大，可能会考验你的计算机哦。</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/05/image-20190531153428574.png" alt="image-20190531153428574" /></p>
<p style="margin: 0px 0px 1.2em !important;">然后我们用下面的3个例子来说明这个DSS包的用法，需要掌握上面样本的命名：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;"># lung cancer cell lines A549 (A) and HTB56 (H)
# normal cell lines (0R) 
# a highly metastatic phenotype (3R)
# 5-Azacytidine treatment at low concentrations (250 nM &amp; 1000 nM) 
# for 6 days, additional 7 days in regular medium
</code></pre>
<h3 id="-vs-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">单样本VS单样本</h3>
<p style="margin: 0px 0px 1.2em !important;">代码如下，重要就是构建对象和做统计检验</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-R" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"><span class="hljs-keyword" style="color: #333333; font-weight: bold;">library</span>(DSS)
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">require</span>(bsseq) 
<span class="hljs-keyword" style="color: #333333; font-weight: bold;">if</span>(<span class="hljs-literal">T</span>){
 BSobj &lt;- makeBSseqData(allDat[<span class="hljs-number" style="color: #008080;">1</span>:<span class="hljs-number" style="color: #008080;">2</span>],
 c(<span class="hljs-string" style="color: #dd1144;">"A0R"</span>, <span class="hljs-string" style="color: #dd1144;">"A3R"</span>) )[<span class="hljs-number" style="color: #008080;">1</span>:<span class="hljs-number" style="color: #008080;">1000</span>,]
 BSobj
 save(BSobj,file = <span class="hljs-string" style="color: #dd1144;">'single-BSobj.Rdata'</span>)
 <span class="hljs-comment" style="color: #999988; font-style: italic;"># There is no biological replicates in at least one condition.</span>
 dmlTest &lt;- DMLtest(BSobj, group1=c(<span class="hljs-string" style="color: #dd1144;">"A0R"</span>), group2=c(<span class="hljs-string" style="color: #dd1144;">"A3R"</span>),smoothing=<span class="hljs-literal">TRUE</span>)
 head(dmlTest) 
}
</code></pre>
<h3 id="-vs-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">多样本的组VS另一个组</h3>
<p style="margin: 0px 0px 1.2em !important;">代码如下，重要就是构建对象和做统计检验，这里比较”A0R_d0”和”A3R_d0”组别的2个样本：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-R" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"><span class="hljs-keyword" style="color: #333333; font-weight: bold;">if</span>(<span class="hljs-literal">T</span>){
 sn[c(<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">11</span>,<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">12</span>)]
 BSobj &lt;- makeBSseqData(allDat[c(<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">11</span>,<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">12</span>)],
 sn[c(<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">11</span>,<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">12</span>)] )[<span class="hljs-number" style="color: #008080;">1</span>:<span class="hljs-number" style="color: #008080;">1000</span>,]
 BSobj 
 save(BSobj,file = <span class="hljs-string" style="color: #dd1144;">'group-BSobj.Rdata'</span>)
 dmlTest &lt;- DMLtest(BSobj, group1=c(<span class="hljs-string" style="color: #dd1144;">"A0R_d0_rep1"</span>,<span class="hljs-string" style="color: #dd1144;">"A0R_d0_rep2"</span>),
 group2=c(<span class="hljs-string" style="color: #dd1144;">"A3R_d0_rep1"</span>,<span class="hljs-string" style="color: #dd1144;">"A3R_d0_rep2"</span>),smoothing=<span class="hljs-literal">F</span>)
 head(dmlTest) 
}
</code></pre>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">多种比较方式</h3>
<p style="margin: 0px 0px 1.2em !important;">代码如下，重要仍然是构建对象和做统计检验，但是需要构建属性矩阵，而且增加了DMLfit步骤。</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"> sn
 sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)]
 cellline=substring(sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)],<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">1</span>)
 type=substring(sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)],<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">2</span>)
 design=data.frame(cellline=cellline,type=type)
 design
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">得到的属性矩阵如下：</p>
<p style="margin: 0px 0px 1.2em !important;"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2019/05/image-20190531162248144.png" alt="image-20190531162248144" /></p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-R" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"> sn
 sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)]
 cellline=substring(sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)],<span class="hljs-number" style="color: #008080;">1</span>,<span class="hljs-number" style="color: #008080;">1</span>)
 type=substring(sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)],<span class="hljs-number" style="color: #008080;">2</span>,<span class="hljs-number" style="color: #008080;">2</span>)
 design=data.frame(cellline=cellline,type=type)
 design

 <span class="hljs-comment" style="color: #999988; font-style: italic;"># 构建对象特别耗时；</span>
 BSobj &lt;- makeBSseqData(allDat[c(grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn))],
 sn[grep(<span class="hljs-string" style="color: #dd1144;">'rep'</span>,sn)]) 
 BSobj 
 save(BSobj,file = <span class="hljs-string" style="color: #dd1144;">'multi-BSobj.Rdata'</span>)
 load(file = <span class="hljs-string" style="color: #dd1144;">'multi-BSobj.Rdata'</span>)
 DMLfit=DMLfit.multiFactor(BSobj,design,
 formula = ~cellline+type+cellline:type)

 colnames(DMLfit$X) 
 <span class="hljs-comment" style="color: #999988; font-style: italic;"># 这里可以使用 ‘coef’, ‘term’, or ‘Contrast’我们仅仅是演示 coef</span>
 dmlTest=DMLtest.multiFactor(DMLfit,coef=<span class="hljs-number" style="color: #008080;">2</span>)
 head(dmlTest)
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">值得注意的是<code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">DMLtest.multiFactor</code>结果不需要<code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">callDML</code>，只需要<code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">callDMR</code>即可!</p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">结果介绍</h3>
<p style="margin: 0px 0px 1.2em !important;">不管是哪种比较，最后都得到<code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">dmlTest</code>变量走后面的流程，包括确定显著差异甲基化区域及基因，以及可视化展现，代码如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-r" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"><span class="hljs-comment" style="color: #999988; font-style: italic;"># 3.Call DML by using callDML function. The results DMLs are sorted by the significance.</span>
dmls &lt;- callDML(dmlTest, p.threshold=<span class="hljs-number" style="color: #008080;">0.001</span>)
head(dmls)
<span class="hljs-comment" style="color: #999988; font-style: italic;">##To detect loci with difference greater than 0.1, do:</span>
dmls2 &lt;- callDML(dmlTest, delta=<span class="hljs-number" style="color: #008080;">0.1</span>, p.threshold=<span class="hljs-number" style="color: #008080;">0.001</span>)
head(dmls2)

<span class="hljs-comment" style="color: #999988; font-style: italic;"># 4.Call DMR by using callDML function</span>
<span class="hljs-comment" style="color: #999988; font-style: italic;">##Regions with many statistically significant CpG sites are identified as DMRs.</span>
dmrs &lt;- callDMR(dmlTest, p.threshold=<span class="hljs-number" style="color: #008080;">0.01</span>)
head(dmrs)
<span class="hljs-comment" style="color: #999988; font-style: italic;">##To detect regions with difference greater than 0.1, do:</span>
dmrs2 &lt;- callDMR(dmlTest, delta=<span class="hljs-number" style="color: #008080;">0.1</span>, p.threshold=<span class="hljs-number" style="color: #008080;">0.05</span>)
head(dmrs2)

<span class="hljs-comment" style="color: #999988; font-style: italic;"># 5.The DMRs can be visualized using showOneDMR function</span>
showOneDMR(dmrs[<span class="hljs-number" style="color: #008080;">1</span>,], BSobj)
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">很明显，参数都是可以调整的，统计学显著性的阈值自己把握。</p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">作业</h3>
<p style="margin: 0px 0px 1.2em !important;">分析文章 Enduring epigenetic landmarks define the cancer microenvironment ，拿到 患者间差异甲基化区域（DMRs）与 DMR 相关差异表达基因（DE-DMRs）</p>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5LiA6LW35a2m5LiA5Liq5YyF5ZCn77yBPGltZyBjbGFzcz0id3AtbW9yZS10YWcgbWNlLXdw
LW1vcmUiIHRpdGxlPSLpmIXor7vmm7TlpJrigKYiIHNyYz0iZGF0YTppbWFnZS9naWY7YmFzZTY0
LFIwbEdPRGxoQVFBQkFJQUFBQUFBQVAvLy95SDVCQUVBQUFBQUxBQUFBQUFCQUFFQUFBSUJSQUE3
IiBhbHQ9IiIgZGF0YS13cC1tb3JlPSJtb3JlIiBkYXRhLW1jZS1yZXNpemU9ImZhbHNlIiBkYXRh
LW1jZS1wbGFjZWhvbGRlcj0iMSIgZGF0YS1tY2Utc3JjPSJkYXRhOmltYWdlL2dpZjtiYXNlNjQs
UjBsR09EbGhBUUFCQUlBQUFBQUFBUC8vL3lINUJBRUFBQUFBTEFBQUFBQUJBQUVBQUFJQlJBQTci
PjwvcD48cD4jIyMg5LiA5Liq6IOM5pmvPGJyPjwvcD48cD7lk7rkubPliqjnianln7rlm6Dnu4RD
cEfkvY3ngrnpgJrluLjpm4bkuK3lnKjnp7DkuLpDcEflspvvvIhDcEcgaXNsYW5k77yMQ0dJ77yJ
55qE5Yy65Z+f5Lit77yM5bm25LiU5bey55+l5Lq65Z+65Zug5ZCv5Yqo5a2QfjYw77yF5ZCr5pyJ
Q3BH5bKb44CCQ3BH5bKb5LiK5LiL5ri45LiN6LaF6L+HMjAwMOS4queiseWfuuWvue+8iDJrYu+8
ieeahOWfuuWboOe7hOWMuuWfn+ensOS4ukNwR+KAnOWym+WyuOKAne+8iHNob3Jlc++8ie+8jOWF
tuS4rUNwRyBzaGVsdmVz5oyH5L2N5LqOQ3BHIHNob3JlcyDkuIrkuIvmuLgya2Lku6XlhoXnmoTl
jLrln5/vvIxvcGVuIHNlYeaMh0NwRyBpc2xhbmRz44CBQ3BHIHNob3Jlc+WSjENwRyBzaGVsdmVz
5LmL5aSW55qE5YW25LuW5Yy65Z+f44CC6L+ZNOenjeaDheWGteW9ouaIkOS6hkNwRyByZXNvcnTj
gIJDcEfkvY3ngrnnmoTlr4bluqbku45pc2xhbmTliLBvcGVuIHNlYemAkuWHj+OAgjwvcD48cD4j
IyMgM+S4quaKgOacrzwvcD48cD7kuLvopoHmmK8g55Sy5Z+65YyW5rWL5bqP55qEIFdHQlPlkoxS
UkJT77yM6L+Y5pyJIOiKr+eJh++8mjwvcD48cD4qKuWFqOWfuuWboOe7hEROQeeUsuWfuuWMlua1
i+W6j++8iFdob2xlIEdlbm9tZSBCaXN1bGZpdGUgU2VxdWVuY2luZ++8jFdHQlPvvIkqKuaYryBE
TkEg55Sy5Z+65YyW56CU56m255qE6YeR5qCH5YeG77yM5a6D6YCa6L+HIEJpc3VsZml0ZSDlpITn
kIblkozlhajln7rlm6Dnu4QgRE5BIOa1i+W6j+e7k+WQiOeahOaWueW8j++8jOWvueaVtOS4quWf
uuWboOe7hOS4iueahOeUsuWfuuWMluaDheWGtei/m+ihjOWIhuaekO+8jOWFt+acieWNleeiseWf
uuWIhui+qOeOh++8jOWPr+eyvuehruivhOS8sOWNleS4qiBDIOeiseWfuueahOeUsuWfuuWMluaw
tOW5s++8jOaehOW7uuWFqOWfuuWboOe7hOeyvue7hueUsuWfuuWMluWbvuiwseOAguaVsOaNrumH
j+mdnuW4uOWkp+OAgjwvcD48cD4qKueugOWMlueUsuWfuuWMlua1i+W6jyoqIChSZWR1Y2VkIHJl
cHJlc2VudGF0aW9uIGJpc3VsZml0ZSBzZXF1ZW5jaW5nLCBSUkJTKeaYr+S4gOenjSoq5YeG56Gu
44CB6auY5pWI44CB57uP5rWOKirnmoRETkHnlLLln7rljJbnoJTnqbbmlrnms5XvvIzpgJrov4fp
hbbliIcgKE1zcCBJKSDlr4zpm4blkK/liqjlrZDlj4pDcEflspvljLrln5/vvIzlubbov5vooYxC
aXN1bGZpdGXmtYvluo/vvIzlkIzml7blrp7njrBETkHnlLLln7rljJbnirbmgIHmo4DmtYvnmoTp
q5jliIbovqjnjoflkozmtYvluo/mlbDmja7nmoTpq5jliKnnlKjnjofjgILkvZzkuLrkuIDnp43p
q5jmgKfku7fmr5TnmoTnlLLln7rljJbnoJTnqbbmlrnms5XvvIznroDljJbnlLLln7rljJbmtYvl
uo/lnKjlpKfop4TmqKHkuLTluormoLfmnKznmoTnoJTnqbbkuK3lhbfmnInlub/ms5vnmoTlupTn
lKjliY3mma/jgII8L3A+PHA+KipJbGx1bWluYeeahEluZmluaXVtIEJlYWRDaGlw6Iqv54mHKirv
vIzljIXmi6xIdW1hbk1ldGh5YXRpb240NTDvvIg0NTBL77yJ5ZKMTWV0aHlsYXRpb25FUElD77yI
ODUwS++8ieOAgkluZmluaXVt6Iqv54mH5a2Y5Zyo5p+T5paZ5YGP5beu44CB5LiN5ZCM5o6i6ZKI
5YyW5a2m5ZKM5L2N572u5pWI5bqU55qE6Zeu6aKY77yM5bey55+l6L+Z5Lqb6Zeu6aKY5Lya5b2x
5ZON57uT5p6c77yM5b+F6aG75Zyo5pWw5o2u5aSE55CG6L+H56iL5Lit6L+b6KGM5qCh5q2j44CC
SW5maW5pdW0gNDUwS+aOoumSiOS6pOWPieWPjeW6lOWSjOaooeeziuavlOWvueWIsOS6uuexu+Wf
uuWboOe7hOS4reeahOWkmuS4quS9jee9ruW9seWTjeS6hjQ4NSwwMDDkuKrmjqLmtYvlmajkuK3n
moTnuqYxNDAsMDAw5Liq5o6i6ZKI77yIMjnvvIXvvInvvIzlsIblj6/nlKjmjqLpkojnmoTmlbDp
h4/lh4/lsJHliLDnuqYzNDUsMDAw5Liq44CC6L+Z5Liq6Zeu6aKY5Zyo5paw5Y+R5biDODUwS+S7
jeeEtuWtmOWcqO+8jOWFtuWMheaLrCZndDsgOTDvvIXnmoQ0NTBL5o6i6ZKI44CCPC9wPjxwPuac
ieaWh+eroOavlOi+g+i/mTPkuKrmioDmnK/vvJpFbXBpcmljYWwgY29tcGFyaXNvbiBvZiByZWR1
Y2VkIHJlcHJlc2VudGF0aW9uIGJpc3VsZml0ZSBzZXF1ZW5jaW5nIGFuZCBJbmZpbml1bSBCZWFk
Q2hpcCByZXByb2R1Y2liaWxpdHkgYW5kIGNvdmVyYWdlIG9mIEROQSBtZXRoeWxhdGlvbiBpbiBo
dW1hbnM8L3A+PHA+5Li76KaB5piv5YiG5p6QICoq5beu5byC55Sy5Z+65YyW5Yy65Z+f77yIRE1S
c++8ieS4jiBETVIg55u45YWz5beu5byC6KGo6L6+5Z+65ZugKio8L3A+PHA+IyMjIOaVsOaNruS7
i+e7jTwvcD48cD7ov5nph4zpgInmi6kgJmx0O2h0dHBzOi8vd3d3Lm5jYmkubmxtLm5paC5nb3Yv
Z2VvL3F1ZXJ5L2FjYy5jZ2k/YWNjPUdTRTUyMTQwJmd0OzwvcD48cD7mj5Dkvpvmr4/kuKrmoLfm
nKznmoTkv6Hlj7flgLznn6npmLXkuIvovb08L3A+PHA+IVtpbWFnZS0yMDE5MDUzMDEyMTYzODA4
OV0oaHR0cDovL3d3dy5iaW8taW5mby10cmFpbmVlLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAx
OS8wNS9pbWFnZS0yMDE5MDUzMDEyMTYzODA4OS5wbmcpPC9wPjxwPuS4i+i9veW5tuS4lOS6huin
o+aVsOaNrjo8L3A+PHA+IVtpbWFnZS0yMDE5MDUzMTE0NDI1OTc0Ml0oaHR0cDovL3d3dy5iaW8t
aW5mby10cmFpbmVlLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAxOS8wNS9pbWFnZS0yMDE5MDUz
MTE0NDI1OTc0Mi5wbmcpPC9wPjxwPuafpeeci+WOi+e8qeWMheWGheWuue+8jOWmguS4i++8mjwv
cD48cD5gYGA8YnI+aGVhZCBHU00xMDg0MjQwX0EzUl9kNl8yNTAuY3Bncy50eHQ8YnI+Q0hSIFBP
UyBBM1JfZDZfMjUwPGJyPmNocjEgMTA1MjUgMjEvMjM8YnI+Y2hyMSAxMDU0MiAyMS8yMzxicj5j
aHIxIDEwNjA5IDEvMjM8YnI+Y2hyMSAxMDYxNyAwLzI0PGJyPmNocjEgMTA2MjAgMC8yNDxicj5j
aHIxIDEwNjMxIDAvMjQ8YnI+Y2hyMSAxMDYzMyAwLzI0PGJyPmNocjEgMTA2MzYgMC8yNDxicj5j
aHIxIDEwNjM4IDAvMjQ8YnI+YGBgPC9wPjxwPkRTU+WMheimgeaxgui+k+WFpeaWh+S7tuaVsOaN
rjoq5q+P5LiA6KGM5Luj6KGo5LiA5LiqQ3BHIHNpdGUqLCDmoLzlvI/lpoLkuIs6PC9wPjxwPi0g
56ys5LiA5YiX5Li65p+T6Imy5L2TPGJyPi0g56ys5LqM5YiX5Li65L2N572uPGJyPi0g56ys5LiJ
5YiX5Li6dG90YWwgcmVhZHM8YnI+LSDnrKzlm5vliJfkuLrnlLLln7rljJbnmoRyZWFkczwvcD48
cD7miYDku6XmiJHku6zkuIvovb3nmoTmlbDmja7pnIDopoHov5vooYzmi4bliIbvvIznhLblkI7l
r7zlhaXliLBS6YeM6Z2i5omN6IO96KKrRFNT5YyF5L2/55So44CCPC9wPjxwPiMjIyBEU1PljIXk
u4vnu408L3A+PHA+5Li76KaB5piv5oqK5LiK6Z2i6aG555uu55qE5pWw5o2u5paH5Lu25LiL6L29
77yM54S25ZCO5a+85YWl5YiwUumHjOmdou+8jOaYr+aciURTU+WMhei/m+ihjOWIhuaekOOAgjwv
cD48cD4qKkRTUyAoRGlzcGVyc2lvbiBTaHJpbmthZ2UgZm9yIFNlcXVlbmNpbmcgZGF0YSkqKu+8
jOS4uuWfuuS6jumrmOmAmumHj+a1i+W6j+aVsOaNrueahOW3ruW8guWIhuaekOiAjOiuvuiuoeea
hEJpb2NvbmR1Y3RvcuWMheOAgioq5Li76KaB5bqU55So5LqOQlMtc2Vx77yI5Lqa56Gr6YW45rCi
55uQ5rWL5bqP77yJ5Lit6K6h566X5LiN5ZCM57uE5Yir6Ze05beu5byC55Sy5Z+65YyW5L2N54K5
77yIRE1M77yJ5ZKM5beu5byC55Sy5Z+65YyW5Yy65Z+f77yIRE1S77yJKirljbNDYWxsIERNTCBv
ciBETVLjgII8L3A+PHA+KipEU1PljIXnmoTkvb/nlKjkuLvopoHljIXmi6zvvJoqKjwvcD48cD4t
IOi+k+WFpeaWh+S7tueahOWHhuWkhzxicj4tIOWIqeeUqERNTHRlc3Tlh73mlbDmo4DpqozmiYDm
nInnmoTkvY3ngrk8YnI+LSDliKnnlKhjYWxsRE1M5Ye95pWw5oyR6YCJ57uf6K6h5a2m5pi+6JGX
55qE5L2N54K5PGJyPi0g5Yip55SoY2FsbERNUuWHveaVsENhbGwgRE1SPGJyPi0g5Yip55Soc2hv
d09uZURNUuWHveaVsOWvuURNUnPlj6/op4bljJY8L3A+PHA+6aaW5YWI5oiR5Lus5a+85YWl5LiK
6Z2iR1NFNTIxNDDmlbDmja7pm4bnmoTmlofku7bvvJo8L3A+PHA+YGBgcjxicj5saWJyYXJ5KGRh
dGEudGFibGUpPGJyPmxpYnJhcnkoc3RyaW5ncik8YnI+bGlicmFyeSh0aWR5dmVyc2UpIDxicj5h
bGxEYXQgJmx0Oy0gbGFwcGx5KGxpc3QuZmlsZXMoJ0dTRTUyMTQwX1JBVy8nLHBhdHRlcm49Jypj
cGdzLnR4dC5neicpLGZ1bmN0aW9uKGYpezxicj4gIyBmPSJHU00xMjUxMjQyX0gyUl9kMC5jcGdz
LnR4dC5neiI7PGJyPiBwcmludChmKTs8YnI+IHRtcD1mcmVhZChmaWxlLnBhdGgoJ0dTRTUyMTQw
X1JBVy8nLGYpKTxicj4gY2hyPWFzLmNoYXJhY3Rlcih0bXAkQ0hSKTxicj4gcG9zPWFzLmNoYXJh
Y3Rlcih0bXAkUE9TKSA8YnI+IG5ld1RtcD1zZXBhcmF0ZSh0bXAsY29sID0zLGludG8gPSBjKCJt
ZXRoeSIsICJ1bm1ldGh5IiksIHNlcCA9ICIvIik8YnI+IG5ld1RtcCRhbGw9YXMubnVtZXJpYyhu
ZXdUbXAkbWV0aHkpK2FzLm51bWVyaWMobmV3VG1wJHVubWV0aHkpPGJyPiBuZXdUbXA9YXMuZGF0
YS5mcmFtZShuZXdUbXBbLGMoMSwyLDUsMyldKTxicj4gY29sbmFtZXMobmV3VG1wKT1jKCdjaHIn
LCAncG9zJyAsJ04nICwnWCcpPGJyPiByZXR1cm4obmV3VG1wKTxicj59KTwvcD48cD4jIyDlgLzl
vpfms6jmhI/nmoTmmK/mr4/kuKrmoLfmnKznmoTkvY3ngrnmlbDph4/kuI3kuIDoh7Tlk6Y8YnI+
ZG8uY2FsbChyYmluZCxsYXBwbHkoYWxsRGF0LGRpbSkpPGJyPnRtcD1kby5jYWxsKGNiaW5kLGxh
cHBseShhbGxEYXQsaGVhZCkpPC9wPjxwPnNuPWdzdWIoJy5jcGdzLnR4dC5neicsJycsbGlzdC5m
aWxlcygnR1NFNTIxNDBfUkFXLycscGF0dGVybj0nKmNwZ3MudHh0Lmd6JykpPGJyPnNuPWdzdWIo
J0dTTS4qP18nLCcnLHNuKTxicj5zbjxicj5gYGA8L3A+PHA+5Lmf5bCx5piv6K+05oqK6L+ZMTfk
uKrmlofku7bor7vlhaXkuobvvIzmoLfmnKzlkI3lrZfmmK/vvJo8L3A+PHA+YGBgPGJyPiZndDsg
c248YnI+IFsxXSAiQTBSX2QwX3JlcDEiICJBM1JfZDBfcmVwMSIgIkEzUl9kNl8yNTAiICJBM1Jf
ZDZfMTAwMCIgPGJyPiBbNV0gIkEzUl9kMTNfMjUwIiAiQTNSX2QxM18xMDAwIiAiSDBSX2QwX3Jl
cDEiICJIM1JfZDBfcmVwMSIgPGJyPiBbOV0gIkgzUl9kNl8yNTAiICJIM1JfZDEzXzI1MCIgIkEw
Ul9kMF9yZXAyIiAiQTNSX2QwX3JlcDIiIDxicj5bMTNdICJIMFJfZDBfcmVwMiIgIkgzUl9kMF9y
ZXAyIiAiQTFSX2QwIiAiQTJSX2QwIiA8YnI+WzE3XSAiSDJSX2QwIiA8YnI+YGBgPC9wPjxwPui/
meS4quaXtuWAme+8jOi/meS4quWPmOmHj+acieeCueWkp++8jOWPr+iDveS8muiAg+mqjOS9oOea
hOiuoeeul+acuuWTpuOAgjwvcD48cD4hW2ltYWdlLTIwMTkwNTMxMTUzNDI4NTc0XShodHRwOi8v
d3d3LmJpby1pbmZvLXRyYWluZWUuY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE5LzA1L2ltYWdl
LTIwMTkwNTMxMTUzNDI4NTc0LnBuZyk8L3A+PHA+54S25ZCO5oiR5Lus55So5LiL6Z2i55qEM+S4
quS+i+WtkOadpeivtOaYjui/meS4qkRTU+WMheeahOeUqOazle+8jOmcgOimgeaOjOaPoeS4iumd
ouagt+acrOeahOWRveWQje+8mjwvcD48cD5gYGA8YnI+IyBsdW5nIGNhbmNlciBjZWxsIGxpbmVz
IEE1NDkgKEEpIGFuZCBIVEI1NiAoSCk8YnI+IyBub3JtYWwgY2VsbCBsaW5lcyAoMFIpIDxicj4j
IGEgaGlnaGx5IG1ldGFzdGF0aWMgcGhlbm90eXBlICgzUik8YnI+IyA1LUF6YWN5dGlkaW5lIHRy
ZWF0bWVudCBhdCBsb3cgY29uY2VudHJhdGlvbnMgKDI1MCBuTSAmYW1wOyAxMDAwIG5NKSA8YnI+
IyBmb3IgNiBkYXlzLCBhZGRpdGlvbmFsIDcgZGF5cyBpbiByZWd1bGFyIG1lZGl1bSA8YnI+YGBg
PC9wPjxwPiMjIyDljZXmoLfmnKxWU+WNleagt+acrDwvcD48cD7ku6PnoIHlpoLkuIvvvIzph43o
poHlsLHmmK/mnoTlu7rlr7nosaHlkozlgZrnu5/orqHmo4Dpqow8L3A+PHA+YGBgUjxicj5saWJy
YXJ5KERTUyk8YnI+cmVxdWlyZShic3NlcSkgPGJyPmlmKFQpezxicj4gQlNvYmogJmx0Oy0gbWFr
ZUJTc2VxRGF0YShhbGxEYXRbMToyXSw8YnI+IGMoIkEwUiIsICJBM1IiKSApWzE6MTAwMCxdPGJy
PiBCU29iajxicj4gc2F2ZShCU29iaixmaWxlID0gJ3NpbmdsZS1CU29iai5SZGF0YScpPGJyPiAj
IFRoZXJlIGlzIG5vIGJpb2xvZ2ljYWwgcmVwbGljYXRlcyBpbiBhdCBsZWFzdCBvbmUgY29uZGl0
aW9uLjxicj4gZG1sVGVzdCAmbHQ7LSBETUx0ZXN0KEJTb2JqLCBncm91cDE9YygiQTBSIiksIGdy
b3VwMj1jKCJBM1IiKSxzbW9vdGhpbmc9VFJVRSk8YnI+IGhlYWQoZG1sVGVzdCkgPGJyPn08YnI+
YGBgPC9wPjxwPiMjIyDlpJrmoLfmnKznmoTnu4RWU+WPpuS4gOS4que7hDwvcD48cD7ku6PnoIHl
poLkuIvvvIzph43opoHlsLHmmK/mnoTlu7rlr7nosaHlkozlgZrnu5/orqHmo4DpqozvvIzov5np
h4zmr5TovoMiQTBSX2QwIuWSjCJBM1JfZDAi57uE5Yir55qEMuS4quagt+acrO+8mjwvcD48cD5g
YGBSPGJyPmlmKFQpezxicj4gc25bYygxLDExLDIsMTIpXTxicj4gQlNvYmogJmx0Oy0gbWFrZUJT
c2VxRGF0YShhbGxEYXRbYygxLDExLDIsMTIpXSw8YnI+IHNuW2MoMSwxMSwyLDEyKV0gKVsxOjEw
MDAsXTxicj4gQlNvYmogPGJyPiBzYXZlKEJTb2JqLGZpbGUgPSAnZ3JvdXAtQlNvYmouUmRhdGEn
KTxicj4gZG1sVGVzdCAmbHQ7LSBETUx0ZXN0KEJTb2JqLCBncm91cDE9YygiQTBSX2QwX3JlcDEi
LCJBMFJfZDBfcmVwMiIpLDxicj4gZ3JvdXAyPWMoIkEzUl9kMF9yZXAxIiwiQTNSX2QwX3JlcDIi
KSxzbW9vdGhpbmc9Rik8YnI+IGhlYWQoZG1sVGVzdCkgPGJyPn08YnI+YGBgPC9wPjxwPiMjIyDl
pJrnp43mr5TovoPmlrnlvI88L3A+PHA+5Luj56CB5aaC5LiL77yM6YeN6KaB5LuN54S25piv5p6E
5bu65a+56LGh5ZKM5YGa57uf6K6h5qOA6aqM77yM5L2G5piv6ZyA6KaB5p6E5bu65bGe5oCn55+p
6Zi177yM6ICM5LiU5aKe5Yqg5LqGRE1MZml05q2l6aqk44CCPC9wPjxwPmBgYHI8YnI+IHNuPGJy
PiBzbltncmVwKCdyZXAnLHNuKV08YnI+IGNlbGxsaW5lPXN1YnN0cmluZyhzbltncmVwKCdyZXAn
LHNuKV0sMSwxKTxicj4gdHlwZT1zdWJzdHJpbmcoc25bZ3JlcCgncmVwJyxzbildLDIsMik8YnI+
IGRlc2lnbj1kYXRhLmZyYW1lKGNlbGxsaW5lPWNlbGxsaW5lLHR5cGU9dHlwZSk8YnI+IGRlc2ln
bjxicj5gYGA8L3A+PHA+5b6X5Yiw55qE5bGe5oCn55+p6Zi15aaC5LiL77yaPC9wPjxwPiFbaW1h
Z2UtMjAxOTA1MzExNjIyNDgxNDRdKGh0dHA6Ly93d3cuYmlvLWluZm8tdHJhaW5lZS5jb20vd3At
Y29udGVudC91cGxvYWRzLzIwMTkvMDUvaW1hZ2UtMjAxOTA1MzExNjIyNDgxNDQucG5nKTwvcD48
cD5gYGBSPGJyPiBzbjxicj4gc25bZ3JlcCgncmVwJyxzbildPGJyPiBjZWxsbGluZT1zdWJzdHJp
bmcoc25bZ3JlcCgncmVwJyxzbildLDEsMSk8YnI+IHR5cGU9c3Vic3RyaW5nKHNuW2dyZXAoJ3Jl
cCcsc24pXSwyLDIpPGJyPiBkZXNpZ249ZGF0YS5mcmFtZShjZWxsbGluZT1jZWxsbGluZSx0eXBl
PXR5cGUpPGJyPiBkZXNpZ248YnI+IDxicj4gIyDmnoTlu7rlr7nosaHnibnliKvogJfml7bvvJs8
YnI+IEJTb2JqICZsdDstIG1ha2VCU3NlcURhdGEoYWxsRGF0W2MoZ3JlcCgncmVwJyxzbikpXSw8
YnI+IHNuW2dyZXAoJ3JlcCcsc24pXSkgPGJyPiBCU29iaiA8YnI+IHNhdmUoQlNvYmosZmlsZSA9
ICdtdWx0aS1CU29iai5SZGF0YScpPGJyPiBsb2FkKGZpbGUgPSAnbXVsdGktQlNvYmouUmRhdGEn
KTxicj4gRE1MZml0PURNTGZpdC5tdWx0aUZhY3RvcihCU29iaixkZXNpZ24sPGJyPiBmb3JtdWxh
ID0gfmNlbGxsaW5lK3R5cGUrY2VsbGxpbmU6dHlwZSk8YnI+IDxicj4gY29sbmFtZXMoRE1MZml0
JFgpIDxicj4gIyDov5nph4zlj6/ku6Xkvb/nlKgg4oCYY29lZuKAmSwg4oCYdGVybeKAmSwgb3Ig
4oCYQ29udHJhc3TigJnmiJHku6zku4Xku4XmmK/mvJTnpLogY29lZjxicj4gZG1sVGVzdD1ETUx0
ZXN0Lm11bHRpRmFjdG9yKERNTGZpdCxjb2VmPTIpPGJyPiBoZWFkKGRtbFRlc3QpIDxicj5gYGA8
L3A+PHA+5YC85b6X5rOo5oSP55qE5pivYERNTHRlc3QubXVsdGlGYWN0b3Jg57uT5p6c5LiN6ZyA
6KaBYGNhbGxETUxg77yM5Y+q6ZyA6KaBYGNhbGxETVJg5Y2z5Y+vITwvcD48cD4jIyMg57uT5p6c
5LuL57uNPC9wPjxwPuS4jeeuoeaYr+WTquenjeavlOi+g++8jOacgOWQjumDveW+l+WIsGBkbWxU
ZXN0YOWPmOmHj+i1sOWQjumdoueahOa1geeoi++8jOWMheaLrOehruWumuaYvuiRl+W3ruW8gueU
suWfuuWMluWMuuWfn+WPiuWfuuWboO+8jOS7peWPiuWPr+inhuWMluWxleeOsO+8jOS7o+eggeWm
guS4i++8mjwvcD48cD5gYGByPGJyPiMgMy5DYWxsIERNTCBieSB1c2luZyBjYWxsRE1MIGZ1bmN0
aW9uLiBUaGUgcmVzdWx0cyBETUxzIGFyZSBzb3J0ZWQgYnkgdGhlIHNpZ25pZmljYW5jZS48YnI+
ZG1scyAmbHQ7LSBjYWxsRE1MKGRtbFRlc3QsIHAudGhyZXNob2xkPTAuMDAxKTxicj5oZWFkKGRt
bHMpPGJyPiMjVG8gZGV0ZWN0IGxvY2kgd2l0aCBkaWZmZXJlbmNlIGdyZWF0ZXIgdGhhbiAwLjEs
IGRvOjxicj5kbWxzMiAmbHQ7LSBjYWxsRE1MKGRtbFRlc3QsIGRlbHRhPTAuMSwgcC50aHJlc2hv
bGQ9MC4wMDEpPGJyPmhlYWQoZG1sczIpPC9wPjxwPiMgNC5DYWxsIERNUiBieSB1c2luZyBjYWxs
RE1MIGZ1bmN0aW9uPGJyPiMjUmVnaW9ucyB3aXRoIG1hbnkgc3RhdGlzdGljYWxseSBzaWduaWZp
Y2FudCBDcEcgc2l0ZXMgYXJlIGlkZW50aWZpZWQgYXMgRE1Scy48YnI+ZG1ycyAmbHQ7LSBjYWxs
RE1SKGRtbFRlc3QsIHAudGhyZXNob2xkPTAuMDEpPGJyPmhlYWQoZG1ycyk8YnI+IyNUbyBkZXRl
Y3QgcmVnaW9ucyB3aXRoIGRpZmZlcmVuY2UgZ3JlYXRlciB0aGFuIDAuMSwgZG86PGJyPmRtcnMy
ICZsdDstIGNhbGxETVIoZG1sVGVzdCwgZGVsdGE9MC4xLCBwLnRocmVzaG9sZD0wLjA1KTxicj5o
ZWFkKGRtcnMyKTwvcD48cD4jIDUuVGhlIERNUnMgY2FuIGJlIHZpc3VhbGl6ZWQgdXNpbmcgc2hv
d09uZURNUiBmdW5jdGlvbjxicj5zaG93T25lRE1SKGRtcnNbMSxdLCBCU29iaik8YnI+YGBgPC9w
PjxwPuW+iOaYjuaYvu+8jOWPguaVsOmDveaYr+WPr+S7peiwg+aVtOeahO+8jOe7n+iuoeWtpuaY
vuiRl+aAp+eahOmYiOWAvOiHquW3seaKiuaPoeOAgjwvcD48cD4jIyMg5L2c5LiaPC9wPjxwPuWI
huaekOaWh+eroCBFbmR1cmluZyBlcGlnZW5ldGljIGxhbmRtYXJrcyBkZWZpbmUgdGhlIGNhbmNl
ciBtaWNyb2Vudmlyb25tZW50IO+8jOaLv+WIsCDmgqPogIXpl7Tlt67lvILnlLLln7rljJbljLrl
n5/vvIhETVJz77yJ5LiOIERNUiDnm7jlhbPlt67lvILooajovr7ln7rlm6DvvIhERS1ETVJz77yJ
PC9wPg==">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4392.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>conda测试2019镜像问题</title>
		<link>http://www.bio-info-trainee.com/4130.html</link>
		<comments>http://www.bio-info-trainee.com/4130.html#comments</comments>
		<pubDate>Tue, 07 May 2019 04:00:39 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=4130</guid>
		<description><![CDATA[最近很多人反映conda镜像挂掉的问题，所以我有必要给粉丝测试一下： wget  &#8230; <a href="http://www.bio-info-trainee.com/4130.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<p style="margin: 0px 0px 1.2em !important;">最近很多人反映conda镜像挂掉的问题，所以我有必要给粉丝测试一下：<span id="more-4130"></span></p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-shell" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">代码正常安装即可！</p>
<p style="margin: 0px 0px 1.2em !important;">万一你没有把conda添加到环境变量，也可以重新安装，或者修改bashrc咯，我这里测试的用户就是 Ubuntu用户。</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">no change /home/ubuntu/miniconda3/condabin/conda
no change /home/ubuntu/miniconda3/bin/conda
no change /home/ubuntu/miniconda3/bin/conda-env
no change /home/ubuntu/miniconda3/bin/activate
no change /home/ubuntu/miniconda3/bin/deactivate
no change /home/ubuntu/miniconda3/etc/profile.d/conda.sh
no change /home/ubuntu/miniconda3/etc/fish/conf.d/conda.fish
no change /home/ubuntu/miniconda3/shell/condabin/Conda.psm1
no change /home/ubuntu/miniconda3/shell/condabin/conda-hook.ps1
no change /home/ubuntu/miniconda3/lib/python3.7/site-packages/xonsh/conda.xsh
no change /home/ubuntu/miniconda3/etc/profile.d/conda.csh
modified /home/ubuntu/.bashrc
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">这个文件被修改的很复杂，一般人可能是看不懂的，当然，也不需要懂。</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-shell" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;"># &gt;&gt;&gt; conda initialize &gt;&gt;&gt;
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/ubuntu/miniconda3/bin/conda' 'shell.bash' 'hook' 2&gt; /dev/null)"
if [ $? -eq 0 ]; then
 eval "$__conda_setup"
else
 if [ -f "/home/ubuntu/miniconda3/etc/profile.d/conda.sh" ]; then
 . "/home/ubuntu/miniconda3/etc/profile.d/conda.sh"
 else
 export PATH="/home/ubuntu/miniconda3/bin:$PATH"
 fi
fi
unset __conda_setup
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">默认的镜像可以查看： <code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">conda config --show channels</code></p>
<h3 id="conda-base-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">conda的base环境可以被抑制</h3>
<p style="margin: 0px 0px 1.2em !important;">这样就不用担心conda污染我们的系统了</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-shell" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">source /home/ubuntu/.bashrc
conda config --set auto_activate_base false
</code></pre>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">尝试下载安装</h3>
<p style="margin: 0px 0px 1.2em !important;">首先测试 <code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">conda install numpy</code> 发现速度非常棒！</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;"> blas pkgs/main/linux-64::blas-1.0-mkl
 intel-openmp pkgs/main/linux-64::intel-openmp-2019.3-199
 libgfortran-ng pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0
 mkl pkgs/main/linux-64::mkl-2019.3-199
 mkl_fft pkgs/main/linux-64::mkl_fft-1.0.12-py37ha843d7b_0
 mkl_random pkgs/main/linux-64::mkl_random-1.0.2-py37hd81dba3_0
 numpy pkgs/main/linux-64::numpy-1.16.3-py37h7e9f1db_0
 numpy-base pkgs/main/linux-64::numpy-base-1.16.3-py37hde5b4d6_0
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">这个 mkl-2019.3 | 203.3 MB 都是几秒钟就下载OK了。</p>
<p style="margin: 0px 0px 1.2em !important;">再尝试fastqc等生物信息学软件。</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code class="hljs language-shell" style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block; overflow: auto; overflow-x: auto; color: #333333; background: #f8f8f8; text-size-adjust: none;">conda deactivate
conda create -n qc
conda activate qc
conda info
conda install fastqc -c bioconda
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">速度也飞快，这个时候我的镜像是：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">custom_channels:
 pkgs/main: https://repo.anaconda.com
 pkgs/free: https://repo.anaconda.com
 pkgs/r: https://repo.anaconda.com
 pkgs/pro: https://repo.anaconda.com
custom_multichannels:
 defaults:
 - https://repo.anaconda.com/pkgs/main
 - https://repo.anaconda.com/pkgs/free
 - https://repo.anaconda.com/pkgs/r
 local:
debug: False
default_channels:
 - https://repo.anaconda.com/pkgs/main
 - https://repo.anaconda.com/pkgs/free
 - https://repo.anaconda.com/pkgs/r
default_python: 3.7
</code></pre>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">尝试修改镜像</h3>
<p style="margin: 0px 0px 1.2em !important;">首先看看清华镜像</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
conda config --set show_channel_urls yes
conda config --show channels
conda deactivate
conda create -n qinghua
conda activate qinghua
conda install fastqc -c bioconda
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">虽然慢了一点，但实际上还是可以使用的！</p>
<p style="margin: 0px 0px 1.2em !important;">值得一提的是这个时候安装的是 <code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">openjdk-11.0.1</code> ， 而 默认镜像安装的是：<code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">openjdk-8.0.152</code></p>
<p style="margin: 0px 0px 1.2em !important;">再 看看腾讯的镜像：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">conda config --add channels https://mirrors.cloud.tencent.com/anaconda/pkgs/free/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/bioconda/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/msys2/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/menpo/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/peterjc123/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/pkgs/main/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.cloud.tencent.com/anaconda/cloud/pytorch/
conda config --set show_channel_urls yes
conda config --show channels
conda deactivate
conda create -n tercent -y 
conda activate tercent 
conda install fastqc -c bioconda
</code></pre>
<p style="margin: 0px 0px 1.2em !important;">是不可访问的！！</p>
<h3 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.3em;">删除已经添加的镜像</h3>
<p style="margin: 0px 0px 1.2em !important;">既然腾讯镜像错误，那么就需要删除，其实就是编辑 文件 <code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; display: inline;">~/.condarc</code> 即可</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">conda config --show channels
</code></pre>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5pyA6L+R5b6I5aSa5Lq65Y+N5pigY29uZGHplZzlg4/mjILmjonnmoTpl67popjvvIzmiYDk
u6XmiJHmnInlv4XopoHnu5nnsonkuJ3mtYvor5XkuIDkuIvvvJo8aW1nIGNsYXNzPSJ3cC1tb3Jl
LXRhZyBtY2Utd3AtbW9yZSIgdGl0bGU9IumYheivu+abtOWkmuKApiIgc3JjPSJkYXRhOmltYWdl
L2dpZjtiYXNlNjQsUjBsR09EbGhBUUFCQUlBQUFBQUFBUC8vL3lINUJBRUFBQUFBTEFBQUFBQUJB
QUVBQUFJQlJBQTciIGFsdD0iIiBkYXRhLXdwLW1vcmU9Im1vcmUiIGRhdGEtbWNlLXJlc2l6ZT0i
ZmFsc2UiIGRhdGEtbWNlLXBsYWNlaG9sZGVyPSIxIiBkYXRhLW1jZS1zcmM9ImRhdGE6aW1hZ2Uv
Z2lmO2Jhc2U2NCxSMGxHT0RsaEFRQUJBSUFBQUFBQUFQLy8veUg1QkFFQUFBQUFMQUFBQUFBQkFB
RUFBQUlCUkFBNyI+PGJyPjwvcD48cD5gYGBzaGVsbDxicj53Z2V0IGh0dHBzOi8vcmVwby5hbmFj
b25kYS5jb20vbWluaWNvbmRhL01pbmljb25kYTMtbGF0ZXN0LUxpbnV4LXg4Nl82NC5zaDxicj5i
YXNoIE1pbmljb25kYTMtbGF0ZXN0LUxpbnV4LXg4Nl82NC5zaDxicj5gYGA8L3A+PHA+5Luj56CB
5q2j5bi45a6J6KOF5Y2z5Y+v77yBPC9wPjxwPuS4h+S4gOS9oOayoeacieaKimNvbmRh5re75Yqg
5Yiw546v5aKD5Y+Y6YeP77yM5Lmf5Y+v5Lul6YeN5paw5a6J6KOF77yM5oiW6ICF5L+u5pS5YmFz
aHJj5ZKv77yM5oiR6L+Z6YeM5rWL6K+V55qE55So5oi35bCx5pivIFVidW50deeUqOaIt+OAgjwv
cD48cD5gYGA8YnI+bm8gY2hhbmdlIC9ob21lL3VidW50dS9taW5pY29uZGEzL2NvbmRhYmluL2Nv
bmRhPGJyPm5vIGNoYW5nZSAvaG9tZS91YnVudHUvbWluaWNvbmRhMy9iaW4vY29uZGE8YnI+bm8g
Y2hhbmdlIC9ob21lL3VidW50dS9taW5pY29uZGEzL2Jpbi9jb25kYS1lbnY8YnI+bm8gY2hhbmdl
IC9ob21lL3VidW50dS9taW5pY29uZGEzL2Jpbi9hY3RpdmF0ZTxicj5ubyBjaGFuZ2UgL2hvbWUv
dWJ1bnR1L21pbmljb25kYTMvYmluL2RlYWN0aXZhdGU8YnI+bm8gY2hhbmdlIC9ob21lL3VidW50
dS9taW5pY29uZGEzL2V0Yy9wcm9maWxlLmQvY29uZGEuc2g8YnI+bm8gY2hhbmdlIC9ob21lL3Vi
dW50dS9taW5pY29uZGEzL2V0Yy9maXNoL2NvbmYuZC9jb25kYS5maXNoPGJyPm5vIGNoYW5nZSAv
aG9tZS91YnVudHUvbWluaWNvbmRhMy9zaGVsbC9jb25kYWJpbi9Db25kYS5wc20xPGJyPm5vIGNo
YW5nZSAvaG9tZS91YnVudHUvbWluaWNvbmRhMy9zaGVsbC9jb25kYWJpbi9jb25kYS1ob29rLnBz
MTxicj5ubyBjaGFuZ2UgL2hvbWUvdWJ1bnR1L21pbmljb25kYTMvbGliL3B5dGhvbjMuNy9zaXRl
LXBhY2thZ2VzL3hvbnNoL2NvbmRhLnhzaDxicj5ubyBjaGFuZ2UgL2hvbWUvdWJ1bnR1L21pbmlj
b25kYTMvZXRjL3Byb2ZpbGUuZC9jb25kYS5jc2g8YnI+bW9kaWZpZWQgL2hvbWUvdWJ1bnR1Ly5i
YXNocmM8YnI+YGBgPC9wPjxwPui/meS4quaWh+S7tuiiq+S/ruaUueeahOW+iOWkjeadgu+8jOS4
gOiIrOS6uuWPr+iDveaYr+eci+S4jeaHgueahO+8jOW9k+eEtu+8jOS5n+S4jemcgOimgeaHguOA
gjwvcD48cD5gYGBzaGVsbDxicj4jICZndDsmZ3Q7Jmd0OyBjb25kYSBpbml0aWFsaXplICZndDsm
Z3Q7Jmd0Ozxicj4jICEhIENvbnRlbnRzIHdpdGhpbiB0aGlzIGJsb2NrIGFyZSBtYW5hZ2VkIGJ5
ICdjb25kYSBpbml0JyAhITxicj5fX2NvbmRhX3NldHVwPSIkKCcvaG9tZS91YnVudHUvbWluaWNv
bmRhMy9iaW4vY29uZGEnICdzaGVsbC5iYXNoJyAnaG9vaycgMiZndDsgL2Rldi9udWxsKSI8YnI+
aWYgWyAkPyAtZXEgMCBdOyB0aGVuPGJyPiBldmFsICIkX19jb25kYV9zZXR1cCI8YnI+ZWxzZTxi
cj4gaWYgWyAtZiAiL2hvbWUvdWJ1bnR1L21pbmljb25kYTMvZXRjL3Byb2ZpbGUuZC9jb25kYS5z
aCIgXTsgdGhlbjxicj4gLiAiL2hvbWUvdWJ1bnR1L21pbmljb25kYTMvZXRjL3Byb2ZpbGUuZC9j
b25kYS5zaCI8YnI+IGVsc2U8YnI+IGV4cG9ydCBQQVRIPSIvaG9tZS91YnVudHUvbWluaWNvbmRh
My9iaW46JFBBVEgiPGJyPiBmaTxicj5maTxicj51bnNldCBfX2NvbmRhX3NldHVwPGJyPmBgYDwv
cD48cD7pu5jorqTnmoTplZzlg4/lj6/ku6Xmn6XnnIvvvJogYGNvbmRhIGNvbmZpZyAtLXNob3cg
Y2hhbm5lbHNgPC9wPjxwPiMjIyBjb25kYeeahGJhc2Xnjq/looPlj6/ku6XooqvmipHliLY8L3A+
PHA+6L+Z5qC35bCx5LiN55So5ouF5b+DY29uZGHmsaHmn5PmiJHku6znmoTns7vnu5/kuoY8L3A+
PHA+YGBgc2hlbGw8YnI+c291cmNlIC9ob21lL3VidW50dS8uYmFzaHJjPGJyPmNvbmRhIGNvbmZp
ZyAtLXNldCBhdXRvX2FjdGl2YXRlX2Jhc2UgZmFsc2U8YnI+YGBgPC9wPjxwPiMjIyDlsJ3or5Xk
uIvovb3lronoo4U8L3A+PHA+6aaW5YWI5rWL6K+VIGAgY29uZGEgaW5zdGFsbCBudW1weWAg5Y+R
546w6YCf5bqm6Z2e5bi45qOS77yBPC9wPjxwPmBgYDxicj4gYmxhcyBwa2dzL21haW4vbGludXgt
NjQ6OmJsYXMtMS4wLW1rbDxicj4gaW50ZWwtb3Blbm1wIHBrZ3MvbWFpbi9saW51eC02NDo6aW50
ZWwtb3Blbm1wLTIwMTkuMy0xOTk8YnI+IGxpYmdmb3J0cmFuLW5nIHBrZ3MvbWFpbi9saW51eC02
NDo6bGliZ2ZvcnRyYW4tbmctNy4zLjAtaGRmNjNjNjBfMDxicj4gbWtsIHBrZ3MvbWFpbi9saW51
eC02NDo6bWtsLTIwMTkuMy0xOTk8YnI+IG1rbF9mZnQgcGtncy9tYWluL2xpbnV4LTY0Ojpta2xf
ZmZ0LTEuMC4xMi1weTM3aGE4NDNkN2JfMDxicj4gbWtsX3JhbmRvbSBwa2dzL21haW4vbGludXgt
NjQ6Om1rbF9yYW5kb20tMS4wLjItcHkzN2hkODFkYmEzXzA8YnI+IG51bXB5IHBrZ3MvbWFpbi9s
aW51eC02NDo6bnVtcHktMS4xNi4zLXB5MzdoN2U5ZjFkYl8wPGJyPiBudW1weS1iYXNlIHBrZ3Mv
bWFpbi9saW51eC02NDo6bnVtcHktYmFzZS0xLjE2LjMtcHkzN2hkZTViNGQ2XzA8YnI+YGBgPC9w
PjxwPui/meS4qiBta2wtMjAxOS4zIHwgMjAzLjMgTUIg6YO95piv5Yeg56eS6ZKf5bCx5LiL6L29
T0vkuobjgII8L3A+PHA+5YaN5bCd6K+VZmFzdHFj562J55Sf54mp5L+h5oGv5a2m6L2v5Lu244CC
PC9wPjxwPmBgYHNoZWxsPGJyPmNvbmRhIGRlYWN0aXZhdGU8YnI+Y29uZGEgY3JlYXRlIC1uIHFj
PGJyPmNvbmRhIGFjdGl2YXRlIHFjPGJyPmNvbmRhIGluZm88YnI+Y29uZGEgaW5zdGFsbCBmYXN0
cWMgLWMgYmlvY29uZGE8YnI+YGBgPC9wPjxwPumAn+W6puS5n+mjnuW/q++8jOi/meS4quaXtuWA
meaIkeeahOmVnOWDj+aYr++8mjwvcD48cD5gYGA8YnI+Y3VzdG9tX2NoYW5uZWxzOjxicj4gcGtn
cy9tYWluOiBodHRwczovL3JlcG8uYW5hY29uZGEuY29tPGJyPiBwa2dzL2ZyZWU6IGh0dHBzOi8v
cmVwby5hbmFjb25kYS5jb208YnI+IHBrZ3MvcjogaHR0cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbTxi
cj4gcGtncy9wcm86IGh0dHBzOi8vcmVwby5hbmFjb25kYS5jb208YnI+Y3VzdG9tX211bHRpY2hh
bm5lbHM6PGJyPiBkZWZhdWx0czo8YnI+IC0gaHR0cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dz
L21haW48YnI+IC0gaHR0cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dzL2ZyZWU8YnI+IC0gaHR0
cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dzL3I8YnI+IGxvY2FsOjxicj5kZWJ1ZzogRmFsc2U8
YnI+ZGVmYXVsdF9jaGFubmVsczo8YnI+IC0gaHR0cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dz
L21haW48YnI+IC0gaHR0cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dzL2ZyZWU8YnI+IC0gaHR0
cHM6Ly9yZXBvLmFuYWNvbmRhLmNvbS9wa2dzL3I8YnI+ZGVmYXVsdF9weXRob246IDMuNzxicj5g
YGA8L3A+PHA+IyMjIOWwneivleS/ruaUuemVnOWDjzwvcD48cD7pppblhYjnnIvnnIvmuIXljY7p
lZzlg488L3A+PHA+YGBgPGJyPmNvbmRhIGNvbmZpZyAtLWFkZCBjaGFubmVscyBodHRwczovL21p
cnJvcnMudHVuYS50c2luZ2h1YS5lZHUuY24vYW5hY29uZGEvcGtncy9mcmVlPGJyPmNvbmRhIGNv
bmZpZyAtLWFkZCBjaGFubmVscyBodHRwczovL21pcnJvcnMudHVuYS50c2luZ2h1YS5lZHUuY24v
YW5hY29uZGEvY2xvdWQvY29uZGEtZm9yZ2U8YnI+Y29uZGEgY29uZmlnIC0tYWRkIGNoYW5uZWxz
IGh0dHBzOi8vbWlycm9ycy50dW5hLnRzaW5naHVhLmVkdS5jbi9hbmFjb25kYS9jbG91ZC9iaW9j
b25kYTxicj5jb25kYSBjb25maWcgLS1zZXQgc2hvd19jaGFubmVsX3VybHMgeWVzPGJyPmNvbmRh
IGNvbmZpZyAtLXNob3cgY2hhbm5lbHM8YnI+Y29uZGEgZGVhY3RpdmF0ZTxicj5jb25kYSBjcmVh
dGUgLW4gcWluZ2h1YTxicj5jb25kYSBhY3RpdmF0ZSBxaW5naHVhPGJyPmNvbmRhIGluc3RhbGwg
ZmFzdHFjIC1jIGJpb2NvbmRhPGJyPmBgYDwvcD48cD7omb3nhLbmhaLkuobkuIDngrnvvIzkvYbl
rp7pmYXkuIrov5jmmK/lj6/ku6Xkvb/nlKjnmoTvvIE8L3A+PHA+5YC85b6X5LiA5o+Q55qE5piv
6L+Z5Liq5pe25YCZ5a6J6KOF55qE5pivIGBvcGVuamRrLTExLjAuMWAg77yMIOiAjCDpu5jorqTp
lZzlg4/lronoo4XnmoTmmK/vvJpgb3Blbmpkay04LjAuMTUyYDwvcD48cD7lho0g55yL55yL6IW+
6K6v55qE6ZWc5YOP77yaPC9wPjxwPmBgYDxicj5jb25kYSBjb25maWcgLS1hZGQgY2hhbm5lbHMg
aHR0cHM6Ly9taXJyb3JzLmNsb3VkLnRlbmNlbnQuY29tL2FuYWNvbmRhL3BrZ3MvZnJlZS88YnI+
Y29uZGEgY29uZmlnIC0tYWRkIGNoYW5uZWxzIGh0dHBzOi8vbWlycm9ycy5jbG91ZC50ZW5jZW50
LmNvbS9hbmFjb25kYS9jbG91ZC9iaW9jb25kYS88YnI+Y29uZGEgY29uZmlnIC0tYWRkIGNoYW5u
ZWxzIGh0dHBzOi8vbWlycm9ycy5jbG91ZC50ZW5jZW50LmNvbS9hbmFjb25kYS9jbG91ZC9tc3lz
Mi88YnI+Y29uZGEgY29uZmlnIC0tYWRkIGNoYW5uZWxzIGh0dHBzOi8vbWlycm9ycy5jbG91ZC50
ZW5jZW50LmNvbS9hbmFjb25kYS9jbG91ZC9tZW5wby88YnI+Y29uZGEgY29uZmlnIC0tYWRkIGNo
YW5uZWxzIGh0dHBzOi8vbWlycm9ycy5jbG91ZC50ZW5jZW50LmNvbS9hbmFjb25kYS9jbG91ZC9w
ZXRlcmpjMTIzLzxicj5jb25kYSBjb25maWcgLS1hZGQgY2hhbm5lbHMgaHR0cHM6Ly9taXJyb3Jz
LmNsb3VkLnRlbmNlbnQuY29tL2FuYWNvbmRhL3BrZ3MvbWFpbi88YnI+Y29uZGEgY29uZmlnIC0t
YWRkIGNoYW5uZWxzIGh0dHBzOi8vbWlycm9ycy5jbG91ZC50ZW5jZW50LmNvbS9hbmFjb25kYS9j
bG91ZC9jb25kYS1mb3JnZS88YnI+Y29uZGEgY29uZmlnIC0tYWRkIGNoYW5uZWxzIGh0dHBzOi8v
bWlycm9ycy5jbG91ZC50ZW5jZW50LmNvbS9hbmFjb25kYS9jbG91ZC9weXRvcmNoLzxicj5jb25k
YSBjb25maWcgLS1zZXQgc2hvd19jaGFubmVsX3VybHMgeWVzPGJyPmNvbmRhIGNvbmZpZyAtLXNo
b3cgY2hhbm5lbHM8YnI+Y29uZGEgZGVhY3RpdmF0ZTxicj5jb25kYSBjcmVhdGUgLW4gdGVyY2Vu
dCAteSA8YnI+Y29uZGEgYWN0aXZhdGUgdGVyY2VudCA8YnI+Y29uZGEgaW5zdGFsbCBmYXN0cWMg
LWMgYmlvY29uZGE8YnI+YGBgPC9wPjxwPuaYr+S4jeWPr+iuv+mXrueahO+8ge+8gTwvcD48cD4j
IyMg5Yig6Zmk5bey57uP5re75Yqg55qE6ZWc5YOPPC9wPjxwPuaXoueEtuiFvuiur+mVnOWDj+mU
meivr++8jOmCo+S5iOWwsemcgOimgeWIoOmZpO+8jOWFtuWunuWwseaYr+e8lui+kSDmlofku7Yg
YH4vLmNvbmRhcmNgIOWNs+WPrzwvcD48cD5gYGBgPGJyPmNvbmRhIGNvbmZpZyAtLXNob3cgY2hh
bm5lbHM8YnI+YGBgYDwvcD4=">​</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/4130.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>为什么要学编程-excel基因名错误的优雅解决方案</title>
		<link>http://www.bio-info-trainee.com/2997.html</link>
		<comments>http://www.bio-info-trainee.com/2997.html#comments</comments>
		<pubDate>Tue, 30 Jan 2018 05:50:42 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=2997</guid>
		<description><![CDATA[为什么要学编程 帮同学处理一下他从公司拿到的差异分析结果，当然，给我的是Exce &#8230; <a href="http://www.bio-info-trainee.com/2997.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<h1 class="md-end-block md-heading">为什么要学编程</h1>
<p><span class="md-line md-end-block">帮同学处理一下他从公司拿到的差异分析结果，当然，给我的是Excel表格，老规矩，导出csv然后读入R，然后准备顺手画个火山图，做个GO/KEGG富集分析。下意识的看了看数据结构，然后顺手按照基因名排序了一下，哈哈哈~</span><span id="more-2997"></span></p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2018/01/geneSymbol-error.jpeg"><img class="alignnone size-full wp-image-2998" src="http://www.bio-info-trainee.com/wp-content/uploads/2018/01/geneSymbol-error.jpeg" alt="genesymbol-error" width="1422" height="584" /></a></p>
<p><span class="md-line md-end-block">这是一个大坑。</span></p>
<p><span class="md-line md-end-block">就因为这个还有两篇文章；</span></p>
<ul class="ul-list" data-mark="-">
<li><span class="md-line md-end-block"><span class=""><a spellcheck="false" href="https://doi.org/10.1186/1471-2105-5-80">Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics</a></span> 2004年</span></li>
<li class=""><span class="md-line md-end-block"><span class=""><a spellcheck="false" href="https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7">Gene name errors are widespread in the scientific literature</a></span><span class=""> 2016年</span></span></li>
</ul>
<p><span class="md-line md-end-block">也有人在论坛上面发问，高达2K的阅读量： <span spellcheck="false"><a href="https://www.biostars.org/p/211861/">https://www.biostars.org/p/211861/</a></span> </span></p>
<blockquote><p><span class="md-line md-end-block">Some gene names start with APR<span class=""><em>/MARC</em></span><span class="">/SEPT* etc default converted into date format.</span></span></p></blockquote>
<p><span class="md-line md-end-block">我们生信技能树论坛也有人分享过： <span class=""><a spellcheck="false" href="http://www.biotrainee.com/thread-908-1-1.html"><span class=""><em>Excel</em></span>-坑你的基因名没商量！</a></span></span></p>
<pre class="md-fences md-end-block" lang="" contenteditable="false">随意篡改20%的遗传学论文!
​
可就在今年8月份，三位科学家在《Genome Biology》期刊上发表论文，称他们发现20%的遗传学论文包含了Excel软件导致的基因名转换错误。他们对论文进行的扫描显示，科学文献中的基因名错误十分普遍，在默认设置下Excel软件会将基因的名字转换成日期或浮点数。
​
举例来说，基因名字SEPT2和MARCH1会被分别转换成2-Sep和1-Mar;标识符2310009E1被转换成浮点数2.31E+13。</pre>
<p><span class="md-line md-end-block">但是，如果你会编程的话，事情就很简单咯</span></p>
<pre class="md-fences md-end-block" lang="R" contenteditable="false"><span class="cm-variable">a</span><span class="cm-operator cm-dollar">$</span><span class="cm-variable">Gene.Symbol</span><span class="cm-operator">=</span><span class="cm-variable">unlist</span>(<span class="cm-variable">lapply</span>(<span class="cm-variable">as.character</span>(<span class="cm-variable">a</span><span class="cm-operator cm-dollar">$</span><span class="cm-variable">gene_assignment</span>),<span class="cm-keyword">function</span>(<span class="cm-variable">x</span>){<span class="cm-variable">trimws</span>(<span class="cm-variable">strsplit</span>(<span class="cm-variable">x</span>,<span class="cm-string">'//'</span>)[[<span class="cm-number">1</span>]][<span class="cm-number">2</span>])}))</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/2997.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>找变异的流程</title>
		<link>http://www.bio-info-trainee.com/2790.html</link>
		<comments>http://www.bio-info-trainee.com/2790.html#comments</comments>
		<pubDate>Mon, 30 Oct 2017 02:55:56 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[变异]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=2790</guid>
		<description><![CDATA[找变异简单点说，就是把高通量测序得到的成千上万条序列片段比对到合适的参考基因组， &#8230; <a href="http://www.bio-info-trainee.com/2790.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div class="markdown-here-wrapper" data-md-url="http://www.bio-info-trainee.com/wp-admin/post-new.php">
<p style="margin: 0px 0px 1.2em !important;">找变异简单点说，就是把高通量测序得到的成千上万条序列片段比对到合适的参考基因组，找到那些成</p>
<p style="margin: 0px 0px 1.2em !important;">功比对的片段与参考基因组的微小差异情况。 那么就涉及到存储测序数据的fastq数据格式，比对的工具，比对后的sam格式，找微小差异的工具，差异结果的vcf文件，每个步骤的软件选择，参数 调整。当然，最重要的是走通整个流程，明白自己在做什么。</p>
<p style="margin: 0px 0px 1.2em !important;"><span id="more-2790"></span></p>
<h1 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.6em; border-bottom: 1px solid #dddddd;">一个模拟项目</h1>
<ul style="margin: 1.2em 0px; padding-left: 2em;">
<li style="margin: 0.5em 0px;">首先下载X,Y染色体的fasta序列，在UCSC上面下载即可。</li>
<li style="margin: 0.5em 0px;">然后把X染色体构建bwa的索引</li>
<li style="margin: 0.5em 0px;">接着模拟一个Y染色体的测序数据，模拟的程序很简单,模拟Y染色体的测序片段（PE100，insert400）</li>
<li style="margin: 0.5em 0px;">然后把模拟测序数据比对到X染色体的参考，统计一下比对结果。</li>
<li style="margin: 0.5em 0px;">最后对比对成功的bam文件进行找变异位点。</li>
</ul>
<p style="margin: 0px 0px 1.2em !important;">代码如下：</p>
<pre style="font-size: 1em; font-family: Consolas, Inconsolata, Courier, monospace; line-height: 1.2em; margin: 1.2em 0px;"><code style="font-size: 0.85em; font-family: Consolas, Inconsolata, Courier, monospace; margin: 0px 0.15em; padding: 0.5em 0.7em; white-space: pre; border: 1px solid #cccccc; background-color: #f8f8f8; border-radius: 3px; display: block !important; overflow: auto;">## 源代码方式安装 bwa-0.7.15 
## conda安装samtools
cd tmp/chrX_Y/hg19/
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chrX.fa.gz 
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chrY.fa.gz 
gunzip chrX.fa.gz
gunzip chrY.fa.gz
~/biosoft/bwa/bwa-0.7.15/bwa index chrX.fa
perl simulate.pl chrY.fa ## 这个perl脚本在 http://www.bio-info-trainee.com/wp-content/uploads/2015/10/tmp.png 
~/biosoft/bwa/bwa-0.7.15/bwa mem -t 5 -M chrX.fa read*.fa &gt;read.sam
samtools view -bS read.sam &gt;read.bam
samtools flagstat read.bam
samtools sort -@ 5 -o read.sorted.bam read.bam
samtools view -h -F4 -q 5 read.sorted.bam |samtools view -bS|samtools rmdup - read.filter.rmdup.bam
samtools index read.filter.rmdup.bam
samtools mpileup -ugf ~/tmp/chrX_Y/hg19/chrX.fa read.filter.rmdup.bam |bcftools call -vmO z -o read.bcftools.vcf.gz
## 把fa/bam/vcf 载入到 IGV 进行可视化，截图其中一个变异位点
## 参考 http://www.biotrainee.com/thread-696-1-1.html
</code></pre>
<h1 id="-" style="margin: 1.3em 0px 1em; padding: 0px; font-weight: bold; font-size: 1.6em; border-bottom: 1px solid #dddddd;">变异寻找的流程</h1>
<p style="margin: 0px 0px 1.2em !important;">完整的流程可以很复杂：</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="http://www.bio-info-trainee.com/wp-content/uploads/2017/10/Workflow-for-pharmacogenomics-using-WES-or-WGS-After-mapping-to-the-reference-sequence.jpg"><img class="alignnone size-full wp-image-2796" src="http://www.bio-info-trainee.com/wp-content/uploads/2017/10/Workflow-for-pharmacogenomics-using-WES-or-WGS-After-mapping-to-the-reference-sequence.jpg" alt="workflow-for-pharmacogenomics-using-wes-or-wgs-after-mapping-to-the-reference-sequence" width="600" height="439" /></a></p>
<p style="margin: 0px 0px 1.2em !important;">仅是上变异寻找流程就可以很复杂：</p>
<p style="margin: 0px 0px 1.2em !important;"><a href="http://www.bio-info-trainee.com/wp-content/uploads/2017/10/Variant-analysis-workflow-specifications.png"><img class="alignnone size-full wp-image-2793" src="http://www.bio-info-trainee.com/wp-content/uploads/2017/10/Variant-analysis-workflow-specifications.png" alt="variant-analysis-workflow-specifications" width="570" height="519" /></a></p>
<p style="margin: 0px 0px 1.2em !important;">来自于2017年发表于BMC Bioinformatics的文章 <a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1454-2">MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants</a></p>
<div style="height: 0; width: 0; max-height: 0; max-width: 0; overflow: hidden; font-size: 0em; padding: 0; margin: 0;" title="MDH:PHA+5om+5Y+Y5byC566A5Y2V54K56K+077yM5bCx5piv5oqK6auY6YCa6YeP5rWL5bqP5b6X5Yiw
55qE5oiQ5Y2D5LiK5LiH5p2h5bqP5YiX54mH5q615q+U5a+55Yiw5ZCI6YCC55qE5Y+C6ICD5Z+6
5Zug57uE77yM5om+5Yiw6YKj5Lqb5oiQ5Yqf5q+U5a+555qE54mH5q615LiO5Y+C6ICD5Z+65Zug
57uE55qE5b6u5bCP5beu5byC5oOF5Ya144CCPC9wPjxwPiMg5LiA5Liq5qih5ouf6aG555uuPC9w
PjxwPi0g6aaW5YWI5LiL6L29WCxZ5p+T6Imy5L2T55qEZmFzdGHluo/liJfvvIzlnKhVQ1ND5LiK
6Z2i5LiL6L295Y2z5Y+v44CCIDxicj4tIOeEtuWQjuaKiljmn5PoibLkvZPmnoTlu7pid2HnmoTn
tKLlvJU8YnI+LSDmjqXnnYDmqKHmi5/kuIDkuKpZ5p+T6Imy5L2T55qE5rWL5bqP5pWw5o2u77yM
5qih5ouf55qE56iL5bqP5b6I566A5Y2VLOaooeaLn1nmn5PoibLkvZPnmoTmtYvluo/niYfmrrXv
vIhQRTEwMO+8jGluc2VydDQwMO+8iSA8YnI+LSDnhLblkI7miormqKHmi5/mtYvluo/mlbDmja7m
r5Tlr7nliLBY5p+T6Imy5L2T55qE5Y+C6ICD77yM57uf6K6h5LiA5LiL5q+U5a+557uT5p6c44CC
PGJyPi0g5pyA5ZCO5a+55q+U5a+55oiQ5Yqf55qEYmFt5paH5Lu26L+b6KGM5om+5Y+Y5byC5L2N
54K544CCPC9wPjxwPuS7o+eggeWmguS4i++8mjwvcD48cD5gYGA8YnI+IyMg5rqQ5Luj56CB5pa5
5byP5a6J6KOFIGJ3YS0wLjcuMTUgPGJyPiMjIGNvbmRh5a6J6KOFc2FtdG9vbHM8YnI+Y2QgdG1w
L2NoclhfWS9oZzE5Lzxicj53Z2V0IGh0dHA6Ly9oZ2Rvd25sb2FkLmNzZS51Y3NjLmVkdS9nb2xk
ZW5QYXRoL2hnMTkvY2hyb21vc29tZXMvY2hyWC5mYS5neiA8YnI+d2dldCBodHRwOi8vaGdkb3du
bG9hZC5jc2UudWNzYy5lZHUvZ29sZGVuUGF0aC9oZzE5L2Nocm9tb3NvbWVzL2NoclkuZmEuZ3og
PGJyPmd1bnppcCBjaHJYLmZhLmd6PGJyPmd1bnppcCBjaHJZLmZhLmd6PGJyPn4vYmlvc29mdC9i
d2EvYndhLTAuNy4xNS9id2EgaW5kZXggY2hyWC5mYTxicj5wZXJsIHNpbXVsYXRlLnBsIGNoclku
ZmEgIyMg6L+Z5LiqcGVybOiEmuacrOWcqCBodHRwOi8vd3d3LmJpby1pbmZvLXRyYWluZWUuY29t
L3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE1LzEwL3RtcC5wbmcgPGJyPn4vYmlvc29mdC9id2EvYndh
LTAuNy4xNS9id2EgbWVtIC10IDUgLU0gY2hyWC5mYSByZWFkKi5mYSAmZ3Q7cmVhZC5zYW08YnI+
c2FtdG9vbHMgdmlldyAtYlMgcmVhZC5zYW0gJmd0O3JlYWQuYmFtPGJyPnNhbXRvb2xzIGZsYWdz
dGF0IHJlYWQuYmFtPGJyPnNhbXRvb2xzIHNvcnQgLUAgNSAtbyByZWFkLnNvcnRlZC5iYW0gcmVh
ZC5iYW08YnI+c2FtdG9vbHMgdmlldyAtaCAtRjQgLXEgNSByZWFkLnNvcnRlZC5iYW0gfHNhbXRv
b2xzIHZpZXcgLWJTfHNhbXRvb2xzIHJtZHVwIC0gcmVhZC5maWx0ZXIucm1kdXAuYmFtPGJyPnNh
bXRvb2xzIGluZGV4IHJlYWQuZmlsdGVyLnJtZHVwLmJhbTxicj5zYW10b29scyBtcGlsZXVwIC11
Z2Ygfi90bXAvY2hyWF9ZL2hnMTkvY2hyWC5mYSByZWFkLmZpbHRlci5ybWR1cC5iYW0gfGJjZnRv
b2xzIGNhbGwgLXZtTyB6IC1vIHJlYWQuYmNmdG9vbHMudmNmLmd6PGJyPiMjIOaKimZhL2JhbS92
Y2Yg6L295YWl5YiwIElHViDov5vooYzlj6/op4bljJbvvIzmiKrlm77lhbbkuK3kuIDkuKrlj5jl
vILkvY3ngrk8YnI+IyMg5Y+C6ICDIGh0dHA6Ly93d3cuYmlvdHJhaW5lZS5jb20vdGhyZWFkLTY5
Ni0xLTEuaHRtbDxicj5gYGA8L3A+PHA+IyDlj5jlvILlr7vmib7nmoTmtYHnqIs8L3A+PHA+5a6M
5pW055qE5rWB56iL5Y+v5Lul5b6I5aSN5p2C77yaPC9wPjxwPiFb5a6M5pW055qE5rWB56iLXShp
bWFnZS9Xb3JrZmxvdy1mb3ItcGhhcm1hY29nZW5vbWljcy11c2luZy1XRVMtb3ItV0dTLUFmdGVy
LW1hcHBpbmctdG8tdGhlLXJlZmVyZW5jZS1zZXF1ZW5jZS5wbmcpPC9wPjxwPuS7heS7heaYr+S4
iua4uOeahOWPmOW8guWvu+aJvua1geeoi+WwseWPr+S7peW+iOWkjeadgu+8mjwvcD48cD4hW+S4
iua4uOa1geeoi+ivpue7huWMll0oaW1hZ2UvVmFyaWFudC1hbmFseXNpcy13b3JrZmxvdy1zcGVj
aWZpY2F0aW9ucy5wbmcpPC9wPjxwPuadpeiHquS6jjIwMTflubTlj5Hooajkuo5CTUMgQmlvaW5m
b3JtYXRpY3PnmoTmlofnq6AgW01DLUdlbm9tZUtleTogYSBtdWx0aWNsb3VkIHN5c3RlbSBmb3Ig
dGhlIGRldGVjdGlvbiBhbmQgYW5ub3RhdGlvbiBvZiBnZW5vbWljIHZhcmlhbnRzXShodHRwczov
L2JtY2Jpb2luZm9ybWF0aWNzLmJpb21lZGNlbnRyYWwuY29tL2FydGljbGVzLzEwLjExODYvczEy
ODU5LTAxNi0xNDU0LTIpPGJyPjwvcD48cD48YnIgZGF0YS1tY2UtYm9ndXM9IjEiPjwvcD48cD48
YnIgZGF0YS1tY2UtYm9ndXM9IjEiPjwvcD48cD48YnIgZGF0YS1tY2UtYm9ndXM9IjEiPjwvcD4=">​</div>
</div>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/2790.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>自学无参RNAseq数据分析第一讲之参考文献解读</title>
		<link>http://www.bio-info-trainee.com/1889.html</link>
		<comments>http://www.bio-info-trainee.com/1889.html#comments</comments>
		<pubDate>Wed, 21 Sep 2016 09:49:23 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[de novo]]></category>
		<category><![CDATA[无参转录组]]></category>
		<category><![CDATA[生信技能树]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1889</guid>
		<description><![CDATA[这是我为新创办的 生信技能树 论坛写的帖子，也适合本博客，所以转载过来： htt &#8230; <a href="http://www.bio-info-trainee.com/1889.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>这是我为新创办的 生信技能树 论坛写的帖子，也适合本博客，所以转载过来： <a href="http://www.biotrainee.com/thread-243-1-1.html" target="_blank">http://www.biotrainee.com/thread-243-1-1.html </a></p>
<p>以前做的都是有参转录组分析，只需要找到参考基因组和注释文件，<span style="color: #ff0000;">然后走QC--&gt;alignment--&gt;counts-&gt;DEG--&gt;annotation的流程图即可。</span><br />
现在开始学习新的东西了，就是无参转录组分析，这里记录一下自己的学习笔记，首先还是资料收集，这次，我就针对性的看5个 全流程化的转录组 de novo 分析 文章，如下：<br />
<span style="color: #000000;"><span style="font-family: Arial;"><a class="gj_safe_a" href="http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-554" target="_blank">http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-554</a>  2014年栀子花的花瓣衰老的标准de novo 转录组分析，数据如下：用Trinity做组装，用NCBI non-redundant (Nr) database库做注释，做了差异分析（栀子花花期分成4个阶段），GO/KEGG注释，然后做了RT-qPCR的实验验证。</span></span><br />
<span style="color: #000000;"><span style="font-family: Arial;">多做了一个 Clusters of Orthologus Groups (COG)的数据库注释</span></span></p>
<table class="t_table" cellspacing="0">
<tbody>
<tr>
<td colspan="6"></td>
</tr>
<tr>
<td></td>
<td>
<div align="center">Raw Reads</div>
</td>
<td>
<div align="center">Clean Reads</div>
</td>
<td>
<div align="center">Contigs</div>
</td>
<td>
<div align="center">Unigenes</div>
</td>
<td>
<div align="center">Annotated</div>
</td>
</tr>
<tr>
<td>
<div align="center">Transcriptome</div>
</td>
<td>
<div align="center">55,092,396</div>
</td>
<td>
<div align="center">50,335,672</div>
</td>
<td>
<div align="center">102,263</div>
</td>
<td>
<div align="center">57,503</div>
</td>
<td>
<div align="center">39,459</div>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<div align="center"></div>
<p><span style="color: #000000;"><span style="font-family: Arial;"><a class="gj_safe_a" href="http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-236" target="_blank">http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-236</a>  2014 巴西橡胶树的研究，是一个综合多组织样本的RNA库，ployT建库，454测序，用的是est2Assembly 和gsassembler 软件做组装，用 NCBI RefSeq, Plant Protein Database 做注释，因为没有分组，所以不必做差异分析，只需要找SNV和SSR标记即可，最后也是做GO/KEGG注释</span></span></p>
<p><span style="color: #000000;"><span style="font-family: Arial;"><a class="gj_safe_a" href="https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2633-2" target="_blank">https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2633-2</a> 2015 萝卜，用illumina进行转录组测序，用Trinity组装，用RPKM值算unigene的表达量，也是用 BLASTx来对Trinity结果进行注释，注释到NR，NT,Swiss-Prot,GO，COG，kegg数据库，其中GO注释用的是Blast2GO，最后也做了RT-qPCR 实验验证，某些基因在leaf里面的表达量显著高于其它tissue，有原始数据：<a class="gj_safe_a" href="http://www.ncbi.nlm.nih.gov/sra/?term=SRX1671013" target="_blank">http://www.ncbi.nlm.nih.gov/sra/?term=SRX1671013</a> </span></span><br />
<span style="color: #000000;"><span style="font-family: Arial;">转录组分析结果结果：A total of 54.64 million clean reads and 111,167 contigs representing 53,642 unigenes were obtained from the radish leaf transcriptome.</span></span></p>
<p><span style="color: #000000;"><span style="font-family: Arial;"><a class="gj_safe_a" href="http://www.nature.com/articles/srep08259" target="_blank">http://www.nature.com/articles/srep08259</a> 2015 芹菜 叶片发育中木质素的探究，测序的reads是A total of 32,477,416 quality reads were recorded for the leaves at Stage 1, 53,675,555 at Stage 2, and 27,158,566 at Stage 3, respectively.，也是用Trinity组装，kmer值设为25，组装结果：33,213 unigenes with an average length of 1,478 bp, a maximum length of 17,075 bp, and an N50 of 2,060 bp，然后用eggNOG/GO/KEGG数据库来注释。文章正文给了所用到的软件和数据库的详细链接</span></span><br />
<span style="color: #000000;"><span style="font-family: Arial;">最后还用了 real-time PCR assays          来看 roots, stems, petioles, and leaf blade 这些组织的基因表达差异情况</span></span></p>
<p><span style="color: #000000;"><span style="font-family: Arial;"><a class="gj_safe_a" href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0128659" target="_blank">http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0128659</a> 对 三疣梭子蟹 的卵巢和睾丸的转录组研究，，也是标准的转录组de novo 分析流程，非常值得借鉴</span></span><br />
<span style="color: #000000;"><span style="font-family: Arial;">NCBI有上传原始数据：SRR1920180  和SRR1920180  </span></span></p>
<p><span style="color: #000000;"><span style="font-family: Arial;">总结好这5篇文献的数据分析流程，就差不多明白如何做无参的转录组de novo分析了</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1889.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RNAseq数据完整生物信息分析流程第一讲之文献数据下载</title>
		<link>http://www.bio-info-trainee.com/1876.html</link>
		<comments>http://www.bio-info-trainee.com/1876.html#comments</comments>
		<pubDate>Tue, 09 Aug 2016 12:34:14 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[转录组软件]]></category>
		<category><![CDATA[--split-3]]></category>
		<category><![CDATA[airway]]></category>
		<category><![CDATA[fastq-dump]]></category>
		<category><![CDATA[SRA]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1876</guid>
		<description><![CDATA[我这里拿的是bioconductor里面最常用的airway数据，因为差异表达分 &#8230; <a href="http://www.bio-info-trainee.com/1876.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>我这里拿的是bioconductor里面最常用的airway数据，因为差异表达分析在bioconductor里面是重点，它们这些包在介绍自己的算法以及做示范的时候都用的这个数据。可以在GEO数据库里面看到信息描述：<a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52778">http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52778</a>  可以看到是Illumina HiSeq 2000 (Homo sapiens) ，75bp paired-end 这个信息很重要，决定了下载sra数据之后如何解压以及如何比对。也可以看到作者把所有的测序原始数据都上传到了SRA中心：<a href="http://www.ncbi.nlm.nih.gov/sra?term=SRP033351 ">http://www.ncbi.nlm.nih.gov/sra?term=SRP033351 </a> ，这里可以在linux服务器上面写一个简单的脚本批量下载所有的测序数据，然后根据GEO里面描述的metadata把原始数据改名。</p>
<blockquote><p>for ((i=508;i&lt;=523;i++)) ;do wget ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP033/SRP033351/<span style="color: #ff0000;"><strong>SRR1039$i/SRR1039$i.sra;done</strong></span><br />
ls *sra |while read id; do ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 $id;done</p></blockquote>
<p>需要自己看SRA里面的数据记录，上面的脚本不难写出，然后因为是Illumina的双端测序，所以我们用fastq-dump --split-3命令来把sra格式数据转换为fastq，但是因为这里有16个测序数据，所以最好是同步改名，我这里用脚本批量生成改名脚本如下：</p>
<p>为了节省空间，我用了--gzip压缩，该文件名，用-A参数。</p>
<blockquote><p>nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/<strong><span style="color: #ff0000;">fastq-dump --split-3 --gzip -A N61311_untreated</span></strong> SRR1039508.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N61311_Dex SRR1039509.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N61311_Alb SRR1039510.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N61311_Alb_Dex SRR1039511.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N052611_untreated SRR1039512.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N052611_Dex SRR1039513.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N052611_Alb SRR1039514.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N052611_Alb_Dex SRR1039515.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N080611_untreated SRR1039516.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N080611_Dex SRR1039517.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N080611_Alb SRR1039518.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N080611_Alb_Dex SRR1039519.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N061011_untreated SRR1039520.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N061011_Dex SRR1039521.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N061011_Alb SRR1039522.sra &amp;<br />
nohup ~/biosoft/sratoolkit/sratoolkit.2.6.3-centos_linux64/bin/fastq-dump --split-3 --gzip -A N061011_Alb_Dex SRR1039523.sra &amp;</p></blockquote>
<p>可以看到这里的16个样本来源于同样的4个人，是HASM细胞系，处理详情如下：</p>
<div>测序基础：</div>
<div>HASM细胞系-human airway smooth muscle，</div>
<div>The Illumina TruSeq assay was used to prepare 75bp paired-end libraries for HASM cells from <b><span style="color: #ff0000;">four white male donors</span></b> under four treatment conditions:</div>
<blockquote>
<div>1) no treatment;</div>
<div>2) treatment with a β2-agonist (i.e. Albuterol, 1μM for 18h);</div>
<div>3) treatment with a glucocorticosteroid (i.e. Dexamethasone (Dex), 1μM for 18h);</div>
<div>4) simultaneous treatment with a β2-agonist and glucocorticoid</div>
</blockquote>
<div>and the libraries were sequenced with an Illumina Hi-Seq 2000 instrument.</div>
<div>我们这里只是先根据fastq数据比对到参考基因组，然后计算每个样本的表达量即可，后续的分组计算差异表达，就需要个性化了。</div>
<p>下载的sra大小如下：</p>
<blockquote><p>-rw-rw-r-- 1 jmzeng jmzeng 1.6G Aug 9 04:21 SRR1039508.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.5G Aug 9 05:20 SRR1039509.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.6G Aug 9 06:14 SRR1039510.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.5G Aug 9 07:05 SRR1039511.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.1G Aug 9 08:07 SRR1039512.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.3G Aug 9 09:17 SRR1039513.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 3.1G Aug 9 10:56 SRR1039514.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.9G Aug 9 11:56 SRR1039515.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.1G Aug 9 13:02 SRR1039516.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.6G Aug 9 14:16 SRR1039517.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.3G Aug 9 15:17 SRR1039518.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.0G Aug 9 16:05 SRR1039519.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.1G Aug 9 16:56 SRR1039520.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.4G Aug 9 17:57 SRR1039521.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.0G Aug 9 18:46 SRR1039522.sra<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.4G Aug 9 19:28 SRR1039523.sra</p></blockquote>
<p>解压后成双端测序的fastq数据如下：</p>
<blockquote><p> -rw-rw-r-- 1 jmzeng jmzeng 2.5G Aug 9 20:12 N052611_Alb_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.5G Aug 9 20:12 N052611_Alb_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 20:44 N052611_Alb_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 20:44 N052611_Alb_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 289M Aug 9 20:44 N052611_Alb_Dex.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 951M Aug 9 20:59 N052611_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 954M Aug 9 20:59 N052611_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.7G Aug 9 20:53 N052611_untreated_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.7G Aug 9 20:53 N052611_untreated_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.5G Aug 9 20:45 N061011_Alb_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.5G Aug 9 20:45 N061011_Alb_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.9G Aug 9 20:59 N061011_Alb_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.9G Aug 9 20:59 N061011_Alb_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 16M Aug 9 20:45 N061011_Alb.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.4G Aug 9 20:48 N061011_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.4G Aug 9 20:48 N061011_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.2G Aug 9 20:00 N061011_untreated_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.2G Aug 9 20:00 N061011_untreated_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 759M Aug 9 20:00 N061011_untreated.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.9G Aug 9 20:03 N080611_Alb_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.9G Aug 9 20:03 N080611_Alb_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 19:59 N080611_Alb_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 19:59 N080611_Alb_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 535M Aug 9 19:59 N080611_Alb_Dex.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.1G Aug 9 20:06 N080611_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 2.1G Aug 9 20:06 N080611_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.6G Aug 9 20:01 N080611_untreated_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.6G Aug 9 20:01 N080611_untreated_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:09 N61311_Alb_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:09 N61311_Alb_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:08 N61311_Alb_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:08 N61311_Alb_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.2G Aug 9 08:07 N61311_Dex_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.2G Aug 9 08:07 N61311_Dex_2.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:09 N61311_untreated_1.fastq.gz<br />
-rw-rw-r-- 1 jmzeng jmzeng 1.3G Aug 9 08:09 N61311_untreated_2.fastq.gz</p></blockquote>
<p>接下来所有的分析就基于此数据啦</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1876.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
