<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; makeVennDiagram</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/makevenndiagram/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>对CHIP-seq数据call peaks应该选取unique比对的reads吗？</title>
		<link>http://www.bio-info-trainee.com/1869.html</link>
		<comments>http://www.bio-info-trainee.com/1869.html#comments</comments>
		<pubDate>Sun, 07 Aug 2016 13:13:18 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[CHIP-seq]]></category>
		<category><![CDATA[ChIPpeakAnno]]></category>
		<category><![CDATA[makeVennDiagram]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1869</guid>
		<description><![CDATA[对于CHIP-seq数据处理完全是自学的，所以有很多细节得慢慢学习回来，这次记录 &#8230; <a href="http://www.bio-info-trainee.com/1869.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>对于CHIP-seq数据处理完全是自学的，所以有很多细节得慢慢学习回来，这次记录的就是当我们把测序仪的fastq数据比对到参考基因组之后，应该对比对的结果文件做什么样的处理，然后去给peaks caller软件拿来call peaks呢？<strong><span style="color: #ff0000;">我看过博客 提到只保留比对质量值大于30的，也看过博客提到只保留unique比对的reads</span></strong>，我这里拿一篇公共数据测试了一下它们的区别！数据描述如下：<span id="more-1869"></span></p>
<div>
<div><a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74311">http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74311</a></div>
</div>
<div>参考流程：<a href="https://github.com/jmzeng1314/NGS-pipeline/tree/master/CHIPseq">https://github.com/jmzeng1314/NGS-pipeline/tree/master/CHIPseq</a></div>
<div>
<div>
<table border="1" cellspacing="0" cellpadding="2">
<tbody>
<tr>
<td valign="top"><a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1916974">GSM1916974</a></td>
<td valign="top">H3K27ac ChIP-seq</td>
<td valign="top">
<div>
<h2>SRR2774675</h2>
</div>
</td>
</tr>
<tr>
<td valign="top"><a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1916975">GSM1916975</a></td>
<td valign="top">input DNA</td>
<td valign="top">
<div>
<h2>SRR2774676</h2>
</div>
</td>
</tr>
</tbody>
</table>
</div>
<div>首先在SRA数据库下载  SRR2774675.sra 和 SRR2774676.sra</div>
<div>
<div><a href="http://www.ncbi.nlm.nih.gov/sra?term=SRP065184">http://www.ncbi.nlm.nih.gov/sra?term=SRP065184</a></div>
<div>应用我github的流程很快就可以对比对，我把两种方法处理的比对结果都拿去call peaks，然后得到了，两个peaks文件。</div>
<div>39709 highQuaily_peaks.bed<br />
39709 highQuaily_summits.bed</div>
<div>可以看到两次结果得到的peaks条数并没有显著区别，我们简单看看前几行！</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/08/12.png"><img class="alignnone size-full wp-image-1870" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/08/12.png" alt="1" width="767" height="181" /></a></div>
<div>其实用bedtools就可以看看左右两边的文件的交集情况，但是我这里选用了ChIPpeakAnno这个R包集成好的函数，直接得到结果即可！</div>
<div>ChIPpeakAnno 包直接看说明书吧，我这里贴出代码：</div>
<blockquote>
<div>library(ChIPpeakAnno)<br />
highPeak &lt;- readPeakFile( 'highQuaily_peaks.bed' )<br />
uniquePeak &lt;- readPeakFile( 'unique_peaks.bed' )<br />
ol &lt;- findOverlapsOfPeaks(highPeak, uniquePeak)<br />
png('overlapVenn.png')<br />
makeVennDiagram(ol)<br />
dev.off()</div>
</blockquote>
<div>然后打开画好的韦恩图：</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/08/overlapVenn.png"><img class="alignnone size-full wp-image-1871" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/08/overlapVenn.png" alt="overlapVenn" width="480" height="480" /></a></div>
<div>可以看到这两种情况得到的结果几乎没有区别，如果大家感兴趣可以自己看看它们那些独特的peaks到底是什么原因！</div>
<div>结论就是，说明CHIP-seq数据分析的时候，call peaks那个步骤，<strong>只保留比对质量值大于30的，或者只保留unique比对的reads，从数据处理的角度来讲差别不大，主要看你具体实验意义。</strong></div>
<div></div>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1869.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
