<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; 性别</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/%e6%80%a7%e5%88%ab/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>根据chrY独有区域的覆盖度及测序深度来判断性别</title>
		<link>http://www.bio-info-trainee.com/1248.html</link>
		<comments>http://www.bio-info-trainee.com/1248.html#comments</comments>
		<pubDate>Tue, 22 Dec 2015 13:36:54 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[未分类]]></category>
		<category><![CDATA[性别]]></category>
		<category><![CDATA[覆盖度]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1248</guid>
		<description><![CDATA[这个也是基于bam文件来的，判断chrY独有区域的覆盖度及测序深度 首先下载ch &#8230; <a href="http://www.bio-info-trainee.com/1248.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div>
<div><span style="font-family: KaiTi_GB2312;">这个也是基于bam文件来的，判断chrY独有区域的覆盖度及测序深度</span></div>
<div><span style="font-family: KaiTi_GB2312;">首先下载chrY独有区域的记录文件，<a href="https://www.familytreedna.com/documents/bigy_targets.txt">https://www.familytreedna.com/documents/bigy_targets.txt</a></span></div>
</div>
<div><span style="font-family: KaiTi_GB2312;">然后用samtools depth来统计测序深度，samtools  depth $i |grep 'chr[XY]'</span></div>
<div>depth统计结果文件如下：</div>
<div>mzeng@ubuntu:/home/jmzeng/gender_determination$ head Sample3.depth<br />
chrX    60085    1<br />
chrX    60086    1<br />
chrX    60087    1<br />
chrX    60088    1<br />
chrX    60089    1<br />
chrX    60090    1<br />
chrX    60091    1<br />
chrX    60092    1<br />
chrX    60093    1<br />
chrX    60094    1</div>
<div><span style="font-family: KaiTi_GB2312;">然后我随便写了一个脚本来对测序深度文件进行再统计，统计覆盖度及测序深度</span></div>
<div>
<div>
<p>[perl]<br />
open FH,&quot;bigy_targets.txt&quot;;<br />
while(&lt;FH&gt;){<br />
 chomp;<br />
 @F=split;<br />
 $all+=$F[2]-$F[1]+1;<br />
 foreach ($F[1]..$F[2]){<br />
  $h{$_}=1;<br />
 }<br />
}<br />
close FH;<br />
open FH,$ARGV[0];<br />
while(&lt;FH&gt;){<br />
 chomp;<br />
 @F=split;<br />
 next unless $F[0] eq 'chrY';<br />
 if (exists $h{$F[1]}){<br />
  $pos++;<br />
  $depth+=$F[2];<br />
 }<br />
}<br />
close FH;<br />
$average=$depth/$pos;<br />
$coverage=$pos/$all;<br />
print &quot;$pos\t$average\t$coverage\n&quot; ;</p>
<p>[/perl]</p>
<p>&nbsp;</p>
</div>
<div><span style="font-family: KaiTi_GB2312;"> </span></div>
<div>这样对那三个样本结果如下：</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/clipboard1.png"><img class="alignnone size-full wp-image-1249" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/clipboard1.png" alt="clipboard" width="890" height="176" /></a></div>
<div>可以看到只有sample4，是覆盖率极低的，而且记录到的pos位点也特别少，所以她是女性！</div>
<div>这里测序深度没有意义。</div>
<div></div>
<div></div>
<div><span style="font-family: KaiTi_GB2312;"><span style="font-family: KaiTi_GB2312;">chrY独有区域文献：<a href="https://www.familytreedna.com/learn/wp-content/uploads/2014/08/BIG_Y_WhitePager.pdf">https://www.familytreedna.com/learn/wp-content/uploads/2014/08/BIG_Y_WhitePager.pdf</a></span></span></p>
<div>区域记录于：<a href="https://www.familytreedna.com/documents/bigy_targets.txt">https://www.familytreedna.com/documents/bigy_targets.txt</a></div>
<div>另外的文献：<a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2748900/#bib8">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2748900/#bib8</a></div>
<div></div>
<div>一个很有趣的博客：<a href="http://davetang.org/muse/2013/09/07/creating-a-coverage-plot-in-r/">http://davetang.org/muse/2013/09/07/creating-a-coverage-plot-in-r/</a></div>
<p>try the <a href="http://www.broadinstitute.org/gsa/wiki/index.php/Main_Page">GATK</a> "DepthOfCoverage" ?<a href="http://www.broadinstitute.org/gsa/wiki/index.php/Depth_of_Coverage_v3.0">http://www.broadinstitute.org/gsa/wiki/index.php/Depth_of_Coverage_v3.0</a></p>
<p>or you can run '<a href="http://samtools.sourceforge.net/samtools.shtml">samtools pileup</a>' and calculate the mean value for the <a href="http://samtools.sourceforge.net/pileup.shtml">coverage column</a>.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1248.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>根据X,Y染色体比对上的reads数来判断性别</title>
		<link>http://www.bio-info-trainee.com/1244.html</link>
		<comments>http://www.bio-info-trainee.com/1244.html#comments</comments>
		<pubDate>Tue, 22 Dec 2015 13:32:13 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[生信基础]]></category>
		<category><![CDATA[reads]]></category>
		<category><![CDATA[性别]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1244</guid>
		<description><![CDATA[针对高通量测序数据，包括WGS，WES 我这里主要讲根据bam文件里面的chrX &#8230; <a href="http://www.bio-info-trainee.com/1244.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div>
<div>
<div>
<div>
<div>针对高通量测序数据，包括WGS，WES</div>
<div>我这里主要讲根据bam文件里面的chrX和chrY的reas比例来判断性别，大家可以自己处理数据得到bam文件！</div>
<div>主要是读取bam文件，选择chrX上面的记录，统计genotype即可：</div>
</div>
<div><span style="font-family: KaiTi_GB2312;">也可以统计测序深度，samtools  depth $i |grep 'chr[XY]'</span></div>
<div><span style="font-family: KaiTi_GB2312;"><b>如果chrX和chrY的reads比例超过一定值，比如50倍，就判定为女性！</b></span></div>
<div>脚本很简单，就统计bam文件的第三列就可以啦，第三列就是染色体信息</div>
<div> samtools  view $i |  perl -alne '{$h{$F[2]}++}END{print "$_\t$h{$_}" foreach sort keys %h}' &gt;$out.chromosome.stat</div>
<div></div>
</div>
</div>
</div>
<div><span style="font-family: Arial;">对于Sample3，<br />
</span></div>
<div><span style="font-family: Arial;">chrX 1233061</span></div>
<div><span style="font-family: Arial;">chrY 140506</span></div>
<div><span style="font-family: Arial;"> </span></div>
<div><span style="font-family: Arial;">对于Sample4，很明显，这个是女性！<br />
</span></div>
<div><span style="font-family: Arial;">chrX 2734815</span></div>
<div><span style="font-family: Arial;">chrY 51860</span></div>
<div><span style="font-family: Arial;"> </span></div>
<div><span style="font-family: Arial;">对于Sample5</span></div>
<div><span style="font-family: Arial;">chrX 1329302</span></div>
<div><span style="font-family: Arial;">chrY 156663</span></div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1244.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>根据X染色体的snp的纯和率来判断性别</title>
		<link>http://www.bio-info-trainee.com/1243.html</link>
		<comments>http://www.bio-info-trainee.com/1243.html#comments</comments>
		<pubDate>Tue, 22 Dec 2015 13:28:53 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[生信基础]]></category>
		<category><![CDATA[性别]]></category>
		<category><![CDATA[纯和率]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1243</guid>
		<description><![CDATA[针对高通量测序数据，包括WGS，WES，甚至snp6芯片也行。 我这里主要讲根据 &#8230; <a href="http://www.bio-info-trainee.com/1243.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div><span style="font-family: KaiTi_GB2312;">针对高通量测序数据，包括WGS，WES，甚至snp6芯片也行。</span></div>
<div><span style="font-family: KaiTi_GB2312;">我这里主要讲根据vcf文件里面的chrX的纯合率来判断性别，大家可以自己处理数据得到vcf文件！</span></div>
<div><span style="font-family: KaiTi_GB2312;">主要是读取vcf文件，选择chrX上面的记录，统计genotype即可：</span></div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/clipboard.png"><img class="alignnone size-full wp-image-1245" src="http://www.bio-info-trainee.com/wp-content/uploads/2015/12/clipboard.png" alt="clipboard" width="726" height="266" /></a></div>
<div>我这里拿之前的自闭症项目数据来举例子：</div>
<div><strong>根据数据提供者的信息，3-4-5分别就是孩子、父亲、母亲</strong>，统计chrX的snp的的纯合和杂合的比例，代码很简单</div>
<div>vcf文件一般第一列是染色体，第6列是质量，第10列是基因型已经测序深度相关信息</div>
<p><span style="font-family: Arial;">cat Sample5.gatk.UG.vcf |perl -alne '{next unless $F[0] eq "chrX";next unless $F[5]&gt;30;$h{(split/:/,$F[9])[0]}++}END{print "$_\t$h{$_}" foreach keys %h}' </span></p>
<div><span style="font-family: Arial;"><b>如果纯合的snp是杂合的倍数超过一定阈值，比如4倍，就可以判断是男性。</b><br />
</span></p>
<div><span style="font-family: Arial;">对于Sample3来说，很明显，是男孩，因为X染色体都是纯合突变<br />
</span></p>
<div><span style="font-family: Arial;">0/1 391<br />
1/1 2463<br />
2/2 1<br />
1/2 32<br />
0/2 1<br />
对于Sample4来说，很明显，应该是母亲，<strong>证明之前别人给我的信息有误</strong><br />
1/1 3559<br />
1/2 27<br />
0/1 1835<br />
0/2 5<br />
那么Sample5很明显就是父亲咯<br />
1/1 2626<br />
0/1 356<br />
1/2 22</span></div>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1243.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
