<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; bam</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/bam/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>不要想当然的使用生信软件，读文档，勤搜索！</title>
		<link>http://www.bio-info-trainee.com/2338.html</link>
		<comments>http://www.bio-info-trainee.com/2338.html#comments</comments>
		<pubDate>Mon, 06 Feb 2017 02:35:42 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[杂谈-随笔]]></category>
		<category><![CDATA[bam]]></category>
		<category><![CDATA[samtools]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=2338</guid>
		<description><![CDATA[最近在写一篇很有趣的文章，一张图说清楚wgs,wes,rna-seq,chip- &#8230; <a href="http://www.bio-info-trainee.com/2338.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>最近在写一篇很有趣的文章，一张图说清楚wgs,wes,rna-seq,chip-seq的异同点！</p>
<p><span style="color: #ff0000;"><strong>需要用到一些测试数据，我准备拿17号染色体的40437407-40486397这约48Kb碱基区域来举例子，就需要把这个区域的bam提取出来。</strong></span></p>
<p>我分别找了以前处理的wgs,wes,rna-seq,chip-seq公共数据，原始bam非常大，尤其是WGS的，45G的bam文件，所以只能抽取17号染色体的40437407-40486397这约48Kb碱基区域，以前我做mpileup或者其它都是用的-r 参数，所以我想当然的使用下面的代码：<span id="more-2338"></span></p>
<p>samtools view -h -r chr17:40437407-40486397 your.sorted.merge.bam |samtools view -bS - &gt;wes.bam</p>
<p>发现始终不对，让我着实郁闷，我就Google了一下，https://www.biostars.org/p/48719/</p>
<p><a href="http://www.bio-info-trainee.com/wp-content/uploads/2017/02/1.png"><img class="alignnone size-full wp-image-2339" src="http://www.bio-info-trainee.com/wp-content/uploads/2017/02/1.png" alt="1" width="712" height="275" /></a></p>
<p>才明白，samtools的view命令的-r参数不再是用来指定坐标了！</p>
<p>samtools view -h  control_1.sort.bam   "chr17:40437407-40486397"  |samtools view -bS - &gt;RNA-seq.bam</p>
<p>所以我修改了命令，完成了提取指定区域比对的reads的bam文件这个需求！</p>
<p>&nbsp;</p>
<p>samtools view -h</p>
<p>Usage: samtools view [options] &lt;in.bam&gt;|&lt;in.sam&gt;|&lt;in.cram&gt; [region ...]</p>
<p>Options:<br />
-b output BAM<br />
-C output CRAM (requires -T)<br />
-1 use fast BAM compression (implies -b)<br />
-u uncompressed BAM output (implies -b)<br />
-h include header in SAM output<br />
-H print SAM header only (no alignments)<br />
-c print only the count of matching records<br />
-o FILE output file name [stdout]<br />
-U FILE output reads not selected by filters to FILE [null]<br />
-t FILE FILE listing reference names and lengths (see long help) [null]<br />
-L FILE only include reads overlapping this BED FILE [null]<br />
<span style="color: #ff0000;"><strong>-r STR only include reads in read group STR [null]</strong></span><br />
<span style="color: #ff0000;"><strong> -R FILE only include reads with read group listed in FILE [null]</strong></span><br />
-q INT only include reads with mapping quality &gt;= INT [0]<br />
-l STR only include reads in library STR [null]<br />
-m INT only include reads with number of CIGAR operations consuming<br />
query sequence &gt;= INT [0]<br />
-f INT only include reads with all bits set in INT set in FLAG [0]<br />
-F INT only include reads with none of the bits set in INT set in FLAG [0]<br />
-x STR read tag to strip (repeatable) [null]<br />
-B collapse the backward CIGAR operation<br />
-s FLOAT integer part sets seed of random number generator [0];<br />
rest sets fraction of templates to subsample [no subsampling]<br />
-@, --threads INT<br />
number of BAM/CRAM compression threads [0]<br />
-? print long help, including note about region specification<br />
-S ignored (input format is auto-detected)<br />
--input-fmt-option OPT[=VAL]<br />
Specify a single input file format option in the form<br />
of OPTION or OPTION=VALUE<br />
-O, --output-fmt FORMAT[,OPT[=VAL]]...<br />
Specify output format (SAM, BAM, CRAM)<br />
--output-fmt-option OPT[=VAL]<br />
Specify a single output file format option in the form<br />
of OPTION or OPTION=VALUE<br />
-T, --reference FILE<br />
Reference sequence FASTA FILE [null]</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/2338.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bioconductor系列之pasillaBamSubset</title>
		<link>http://www.bio-info-trainee.com/879.html</link>
		<comments>http://www.bio-info-trainee.com/879.html#comments</comments>
		<pubDate>Sun, 19 Jul 2015 02:46:35 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[bam]]></category>
		<category><![CDATA[bioconductor]]></category>
		<category><![CDATA[coverage]]></category>
		<category><![CDATA[pasillaBamSubset]]></category>
		<category><![CDATA[readGAlignments]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=879</guid>
		<description><![CDATA[这个包主要有bam文件测试数据 > biocLite("pasillaBamSu &#8230; <a href="http://www.bio-info-trainee.com/879.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>这个包主要有bam文件测试数据<br />
> biocLite("pasillaBamSubset")<br />
BioC_mirror: http://bioconductor.orgUsing Bioconductor version 3.0 (BiocInstaller 1.16.5), R version 3.1.2.<br />
Installing package(s) 'pasillaBamSubset'<br />
trying URL 'http://bioconductor.org/packages/3.0/data/experiment/bin/windows/contrib/3.1/pasillaBamSubset_0.3.1.zip'<br />
Content type 'application/zip' length 31514402 bytes (30.1 Mb)<br />
打开pasillaBamSubset包的安装地址就可以看到里面有几个bam文件<br />
Several functions are available for reading BAM files into R:<br />
而且加载包的同时也引入了几个读取bam文件的函数<br />
readGAlignments()<br />
readGAlignmentPairs()<br />
readGAlignmentsList()<br />
scanBam()<br />
<a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/07/Bioconductor系列之pasillaBamSubset448.png"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2015/07/Bioconductor系列之pasillaBamSubset448.png" alt="Bioconductor系列之pasillaBamSubset448" width="554" height="305" class="alignnone size-full wp-image-880" /></a><br />
加载包就可以看到用两个函数得到包自带的数据文件的地址，主要是有很多人不一定把包安装在C盘，所以用函数来定位文件更加安全一点<br />
> library(pasillaBamSubset)<br />
> untreated1_chr4()<br />
[1] "C:/Program Files/R/R-3.1.2/library/pasillaBamSubset/extdata/untreated1_chr4.bam"<br />
> untreated3_chr4()<br />
[1] "C:/Program Files/R/R-3.1.2/library/pasillaBamSubset/extdata/untreated3_chr4.bam"</p>
<p>接下来我们就看看如何读取这些bam文件的<br />
library(pasillaBamSubset)<br />
un1 <- untreated1_chr4()  # single-end reads
library(GenomicAlignments)
reads1 <- readGAlignments(un1)
cvg1 <- coverage(reads1)
查看reads1这个结果，可以看到把这个bam文件都读成了一个数据对象GAlignments object，
<a href="http://www.bio-info-trainee.com/wp-content/uploads/2015/07/Bioconductor系列之pasillaBamSubset1142.png"><img src="http://www.bio-info-trainee.com/wp-content/uploads/2015/07/Bioconductor系列之pasillaBamSubset1142.png" alt="Bioconductor系列之pasillaBamSubset1142" width="554" height="219" class="alignnone size-full wp-image-881" /></a><br />
针对着个数据对象有很多操作，其中一个coverage操作是来自于GenomicFeatures<br />
或者GenomicAlignments函数的，可以算出测序覆盖情况。<br />
可以看到这个bam文件里面的比对情况大多几种在4号染色体里面<br />
> cvg1$chr4<br />
integer-Rle of length 1351857 with 122061 runs<br />
  Lengths:  891   27    5   12   13   45    5   12   13 ...    5    1    1    3  10<br />
  Values :    0    1    2    3    4    5    4    3    2 ...   12   11   10    6<br />
> mean(cvg1$chr4)<br />
[1] 11.33746<br />
> max(cvg1$chr4)[1] 5627<br />
可以看到平均测序深度是11.3X，最大测序深度是5627X</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/879.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
