<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; soapfuse</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/soapfuse/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>融合基因检测软件-soapfusion</title>
		<link>http://www.bio-info-trainee.com/1463.html</link>
		<comments>http://www.bio-info-trainee.com/1463.html#comments</comments>
		<pubDate>Tue, 15 Mar 2016 11:30:21 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[基础软件]]></category>
		<category><![CDATA[转录组软件]]></category>
		<category><![CDATA[soap]]></category>
		<category><![CDATA[soapfuse]]></category>
		<category><![CDATA[融合基因]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1463</guid>
		<description><![CDATA[开发单位：华大，SOAP系列软件套装！ 功能：检测合基因 优点：在现有的各种软件 &#8230; <a href="http://www.bio-info-trainee.com/1463.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>开发单位：华大，SOAP系列软件套装！</p>
<div>
<div>功能：检测合基因</div>
<div>优点：在现有的各种软件里面表现算是最好的</div>
<div>算法：是hash index，跟其它bwt算法不太一样</div>
<div>官网：<a href="http://soap.genomics.org.cn/soapfuse.html">http://soap.genomics.org.cn/soapfuse.html</a></div>
<div>paper：<a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12">https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12</a></div>
<div></div>
<div>其它软件有： FusionSeq [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR21">21</a></span>], deFuse [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR22">22</a></span>], TopHat-Fusion [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR23">23</a></span>], FusionHunter [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR24">24</a></span>], SnowShoes-FTD [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR25">25</a></span>], chimerascan [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR26">26</a></span>] and FusionMap [<span class=""><a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-2-r12#CR27">27</a></span>]</div>
<div></div>
<div>具体的算法我没看，因为只是有需求，正好有一些RNA-seq数据又想看看样本融合基因情况。所以就测试这个软件，通俗点说，融合基因原理其实很简单，如果有足够多的reads一部分比对到一个基因，另一部分比对到另一个基因，就可以说明它们两个基因发生了融合现象！如果是PE测序，那么更方便，左右两端reads比对情况也可以考虑。我就不多说废话了，直接上教程吧！</div>
<div></div>
<div>
<div><span style="color: #ff0000;">一，软件安装</span></div>
<div>
<div>软件下载地址：<a href="https://sourceforge.net/projects/soapfuse/files/SOAPfuse_Package/SOAPfuse-v1.27.tar.gz">https://sourceforge.net/projects/soapfuse/files/SOAPfuse_Package/SOAPfuse-v1.27.tar.gz</a></div>
</div>
<div>下载压缩包，解压后即可使用！！！</div>
<div>推荐用最新版，然后看作者说明书的时候也要看清楚！</div>
<div>我反正好几次都搞糊涂了，最后联系了作者才搞明白，作者说他想更新到2.0版本，直接用HISAT的比对sam文件来做，但是还在筹备中，我觉得有点悬！</div>
<div><a href="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/12.png"><img class="alignnone size-full wp-image-1465" src="http://www.bio-info-trainee.com/wp-content/uploads/2016/03/12.png" alt="1" width="655" height="177" /></a></div>
<div></div>
<div>解压后是一堆perl程序，都在source目录下，source目录下面还有bin下面附带了几个第三方软件，包括bwa，blast和soap，最后都用得着！</div>
<div>有个很重要的问题，一定要软件自带的perl模块添加到perl的环境变量。不然那些perl程序运行会报错！</div>
<div>配置文件需要修改，就把几个目录放进去即可</div>
<div></div>
<div></div>
<p><span style="color: #ff0000;">二，输入数据准备</span></p>
<div>这里最重要的就是制作数据库！！！</div>
<div>作者给了非常详细的制作过程，我觉得还是不够清楚，所以再讲一遍！</div>
<div>
<div><a href="https://sourceforge.net/p/soapfuse/blog/2013/07/strategy-for-recurrent-transcriptname-and-genename-in-ensembl-gtf-file">https://sourceforge.net/p/soapfuse/blog/2013/07/strategy-for-recurrent-transcriptname-and-genename-in-ensembl-gtf-file</a></div>
<div>首先下载5个文件：</div>
<div>
<blockquote>
<div>6.5K Jun 15  2009 cytoBand.txt.gz</div>
<div>3.0G Oct 12  2012 hg19.fa</div>
<div>2.5M Mar 15 10:30 HGNC_Gene_Family_dataset</div>
<div>38M Feb  8  2014 Homo_sapiens.GRCh37.75.gtf.gz</div>
<div>202 Jan 19 16:07 HumanRef_refseg_symbols_relationship.list</div>
</blockquote>
<p>文件下载地址，作者已经给出了！</p>
</div>
<div>我把这些文件都放在的当前文件夹下面的raw这个子文件夹，因为我要当前文件夹作为该软件的database文件夹！！！</div>
<div>然后运行命令！</div>
<div>
<div>我在SOAPfuse-v1.27文件下面运行：</div>
<div>perl ../SOAPfuse-v1.27/source/SOAPfuse-S00-Generate_SOAPfuse_database.pl  \</div>
<div>-wg raw/hg19.fa  -gtf raw/Homo_sapiens.GRCh37.75.gtf.gz  -cbd raw/cytoBand.txt.gz   -gf raw/HGNC_Gene_Family_dataset \</div>
<div>-rft raw/HumanRef_refseg_symbols_relationship.list \</div>
<div> -sd ../SOAPfuse-v1.27 -dd ./</div>
<p>这一步耗时很长，4~6小时，创造了transcript.fa和gene.fa，然后还对他们建立bwa和soap的index，所以有点慢！</p>
</div>
<div>构建成功会有提示：</div>
</div>
</div>
</div>
<blockquote>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">Congratulations!</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">You have constructed SOAPfuse database files successfully.</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">These database files are all stored in directory you supplied:</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">/home/jmzeng/biosoft/SOAPfuse/db_v1.27/</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">They are all generated based on public data files you supplied:</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">whole_genome_fasta_file:   /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/hg19.fa</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">gtf_annotation_file:       /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/Homo_sapiens.GRCh37.75.gtf.gz</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">Chr_Bandregion_file:       /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/cytoBand.txt.gz</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">HGNC_gene_family_file:     /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/HGNC_Gene_Family_dataset</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">gtf_segname2refseg_list:   /home/jmzeng/biosoft/SOAPfuse/db_v1.27/raw/HumanRef_refseg_symbols_relationship.list</span></div>
</blockquote>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">这些目录很重要，接下来制作配置文件会用得着！</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">To use these database files, just set the 'DB_db_dir' in config file as belowed:</span></div>
<div><span style="font-family: Monaco,Consolas,Courier,Lucida Console,monospace;">DB_db_dir  =   /home/jmzeng/biosoft/SOAPfuse/db_v1.27</span></div>
<div>
<div>
<div>
<div>配置文件需要修改下面5个</div>
</div>
</div>
</div>
<blockquote>
<pre>DB_db_dir = /DATABASE_DIR/</pre>
<pre>PG_pg_dir = /TOOL_DIR/source/bin</pre>
<pre>PS_ps_dir = /TOOL_DIR/source</pre>
<pre>PD_all_out = /out_directory/</pre>
<pre>PA_all_fq_postfix = PostFix</pre>
</blockquote>
<div>
<div>
<div>
<div></div>
<div>其实你仔细阅读了说明书，你就知道该修改成什么样子了！</div>
<div>最后制作sample list文件</div>
<div>我这里只有一个sample,所以文件就一句话即可</div>
<div>test test test 100</div>
<div>所以我的有下面两个文件，都是为了顺应作者的需求我才搞了test/test/test这么无聊的东西！！！</div>
<div>/home/jmzeng/test_for_soapfuse/test/test/test_1.fq.gz</div>
<div>/home/jmzeng/test_for_soapfuse/test/test/test_2.fq.gz</div>
<div>如果你有多个sample需要一起运行，你就要仔细读作者的readme了，它把这个配置文件搞得特别复杂！！！</div>
</div>
<p><span style="color: #ff0000;">三，运行命令</span></p>
<div>如果文件都准备好了，运行命令非常简单！！</div>
<div>
<div>
<pre>perl<span style="color: #ff00ff;"> SOAPfuse-RUN.pl</span> -c &lt;<strong>config_file</strong>&gt; -fd &lt;<strong>WHOLE_SEQ-DATA_DIR</strong>&gt; -l &lt;<strong>sample_list</strong>&gt; -o &lt;<strong>out_directory</strong>&gt; [Options]</pre>
<p>运行的非常慢！！！</p>
</div>
<div>因为需要重新比对，知道</div>
</div>
<p><span style="color: #ff0000;">四，数据结果解读</span></p>
<div>结果，作者已经说的很清楚了，我就不多说了！</div>
<div>
<div><a href="http://soap.genomics.org.cn/soapfuse.html">http://soap.genomics.org.cn/soapfuse.html</a></div>
</div>
<div></div>
<div></div>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1463.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
