<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>生信菜鸟团 &#187; 自学编程</title>
	<atom:link href="http://www.bio-info-trainee.com/tag/%e8%87%aa%e5%ad%a6%e7%bc%96%e7%a8%8b/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bio-info-trainee.com</link>
	<description>欢迎去论坛biotrainee.com留言参与讨论，或者关注同名微信公众号biotrainee</description>
	<lastBuildDate>Sat, 28 Jun 2025 14:30:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.33</generator>
	<item>
		<title>生物信息小白如何自学编程</title>
		<link>http://www.bio-info-trainee.com/1079.html</link>
		<comments>http://www.bio-info-trainee.com/1079.html#comments</comments>
		<pubDate>Fri, 23 Oct 2015 14:43:48 +0000</pubDate>
		<dc:creator><![CDATA[ulwvfje]]></dc:creator>
				<category><![CDATA[杂谈-随笔]]></category>
		<category><![CDATA[生物信息]]></category>
		<category><![CDATA[自学编程]]></category>

		<guid isPermaLink="false">http://www.bio-info-trainee.com/?p=1079</guid>
		<description><![CDATA[这本来是我在知乎上面看到的问题，所以就抽空回答了一下：http://www.zh &#8230; <a href="http://www.bio-info-trainee.com/1079.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>这本来是我在知乎上面看到的问题，所以就抽空回答了一下：<a href="http://www.zhihu.com/question/36701137/answer/68928111">http://www.zhihu.com/question/36701137/answer/68928111</a></p>
<p>首先，你懂得想去看源码，这是一个很好的兆头，一些非常正规的源码的确是编程进阶的的捷径，毕竟我们大部分人都不可能得到别人的手把手指导，所以只能靠自己的悟性了。</p>
<p>我就以我自己的经历来回答这个问题吧，我作为一个纯生物出身的小白，现在编程技术应该还算可以了！</p>
<p>首先，不管是哪个语言，perl,python,R,matlab都好，它们都有一堆的基础书籍，你必须以囫囵吞枣的心态看完一两本书（书没有好坏，别要我给你推荐书名），必须看完，了解编程基础。</p>
<p>接下来的步骤最重要，就是实践，不停的实践，在实践中运用编程技术，这样是学的最快的，不然你看再多的书也只是一个概念。</p>
<p>我这里重点推荐一个工具集，它实现了很多生物信息学需要的常用操作，网址是：<a class=" wrap external" href="https://www.dna20.com/resources/bioinformatics-tools" target="_blank" rel="nofollow noreferrer">Bioinformatics Tools<i class="icon-external"></i></a><br />
包含以下64中工具，而且网页也很清楚的描述了它们的功能，其实非常简单，但是这样写程序非常有效。<br />
"Combines multiple FASTA entries into a single sequence."<br />
"Returns the entire sequence contained in an EMBL file in FASTA format."<br />
"Parses the feature table of an EMBL file and returns the feature sequences."<br />
"Parses the feature table of an EMBL file and returns the protein translations."<br />
"Removes non-DNA characters from text."<br />
"Removes non-protein characters from text."<br />
"Returns the entire sequence contained in a GenBank file in FASTA format."<br />
"Parses the feature table of a GenBank file and returns the feature sequences."<br />
"Parses the feature table of a GenBank file and returns the protein translations."<br />
"Converts single letter amino acid codes to three letter codes."<br />
"Reads a list of positions and ranges and returns those parts of a DNA sequence."<br />
"Reads a list of positions and ranges and returns those parts of a protein sequence."<br />
"Determines the reverse-complement, reverse, or complement of the sequence you enter."<br />
"Separates bases according to codon position."<br />
"Converts a FASTA sequence into multiple sequences."<br />
"Converts three letter amino acid codes to one letter codes."<br />
"Returns DNA sequence segments specified by a position and window size."<br />
"Returns protein sequence segments specified by a position and window size."<br />
"Plots codon frequency (according to the codon table you enter) for each codon in a DNA sequence."<br />
"Returns a standard codon usage table."<br />
"Returns a list of potential CpG islands."<br />
"Calculates the molecular weight of DNA sequences."<br />
"Returns positions of the patterns you enter."<br />
"Returns basic sequence statistics."<br />
"Returns sequences that are identical or similar to a query sequence."<br />
"Returns sequences that are identical or similar to a query sequence."<br />
"Accepts aligned sequences in FASTA format and calculates the identity and similarity of each sequence pair."<br />
"Can be used to predict a DNA sequence in another species using a protein sequence alignment."<br />
"Finds DNA sequences that can easily be converted to a restriction site."<br />
"Determines the positions of open reading frames."<br />
"Returns the optimal global alignment for two coding DNA sequences."<br />
"Returns the optimal global alignment for two DNA sequences."<br />
"Returns the optimal global alignment for two protein sequences."<br />
"Returns a report describing PCR primer properties"<br />
"Generates PCR products from a template and two primer sequences."<br />
"Returns the grand average of hydropathy value of protein sequences."<br />
"Returns the predicted isoelectric point of protein sequences."<br />
"Calculates the molecular weight of protein sequences."<br />
"Returns positions of the patterns you enter."<br />
"Returns basic sequence statistics."<br />
"Converts the sequence you enter into restriction fragments."<br />
"Returns the number and positions of restriction sites."<br />
"Can be used to convert protein into DNA."<br />
"Returns the translation in the reading frame you specify."<br />
"Colors a sequence alignment based on sequence conservation."<br />
"Colors a protein alignment based on biochemical properties of residues."<br />
"Numbers and groups DNA according to your specifications."<br />
"Numbers and groups amino acids according to your specifications."<br />
"Shows PCR primer annealing sites, translations, and restriction sites."<br />
"Shows restriction sites and protein translations."<br />
"Shows protein translations."<br />
"Introduces random mutations into DNA sequences."<br />
"Introduces random mutations into protein sequences."<br />
"Generates a random coding sequence of the length you specify."<br />
"Generates a random DNA sequence of the length you specify."<br />
"Replaces regions of the DNA sequences you enter with random bases."<br />
"Generates a random protein sequence of the length you specify."<br />
"Replaces regions of the protein sequences you enter with random residues."<br />
"Samples bases from a DNA sequence with replacement."<br />
"Samples residues from a protein sequence with replacement."<br />
"Randomly shuffles the DNA sequences you enter."<br />
"Randomly shuffles the protein sequences you enter."<br />
"IUPAC codes for DNA and protein."<br />
"The genetic codes used in the Sequence Manipulation Suite."<br />
当你实现完了这些需求，你不仅仅学会了编程，而且是学会了编程该如何应用在生物信息学里面！<br />
用perl,python,R,matlab中的任何一种都可以实现，它们没有任何区别的，别纠结语言的问题。<br />
不推荐初学者看源代码，因为源代码太正规了，定义变量就几十行代码了，再定义函数又是几百行代码，而真正学生物信息学的压根写代码都不超过五十行的，比如我上面提到那64个生物数据处理需求，一般就七八行代码就可以（在perl里面）<br />
不信你可以看看这个github里面托管的代码：<a class=" wrap external" href="https://github.com/trinityrnaseq/trinityrnaseq/tree/master/util/misc" target="_blank" rel="nofollow noreferrer">trinityrnaseq/util/misc at master · trinityrnaseq/trinityrnaseq · GitHub <i class="icon-external"></i></a><br />
里面有很多perl代码，都是实现各种数据转换的，写的非常正规，甚至能把一行代码就能解决的问题写成几百甚至上千行，除非你想把自己的代码拿去发文章或者出售，否则正常的生物信息学研究根本用不着！<br />
当然，回到你最初的问题，哪里能找到源码呢？<br />
首先，你可以去图书馆看一堆书籍，它们都会有光盘，下载既有视频又有源码，或者书上一般会说源码在哪里下载，比如这个<a class=" wrap external" href="https://github.com/pleac/pleac/tree/master/include/perl" target="_blank" rel="nofollow noreferrer">pleac/include/perl at master · pleac/pleac · GitHub<i class="icon-external"></i></a><br />
然后，你可以找一大堆的生物信息学软件，它们一般都托管在github上面，这个链接里面有三百多个生物信息学转录组领域的软件：<a class=" wrap external" href="https://en.wiki2.org/wiki/List_of_RNA-Seq_bioinformatics_tools" target="_blank" rel="nofollow noreferrer">List of RNA-Seq bioinformatics tools<i class="icon-external"></i></a><br />
这个链接有几百个生物信息学里面做alignment的软件：<br />
甚至连常见的生物信息学数据库也有自己的源码包：例如NCBI，ensembl，UCSC<br />
下面就是ENSEMBL数据库的：<a class=" wrap external" href="http://www.bio-info-trainee.com/?p=1051" target="_blank" rel="nofollow noreferrer">NGS数据比对工具持续收集 <i class="icon-external"></i></a><br />
<a class=" wrap external" href="http://www.bio-info-trainee.com/?p=1051" target="_blank" rel="nofollow noreferrer">（记住，这些软件都是人家发表文章的，非常难，你一辈子能搞定一个就很了不起了，比如我，就搞了一下bowtie，也是一知半解的）<i class="icon-external"></i></a><br />
分享了所有的代码，实在是太方便了：<a class=" wrap external" href="https://github.com/Ensembl" target="_blank" rel="nofollow noreferrer">Ensembl Project · GitHub<i class="icon-external"></i></a><br />
可以跟着这些代码学习编程：<a class=" wrap external" href="https://github.com/Ensembl/ensembl-pipeline" target="_blank" rel="nofollow noreferrer">Ensembl/ensembl-pipeline · GitHub<i class="icon-external"></i></a><br />
它的官网的帮助文档也特别详细：<a class=" wrap external" href="http://useast.ensembl.org/info/index.html" target="_blank" rel="nofollow noreferrer">Help &amp; Documentation<i class="icon-external"></i></a><br />
你现在还缺资料吗？</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bio-info-trainee.com/1079.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
