The posts from Beginning Python for Bioinformatics are and will be included here, allowing for a better experience when reading the entries.

Beginning the begin

This website uses as a premise the book:

Beginning Perl for Bioinformatics by James Tisdal

which was published in 2001. My idea here is to follow the structure of the book, analyzing each chapter and converting the Perl scripts into Python. The original book is very well written and an excellent starting point for any aspiring bioinformatician, either if you are a biologist that does not understand programming or a computer scientist that does not know a lot of biology and maybe even Perl.

In no way this website/tutorial tries to plagiarize the book and I will try to include a minimum amount Perl code, as the book is only used as an starting point (a very good one indeed) to this journey into Python. Here you will not find biological concept explanations and criticisms towards Perl. Making this clear, I will start from the beginning.

Why Python (and not Perl)?

According to the official Python website:

“Python and Perl come from a similar background (Unix scripting, which both have long outgrown) [to learn more about that check this tutorial], and sport many similar features, but have a different philosophy. Perl emphasizes support for common application-oriented tasks, e.g. by having built-in regular expressions, file scanning and report generating features. Python emphasizes support for common programming methodologies such as data structure design and object-oriented programming, and encourages programmers to write readable (and thus maintainable) code by providing an elegant but not overly cryptic notation. As a consequence, Python comes close to Perl but rarely beats it in its original application domain; however Python has an applicability well beyond Perl's niche.”

I couldn't explain better than that. But still I have to give my take on why I prefer Python over Perl, and why I decided to use it in my day-to-day programming. First I have to admit that I am lousy Perl programmer (not even close to an apprentice monger) and I always get confused by its syntax. Second I come from a Basic/Pascal/C++ background, all of them having slightly better syntaxes than Perl. Thus, it was natural to get on the Python bandwagon, and as the paragraph above states Python code is “extremely” readable (emphasis are mine); in no-time you can grasp it completely. OK, I admit that it has at least one odd feature : the “mandatory” indentation. In Python you have to tabulate (using tabs or space <- recommended) loops, if clauses, functions, anything. Maybe this is the first and only hard step to get, but after a couple of hours of coding you will be satisfied on how good your code look.

Part 1 - an introduction, and some sequence manipulation

Part 2 - flow control, finding motifs and some string manipulation

Part 3 - functions and command line arguments

Part 4 - randomization and simple sequence simulation

Part 5 - genetic code and some “DNA” manipulation

Part 6 - more motif finding and some restriction enzymes

Part 7 - GenBank files

Part 8 - Splitting a FASTA file

Part 9 - functional programming

Part 10 - “cutting” chromosomes

Part 11 - uniquifying lists

Part 12 - the Fasta module

Part 13 - More Python sets

Part 14 - Obtaining overrepresented motifs in DNA sequences

Part 15 - Creating an interface for the motif finding script

Part 16 - Managing a simple database with Python, SQLite and wxPython

beginning_python_for_bioinformatics.txt · Last modified: 2009/05/22 11:29 by nuin
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki