Differences

This shows you the differences between two versions of the page.

part11 [2009/05/14 12:29]
nuin created
part11 [2009/12/04 14:38] (current)
newacct
Line 8: Line 8:
So, a //set// is basically a collection of item, but its unordered and not indexed, because it does not record element position or indention order. Methods available for a //set// include //union()//, //intersection()//, //difference()// that we will check next time. First let's see some basic set functionality. So, a //set// is basically a collection of item, but its unordered and not indexed, because it does not record element position or indention order. Methods available for a //set// include //union()//, //intersection()//, //difference()// that we will check next time. First let's see some basic set functionality.
-Differently from other commonly available Python object types, we need to import a library in order to use <code>set</code>+In Python 2.4, //set// is available as a built-in object type. In Python 2.3, we need to import a library in order to use <code>set</code>, like this:
-<code python>from sets import Set</code> +<code python>from sets import Set as set</code>
- +
-should work.+
A first use for //set// would be to uniquify a list. Let's say that you have the gene IDs of two different clusters and you want to merge these lists and keep only the unique ones, eliminating possible duplicates IDs. We could do that with a //dictionary// and a simple function (we will also check this later on) but a <code>set</code> makes our life easier. A first use for //set// would be to uniquify a list. Let's say that you have the gene IDs of two different clusters and you want to merge these lists and keep only the unique ones, eliminating possible duplicates IDs. We could do that with a //dictionary// and a simple function (we will also check this later on) but a <code>set</code> makes our life easier.
-<code python>from sets import Set +<code python>cluster1 = open(sys.argv[1]).readlines()
- +
-cluster1 = open(sys.argv[1]).readlines()+
cluster2 = open(sys.argv[2]).readlines() cluster2 = open(sys.argv[2]).readlines()
allgenes = cluster1 + cluster2 allgenes = cluster1 + cluster2
-uniqueset = Set(allgenes)</code>+uniqueset = set(allgenes)</code>
and that's all. Of course we won't have a flexibility of a //list//, but we can easily convert the //set// to a list and manipulate as before. and that's all. Of course we won't have a flexibility of a //list//, but we can easily convert the //set// to a list and manipulate as before.
Line 35: Line 31:
Like last time (with one small addition) Like last time (with one small addition)
-<code python>from sets import Set +<code python>cluster1 = open(sys.argv[1]).readlines()
- +
-cluster1 = open(sys.argv[1]).readlines()+
cluster2 = open(sys.argv[2]).readlines() cluster2 = open(sys.argv[2]).readlines()
allgenes = cluster1 + cluster2 allgenes = cluster1 + cluster2
-uniqueset = Set(allgenes)+uniqueset = set(allgenes)
finalist = list(uniqueset)</code> finalist = list(uniqueset)</code>
Line 93: Line 87:
For a more comprehensive test and more functions with a similar objective check this [[http://www.peterbe.com/plog/uniqifiers-benchmark | page]]. For a more comprehensive test and more functions with a similar objective check this [[http://www.peterbe.com/plog/uniqifiers-benchmark | page]].
- 
 
part11.txt · Last modified: 2009/12/04 14:38 by newacct
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki