EasySVM Software package
The EasySVM package provides a set of tools based on the Shogun toolbox allowing to train and test SVMs in a simple way.
News
April 28, 2011: | New version of EasySVM released. |
---|
This page is dedicated to the EasySVM package. It provides a set of tools based on the Shogun toolbox allowing to train and test SVMs in a simple way. This toolbox is integrated in our Galaxy server.
The updated release (easysvm-0.3.3.tar.bz2) can be downloaded here. Other releases can be found here.
Installation
For a global install, for which you need root permissions
python setup.py install
For a local install
python setup.py install --prefix=$HOME
See distutils-help.txt for more details.
A simple example
In the following very simple example we generate a two-dimensional data set with two Gaussian-distributed classes (60% positive examples, width of distribution 1.3):
python scripts/datagen.py cloud 1000 2 0.6 1.3 cloud.arffpython scripts/easysvm.py modelsel 5 0.1,1,10 gauss 0.1,1,10 arff cloud.arff modelsel-cloud.txtpython scripts/easysvm.py cv 5 10 gauss 1 arff cloud.arff cv-cloud.txt
Tutorial examples
Many examples of using easysvm are discussed in a tutorial paper. The results of the paper can by reproduced by a script in tutorial_example.py. Execute it in the data directory:
cd datapython ../splicesites/tutorial_example.py
The output of this script can be downloaded here: tutorial_example.out
Galaxy interface
The following command line arguments are what is behind the galaxy interface, which is available as a web service from http://galaxy.raetschlab.org/
There are three types of data creation methods:
datagen.py motif arff gattaca 10 50 10-15 0.1 tttt 100 50 15 0.1 testmotif1.arffdatagen.py cloud 100 3 0.6 1.3 testcloud1.arffdatagen.py motif arff gattaca 100 50 10-15 0.1 tttt 1000 50 15 0.1 testmotif2.arffdatagen.py cloud 1000 3 0.6 1.3 testcloud2.arffdatagen.py motif fasta gattaca 10 50 10-15 0.1 testmotifpos.fastadatagen.py motif fasta tttt 100 50 15 0.1 testmotifneg.fastadatagen.py motif fasta gattaca 100 50 10-15 0.1 tm1.fastadatagen.py motif fasta tttt 1000 50 15 0.1 tm2.fastacat tm1.fasta tm2.fasta > testmotiftest.fastarm tm1.fasta tm2.fasta
Cross validation and evaluation on a independent validation set:
easysvm.py cv 5 10 gauss 0.6 arff testcloud1.arff cv_cloud.txteasysvm.py eval cv_cloud.txt arff testcloud1.arff cv_cloud_eval.txt roc roc_cloud_cv.pngeasysvm.py cv 5 10 wd 10 2 arff testmotif1.arff cv_motif.txt dna Reasysvm.py eval cv_motif.txt arff testmotif1.arff cv_motif_eval.txt roc roc_motif_cv.png
Predict on a test set:
easysvm.py pred 10 gauss 0.6 arff testcloud1.arff testcloud2.arff pred_cloud.txteasysvm.py pred 10 linear arff testcloud1.arff testcloud2.arff pred_cloud.txteasysvm.py pred 10 poly 3 true true arff testcloud1.arff testcloud2.arff pred_cloud.txteasysvm.py pred 10 wd 10 2 arff testmotif1.arff testmotif2.arff pred_motif.txt dna Reasysvm.py pred 10 localalign arff testmotif1.arff testmotif2.arff pred_motif.txt dna Reasysvm.py pred 10 localimprove 10 1 1 arff testmotif1.arff testmotif2.arff pred_motif.txt dna R
For some kernels, investigate the importance of different motives:
easysvm.py poim 10 6 wd 10 2 arff testmotif1.arff poims.png dna R
We also support the fasta format:
easysvm.py cv 5 10 wd 10 2 fasta testmotifpos.fasta testmotifneg.fasta cv_motif.txt dna Reasysvm.py eval cv_motif.txt fasta testmotifpos.fasta testmotifneg.fasta cv_motif_eval.txt roc roc_motif_cv.pngeasysvm.py pred 10 wd 10 2 fasta testmotifpos.fasta testmotifneg.fasta testmotiftest.fasta pred_motif.txt dna Reasysvm.py poim 10 6 wd 10 2 fasta testmotifpos.fasta testmotifneg.fasta poims.png dna R
License
All programs in this collection are free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.