Elysium Technologies Private Limited
ISO 9001:2008 A leading Research and Development Division
Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore
#230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
(: +91 452-4390702, 4392702, 4390651
Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
Email: info@elysiumtechnologies.com
A b s t r a c t
Bio - Informatics
2010 - 2011
01 Sparse Support Vector Machines with Lp Penalty for Biomarker Identification
The development of high-throughput technology has generated a massive amount of high-dimensional data, and many
of them are of discrete type. Robust and efficient learning algorithms such as LASSO [1] are required for feature
selection and over fitting control. However, most feature selection algorithms are only applicable to the continuous
data type. In this paper, we propose a novel method for sparse support vector machines (SVMs) with Lp ðp < 1Þ
regularization. Efficient algorithms (LpSVM) are developed for learning the classifier that is applicable to high-
dimensional data sets with both discrete and continuous data types. The regularization parameters are estimated
through maximizing the area under the ROC curve (AUC) of the cross-validation data. Experimental results on protein
sequence and SNP data attest to the accuracy, sparsity, and efficiency of the proposed algorithm. Biomarkers
identified with our methods are compared with those from other methods in the literature. The software package in
Matlab is available upon request.
02 Sorting Genomes by Reciprocal Translocations, Insertions, and Deletions
The problem of sorting by reciprocal translocations (abbreviated as SBT) arises from the field of comparative
genomics, which is to find a shortest sequence of reciprocal translocations that transforms one genome _ into another
genome _, with the restriction that _ and _ contain the same genes. SBT has been proved to be polynomial-time
solvable, and several po