"Parallelization of the Incremental Proximal Support Vector Machine Classifier using a Heap-based Tree Topology"
Authors: Amund Tveit and Håvard Engum
Abstract:
Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly retraining with the old data. In this paper we propose two efficient parallelized algorithms based on heaps of processing nodes for classification with the incremental proximal SVM introduced by Fung and Mangasarian.
Implementation of Algorithms:
Implementations can be found at the Sourceforge Incridge - A Scalable Classification Tool project page.
Keywords:
Classification, Data Mining, Machine Learning and Parallel Algorithms
[PDF]
Curriculum:
This paper has been used as curriculum, recommended reading or related material for the following courses:
Professor Arvind Gupta: "CMPT 881 G1 - Introduction to Computational Biology", Simon Fraser University, Canada, Fall 2004
(Used in the list of possible project papers)
Known Citations:
Leon Bottou et.al "Large-Scale Kernel Machines", Book, ISBN ISBN 0262026252, MIT Press, USA, 2007
Hans P. Graf, Igor Durdanovic, Eric Cosatto and Vladimir Vapnik. "Spread kernel support vector machine", US Patent Applications #20070094170, Class: 706015000, Nec Laboratories, USA, 2007
Jing Yang. "An Improved Cascade SVM Training Algorithm with Crossed Feedbacks", Proceedings of International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2006), Hangzhou, China, June 2006
M. R. Guarracino, C. Cifarelli, O. Seref, P. M. Pardalos. "A Parallel Classification Method for Genomic and Proteonomic Problems", Proceedings of 20th International Conference on Advanced Information Networking and Applications (AINA 2006) Volume 2, IEEE Press, pp 588-592, Vienna, Austria, April 2006
Hans P. Graf, Eric Cosatto, Leon Bottou and Vladimir Vapnik. "Parallel Support Vector Method and Apparatus", US Patent Applications #20060112026, Nec Laboratories, USA, 2006
Yoojin Chung, Sang-Young Cho and Sung Y. Shin. "Parallel Prediction of Protein-Protein Interactions Using Proximal SVM", Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing: 10th International Conference, RSFDGrC 2005, Lecture Notes in Computer Science (LNCS) 3642, Springer-Verlag, Regina, Canada, August 31 - September 3, 2005
Jian-Pei Zhang, Zhong-Wei Li and Jing Yang. "A Parallel SVM Training Algorithm on Large-Scale Classification Problems", Proceedings of the 4th International Conference on Machine Learning and Cybernetics, Guangzhou, China, August 2005
Shibin Qiu and Terran Lane. "Parallel Computation of RBF Kernels for Support Vector Classifiers", SIAM International Conference on Data Mining, Newport Beach, USA, April 2005
Hans Peter Graf, Eric Cosatto, Leon Bottou, Igor Dourdanovic and Vladimir Vapnik. "Parallel Support Vector Machines: The Cascade SVM", Advances in Neural Information Processing Systems, Volume 17, MIT Press, 2005
Eray Ozkural. Web-Scale Automatic Text Categorization: algorithmic metrics, support vector classification/clustering, parallel algorithms, Phd Proposal, Bilkent University, Turkey, 2005
Yoojin Chung, Sang-Young Cho and Chul-Hwan Kim. "Predicting Protein-Protein Interactions in Parallel", Proceedings of Workshop on State-of-the-Art in Scientific Computing, IMM-Technical Report-2005-09, Lyngby, Denmark, June 2004
Doina Caragea, Learning classifiers from distributed, semantically heterogeneous, autonomous data sources, Doctor of Philosophy (PhD) Thesis, Iowa State University, USA, 2004
Shibin Qiu and Terran Lane, Parallel Kernel Computation for High Dimensional Data and Its Application to fMRI Image Classification, University of New Mexico Technical Report TR-CS-2004-12, USA, December, 2003
Solomon Gibbs. "Data Parallelism and the Support Vector Machine", IPS Research Forum, Information Processing Systems (IPS) Laboratory, Department of Electrical Engineering, Ohio State University, Columbus, Ohia USA, 2003
Kun Liu and Hillol Kargupta. "Distributed Data Mining Bibliography", Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, USA, 2003
|
 |