Site Meter Abstract of "Parallelization of the Incremental Proximal Support Vector Machine Classifier using a Heap-based Tree Topology"


"Parallelization of the Incremental Proximal Support Vector Machine Classifier using a Heap-based Tree Topology"

Authors: Amund Tveit and Håvard Engum

Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly retraining with the old data. In this paper we propose two efficient parallelized algorithms based on heaps of processing nodes for classification with the incremental proximal SVM introduced by Fung and Mangasarian.

Implementation of Algorithms: Implementations can be found at the Sourceforge Incridge - A Scalable Classification Tool project page.

Keywords: Classification, Data Mining, Machine Learning and Parallel Algorithms


This paper has been used as curriculum, recommended reading or related material for the following courses:
  1. Professor Arvind Gupta: "CMPT 881 G1 - Introduction to Computational Biology", Simon Fraser University, Canada, Fall 2004
    (Used in the list of possible project papers)

Known Citations:
  1. K. Woodsend and J. Gondzio. "Hybrid MPI/OpenMP Parallel Linear Support Vector Machine Training", Journal of Machine Learning Research 10(Aug):1937--1953, 2009

  2. HH Ang, V. Gopalkrishnan and WK Ng. "Classification in P2P Networks by Bagging Cascade RSVMs", Very Large Databases (VLDB 2008) Workshop on Databases, Information Systems and Peer-to-peer Computing (DBISP2008), Auckland, New Zealand, 2008

  3. HH Ang, V. Gopalkrishnan, SCH Hoi and WK Ng. "Cascade RSVM in peer-to-peer networks". Proceedings of Principles and Pratices of Knowledge Discovery in Databases (PKDD 2008), ECML/PKDD 2008, Lecture Notes in Computer Science (LNCS) 5211, Springer-Verlag, Antwerp, Belgium, 2008

  4. Leon Bottou "Large-Scale Kernel Machines", Book, ISBN ISBN 0262026252, MIT Press, USA, 2007

  5. K. Woodsend and J. Gondzio, Parallel Support Vector Machine Training with Nonlinear Kernels, Technical Report MS-07-007, School of Mathematics, The University of Edinburgh, November, 2007

  6. Hans P. Graf, Igor Durdanovic, Eric Cosatto and Vladimir Vapnik. "Spread kernel support vector machine", US Patent Applications #20070094170, Class: 706015000, Nec Laboratories, USA, 2007

  7. Jing Yang. "An Improved Cascade SVM Training Algorithm with Crossed Feedbacks", Proceedings of International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2006), Hangzhou, China, June 2006

  8. M. R. Guarracino, C. Cifarelli, O. Seref, P. M. Pardalos. "A Parallel Classification Method for Genomic and Proteonomic Problems", Proceedings of 20th International Conference on Advanced Information Networking and Applications (AINA 2006) Volume 2, IEEE Press, pp 588-592, Vienna, Austria, April 2006

  9. Hans P. Graf, Eric Cosatto, Leon Bottou and Vladimir Vapnik. "Parallel Support Vector Method and Apparatus", US Patent Applications #20060112026, Nec Laboratories, USA, 2006

  10. Yoojin Chung, Sang-Young Cho and Sung Y. Shin. "Parallel Prediction of Protein-Protein Interactions Using Proximal SVM", Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing: 10th International Conference, RSFDGrC 2005, Lecture Notes in Computer Science (LNCS) 3642, Springer-Verlag, Regina, Canada, August 31 - September 3, 2005

  11. Jian-Pei Zhang, Zhong-Wei Li and Jing Yang. "A Parallel SVM Training Algorithm on Large-Scale Classification Problems", Proceedings of the 4th International Conference on Machine Learning and Cybernetics, Guangzhou, China, August 2005

  12. Shibin Qiu and Terran Lane. "Parallel Computation of RBF Kernels for Support Vector Classifiers", SIAM International Conference on Data Mining, Newport Beach, USA, April 2005

  13. Hans Peter Graf, Eric Cosatto, Leon Bottou, Igor Dourdanovic and Vladimir Vapnik. "Parallel Support Vector Machines: The Cascade SVM", Advances in Neural Information Processing Systems, Volume 17, MIT Press, 2005

  14. Eray Ozkural. Web-Scale Automatic Text Categorization: algorithmic metrics, support vector classification/clustering, parallel algorithms, Phd Proposal, Bilkent University, Turkey, 2005

  15. Yoojin Chung, Sang-Young Cho and Chul-Hwan Kim. "Predicting Protein-Protein Interactions in Parallel", Proceedings of Workshop on State-of-the-Art in Scientific Computing, IMM-Technical Report-2005-09, Lyngby, Denmark, June 2004

  16. Doina Caragea, Learning classifiers from distributed, semantically heterogeneous, autonomous data sources, Doctor of Philosophy (PhD) Thesis, Iowa State University, USA, 2004

  17. Shibin Qiu and Terran Lane, Parallel Kernel Computation for High Dimensional Data and Its Application to fMRI Image Classification, University of New Mexico Technical Report TR-CS-2004-12, USA, December, 2003

  18. Solomon Gibbs. "Data Parallelism and the Support Vector Machine", IPS Research Forum, Information Processing Systems (IPS) Laboratory, Department of Electrical Engineering, Ohio State University, Columbus, Ohia USA, 2003

  19. Kun Liu and Hillol Kargupta. "Distributed Data Mining Bibliography", Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, USA, 2003

Follow @atbrox