background image
Multicategory Incremental Proximal Support
Vector Classifiers
Amund Tveit and Magnus Lie Hetland
Department of Computer and Information Science,
Norwegian University of Science and Technology,
N-7491 Trondheim, Norway
{amundt,mlh}@idi.ntnu.no
Abstract. Support Vector Machines (SVMs) are an efficient data min-
ing approach for classification, clustering and time series analysis. In
recent years, a tremendous growth in the amount of data gathered has
changed the focus of SVM classifier algorithms from providing accurate
results to enabling incremental (and decremental) learning with new data
(or unlearning old data) without the need for computationally costly re-
training with the old data. In this paper we propose an efficient algorithm
for multicategory classification with the incremental proximal SVM in-
troduced by Fung and Mangasarian.
1
Introduction
Support Vector Machines (SVMs) are an efficient data mining approach for clas-
sification, clustering and time series analysis [13]. In recent years, a tremendous
growth in the amount of data gathered (for example, in e-commerce and intru-
sion detection systems) has changed the focus of SVM classifier algorithms from
providing accurate results to enabling incremental (and decremental) learning
with new data (or unlearning old data) without the need for computationally
costly retraining with the old data. Fung and Mangasarian [4] introduced the
Incremental and Decremental Linear Proximal Support Vector Machine (PSVM)
for binary classification and showed that it could be trained extremely efficiently,
with one billion examples (500 increments of two million examples) in two hours
and twenty-six minutes on relatively low-end hardware (400 MHz Pentium II).
In this paper we propose an efficient algorithm based on memoization, in
order to support Multicategory Classification for the Incremental PSVM.
2
Background Theory
The standard binary SVM classification problem with soft margin (allowing some
errors) is shown visually in Fig. 1(a) and as a constrained quadratic programming
problem in (1). Intuitively, the problem is to maximize the margin between the
solid planes and at the same time permit as few errors as possible, errors being
positive class points on the negative side (of the solid line) or vice versa.
108
Incremental Multicategory PSVM Classifiers

<< - < - > - >>