I promised my readers that I would post about YALE/RapidMiner’s LibSVM operator over a month ago. Unfortunately life had gotten in the way and I’m resorting to a multiple part series to just get the information out to you, so bear with me over the course of the next few days (or weeks) as I write about this exciting, powerful, and complicated learner.
First off, I use the LibSVM operator in YALE 3.4 occasionally, I use it to fool around with data sometimes and I rarely use it to build trading models. I prefer to use either the Gaussian Regression, Multilayer Preceptron, or IBk Learners for my time series data modeling. However, you could use the LibSVM learner for time series data modeling but I have found it more useful in analyzing non-time series data.
What the heck is SVM anyway? Wikipedia defines a SVM as a Support Vector Machine that is â€œa set of related supervised learning methods used for classification and regression.â€ Wikipedia continues to give a decent overview on how SVM’s work:
A special property of SVMs is that they simultaneously minimize the empirical classification error and maximize the geometric margin; hence they are also known as maximum margin classifiers
Support vector machines map input vectors to a higher dimensional space where a maximal separating hyperplane is constructed. Two parallel hyperplanes are constructed on each side of the hyperplane that separates the data. The separating hyperplane is the hyperplane that maximizes the distance between the two parallel hyperplanes. An assumption is made that the larger the margin or distance between these parallel hyperplanes the better the generalisation error of the classifier will be.
This particular algorithm, the LibSVM, was created by two researchers Chih-Chung Chang and Chih-Jen Lin at the National Science Council of Taiwan. YALE/RapidMiner packages their LibSVM learner into the nice operator you see to left of this paragraph.
What makes the LibSVM learner so appealing to us is that it can do 5 specialized tasks: it does 2 types of regression (epsilon-SVR, nu-SVR), 2 types of classification (C-SVC, nu-SVC), and something called one class SVM.
I’ll go into greater detail about each type of specialized task in part II of my LibSVM tutorial. If you want to learn more before then, visit Chang and Lin’s website for more details.
For Time Series Applications
I wanted to share two research papers that are invaluable to anyone trying to use Support Vector Machines (SVM) for modeling the stock market. One written by an author well known to the Rapid-I team, and another by Korean researcher. I’ve used both of these papers as blueprints for some of my past stock market analysis processes.
The first one is by Kyoung-jae Kim and titled “Financial time series forecasting using support vector machines”.
The second is by Stefan Ruping (forgive the missing umlaut) and titled “SVM Kernels for Time Series Analysis”“