Is there a sign problem with the decision values when using libsvm’s SVM in MVPA-Light?

What’s wrong?

When using MVPA-Light (e.g. mv_classify), in combination with an SVM-classifier as implemented in the libsvm library, the decision values may get flipped, causing cross-validated accuracies (and other performance metrics)to be miscalculated (if they are calculated from the ‘dval’, rather than from the decoded classes directly).

How to check whether my analyses are affected?

The following snippet of MATLAB-code can be used to evaluate whether your setup has this bug.

%%

% set up the environment, only needed if you want to test from scratch,
% otherwise skip this section
tdir = tempdir;
cd(tdir);
system('git -c http.sslVerify=false clone https://github.com/treder/MVPA-Light'); % the -c is needed to allow the system call to work on our compute cluster
system('git -c http.sslVerify=false clone https://github.com/cjlin1/libsvm');

addpath(fullfile(tdir, 'libsvm', 'matlab'));
make;

addpath(fullfile(tdir, 'MVPA-Light', 'startup'));
startup_MVPA_Light;

%%

% downstream, it is assumed that MVPA-Light and libsvm are on your
% MATLAB-path, to test your current installation, don't execute the above
% section

% create some data
design = [ones(1,100) ones(1,100)+1];
dat    = randn(size(design)) + (design-1);

cfg = [];
cfg.classifier = 'libsvm';
cfg.repeat     = 1;
cfg.metric     = 'accuracy';
cfg.preprocess = {'zscore'};
cfg.feature_dimension = [];
cfg.k = 5;

rng(42);
res_svm1 = mv_classify(cfg, dat(:), design(:));

% does another classifier also work?
cfg.classifier = 'lda';

rng(42);
res_lda1 = mv_classify(cfg, dat(:), design(:));

% and what if you also want to know about the AUC?
cfg.metric = {'accuracy' 'auc'}; % specifying auc as well, results in the lower-level code to use the dval for the accuracy estimation, rather than clabel

rng(42);
res_lda2 = mv_classify(cfg, dat(:), design(:));

cfg.classifier = 'libsvm';
res_svm2 = mv_classify(cfg, dat(:), design(:));

% evaluate the performance metrics, if res_svm2 has values < 0.5, , then
% the dval got flipped along the way, leading to a mismatch between the
% class labels and the assumed side of the data points relative to the
% decision hyperplane.

Do I need to do something about it?

If your installation of libsvm+MVPA leads to accuracies (and AUCs) < 0.5 then you need to comment out a line of code in the test_libsvm.m file, which is in <your-distro-of-MVPA-Light>/models. Specifically, the line where the ‘dval’ is negated (dval = -dval) should be commented out. Also, don’t forget to check out the associated link to the libsvm docs, which is probably related to the reason why the sign-flip was implemented in the first place. However, said documentation is a bit ambiguous, and there’s no explicit record of whether - with a well-defined combination of versions of the code - the flip does not need to be executed anymore (and that in other words it is a general bug in the current version of MVPA-Light, and thus that it should be fixed. We have reached out to the MVPA-Light developers. Stay tuned.