I had some trouble finding a good transductive svm (semi-supervised support vector machine or s3vm) implementation for python. Finally I found the implementation of Fabian Gieseke of Oldenburg University, Germany (code is here: https://www.ci.uni-oldenburg.de/60506.html, paper title: Fast and Simple Gradient-Based Optimization for Semi-Supervised Support Vector Machines).
I now try to integrate the learned model into my scikit-learn code.
1) This works already: I've got a binary classification problem. I defined a new method inside the S3VM-code returning the self.__c-coeficients (these are needed for the decision function of the classifier). I then assign these (in my own scikit-code where clf stands for a svm.SVC-classifier) to clf.dual_coefs_ and properly change clf.support_ too (which holds the indices of the support vectors). It takes a while because sometimes you need numpy-arrays and sometimes lists etc. But it works quite well.
2) This doesnt work at all: I want to adapt this now to a multi-class-classification problem (again with an svm.SVC-classifier). I see the scheme for multi-class dual_coef_ in the docs at http://scikit-learn.org/stable/modules/svm.html I tried some things already but seem to mess it up all the time. My strategy is as follows:
for all pairs in classes:
calculate the coefficients with qns3vm for the properly binarized labeled training set (filling 0s into spaces in the coef-vector where instances have been in the labeled training set that are not in the current class-pair) --> get a 1x(l+u)-np.array of coefficients
horizontally stack these to get a (n_class*(n_class-1)/2)x(l+u) matrix | I do not have a clue why the specs say that this should be of shape [n_class-1, n_SV(=l+u)]?
replace clf.dual_coef_ with this matrix
Does anybody know the right way to replace dual_coef_ in the multi-class-setting? Or is there a neat piece of example code someone can recommend? Or at least a better explanation for the shape of dual_coef_ in the one-vs-one-multiclass-setting?
Thanks!
Damian