Friday, April 26, 2013

[Solved] R SVM test data does not match model

Hi,

Here is my solution to error "test data does not match model !". It occurs, when you try to predict testdata with SVM model from e1071 like bellow
predict(mySVMmodel, type="class", testset)
I found some hint here http://r.789695.n4.nabble.com/Levels-in-new-data-fed-to-SVM-td4654969.html , but in wasn't exactly my case. I lost few hours but I have solution now.

 You have to set factor levels of ALL your columns to be exactly the same as in training data, not only class column...
So you can use sth. like:
(edition: thx to Ting Chi) 

testset$foocolname <- factor(
    testset$foocolname,levels = levels(trainset$foocolname)
)
testset$goocol <- factor(
    testset$goocol,levels = levels(trainset$goocol)
)
etc...
 If it helps, let me know:)

Edit: some tips
  • Error "length of 'center' must equal the number of columns of 'x'" might be somehow connected with factor levels problem. I don't know why, but by using tips from the post i solved that error too.
  • When you assign some factor levels you might get error "number of levels differs". It means, that left side column contains more factor levels than right side column and your idea is probably wrong.