Here is my solution to error "test data does not match model !". It occurs, when you try to predict testdata with SVM model from e1071 like bellow
predict(mySVMmodel, type="class", testset)I found some hint here http://r.789695.n4.nabble.com/Levels-in-new-data-fed-to-SVM-td4654969.html , but in wasn't exactly my case. I lost few hours but I have solution now.
You have to set factor levels of ALL your columns to be exactly the same as in training data, not only class column...
So you can use sth. like:
(edition: thx to Ting Chi)
testset$foocolname <- factor(
testset$foocolname,levels = levels(trainset$foocolname)
)
testset$goocol <- factor(
testset$goocol,levels = levels(trainset$goocol)
)
etc...If it helps, let me know:)
Edit: some tips
- Error "length of 'center' must equal the number of columns of 'x'" might be somehow connected with factor levels problem. I don't know why, but by using tips from the post i solved that error too.
- When you assign some factor levels you might get error "number of levels differs". It means, that left side column contains more factor levels than right side column and your idea is probably wrong.
Hi,
ReplyDeleteI just tried your solution. But the function levels() somehow changed my
original data. For example:
> test_ke = read.csv(file.choose(),encoding = "UTF-8")
> test_ke$label[1:10]
[1] null null null null null null null null null null
Levels: attitude_predicate background_pointer null prejacent source_pointer
> levels(test_ke$label) <- levels(sample_200$label)
> test_ke$label[1:10]
[1] degree_indicator degree_indicator degree_indicator degree_indicator
degree_indicator degree_indicator degree_indicator degree_indicator
degree_indicator degree_indicator
Levels: attitude_predicate background_pointer degree_indicator
negative_marker null prejacent source_pointer
As shown in the example, all the labels in test_ke[1:10,] are "null", but
after I change the levels it, the original labels changed to other values.
Do you have any idea about why this happened?
Best,
Ting
Ting Chi, thx for your post. It was really an error. Please check if new sollution solves your problem and let me know here. It was all connected with reordering levels.
DeleteProgrammer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download Now
Delete>>>>> Download Full
Programmer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download LINK
>>>>> Download Now
Programmer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download Full
>>>>> Download LINK DZ
I was able to quickly resolve this issue on a large dataframe with many factors using the following:
ReplyDeletetestData$ManualIsEvil <- "test flag"
combined <- rbind(trainData, testData)
testData <- combined[combined$ManualIsEvil == "test flag" ,]
trainData <- combined[combined$ManualIsEvil != "test flag" ,]
When I used rbind to combine the test and training datasets, all of the factors were fixed. In my case I just used the training class variable to create a flag on all the test test records. So I could easily separate them back out.
Thanks this worked for me.
Deleteme too, thank you so much !! :)
DeleteI have the same problem, I change parameters and add test_datadecision.values=TRUE ... It works for me
ReplyDeletepredicted_svm <- predict(model_svm, test_datadecision.values=TRUE,probability=TRUE)
Przemek, you just saved me an insane amount of time with this post. Thank you so much!!!! I was wondering why I kept throwing the "test data does not match model!" error.
ReplyDeleteProgrammer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download Now
ReplyDelete>>>>> Download Full
Programmer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download LINK
>>>>> Download Now
Programmer'S Blog: [Solved] R Svm Test Data Does Not Match Model >>>>> Download Full
>>>>> Download LINK v6