I have a question about MLlib in Spark.(with Scala)
I'm trying to understand how LogisticRegressionWithLBFGS and LogisticRegressionWithSGD work. I usually use SAS or R to do logistic regressions but I now have to do it on Spark to be able to analyze Big Data.
How is the variable selection done? Is there any try of different variable combinations in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD? Something like a test of significance of variable one by one? Or a correlation calculation with the variable of interest? Is there any calculation of BIC, AIC to choose the best model?
Because the model only returns weights and intercept...
How can I understand those Spark functions and compare to what I'm used to with SAS or R ?