http://www.vcasmo.com/swf/vcasmo.swf

Key Steps :

  1. Did not use R for data import operation - Used SPSS to read the data
  2. Feature Selection - Used R in this step
  3. Data Cleaning - Treatment of Categorical variables was a problem

Software used : SAS + R

Techniques used :  Gradient Boosting machine(gbm package)

Rationale :

  • Handling of missing values
  • Robustness against extreme values
  • Handling categorical and continous variables
  • Models interaction between predictors
  • Can model nonlinear dependencies

Fitting Time :  Couple of hours on a desktop