A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
The algorithms studied are
in various commonly used open source implementations like
In summary, we are focusing on which algos/implementations can be used to train relatively accurate binary classif…