Big-Data ML Automation
From variable selection to model comparison — automated end-to-end.
AutoGluon-based engine auto-decides classification vs regression, picks the right evaluation metric, and trains models. View feature importance, confusion matrix, and ROC in one place.
At a glance
Common use
Prediction prototypes · Feature importance · Baseline benchmarks
Outcome
1 categorical (classification) or numeric (regression) column
Engine
AutoGluon (LightGBM · linear models — lightweight ensemble)
Metrics
Classification: accuracy · precision · recall · f1 · roc_auc / Regression: mse · mae · rmse · mape · r²
Input data
CSV / XLSX, rows = samples, columns = variables
Plan
PREMIUM plan and above
Workflow
- 1Variable cleanup + missing-value imputation + encoding
- 2Numeric variable distribution + scatter plot (EDA)
- 3Outlier detection + user-tuned removal threshold (Z-score / IQR)
- 4Multicollinearity removal via correlation + VIF thresholds
- 5Scaler selection (Standard / MinMax / None)
- 6AutoGluon auto-training (problem type · eval metric auto-decided)
- 7Leaderboard + feature importance + confusion matrix / ROC
Supported analyses
Variable EDA
Numeric distribution / scatter + normality
Outlier detection + removal
Z-score / IQR visualisation + user-adjustable
Multicollinearity removal
Auto-remove redundant variables via correlation + VIF
AutoGluon auto-training
Lightweight ensemble (LightGBM + linear) trained in parallel, leaderboard comparison
Feature importance
Quantitative per-variable contribution visualised
Performance diagnostics
Confusion matrix · ROC curve · residual plot auto-generated
Use cases
Customer churn prediction
Predict churn from behaviour + payment patterns, surface the 5 most influential variables.
House-price regression
Regress price on region · area · period, compare via MAE / RMSE.
Fraud detection
Learn fraud patterns across many transaction variables, auto-prioritise recall.
What you get
- Per-model leaderboard (classification: accuracy · F1 · AUC / regression: MAE · RMSE · R²)
- Feature importance chart + table
- Confusion matrix · ROC curve · residual plot
- Best-model download (.pkl)
- Auto-generated paper (preprocessing → modeling → results)