Tech Specs¶

Model	Parameter	`fast` Mode	`industrial` Mode
Logistic	`clf__C`	[0.01, 0.1, 1, 10]	[0.01, 0.1, 1, 10]
	`clf__penalty`	['l1', 'l2']	['l1', 'l2', 'elasticnet']
	`clf__l1_ratio`	Not used	[0.1, 0.5, 0.9]
RandomForest	`clf__n_estimators`	[100, 200]	[300, 500, 1000]
	`clf__max_depth`	[5, 10]	[10, 20, None]
	`clf__min_samples_split`	[2, 5, 10]	[2, 5, 10]
XGBoost	`clf__n_estimators`	[100, 200]	[300, 500, 1000]
	`clf__max_depth`	[3, 5]	[3, 5, 7]
	`clf__learning_rate`	[0.1, 0.2]	[0.01, 0.05, 0.1]
	`clf__subsample`	Not used	[0.6, 0.8, 1.0]
LightGBM	`clf__n_estimators`	[100, 200]	[300, 500, 1000]
	`clf__max_depth`	[5, 10]	[10, 15, 20]
	`clf__learning_rate`	[0.1, 0.2]	[0.01, 0.05, 0.1]
	`clf__num_leaves`	Not used	[15, 31, 63]
CatBoost	`clf__iterations`	[200, 300]	[500, 1000]
	`clf__depth`	[4, 6]	[4, 6, 8]
	`clf__learning_rate`	[0.1, 0.2]	[0.01, 0.05, 0.1]
	`clf__l2_leaf_reg`	Not used	[1, 3, 5, 7]

✅ fast → lightweight grid for quick experimentation
✅ industrial → larger, production-ready grid covering more parameter combinations

You can set this using the mode argument when calling build_model_config() or initializing GridMaster.

Model	Random State	Special Defaults	Recommended Use Cases
Logistic	66	`solver='liblinear'` if ≤10,000 samples or mode='fast'; `solver='saga'` with `penalty=['l1', 'l2', 'elasticnet']` if >10,000 samples or mode='industrial'	Best for small-to-medium datasets or when you need interpretable models; `saga` supports large-scale data and elasticnet but requires standardized inputs.
RandomForest	66	Uses sklearn `RandomForestClassifier`; adjusts trees and depth based on mode	Excellent general-purpose model, robust to overfitting, works well on tabular data with mixed feature types; fast mode for quick trials, industrial mode for robust tuning.
XGBoost	66	`eval_metric='logloss'`, `use_label_encoder=False`, `verbosity=0`; optional GPU configs allowed via `custom_estimator_params`	Highly performant on structured data, handles missing values natively; recommended for competition-grade or production tasks; supports GPU for large-scale runs.
LightGBM	66	`verbosity=-1`; optional GPU configs allowed via `custom_estimator_params`	Similar to XGBoost but faster on large datasets; works well with categorical features; recommended for fast iteration and industrial pipelines.
CatBoost	66	`verbose=0`; optional GPU configs allowed via `custom_estimator_params`	Best when working with categorical data; often requires less parameter tuning out-of-the-box; GPU acceleration improves scalability on big data.

Model	Smart Mode (Auto Top 2)	Expert Mode (Pre-Selected)	Custom Mode (User-Defined)
Logistic Regression	Based on top test score variation — usually `clf__C`, `clf__penalty`	Always `clf__C`	Whatever user provides in `custom_fine_params`
Random Forest	Based on top test score variation — usually `clf__max_depth`, `clf__min_samples_split`	Always `clf__max_depth`, `clf__min_samples_split`	Whatever user provides in `custom_fine_params`
XGBoost	Based on top test score variation — usually `clf__learning_rate`, `clf__max_depth`	Always `clf__learning_rate`, `clf__max_depth`	Whatever user provides in `custom_fine_params`
LightGBM	Based on top test score variation — usually `clf__learning_rate`, `clf__max_depth`	Always `clf__learning_rate`, `clf__max_depth`	Whatever user provides in `custom_fine_params`
CatBoost	Based on top test score variation — usually `clf__learning_rate`, `clf__depth`	Always `clf__learning_rate`, `clf__depth`	Whatever user provides in `custom_fine_params`

Smart Mode dynamically detects which parameters matter most for each dataset — great for flexible, adaptive tuning.
Expert Mode sticks to proven influential parameters, reducing grid size and focusing search.
Custom Mode gives you complete freedom but requires you to define a meaningful and valid parameter grid yourself.

By default, GridMaster uses half of the detected CPU cores (n_jobs) to balance system load and optimization speed.

You can override this by setting:

Tip: On shared or production servers, test carefully before using full CPU to avoid resource contention.

Supported for:

These can be passed through the custom_estimator_params argument.

Important: Requires proper GPU drivers and library installations; otherwise, the system may silently fall back to CPU without warnings.

Logistic Regression:

Tree-based models (Random Forest, XGBoost, LightGBM, CatBoost):

Use 'passthrough' because they are scale-invariant and don’t require normalization.

Logistic Regression, Random Forest:

XGBoost:

LightGBM, CatBoost: