Hyperparameter Search Tool

Define your search space, choose a strategy (Grid, Random, or Bayesian), and visualize coverage on a 2D scatter plot — with estimated trial counts and runtime budgets.

Built by Michael Lip

Search Space

LR (log scale)

log10 range (e.g. -5 = 1e-5, -2 = 0.01)

LR grid steps

Batch size

Dropout range

Dropout steps

Epochs / trial

Sec / epoch

Strategy

Total combinations: — Trials: — Est. runtime: —

Frequently Asked Questions

What is the difference between Grid Search and Random Search?

Grid search exhaustively tries every combination of the specified hyperparameter values. Random search samples uniformly at random from the search space for a fixed number of trials. For high-dimensional spaces, random search finds good configurations faster because it explores more diverse values of each dimension per trial.

How does Bayesian hyperparameter optimization work?

Bayesian optimization builds a probabilistic surrogate model (typically a Gaussian process) of the objective function. After each trial it updates the model and uses an acquisition function — such as Expected Improvement (EI) — to select the next point most likely to improve on the best result so far. This is more sample-efficient than random search for expensive evaluations.

What does the 2D scatter plot show?

The horizontal axis is log10(learning rate) and the vertical axis is batch size. Each dot is one trial. Grid trials form a regular lattice; random trials are scattered uniformly; Bayesian trials cluster progressively toward promising regions based on the EI acquisition function.

How is "Coverage %" calculated?

The 2D space (LR × batch size) is divided into a 10×10 grid of cells. Coverage is the percentage of cells that contain at least one trial point. Grid search always achieves 100% on its own axes. Random and Bayesian coverage depends on the number of trials relative to the grid dimension.

Should I use log scale for learning rate?

Yes. Learning rates typically span several orders of magnitude (e.g., 1e-5 to 1e-1). Sampling uniformly on a log scale gives equal probability to each order of magnitude, which means random search explores the space proportionally rather than being dominated by large values.

How It Works

All computation runs client-side in vanilla JavaScript. Grid points are generated from Cartesian products of the defined values. Random points use Math.random() with log-uniform sampling on LR. Bayesian points use a simplified Gaussian surrogate with squared-exponential kernel and analytical EI, re-fitted after each trial.

The acquisition function canvas renders EI over the 1D LR axis to illustrate the exploration-exploitation trade-off visually.

Hyperparameter Search Tool

Search Space

Strategy

Bayesian Acquisition Function — Expected Improvement (EI)

Frequently Asked Questions

How It Works

Related Tools

Privacy

Contact