Quickstart
This guide walks through the core MICE workflow with three examples of increasing complexity.
Minimal Example
Start with the simplest possible setup — minimizing the expected squared distance from a random variable:
import numpy as np
from mice import MICE
from mice.policy import DropRestartClipPolicy
def gradient(x, thetas):
return x - thetas
def sampler(n):
return np.random.randn(n, 1)
estimator = MICE(
grad=gradient,
sampler=sampler,
eps=0.577, # relative error tolerance (1/√3)
min_batch=10,
policy=DropRestartClipPolicy(
drop_param=0.5,
restart_param=0.0,
max_hierarchy_size=100,
),
max_cost=10_000, # maximum gradient evaluations
stop_crit_norm=1e-6, # stopping criterion
)
x = np.array([10.0])
for iteration in range(100):
g = estimator(x)
if estimator.terminate:
print(f"Terminated early: {estimator.terminate_reason}")
break
x = x - 0.1 * g
print(f"Iteration {iteration}: x = {x[0]:.6f}")
Key points:
MICEis imported frommice(top-level re-export).DropRestartClipPolicycontrols index-set operations: Add, Drop, Restart, and Clip behavior.epscontrols the relative error tolerance. Smaller values mean tighter error control but more gradient evaluations.max_costbounds total gradient evaluations; the estimator setsterminate = Truewhen exhausted.stop_crit_normtriggers early termination when the estimated gradient norm drops below the square root of this threshold.
Finite-Sum Problems
For finite datasets (e.g., empirical risk minimization on a fixed training
set), pass the data array directly as the sampler argument instead of a
callable:
import numpy as np
from mice import MICE
# Example: linear regression on a fixed dataset
rng = np.random.default_rng(42)
n_samples, n_features = 500, 5
X = rng.normal(size=(n_samples, n_features))
true_w = rng.normal(size=n_features)
y = X @ true_w + 0.1 * rng.normal(size=n_samples)
data = np.column_stack([y, X])
def grad(x, thetas):
"""Vectorized gradient of ||y - Xw||^2 / (2n)."""
y_batch = thetas[:, 0]
X_batch = thetas[:, 1:]
residuals = X_batch @ x - y_batch
return (X_batch * residuals[:, None]) / n_samples
estimator = MICE(
grad=grad,
sampler=data, # pass array, not callable
eps=0.577,
min_batch=10,
max_cost=50_000,
stop_crit_norm=1e-6,
)
x = np.zeros(n_features)
for _ in range(200):
g = estimator(x)
if estimator.terminate:
break
x = x - 0.05 * g
When sampler is an array, MICE automatically detects the finite-sum
setting and uses without-replacement sampling with optimized sample-size
formulas that account for the finite population correction.
Policy Configuration
Fine-tune index-set management with DropRestartClipPolicy:
from mice.policy import DropRestartClipPolicy
policy = DropRestartClipPolicy(
drop_param=0.5, # threshold for dropping last iterate
restart_param=0.0, # threshold for restarting hierarchy
max_hierarchy_size=100, # maximum |L_k|
clip_type="full", # "full", "all", or None (disabled)
aggr_cost=0.1, # aggregation cost factor
)
estimator = MICE(grad=gradient, sampler=sampler, policy=policy)
Each parameter:
drop_param(float, default 0.5): Controls how aggressively the Drop operator removes near-redundant iterates. Higher values make dropping more likely.restart_param(float, default 0.0): Controls Restart sensitivity. Non-zero values allow restarts even when the cost improvement is minor.max_hierarchy_size(int, default 1000): Caps the number of retained iterates to bound memory and computation.clip_type(str or None, default None):"full"clips when a level reaches the finite-data ceiling;"all"evaluates all possible clip points and picks the cheapest;Nonedisables clipping.aggr_cost(float, default 0.1): Penalty per level in cost computations, encouraging smaller hierarchies.
Resampling-Based Norm Estimation
Enable robust norm estimation for sizing and stopping:
estimator = MICE(
grad=gradient,
sampler=sampler,
use_resampling=True, # enabled by default
re_part=5, # number of jackknife partitions
re_quantile=0.05, # quantile for the error tolerance
re_tot_cost=0.2, # resampling cost budget fraction
stop_crit_prob=0.05, # probability threshold for stopping rule
)