pykoop.RandomBinningKernelApprox
- class RandomBinningKernelApprox(kernel_or_ddot='laplacian', n_components=100, shape=1, encoder_kw=None, random_state=None)
Bases:
KernelApproximation
Kernel approximation with random binning.
Highly experimental! For more details, see [RR07].
- Parameters:
kernel_or_ddot (str | rv_continuous) –
n_components (int) –
shape (float) –
random_state (int | RandomState | None) –
- n_features_out_
Number of features output. This attribute is not available in estimators from
sklearn.kernel_approximation
.- Type:
- ddot_
Probability distribution corresponding to
\delta \ddot{k}(\delta)
.
- pitches_
Grid pitches for each component.
- Type:
np.ndarray, shape (n_features, n_components)
- shifts_
Grid shifts for each component.
- Type:
np.ndarray, shape (n_features, n_components)
- encoder_
One-hot encoder used for hashing sample coordinates for each component.
Examples
Generate randomly binned features from a Laplacian kernel
>>> ka = pykoop.RandomBinningKernelApprox( ... kernel_or_ddot='laplacian', ... n_components=10, ... shape=1, ... random_state=1234, ... ) >>> ka.fit(X_msd[:, 1:]) # Remove episode feature RandomBinningKernelApprox(n_components=10, random_state=1234) >>> ka.transform(X_msd[:, 1:]) array([...])
- __init__(kernel_or_ddot='laplacian', n_components=100, shape=1, encoder_kw=None, random_state=None)
Instantiate
RandomBinningKernelApprox
.- Parameters:
kernel_or_ddot (Union[str, scipy.stats.rv_continuous]) –
Kernel to approximate. Possible options are
'laplacian'
– Laplacian kernel, with\delta \ddot{k}(\delta)
beingscipy.stats.gamma
with shape parametera=2
(default).
Alternatively, a separable, positive, shift-invariant kernel can be implicitly specified by providing
\delta \ddot{k}(\delta)
as a univariate probability distribution subclassingscipy.stats.rv_continuous
.n_components (int) – Number of random samples used to generate features. The higher the number of components, the higher the number of features. Since unoccupied bins are eliminated, it’s impossible to know the exact number of features before fitting.
shape (float) – Shape parameter. Must be greater than zero. Larger numbers correspond to “sharper” kernels. Scaled to be consistent with
gamma
fromsklearn.kernel_approximation.RBFSampler
. This can lead to a mysterious factor ofsqrt(2)
in other kernels. Default is1
.encoder_kw (Optional[Dict[str, Any]]) – Extra keyword arguments for internal
sklearn.preprocessing.OneHotEncoder
. For experimental use only. The wrong arguments can break everything. Overrides defaults.random_state (Union[int, np.random.RandomState, None]) – Random seed.
- Return type:
None
Methods
__init__
([kernel_or_ddot, n_components, ...])Instantiate
RandomBinningKernelApprox
.fit
(X[, y])Fit kernel approximation.
fit_transform
(X[, y])Fit to data, then transform it.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
transform
(X)Transform data.
- fit(X, y=None)
Fit kernel approximation.
- Parameters:
X (np.ndarray) – Data matrix.
y (Optional[np.ndarray]) – Ignored.
- Returns:
Instance of itself.
- Return type:
- Raises:
ValueError – If any of the constructor parameters are incorrect.
- fit_transform(X, y=None, **fit_params)
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns:
X_new – Transformed array.
- Return type:
ndarray array of shape (n_samples, n_features_new)
- get_metadata_routing()
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)
Get parameters for this estimator.
- set_output(*, transform=None)
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
transform ({"default", "pandas"}, default=None) –
Configure output of transform and fit_transform.
”default”: Default output format of a transformer
”pandas”: DataFrame output
None: Transform configuration is unchanged
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- transform(X)
Transform data.
- Parameters:
X (np.ndarray) – Data matrix.
- Returns:
Transformed data matrix.
- Return type:
np.ndarray