pyuoi.datasets
Dataset utility functions for the pyuoi
package.
Testing Utilities
- pyuoi.datasets.make_classification(n_samples=100, n_features=20, n_informative=2, n_classes=2, shared_support=False, random_state=None, w_scale=1.0, include_intercept=False)[source]
Make a linear classification dataset.
- Parameters
n_samples (int) – The number of samples to make.
n_features (int) – The number of feature to use.
n_informative (int) – The number of feature with non-zero weights.
n_classes (int) – The number of classes.
shared_support (bool) – If True, all classes will share the same random support. If False, they will each have randomly chooses support.
random_state (int or np.random.RandomState instance) – Random number seed or state.
w_scale (float) – The model parameter matrix, w, will be drawn from a normal distribution with std=w_scale.
include_intercept (bool) – If true, includes an intercept in the model, if False, the intercept is set to 0.
- pyuoi.datasets.make_linear_regression(n_samples=100, n_features=5, n_informative=2, X_loc=3.0, X_scale=1.0, snr=5.0, beta=None, beta_low=1.0, beta_high=3.0, include_intercept=False, random_state=None)[source]
Make a Linear regression dataset.
- Parameters
n_samples (int) – The number of samples to make.
n_features (int) – The number of feature to use.
n_informative (int) – The number of feature with non-zero weights.
X_loc (float) – The mean of the features in the design matrix.
X_scale (float) – The standard deviation of the features in the design matrix.
snr (float) – The signal-to-noise ratio, which informs the variance of the noise term.
beta (np.ndarray or None) – The beta values to use. If None, beta values will be drawn from a uniform distribution.
beta_low (float) – The lower bound for the beta values.
beta_high (float) – The upper bound for the beta values.
include_intercept (bool) – If true, includes an intercept in the model, if False, the intercept is set to 0.
random_state (int, np.random.RandomState instance, or None) – Random number seed or state.
- Returns
X (ndarray, shape (n_samples, n_features)) – The design matrix.
y (ndarray, shape (n_samples,)) – The response vector.
beta (ndarray, shape (n_features,)) – The feature coefficients.
intercept (float) – The intercept. If include_intercept is False, then intercept is zero.
- pyuoi.datasets.make_poisson_regression(n_samples=100, n_features=5, n_informative=2, X_loc=0.0, X_scale=0.125, beta=None, beta_shape=1.0, beta_scale=3.0, include_intercept=False, random_state=None)[source]
Make a Poisson regression dataset.
- Parameters
n_samples (int) – The number of samples to make.
n_features (int) – The number of feature to use.
n_informative (int) – The number of feature with non-zero weights.
X_loc (float) – The mean of the features in the design matrix.
X_scale (float) – The standard deviation of the features in the design matrix.
beta (np.ndarray or None) – The beta values to use. If None, beta values will be drawn from a gamma distribution.
beta_shape (float) – The shape parameter for the beta values.
beta_scale (float) – The scale parameter for the beta values.
include_intercept (bool) – If true, includes an intercept in the model, if False, the intercept is set to 0.
random_state (int, np.random.RandomState instance, or None) – Random number seed or state.
- Returns
X (ndarray, shape (n_samples, n_features)) – The design matrix.
y (ndarray, shape (n_samples,)) – The response vector.
beta (ndarray, shape (n_features,)) – The feature coefficients.
intercept (float) – The intercept. If include_intercept is False, then intercept is zero.