grama.fit package¶
Submodules¶
grama.fit.fit_lolo module¶
-
grama.fit.fit_lolo.
fit_lolo
¶ Fit a random forest
Fit a random forest to given data. Specify inputs and outputs, or inherit from an existing model.
Parameters: - df (DataFrame) – Data for function fitting
- md (gr.Model) – Model from which to inherit metadata
- var (list(str) or None) – List of features or None for all except outputs
- out (list(str)) – List of outputs to fit
- domain (gr.Domain) – Domain for new model
- density (gr.Density) – Density for new model
- seed (int or None) – Random seed for fitting process
- return_std (bool) – Return predictive standard deviations?
- suppress_warnings (bool) – Suppress warnings when fitting?
Keyword Arguments: - num_trees (int) –
- use_jackknife (bool) –
- () (leaf_learner) –
- () –
- subset_strategy (str) –
- min_leaf_instances (int) –
- max_depth (int) –
- uncertainty_calibration (bool) –
- randomize_pivot_location (bool) –
- randomly_rotate_features (bool) –
Returns: A grama model with fitted function(s)
Return type: gr.Model
Notes
- Wrapper for lolopy.learners.RandomForestRegressor
grama.fit.fit_scikitlearn module¶
-
grama.fit.fit_scikitlearn.
fit_gp
¶ Fit a gaussian process
Fit a gaussian process to given data. Specify var and out, or inherit from an existing model.
Note that the new model will have two outputs y_mean, y_sd for each original output y. The quantity y_mean is the best-fit value, while y_sd is a measure of predictive uncertainty.
Parameters: - df (DataFrame) – Data for function fitting
- md (gr.Model) – Model from which to inherit metadata
- var (list(str) or None) – List of features or None for all except outputs
- out (list(str)) – List of outputs to fit
- domain (gr.Domain) – Domain for new model
- density (gr.Density) – Density for new model
- seed (int or None) – Random seed for fitting process
- kernels (sklearn.gaussian_process.kernels.Kernel or dict or None) – Kernel for GP
- n_restart (int) – Restarts for optimization
- alpha (float or iterable) – Value added to diagonal of kernel matrix
- suppress_warnings (bool) – Suppress warnings when fitting?
Returns: A grama model with fitted function(s)
Return type: gr.Model
Notes
- Wrapper for sklearn.gaussian_process.GaussianProcessRegressor
-
grama.fit.fit_scikitlearn.
fit_lm
¶ Fit a linear model
Fit a linear model to given data. Specify inputs and outputs, or inherit from an existing model.
Parameters: - df (DataFrame) – Data for function fitting
- md (gr.Model) – Model from which to inherit metadata
- var (list(str) or None) – List of features or None for all except outputs
- out (list(str)) – List of outputs to fit
- domain (gr.Domain) – Domain for new model
- density (gr.Density) – Density for new model
- seed (int or None) – Random seed for fitting process
- suppress_warnings (bool) – Suppress warnings when fitting?
Returns: A grama model with fitted function(s)
Return type: gr.Model
Notes
- Wrapper for sklearn.ensemble.RandomForestRegressor
-
grama.fit.fit_scikitlearn.
fit_rf
¶ Fit a random forest
Fit a random forest to given data. Specify inputs and outputs, or inherit from an existing model.
Parameters: - df (DataFrame) – Data for function fitting
- md (gr.Model) – Model from which to inherit metadata
- var (list(str) or None) – List of features or None for all except outputs
- out (list(str)) – List of outputs to fit
- domain (gr.Domain) – Domain for new model
- density (gr.Density) – Density for new model
- seed (int or None) – Random seed for fitting process
- suppress_warnings (bool) – Suppress warnings when fitting?
Keyword Arguments: - n_estimators (int) –
- criterion (int) –
- max_depth (int or None) –
- min_samples_split (int, float) –
- min_samples_leaf (int, float) –
- min_weight_fraction_leaf (float) –
- max_features (int, float, string) –
- max_leaf_nodes (int or None) –
- min_impurity_decrease (float) –
- min_impurity_split (float) –
- bootstrap (bool) –
- oob_score (bool) –
- n_jobs (int or None) –
- random_state (int) –
Returns: A grama model with fitted function(s)
Return type: gr.Model
Notes
- Wrapper for sklearn.ensemble.RandomForestRegressor
-
grama.fit.fit_scikitlearn.
fit_kmeans
¶ K-means cluster a dataset
Create a cluster-labeling model on a dataset using the K-means algorithm.
Parameters: - df (DataFrame) – Hybrid point results from gr.eval_hybrid()
- var (list or None) – Variables in df on which to cluster. Use None to cluster on all variables.
- colname (string) – Name of cluster id; will be output in cluster model.
- seed (int) – Random seed for kmeans clustering
- Kwargs:
- n_clusters (int): Number of clusters to fit random_state (int or None):
Returns: Model that labels input data Return type: gr.Model Notes
- A wrapper for sklearn.cluster.KMeans
References
Scikit-learn: Machine Learning in Python, Pedregosa et al. JMLR 12, pp. 2825-2830, 2011.
Examples:
import grama as gr from grama.data import df_stang from grama.fit import ft_kmeans DF = gr.Intention() md_cluster = ( df_stang >> ft_kmeans(var=["E", "mu"], n_clusters=2) ) ( md_cluster >> gr.ev_df(df_stang) >> gr.tf_group_by(DF.cluster_id) >> gr.tf_summarize( thick_mean=gr.mean(DF.thick), thick_sd=gr.sd(DF.thick), n=gr.n(), ) )
grama.fit.fit_statsmodels module¶
-
grama.fit.fit_statsmodels.
fit_ols
¶ Fit a function via Ordinary Least Squares
Fit a function via ordinary least squares. Specify features via statsmodels formula.
Parameters: - df (DataFrame) – Data for function fitting
- formulae (list(str)) – List of statsmodels formulae
- domain (gr.Domain) – Domain for new model
- density (gr.Density) – Density for new model
Returns: A grama model with fitted function(s)
Return type: gr.Model
@pre domain is not None @pre len(formulae) == len(domain.inputs)
Notes
- Wrapper for statsmodels.formula.api.ols