grama package¶

Subpackages¶

Submodules¶

grama.comp_building module¶

grama.comp_building.comp_freeze¶

Freeze inputs to a model

Composition. Remove inputs from a model by “freezing” them to fixed values.

Parameters:	model (gr.Model) – Model to compose df (pd.DataFrame) – DataFrame of values for freeze var (dict) – Dictionary of inputs to freeze (keys) to specific values (value) Provide each key/value pair as a keyword argument
Returns:	New model with frozen inputs
Return type:	gr.Model

Examples::: import grama as gr

grama.comp_building.comp_function¶

Add a function to a model

Composition. Add a (non-vectorized) function to an existing model. See gr.comp_vec_function() to add a function that is vectorized over DataFrames.

Parameters:	model (gr.Model) – Model to compose fun (function) – Function taking at least one real input and returns at least one real output (fun: R^d -> R^r). Each input to fun must be a scalar (See examples below). var (list(string)) – List of variable names or number of inputs out (list(string)) – List of output names or number of outputs runtime (numeric) – Estimated single-eval runtime (in seconds)
Returns:	New model with added function
Return type:	gr.Model

@pre (len(var) == d) | (var == d) @pre (len(out) == r) | (var == r)

Examples:

import grama as gr
## Simple example
md = (
    gr.Model("test")
    >> gr.cp_function(
        fun=lambda x: x,
        var=["x"],
        out=["y"],
        name="identity"
    )
)

## Providing a function with multiple inputs
md2 = (
    gr.Model("test 2")
    >> gr.cp_function(
        fun=lambda x, y: x + y,
        var=["x", "y"],
        out=["f"],
    )
)

## Providing a function with multiple inputs and multiple outputs
md3 = (
    gr.Model("test 3")
    >> gr.cp_function(
        fun=lambda x, y: [x + y, x - y],
        var=["x", "y"],
        out=["f", "g"],
    )
)

grama.comp_building.comp_vec_function¶

Add a vectorized function to a model

Composition. Add a function to an existing model. Function must be vectorized over DataFrames, and must add new columns matching its out set. See gr.cp_function() to add a non-vectorized function.

Notes

The helper function gr.df_make() is useful for constructing a vectorized lambda function (see Examples below).

Parameters:	model (gr.model) – Model to compose fun (function) – Function taking R^d -> R^r; must be vectorized over DataFrames; it must take a DataFrame as input and return a new DataFrame var (list(string)) – List of variable names or number of inputs out (list(string)) – List of output names or number of outputs runtime (numeric) – Estimated single-eval runtime (in seconds)
Returns:	New model with added function
Return type:	gr.model

@pre (len(var) == d) | (var == d) @pre (len(out) == r) | (var == r)

Examples:

import grama as gr
## Simple example
md = (
    gr.Model("Test")
    >> gr.cp_vec_function(
        fun=lambda df: gr.df_make(y=1 + 0.5 * df.x),
        var=["x"],
        out=["y"],
        name="Simple linear function",
    )
)

grama.comp_building.comp_md_det¶

Add a Model with deterministic evaluation

Composition. Add a model as function to an existing model. Evaluate the model deterministically (ignore any random variables).

Parameters:	model (gr.model) – Model to compose md (gr.model) – Model to add as function
Returns:	New model with added function
Return type:	gr.model

Examples:

import grama as gr
from grama.models import make_cantilever_beam
## Use functions from beam model, but introduce new marginals
md_plate = (
    gr.Model("New beam model")
    >> gr.cp_md_det(md=make_cantilever_beam())
    >> gr.cp_marginals(
        H=gr.marg_mom("norm", mean=1000, cov=0.1),
        V=gr.marg_mom("norm", mean=500, cov=0.1),
    )
    >> gr.cp_copula_independence()
)

grama.comp_building.comp_md_sample¶

Add a Model with sampled evaluation

Composition. Add a model as function to an existing model. Evaluate the model via sampling (one sample per evaluation). Use param to turn model parameters into variables of new model. Random variables of composed model are turned into outputs of new model.

Parameters:	model (gr.model) – Model to compose md (gr.model) – Model to add as function param (dict) – Parameters in md to treat as var; entries must be of the form “var”: (“param1”, “param2”, …) rand2out (bool) – Add model’s var_rand to outputs (to track values)
Returns:	New model with added function
Return type:	gr.model

Examples:

grama.comp_building.comp_bounds¶

Add variable bounds to a model

Composition. Add variable bounds to an existing model. Bounds are specified by iterable; the model variable name is specified by the keyword argument name.

Parameters:	model (gr.model) – Model to modify

Kwargs:: var (iterable): Bound information; keyword argument name is targeted variable, value should be a length 2 iterable of the form (lower_bound, upper_bound)

Returns:	Model with new marginals
Return type:	gr.model

@pre len(var) >= 2

Examples:

import grama as gr
md = (
    gr.Model("Simple Model")
    >> gr.cp_function(
        lambda x: x[0] + x[1],
        var=["x0", "x1"],
        out=1
    )
    >> gr.cp_bounds(
        x0=(-1, 1),    # Finite bounds
        x1=(0, np.inf) # Semi-infinite bounds
    )
)

grama.comp_building.comp_copula_independence¶

Add an independence copula to model

Composition. Add an independence copula to an existing model.

NOTE: Independence of random variables is a very strong assumption! Recommend using comp_copula_gaussian instead.

Parameters:	model (gr.model) – Model to modify
Returns:	Model with independence copula
Return type:	gr.model

Examples:

import grama as gr
md = (
    gr.Model()
    >> gr.cp_marginals(
        x0=gr.marg_mom("norm", mean=0, sd=1),
        x1=gr.marg_mom("beta", mean=0, sd=1, skew=0, kurt=2),
    )
    >> gr.cp_copula_independence()
)

grama.comp_building.comp_copula_gaussian¶

Add a Gaussian copula to model

Composition. Add a gaussian copula to an existing model.

Parameters:	model (gr.model) – Model to modify df_corr (DataFrame) – Correlation information df_data (DataFrame) – Data for automated fitting
Returns:	Model with Gaussian copula
Return type:	gr.model

Examples:

import grama as gr
## Manual assignment
md_manual = (gr.Model()
    >> gr.cp_marginals(
        x0=gr.marg_mom("norm", mean=0, sd=1),
        x1=gr.marg_mom("uniform", mean=0, sd=1),
    )
    >> gr.cp_copula_gaussian(
        # Specify correlation structure explicitly
        df_corr=gr.df_make(var1="x0", var2="x1", corr=0.5)
    )
)
## Automated fitting
from grama.data import df_stang
md_auto = (
    gr.Model()
    >> gr.cp_marginals(
        E=gr.marg_fit("norm", df_stang.E),
        mu=gr.marg_fit("beta", df_stang.mu),
        thick=gr.marg_fit("norm", df_stang.thick)
    )
    >> gr.cp_copula_gaussian(df_data=df_stang)
)

grama.comp_building.comp_marginals¶

Add marginals to a model

Composition. Add marginals to an existing model. Marginals are specified either by dictionary entries or by gr.Marginal() object. The model variable name is specified by the keyword argument name.

Notes

Several helper functions are available to fit marginal distributions

gr.marg_fit() fits a distribution using a dataset (via maximum likelihood estimation)
gr.marg_mom() fits a distribution using moments (via the method of moments)
gr.marg_gkde() fits a gaussian kernel density using a dataset

Parameters:	model (gr.model) – Model to modify var (dict OR gr.Marginal) – Marginal information
Returns:	Model with new marginals
Return type:	gr.model

Examples:

import grama as gr
## Print all of the grama-supported distributions
print(gr.valid_dist.keys())
## Construct a simple example model
md = (
    gr.Model()
    >> gr.cp_function(
        lambda x: x[0] + x[1],
        var=["x0", "x1"],
        out=["y"],
    )
    >> gr.cp_marginals(
        x0=gr.marg_mom("norm", mean=0, sd=1),
    )
)

grama.comp_building.getvars(f)¶

Get a function’s variable names

Convenience function for extracting a function’s variable names. Intended for use with gr.cp_function().

Parameters:	f (function) – Function whose variable names are desired
Returns:	Variable names
Return type:	tuple

Examples::

import grama as gr

def fun(x, y, z):

return x + y + z

md = (

gr.Model(“Test model”) >> gr.cp_function(

fun=fun, var=gr.getvars(fun), out=[“w”],

)

grama.comp_metamodels module¶

grama.comp_metamodels.comp_metamodel¶

Create a metamodel

Composition: Create a metamodel from an existing model. This convenience function essentially applies a recipe of Evaluation followed by Fitting. Default methods are Latin Hypercube Evaluation and Ordinary Least Squares Fitting with linear features.

Parameters:	model (gr.model) – Original model, to be evaluated and fit n (numeric) – Number of samples to draw ev (gr.eval_) – Evaluation strategy, default eval_lhs ft (gr.fit_) – Fitting strategy, default fit_ols w/ linear features seed (int) – Random seed, default None
Returns:	Metamodel
Return type:	gr.model

grama.core module¶

class grama.core.CopulaIndependence(var_rand)¶

Bases: grama.core.Copula

copy()¶

Copy

Returns:	Copy of present copula
Return type:	gr.CopulaIndependence

d(u)¶

Density function

Parameters:	u (array-like) –
Returns:	Copula density values
Return type:	array

dudz(z)¶

Jacobian

Parameters:	z (array-like) –
Returns:
Return type:	array

sample(n=1, seed=None)¶

Draw samples from copula

Parameters:	n (int) – Number of samples seed (int) – Random seed
Returns:	Independent samples
Return type:	DataFrame

summary()¶

u2z(u)¶

Transform to standard-normal space

Parameters:	u (array-like) –
Returns:
Return type:	array

z2u(z)¶

Transform to uniform-marginal space

Parameters:	z (array-like) –
Returns:
Return type:	array

class grama.core.CopulaGaussian(var_rand, df_corr)¶

Bases: grama.core.Copula

copy()¶

Copy

Parameters:	self (gr.CopulaGaussian) –
Returns:
Return type:	gr.CopulaGaussian

d(u)¶

Copula density function

Parameters:	u (array-like) –
Returns:
Return type:	array

dudz(z)¶

Jacobian

Parameters:	z (array-like) –
Returns:
Return type:	array

sample(n=1, seed=None)¶

Draw samples from copula

Draw samples according to gaussian copula dependence structure.

Parameters:	self (gr.CopulaGaussian) – n (int) – Number of samples to draw
Returns:	Copula samples
Return type:	array

summary()¶

u2z(u)¶

Transform to standard-normal space

Parameters:	u (array-like) –
Returns:
Return type:	array

z2u(z)¶

Transform to uniform-marginal space

Parameters:	z (array-like) –
Returns:
Return type:	array

class grama.core.Domain(bounds=None, feasible=None)¶

Bases: object

Parent class for input domains

The domain defines constraints on the variables. Together with a model’s functions, it defines the mathematical domain of a model.

bound_summary(var)¶

copy()¶

get_bound(var)¶

get_nominal(var)¶

get_width(var)¶

update()¶

class grama.core.Density(marginals=None, copula=None)¶

Bases: object

Parent class for joint densities

The density is defined for all the random variables; therefore it explicitly defines the list of random variables, and together implicitly defines the deterministic variables via

domain.var + [functions.var] - density.marginals.keys()

copy()¶

d(df)¶

Evaluate PDF

Evaluate the PDF of the density.

Parameters:	df (DataFrame) – Values

pr2sample(df_prval)¶

Convert CDF probabilities to samples

Convert random variable CDF probabilities to random variable samples. Ignores dependence structure.

Parameters:	df_prval (DataFrame) – Values in [0,1]
Returns:	Variable samples (quantiles)
Return type:	DataFrame

@pre df_prval.shape[1] == len(self.var_rand) @post result.shape[1] == len(self.var_rand)

sample(n=1, seed=None)¶

Draw samples from joint density

Draw samples according to joint density using marginal and copula information.

Parameters:	n (int) – Number of samples to draw seed (int) – random seed to use
Returns:	Joint density samples
Return type:	DataFrame

sample2pr(df_sample)¶

Convert samples to CDF probabilities

Convert random variable samples to CDF probabilities. Ignores dependence structure.

Parameters:	df_sample (DataFrame) – Values in [0,1]
Returns:	Variable samples (quantiles)
Return type:	DataFrame

@pre df_sample.shape[1] == len(self.var_rand) @post result.shape[1] == len(self.var_rand)

summary_copula()¶

summary_marginal(var)¶

class grama.core.Function(func, var, out, name, runtime)¶

Bases: object

Parent class for functions.

A function specifies its own inputs and outputs; these are subsets of the full model’s inputs and outputs.

copy()¶: Make a copy

eval(df)¶

Evaluate function

Evaluate a grama function; loops over dataframe rows. Intended for internal use.

Args: df (DataFrame): Input values to evaluate

Returns:	Result values
Return type:	DataFrame

summary()¶: Returns a summary string

class grama.core.FunctionModel(md, ev=None, var=None, out=None)¶

Bases: grama.core.Function

gr.Model as gr.Function

copy()¶: Make a copy

eval(df)¶

Evaluate function; DataFrame vectorized

Evaluate grama model as a function. Modify the parameters before

Parameters:	df (DataFrame) – Input values to evaluate
Returns:	Result values
Return type:	DataFrame

class grama.core.FunctionVectorized(func, var, out, name, runtime)¶

Bases: grama.core.Function

copy()¶: Make a copy

eval(df)¶

Evaluate function; DataFrame vectorized

Evaluate grama function. Assumes function is vectorized over dataframes.

Parameters:	df (DataFrame) – Input values to evaluate
Returns:	Result values
Return type:	DataFrame

class grama.core.Model(name=None, functions=None, domain=None, density=None)¶

Bases: object

Parent class for grama models.

copy()¶: Make a copy of this model

det_nom()¶

Return nominal conditions for deterministic variables

Returns:	Nominal values for deterministic variables
Return type:	DataFrame

dxdz(z)¶

Inverse transform jacobian

Compute jacobian of the inverse transform X = phi^{-1}(Z)

Parameters:	z (array) – Single vector of standard normal values. Order of entries must match self.var_rand
Returns:	Jacobian of inverse transform
Return type:	2d array

evaluate_df(df)¶

Evaluate function using an input dataframe

Parameters:	df (DataFrame) – Variable values at which to evaluate model functions
Returns:	Output results
Return type:	DataFrame

make_dag(expand={})¶: Generate a DAG for the model

name_corr()¶: Name the correlation elements

norm2rand(df)¶

Transform standard normal samples to model random variable space

Transform a DataFrame of standard normal samples to model random variable space.

Parameters:	df (DataFrame) – Random variable samples; must have columns for all of self.var_rand.
Returns:	Samples in standard-normal space
Return type:	DataFrame

printpretty()¶: Formatted print of model attributes

rand2norm(df)¶

Transform random samples to standard normal space

Transform a DataFrame of random variable samples to standard normal space

Parameters:	df (DataFrame) – Random variable samples; must have columns for all of self.var_rand.
Returns:	Samples in standard-normal space
Return type:	DataFrame

runtime(n)¶

Estimate runtime

Estimate the total runtime to evaluate n observations.

Parameters:	self (gr.Model) – n (int) – Number of observations
Returns:	Estimated runtime, in seconds
Return type:	float

runtime_message(df)¶

Runtime message

Estimate total runtime based on proposed DataFrame, prepare a message for console print.

Parameters:	self (gr.Model) – df (DataFrame) – Data to evaluate
Returns:	Runtime message
Return type:	str

show_dag(expand={})¶: Generate and show a DAG for the model

string_rep()¶

update()¶

Update model public attributes based on functions, domain, and density.

The variables and parameters are implicitly defined by the model attributes. For internal use.

self.functions defines the full list of inputs
self.domain defines the constraints on the model’s domain
self.density defines the random variables

var_outer(df_rand, df_det=None)¶

Outer product of random and deterministic samples

Parameters:	df_rand (DataFrame) – df_det (DataFrame) – set to “nom” for nominal evaluation
Returns:	Outer product of samples
Return type:	DataFrame

x2z(x)¶

Transform to standard normal space

Transform a single vector of random variable values to standard normal space.

Parameters:	x (array) – Single vector of values in var_rand. Order of entries must match self.var_rand
Returns:	Single vector of values transformed to standard normal space
Return type:	array

z2x(z)¶

Transform to random variable space

Transform a single vector of normal values to the model’s random variable space.

Parameters:	z (array) – Single vector of standard normal values. Order of entries must match self.var_rand
Returns:	Single vector of values transformed to model random variable space
Return type:	array

grama.dataframe module¶

grama.dataframe.df_equal(df1, df2, close=False, precision=3)¶

Check DataFrame equality

Check that two dataframes have the same columns and values. Allows column order to differ.

Parameters:	df1 (DataFrame) – Comparison input 1 df2 (DataFrame) – Comparison input 2
Returns:	Result of comparison
Return type:	bool

grama.dataframe.df_make(**kwargs)¶

Construct a DataFrame

Helper function to construct a DataFrame. A common use-case is to use df_make() to pass values to the df (and related) keyword arguments succinctly.

Kwargs:: varname (iterable): Column for constructed dataframe; column name inferred from variable name.

Returns:	Constructed DataFrame
Return type:	DataFrame

Preconditions:: All provided iterables must have identical length or be of length one. All provided variable names (keyword arguments) must be distinct.

Examples:

import grama as gr
from models import make_test
md = make_test()
(
    md
    >> gr.ev_sample(
        n=1e3,
        df_det=gr.df_make(x2=[1, 2])
    )
)

grama.dataframe.df_grid(**kwargs)¶

Construct a DataFrame as outer-product

Helper function to construct a DataFrame as an outer-product of the given columns.

Kwargs:: varname (iterable): Column for constructed dataframe; column name inferred from variable name.

Returns:	Constructed DataFrame
Return type:	DataFrame

Preconditions:: All provided variable names (keyword arguments) must be distinct.

Examples:

import grama as gr
## Make an empty DataFrame
gr.df_grid()
## Create a row for every pair of values (6 rows total)
gr.df_grid(x=["A", "B"], y=[1, 2, 3])

grama.eval_contour module¶

grama.eval_contour.eval_contour¶

Generate contours from a model

Generates contours from a model. Evaluates the model on a dense grid, then runs marching squares to generate contours. Supports targeting multiple outputs and handling auxiliary inputs not included in the contour map.

Parameters:	model (gr.Model) – Model to evaluate. var (list of str) – Model inputs to target; must provide exactly two inputs, and both must have finite domain width. out (list of str) – Model output(s) for contour generation. df (DataFrame or None) – Levels for model variables not included in var (auxiliary inputs). If provided var and model.var contain the same values, then df may equal None. levels (dict) – Specific output levels for contour generation; overrides n_levels. n_side (int) – Side resolution for grid; n_side2 total evaluations. n_levels** (int) – Number of contour levels.
Returns:	Points along contours, organized by output and auxiliary variable levels.
Return type:	DataFrame

Examples:

import grama as gr
## Multiple outputs
(
    gr.Model()
    >> gr.cp_vec_function(
        fun=lambda df: gr.df_make(
            f=df.x**2 + df.y**2,
            g=df.x + df.y,
        ),
        var=["x", "y"],
        out=["f", "g"],
    )
    >> gr.cp_bounds(
        x=(-1, +1),
        y=(-1, +1),
    )
    >> gr.ev_contour(
        var=["x", "y"],
        out=["f", "g"],
    )
    # Contours with no auxiliary variables can autoplot
    >> gr.pt_auto()
)

## Auxiliary inputs
(
    gr.Model()
    >> gr.cp_vec_function(
        fun=lambda df: gr.df_make(
            f=df.c * df.x + (1 - df.c) * df.y,
        ),
        var=["x", "y"],
        out=["f", "g"],
    )
    >> gr.cp_bounds(
        x=(-1, +1),
        y=(-1, +1),
    )
    >> gr.ev_contour(
        var=["x", "y"],
        out=["f"],
        df=gr.df_make(c=[0, 1])
    )

    # Contours with auxiliary variables should be manually plotted
    >> gr.ggplot(gr.aes("x", "y"))
    + gr.geom_segment(gr.aes(xend="x_end", yend="y_end", group="level", color="c"))
)

grama.eval_defaults module¶

grama.eval_defaults.eval_df¶

Evaluate model at given values

Evaluates a given model at a given dataframe.

Parameters:	model (gr.Model) – Model to evaluate df (DataFrame) – Input dataframe to evaluate append (bool) – Append results to original dataframe?
Returns:	Results of model evaluation
Return type:	DataFrame

Examples:

import grama as gr
from grama.models import make_test
md = make_test()
df = gr.df_make(x0=0, x1=1, x2=2)
md >> gr.ev_df(df=df)

grama.eval_defaults.eval_linup¶

Linear uncertainty propagation

Approximates the variance of output models using a linearization of functions—linear uncertainty propagation. Optionally decomposes the output variance according to additive factors from each input.

Parameters:	model (gr.Model) – Model to evaluate df_base (DataFrame or None) – Base levels for evaluation; use “nom” for nominal levels. append (bool) – Append results to nominal inputs? decomp (bool) – Decompose the fractional variances according to each input? decimals (int) – Decimals to report for fractional variances n (float) – Monte Carlo sample size, for estimating covariance matrix seed (int or None) – Monte Carlo seed
Returns:	Output variances at each deterministic level
Return type:	DataFrame

Examples:

import grama as gr
from grama.models import make_test
md = make_test()
## Set manual levels for deterministic inputs; nominal levels for random inputs
md >> gr.ev_linup(df_det=gr.df_make(x2=[0, 1, 2])
## Use nominal deterministic levels
md >> gr.ev_linup(df_base="nom")

grama.eval_defaults.eval_nominal¶

Evaluate model at nominal values

Evaluates a given model at a model nominal conditions (median) of random inputs. Optionally set nominal values for the deterministic inputs.

Parameters:	model (gr.Model) – Model to evaluate df_det (DataFrame or None) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. If provided model has no deterministic variables (model.n_var_det == 0), then df_det may equal None. append (bool) – Append results to nominal inputs? skip (bool) – Skip evaluation of the functions?
Returns:	Results of nominal model evaluation or unevaluated design
Return type:	DataFrame

Examples:

import grama as gr
from grama.models import make_test
md = make_test()
## Set manual levels for deterministic inputs; nominal levels for random inputs
md >> gr.ev_nominal(df_det=gr.df_make(x2=[0, 1, 2])
## Use nominal deterministic levels
md >> gr.ev_nominal(df_det="nom")

grama.eval_defaults.eval_grad_fd¶

Finite-difference gradient approximation

Evaluates a given model with a central-difference stencil to approximate the gradient.

Parameters:	model (gr.Model) – Model to differentiate h (numeric) – finite difference stepsize, single (scalar): or per-input (array) df_base (DataFrame) – Base-points for gradient calculations var (list(str) or string) – list of variables to differentiate, or flag; “rand” for var_rand, “det” for var_det append (bool) – Append results to base point inputs? skip (bool) – Skip evaluation of the functions?
Returns:	Gradient approximation or unevaluated design
Return type:	DataFrame

@pre (not isinstance(h, collections.Sequence)) |: (h.shape[0] == df_base.shape[1])

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md = make_cantilever_beam()
# Select base point(s)
df_nom = md >> gr.ev_nominal(df_det="nom")
# Approximate the gradient
df_grad = md >> gr.ev_grad_fd(df_base=df_nom)

grama.eval_defaults.eval_sample¶

Draw a random sample

Evaluates a model with a random sample of the random model inputs. Generates outer product with deterministic levels (common random numbers) OR generates a sample fully-independent of deterministic levels (non-common random numbers).

For more expensive models, it can be helpful to tune n to achieve a reasonable runtime. An even more effective approach is to use skip evaluation along with tran_sp() to evaluate a small, representative sample. (See examples below.)

Parameters:	model (gr.Model) – Model to evaluate n (numeric) – number of observations to draw df_det (DataFrame or None) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. If provided model has no deterministic variables (model.n_var_det == 0), then df_det may equal None. seed (int) – random seed to use append (bool) – Append results to input values? skip (bool) – Skip evaluation of the functions? comm (bool) – Use common random numbers (CRN) across deterministic levels? CRN will tend to aid in the comparison of statistics across deterministic levels and enhance the convergence of stochastic optimization. ind_comm (str or None) – Name of realization index column; not added if None
Returns:	Results of evaluation or unevaluated design
Return type:	DataFrame

Examples:

import grama as gr
from grama.models import make_test
DF = gr.Intention()

# Simple random sample evaluation
md = make_test()
df = md >> gr.ev_sample(n=1e2, df_det="nom")
df.describe()

## Use autoplot to visualize results
(
    md
    >> gr.ev_sample(n=1e2, df_det="nom")
    >> gr.pt_auto()
)

## Cantilever beam examples
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()

## Use the realization index to facilitate plotting
# Try running this without the `group` aesthetic in `geom_line()`;
# without the group the plot will not have multiple lines.
(
    md_beam
    >> gr.ev_sample(
        n=20,
        df_det=gr.df_make(w=3, t=gr.linspace(2, 4, 100)),
        ind_comm="idx",
    )

    >> gr.ggplot(gr.aes("t", "g_stress"))
    + gr.geom_line(gr.aes(color="w", group="idx"))
)

## Use iocorr to generate input/output correlation tile plot
(
    md_beam
    >> gr.ev_sample(n=1e3, df_det="nom", skip=True)
    # Generate input/output correlation summary
    >> gr.tf_iocorr()
    # Visualize
    >> gr.pt_auto()
)

## Use support points to reduce model runtime
(
    md_beam
    # Generate large input sample but don't evaluate outputs
    >> gr.ev_sample(n=1e5, df_det="nom", skip=True)
    # Reduce to a smaller---but representative---sample
    >> gr.tf_sp(n=50)
    # Evaluate the outputs
    >> gr.tf_md(md_beam)
)

## Estimate probabilities
(
    md_beam
    # Generate large
    >> gr.ev_sample(n=1e5, df_det="nom")
    # Estimate probabilities of failure
    >> gr.tf_summarize(
        pof_stress=gr.mean(DF.g_stress <= 0),
        pof_disp=gr.mean(DF.g_disp <= 0),
    )
)

grama.eval_defaults.eval_conservative¶

Evaluates a given model at conservative input quantiles

Uses model specifications to determine the “conservative” direction for each input, and evaluates the model at the desired quantile. Provided primarily for comparing UQ against pseudo-deterministic design criteria (del Rosario et al.; 2021).

Note that if there is no conservative direction for the given input, the given quantile will be ignored and the median will automatically be selected.

Parameters:	model (gr.Model) – Model to evaluate quantiles (numeric) – lower quantile value(s) for conservative evaluation; can be single value for all inputs, array of values for each random variable, or None for default 0.01. values in [0, 0.5] df_det (DataFrame or None) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. If provided model has no deterministic variables (model.n_var_det == 0), then df_det may equal None. append (bool) – Append results to conservative inputs? skip (bool) – Skip evaluation of the functions?
Returns:	Conservative evaluation or unevaluated design
Return type:	DataFrame

References

del Rosario, Zachary, Richard W. Fenrich, and Gianluca Iaccarino. “When Are Allowables Conservative?.” AIAA Journal 59.5 (2021): 1760-1772.

Examples:

import grama as gr
from grama.models import make_plate_buckle
md = make_plate_buckle()
# Evaluate at conservative input values
md >> gr.ev_conservative(df_det="nom")

grama.eval_opt module¶

grama.eval_opt.eval_nls¶

Estimate with Nonlinear Least Squares (NLS)

Estimate best-fit variable levels with nonlinear least squares (NLS).

Parameters:	model (gr.Model) – Model to analyze. All model variables selected for fitting must be bounded or random. Deterministic variables may have semi-infinite bounds. df_data (DataFrame) – Data for estimating parameters. Variables not found in df_data optimized in fitting. out (list or None) – Output contributions to consider in computing MSE. Assumed to be model.out if left as None. var_fix (list or None) – Variables to fix to nominal levels. Note that variables with domain width zero will automatically be fixed. df_init (DataFrame or None) – Initial guesses for parameters; overrides n_restart (int) – append (bool) – Append metadata? (Initial guess, MSE, optimizer status) tol (float) – Optimizer convergence tolerance n_maxiter (int) – Optimizer maximum iterations n_restart – Number of restarts; beyond n_restart=1 random restarts are used. seed (int OR None) – Random seed for restarts verbose (bool) – Print messages to console?
Returns:	Results of estimation
Return type:	DataFrame

Examples:

import grama as gr
from grama.data import df_trajectory_full
from grama.models import make_trajectory_linear

md_trajectory = make_trajectory_linear()

df_fit = (
    md_trajectory
    >> gr.ev_nls(df_data=df_trajectory_full)
)

print(df_fit)

grama.eval_opt.eval_min¶

Constrained minimization using functions from a model

Perform constrained minimization using functions from a model. Model must have deterministic variables only.

Wrapper for scipy.optimize.minimize

Parameters:	model (gr.Model) – Model to analyze. All model variables must be deterministic. out_min (str) – Output to use as minimization objective. out_geq (None OR list of str) – Outputs to use as geq constraints; out >= 0 out_leq (None OR list of str) – Outputs to use as leq constraints; out <= 0 out_eq (None OR list of str) – Outputs to use as equality constraints; out == 0 method (str) – Optimization method; see the documentation for scipy.optimize.minimize for options. tol (float) – Optimization objective convergence tolerance n_restart (int) – Number of restarts; beyond n_restart=1 random restarts are used. df_start (None or DataFrame) – Specific starting values to use; overrides n_restart if non None provided.
Returns:	Results of optimization
Return type:	DataFrame

Examples:

import grama as gr
## Define a model with objective and constraints
md = (
    gr.Model("Constrained Rosenbrock")
    >> gr.cp_function(
        fun=lambda x: (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2,
        var=["x", "y"],
        out=["c"],
    )
    >> gr.cp_function(
        fun=lambda x: (x[0] - 1)**3 - x[1] + 1,
        var=["x", "y"],
        out=["g1"],
    )
    >> gr.cp_function(
        fun=lambda x: x[0] + x[1] - 2,
        var=["x", "y"],
        out=["g2"],
    )
    >> gr.cp_bounds(
        x=(-1.5, +1.5),
        y=(-0.5, +2.5),
    )
)

## Run the optimizer
md >> gr.ev_min(
    out_min="c",
    out_leq=["g1", "g2"]
)

grama.eval_pnd module¶

grama.eval_pnd.approx_pnd(X_pred, X_cov, X_train, signs, n=10000, seed=None)¶

Approximate the PND via mixture importance sampling

Approximate the probability non-dominated (PND) for a set of predictive points using a mixture importance sampling approach. Predictive points are assumed to have predictive gaussian distributions (with specified mean and covariance matrix).

Parameters:	X_pred (2d numpy array) – Predictive values X_cov (iterable of 2d numpy arrays) – Predictive covariance matrices X_train (2d numpy array) – Training values, used to determine existing Pareto frontier signs (numpy array of +/-1 values) – Array of optimization signs: {-1: Minimize, +1 Maximize}

Kwargs:: n (int): Number of draws for importance sampler seed (int): Seed for random state

Returns:	Estimated PND values var_values (array): Estimated variance values
Return type:	pr_scores (array)

References

Owen Monte Carlo theory, methods and examples (2013)

grama.eval_pnd.choice(a, size=None, replace=True, p=None)¶

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; please see the random-quick-start.

Parameters:	a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were `np.arange(a)` size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., `(m, n, k)`, then `m * n * k` samples are drawn. Default is None, in which case a single value is returned. replace (boolean, optional) – Whether the sample is with or without replacement. Default is True, meaning that a value of `a` can be selected multiple times. p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in `a`.
Returns:	samples – The generated random samples
Return type:	single item or ndarray
Raises:	`ValueError` – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint(), shuffle(), permutation()

Generator.choice(): which should be used in new code

Notes

Setting user-specified probabilities through p uses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element of p is 1 / len(a).

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')

grama.eval_pnd.dprop(X, Sigma, X_means)¶

Evaluate the PDF of a mixture proposal distribution

Evaluate the PDF of a gaussian mixture distribution with a common covariance matrix and different means.

Parameters:	X (2d numpy array) – Observations for which to evaluate the density Sigma (2d numpy array) – Common covariance matrix for mixture distribution X_means (2d numpy array) – Means for mixture distribution

Preconditions:: X.shape[1] == X_means.shape[1] Sigma.shape[0] == X_means.shape[1] Sigma.shape[1] == X_means.shape[1]

Returns:	Density values
Return type:	numpy array

grama.eval_pnd.eval_pnd¶

Approximate the probability non-dominated (PND)

Approximates the probability non-dominated (PND) for a set of training points given a fitted probabilistic model. Used to rank a set of candidates in the context of multiobjective optimization.

Parameters:	model (gr.model) – predictive model to evaluate df_train (DataFrame) – dataframe with training data df_test (DataFrame) – dataframe with test data signs (dict) – dict with the variables you would like to use and minimization or maximization parameter for each append (bool) – Append df_test to pnd algorithm outputs

Kwargs:: n (int): Number of draws for importance sampler seed (int): declarble seed value for reproducibility

Returns:	Results of predictive model going through a PND algorithm. Conatians both values and their scores.
Return type:	DataFrame

References

del Rosario, Zachary, et al. “Assessing the frontier: Active learning, model accuracy, and multi-objective candidate discovery and optimization.” The Journal of Chemical Physics 153.2 (2020): 024112.

Examples:

import grama as gr

## Define a ground-truth model
md_true = gr.make_pareto_random()
df_data = (
    md_true
    >> gr.ev_sample(n=2e3, seed=101, df_det="nom")
)
## Generate test/train data
df_train = (
    df_data
    >> gr.tf_sample(n=10)
)

df_test = (
    df_data
    >> gr.anti_join(
        df_train,
        by = ["x1","x2"]
    )
    >> gr.tf_sample(n=200)
)
## Fit a model to training data
md_fit = (
    df_train
    >> gr.ft_gp(
        var=["x1","x2"]
        out=["y1","y2"]
    )
)
## Rank training points by PND algorithm
df_pnd = (
    md_fit
    >> gr.ev_pnd(
        df_train,
        df_test,
        signs = {"y1":1, "y2":1},
        seed = 101
    )
    >> gr.tf_arrange(gr.desc(DF.pr_scores))
)

grama.eval_pnd.floor_sig(sig, sig_min=1e-08)¶: Floor an array of variances

grama.eval_pnd.make_proposal_sigma(X, idx_pareto, X_cov)¶

Estimate parameters for PND importance distribution

Parameters:	X (2d numpy array) – Full set of response values idx_pareto (boolean array) – Whether each response value is a Pareto point X_cov (iterable of 2d numpy array) – Predictive covariance matrices

Preconditions:: X.shape[0] == len(idx_pareto) X_cov[i].shape[0] == X.shape[1] for all valid i X_cov[i].shape[1] == X.shape[1] for all valid i

Returns:	Common covariance matrix for PND importance distribution
Return type:	2d numpy array

grama.eval_pnd.multivariate_normal(mean, cov, size=None, check_valid='warn', tol=1e-8)¶

Draw random samples from a multivariate normal distribution.

The multivariate normal, multinormal or Gaussian distribution is a generalization of the one-dimensional normal distribution to higher dimensions. Such a distribution is specified by its mean and covariance matrix. These parameters are analogous to the mean (average or “center”) and variance (standard deviation, or “width,” squared) of the one-dimensional normal distribution.

Note

New code should use the multivariate_normal method of a default_rng() instance instead; please see the random-quick-start.

Parameters:

mean (1-D array_like, of length N) – Mean of the N-dimensional distribution.
cov (2-D array_like, of shape (N, N)) – Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling.
size (int or tuple of ints, optional) – Given a shape of, for example, (m,n,k), m*n*k samples are generated, and packed in an m-by-n-by-k arrangement. Because each sample is N-dimensional, the output shape is (m,n,k,N). If no shape is specified, a single (N-D) sample is returned.
check_valid ({ 'warn', 'raise', 'ignore' }, optional) – Behavior when the covariance matrix is not positive semidefinite.
tol (float, optional) – Tolerance when checking the singular values in covariance matrix. cov is cast to double before the check.

Returns:

out – The drawn samples, of shape size, if that was provided. If not, the shape is (N,).

In other words, each entry out[i,j,...,:] is an N-dimensional value drawn from the distribution.

Return type:

ndarray

See also

Generator.multivariate_normal(): which should be used for new code.

Notes

The mean is a coordinate in N-dimensional space, which represents the location where samples are most likely to be generated. This is analogous to the peak of the bell curve for the one-dimensional or univariate normal distribution.

Covariance indicates the level to which two variables vary together. From the multivariate normal distribution, we draw N-dimensional samples, $X = [x_1, x_2, ... x_N]$ . The covariance matrix element $C_{ij}$ is the covariance of $x_i$ and $x_j$ . The element $C_{ii}$ is the variance of $x_i$ (i.e. its “spread”).

Instead of specifying the full covariance matrix, popular approximations include:

Spherical covariance (cov is a multiple of the identity matrix)

Diagonal covariance (cov has non-negative elements, and only on the diagonal)

This geometrical property can be seen in two dimensions by plotting generated data-points:

>>> mean = [0, 0]
>>> cov = [[1, 0], [0, 100]]  # diagonal covariance

Diagonal covariance means that points are oriented along x or y-axis:

>>> import matplotlib.pyplot as plt
>>> x, y = np.random.multivariate_normal(mean, cov, 5000).T
>>> plt.plot(x, y, 'x')
>>> plt.axis('equal')
>>> plt.show()

Note that the covariance matrix must be positive semidefinite (a.k.a. nonnegative-definite). Otherwise, the behavior of this method is undefined and backwards compatibility is not guaranteed.

References

[1]	Papoulis, A., “Probability, Random Variables, and Stochastic Processes,” 3rd ed., New York: McGraw-Hill, 1991.

[2]	Duda, R. O., Hart, P. E., and Stork, D. G., “Pattern Classification,” 2nd ed., New York: Wiley, 2001.

Examples

>>> mean = (1, 2)
>>> cov = [[1, 0], [0, 1]]
>>> x = np.random.multivariate_normal(mean, cov, (3, 3))
>>> x.shape
(3, 3, 2)

The following is probably true, given that 0.6 is roughly twice the standard deviation:

>>> list((x[0,0,:] - mean) < 0.6)
[True, True] # random

grama.eval_pnd.pareto_min_rel(X_test, X_base=None)¶

Determine if rows in X_test are optimal, compared to X_base

Finds the Pareto-efficient test-points that minimize the column values, relative to a given set of base-points.

Parameters:	X_test (2d numpy array) – Test point observations; rows are observations, columns are features X_base (2d numpy array) – Base point observations; rows are observations, columns are features
Returns:	Indicates if test observation is Pareto-efficient, relative to base points
Return type:	array of boolean values

References

Owen Monte Carlo theory, methods and examples (2013)

grama.eval_pnd.rprop(n, Sigma, X_means, seed=None)¶

Draw a sample from gaussian mixture distribution

Draw a sample from a gaussian mixture distribution with a common covariance matrix and different means.

Parameters:	n (int) – Number of observations in sample Sigma (2d numpy array) – Common covariance matrix for mixture distribution X_means (2d numpy array) – Means for mixture distribution

Preconditions:: Sigma.shape[0] == X_means.shape[1] Sigma.shape[1] == X_means.shape[1]

Returns:	Sample of observations
Return type:	2d numpy array

grama.eval_pnd.set_seed()¶

seed(self, seed=None)

Reseed a legacy MT19937 BitGenerator

Notes

This is a convenience, legacy function.

The best practice is to not reseed a BitGenerator, rather to recreate a new one. This method is here for legacy reasons. This example demonstrates best practice.

>>> from numpy.random import MT19937
>>> from numpy.random import RandomState, SeedSequence
>>> rs = RandomState(MT19937(SeedSequence(123456789)))
# Later, you want to restart the stream
>>> rs = RandomState(MT19937(SeedSequence(987654321)))

grama.eval_random module¶

grama.eval_random.eval_sinews¶

Sweep study

Perform coordinate sweeps over each model random variable (“sinew” design). Use random starting points drawn from the joint density. Optionally sweep the deterministic variables.

For more expensive models, it can be helpful to tune n_density and n_sweeps to achieve a reasonable runtime.

Use gr.plot_auto() to construct a quick visualization of the output dataframe. Use skip version to visualize the design, and non-skipped version to visualize the results.

Parameters:	model (gr.Model) – Model to evaluate n_density (numeric) – Number of points along each sweep n_sweeps (numeric) – Number of sweeps per-random variable seed (int) – Random seed to use df_det (DataFrame) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels, use “swp” to sweep deterministic variables varname (str) – Column name to give for sweep variable; default=”sweep_var” indname (str) – Column name to give for sweep index; default=”sweep_ind” append (bool) – Append results to conservative inputs? skip (bool) – Skip evaluation of the functions?
Returns:	Results of evaluation or unevaluated design
Return type:	DataFrame

Examples:

import grama as gr
md = gr.make_cantilever_beam()
# Skip evaluation, used to visualize the design (input points)
df_design = md >> gr.ev_sinews(df_det="nom", skip=True)
df_design >> gr.pt_auto()
# Visualize the input-to-output relationships of the model
df_sinew = md >> gr.ev_sinews(df_det="nom")
df_sinew >> gr.pt_auto()

grama.eval_random.eval_hybrid¶

Hybrid points for Sobol’ indices

Use the “hybrid point” design (Sobol’, 1999) to support estimating Sobol’ indices. Use gr.tran_sobol() to post-process the results and compute estimates.

Parameters:	model (gr.Model) – Model to evaluate; must have CopulaIndependence n (numeric) – Number of points along each sweep plan (str) – Sobol’ index to compute; plan={“first”, “total”} seed (int) – Random seed to use df_det (DataFrame) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. varname (str) – Column name to give for sweep variable; default=”hybrid_var” append (bool) – Append results to conservative inputs? skip (bool) – Skip evaluation of the functions?
Returns:	Results of evaluation or unevaluated design
Return type:	DataFrame

References

I.M. Sobol’, “Sensitivity Estimates for Nonlinear Mathematical Models” (1999) MMCE, Vol 1.

Examples:

import grama as gr
md = gr.make_cantilever_beam()
## Compute the first-order indices
df_first = md >> gr.ev_hybrid(df_det="nom", plan="first")
df_first >> gr.tf_sobol()
## Compute the total-order indices
df_total = md >> gr.ev_hybrid(df_det="nom", plan="total")
df_total >> gr.tf_sobol()

grama.eval_tail module¶

grama.eval_tail.eval_form_pma¶

Tail quantile via FORM PMA

Approximate the desired tail quantiles using the performance measure approach (PMA) of the first-order reliability method (FORM) [1]. Select limit states to minimize at desired quantile with one of: betas (reliability indices), rels (reliability values), or pofs (probabilities of failure). Provide confidence levels cons and estimator covariance df_corr to compute with margin in beta [2].

Note that under the performance measure approach, the optimized limit state value g is sought to be non-negative $g geq 0$. This is usually included as a constraint in optimization, which can be accomplished in by using ``gr.eval_form_pnd()` within a model definition—see the Examples below for more details.

Parameters:	model (gr.Model) – Model to analyze df_det (DataFrame) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. betas (dict) – Target reliability indices; key = limit state name; must be in model.out value = reliability index; beta = Phi^{-1}(reliability) rels (dict) – Target reliability values; key = limit state name; must be in model.out value = reliability pofs (dict) – Target probabilities of failure (POFs); key = limit state name; must be in model.out value = probability of failure; pof = 1 - reliability cons (dict or None) – Target confidence levels; key = limit state name; must be in model.out value = confidence level, in (0, 1) df_corr (DataFrame or None) – Sampling distribution covariance entries; parameters with no information assumed to be known exactly. n_maxiter (int) – Maximum iterations for each optimization run n_restart (int) – Number of restarts (== number of optimization runs) append (bool) – Append MPP results for random values? verbose (bool) – Print optimization results?
Returns:	Results of MPP search
Return type:	DataFrame

Notes

Since FORM PMA relies on optimization over the limit state, it is often beneficial to scale your limit state to keep values near unity.

References

Tu, Choi, and Park, “A new study on reliability-based design optimization,” Journal of Mechanical Design, 1999 del Rosario, Fenrich, and Iaccarino, “Fast precision margin with the first-order reliability method,” AIAA Journal, 2019

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Evaluate the reliability of specified designs
(
    md_beam
    >> gr.ev_form_pma(
        # Specify target reliability
        betas=dict(g_stress=3, g_disp=3),
        # Analyze three different thicknesses
        df_det=gr.df_make(t=[2, 3, 4], w=3)
    )
)

## Specify reliability in POF form
(
    md_beam
    >> gr.ev_form_pma(
        # Specify target reliability
        pofs=dict(g_stress=1e-3, g_disp=1e-3),
        # Analyze three different thicknesses
        df_det=gr.df_make(t=[2, 3, 4], w=3)
    )
)

## Build a nested model for optimization under uncertainty
md_opt = (
    gr.Model("Beam Optimization")
    >> gr.cp_vec_function(
        fun=lambda df: gr.df_make(c_area=df.w * df.t),
        var=["w", "t"],
        out=["c_area"],
        name="Area objective",
    )
    >> gr.cp_vec_function(
        fun=lambda df: gr.eval_form_pma(
            md_beam,
            betas=dict(g_stress=3, g_disp=3),
            df_det=df,
            append=False,
        )
        var=["w", "t"],
        out=["g_stress", "g_disp"],
        name="Reliability constraints",
    )
    >> gr.cp_bounds(w=(2, 4), t=(2, 4))
)
# Run the optimization
(
    md_opt
    >> gr.ev_min(
        out_min="c_area",
        out_geq=["g_stress", "g_disp"],
    )
)

grama.eval_tail.eval_form_ria¶

Tail reliability via FORM RIA

Approximate the desired tail probability using the reliability index approach (RIA) of the first-order reliability method (FORM) [1]. Select limit states to analyze with list input limits. Choose output type using format argument (betas for reliability indices, rels for realibility values, pofs for probabilify of failure values). Provide confidence levels cons and estimator covariance df_corr to compute with margin in beta [2].

Note that the reliability index approach (RIA) is generally less stable than the performance measure approach (PMA). Consider using gr.eval_form_pma() instead, particularly when using FORM to optimize a design.

Parameters:	model (gr.Model) – Model to analyze limits (list) – Target limit states; must be in model.out; limit state assumed to be critical at g == 0. format (str) – One of (“betas”, “rels”, “pofs”). Format for computed reliability information cons (dict or None) – Target confidence levels; key = limit state name; must be in model.out value = confidence level, in (0, 1) df_corr (DataFrame or None) – Sampling distribution covariance entries; parameters with no information assumed to be known exactly. df_det (DataFrame) – Deterministic levels for evaluation; use “nom” for nominal deterministic levels. n_maxiter (int) – Maximum iterations for each optimization run n_restart (int) – Number of restarts (== number of optimization runs) append (bool) – Append MPP results for random values? verbose (bool) – Print optimization results?
Returns:	Results of MPP search
Return type:	DataFrame

Notes

Since FORM RIA relies on optimization over the limit state, it is often beneficial to scale your limit state to keep values near unity.

References

[1] Tu, Choi, and Park, “A new study on reliability-based design optimization,” Journal of Mechanical Design, 1999 [2] del Rosario, Fenrich, and Iaccarino, “Fast precision margin with the first-order reliability method,” AIAA Journal, 2019

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Evaluate the reliability of specified designs
(
    md_beam
    >> gr.ev_form_ria(
        # Specify limit states to analyze
        limits=("g_stress", "g_disp"),
        # Analyze three different thicknesses
        df_det=gr.df_make(t=[2, 3, 4], w=3)
    )
)

grama.fit_synonyms module¶

grama.fit_synonyms.fit_nls¶

Fit a model with Nonlinear Least Squares (NLS)

Estimate best-fit variable levels with nonlinear least squares (NLS), and return an executable model with those frozen best-fit levels. Optionally, fit a distribution on the parameters to quantify parametric uncertainty.

Note that it is highly likely you will need to adjust either the bounds of your model or the initial guess of the optimizer (via df_init). Use these to guide the optimizer towards reasonable input values, otherwise you are likely to get nonsensical results.

Note: This is a synonym for eval_nls(); see the documentation for eval_nls() for keyword argument options available beyond those listed here.

Parameters:

df_data (DataFrame) – Data for estimating best-fit variable levels. Variables not found in df_data optimized for fitting.
md (gr.Model) – Model to analyze. All model variables selected for fitting must be bounded or random. Deterministic variables may have semi-infinite bounds.
var_fix (list or None) – Variables to fix to nominal levels. Note that variables with domain width zero will automatically be fixed.
df_init (DataFrame) – Initial guesses for parameters; overrides n_restart
n_restart (int) – Number of restarts to try; the first try is at the nominal conditions of the model. Returned model will use the least-error parameter set among restarts tested.
n_maxiter (int) – Optimizer maximum iterations
verbose (bool) – Print best-fit parameters to console?
uq_method (str OR None) – If string, select method to quantify parameter uncertainties. If None, provide best-fit values only. Methods: uq_method = “linpool”: assume normal errors; linearly approximate parameter effects; equally pool variance matrices for each output

Returns:

Model for evaluation with best-fit variables frozen to: optimized levels.

Return type:

gr.Model

Examples:

import grama as gr
from grama.data import df_trajectory_windowed
from grama.models import make_trajectory_linear
DF = gr.Intention()

md_trajectory = make_trajectory_linear()

## Fit
md_fitted = (
    df_trajectory_windowed
    >> gr.ft_nls(md=md_trajectory)
)

## Fit, stabilize with initial guess
md_fitted = (
    df_trajectory_windowed
    >> gr.ft_nls(
        md=md_trajectory,
        df_init=gr.df_make(u0=18, v0=18, tau=16),
    )
)

## Fit, and return input uncertainties
md_fitted_uq = (
    df_trajectory_windowed
    >> gr.ft_nls(
        md=md_trajectory,
        df_init=gr.df_make(u0=18, v0=18, tau=16),
        uq_method="linpool",
    )
)

grama.marginals module¶

grama.marginals.marg_gkde(data, sign=None)¶

Fit a gaussian KDE to data

Fits a gaussian kernel density estimate (KDE) to data.

Parameters:	data (iterable) – Data for fit sign (bool) – Include sign? (Optional)
Returns:	Marginal distribution
Return type:	gr.MarginalGKDE

Examples:

import grama as gr
from grama.data import df_stang
md = (
    gr.Model("Marginal Example")
    >> gr.cp_marginals(
        E=gr.marg_gkde(df_stang.E),
        mu=gr.marg_gkde(df_stang.mu),
    )
)
md

grama.marginals.marg_fit(dist, data, name=True, sign=None, **kwargs)¶

Fit scipy.stats continuous distirbution

Fits a scipy.stats continuous distribution. Intended to be used to define a marginal distribution from data.

Parameters:	dist (str) – Distribution to fit data (iterable) – Data for fit name (bool) – Include distribution name? sign (bool) – Include sign? (Optional) loc (float) – Initial guess for location loc parameter (Optional) scale (float) – Initial guess for scale scale parameter (Optional) floc (float) – Value to fix the location loc parameter (Optional) Note that for distributions such as “lognorm” or “weibull_min”, setting floc=0 selects the 2-parameter version of the distribution. fscale (float) – Value to fix the location scale parameter (Optional) f* (float) – Value to fix the specified shape parameter (Optional) e.g. give fc to fix the c parameter
Returns:	Distribution
Return type:	gr.MarginalNamed

Examples:

import grama as gr
from grama.data import df_shewhart
# Fit normal distribution
mg_normal = gr.marg_named(
    "norm",
    df_shewhart.tensile_strength,
)
# Fit two-parameter Weibull distribution
mg_weibull2 = gr.marg_named(
    "weibull_min",
    df_shewhart.tensile_strength,
    floc=0,        # 2-parameter has frozen loc == 0
)
# Fit three-parameter Weibull distribution
mg_weibull3 = gr.marg_named(
    "weibull_min",
    df_shewhart.tensile_strength,
    loc=0,        # 3-parameter fit tends to be unstable;
                  # an inital guess helps stabilize fit
)
# Inspect fits with QQ plot
(
    df_shewhart
    >> gr.tf_mutate(
        q_normal=gr.qqvals(DF.tensile_strength, marg=mg_normal),
        q_weibull2=gr.qqvals(DF.tensile_strength, marg=mg_weibull2),
    )
    >> gr.tf_pivot_longer(
        columns=[
            "q_normal",
            "q_weibull2",
        ],
        names_to=[".value", "Distribution"],
        names_sep="_"
    )

    >> gr.ggplot(gr.aes("q", "tensile_strength"))
    + gr.geom_abline(intercept=0, slope=1, linetype="dashed")
    + gr.geom_point(gr.aes(color="Distribution"))
)

grama.marginals.marg_mom(dist, mean=None, sd=None, cov=None, var=None, skew=None, kurt=None, kurt_excess=None, floc=None, sign=0, dict_x0=None)¶

Fit scipy.stats continuous distribution via moments

Fit a continuous distribution using the method of moments. Select a distribution shape and provide numerical values for a convenient set of common moments.

This routine uses a vector-output root finding routine to match the moments. You may set an optional initial guess for the distribution parameters using the dict_x0 argument.

Parameters:	dist (str) – Name of distribution to fit

Kwargs:

mean (float): Mean of distribution sd (float): Standard deviation of distribution cov (float): Coefficient of Variation of distribution (sd / mean) var (float): Variance of distribution; only one of sd and var can be provided. skew (float): Skewness of distribution kurt (float): Kurtosis of distribution kurt_excess (float): Excess kurtosis of distribution; kurt_excess = kurt - 3.

Only one of kurt and kurt_excess can be provided.

floc (float or None): Frozen value for location parameter: Note that for distributions such as “lognorm” or “weibull_min”, setting floc=0 selects the 2-parameter version of the distribution.

sign (-1, 0, +1): Sign dict_x0 (dict): Dictionary of initial parameter guesses

Returns:	Distribution
Return type:	gr.MarginalNamed

Examples:

import grama as gr
## Fit a normal distribution
mg_norm = gr.marg_mom("norm", mean=0, sd=1)
## Fit a (3-parameter) lognormal distribution
mg_lognorm = gr.marg_mom("lognorm", mean=1, sd=1, skew=1)
## Fit a lognormal, controlling kurtosis instead
mg_lognorm = gr.marg_mom("lognorm", mean=1, sd=1, kurt=1)
## Fit a 2-parameter lognormal; no skewness or kurtosis needed
mg_lognorm = gr.marg_mom("lognorm", mean=1, sd=1, floc=0)

## Not all moment combinations are feasible; this will fail
gr.marg_mom("beta", mean=1, sd=1, skew=0, kurt=4)
## Skewness and kurtosis are related for the beta distribution;
## a different combination is feasible
gr.marg_mom("beta", mean=1, sd=1, skew=0, kurt=2)

class grama.marginals.Marginal(sign=0)¶

Bases: abc.ABC

Parent class for marginal distributions

copy()¶

d(x)¶

fit(data)¶

p(x)¶

q(p)¶

r(n)¶

summary()¶

class grama.marginals.MarginalNamed(d_name=None, d_param=None, **kw)¶

Bases: grama.marginals.Marginal

Marginal using a named distribution from gr.valid_dist

copy()¶

d(**kwargs)¶

fit(data, **kwargs)¶

p(**kwargs)¶

q(**kwargs)¶

summary(dig=2)¶

class grama.marginals.MarginalGKDE(kde, atol=1e-06, **kw)¶

Bases: grama.marginals.Marginal

Marginal using scipy.stats.gaussian_kde

copy()¶

d(**kwargs)¶

fit(data)¶

p(**kwargs)¶

q(**kwargs)¶

summary()¶

grama.mutate_helpers module¶

grama.mutate_helpers.abs(*args, **kwargs)¶: Absolute value

grama.mutate_helpers.sin(*args, **kwargs)¶: Sine

grama.mutate_helpers.cos(*args, **kwargs)¶: Cosine

grama.mutate_helpers.tan(*args, **kwargs)¶: Tangent

grama.mutate_helpers.log(*args, **kwargs)¶: (Natural) log

grama.mutate_helpers.exp(*args, **kwargs)¶: Exponential (e-base)

grama.mutate_helpers.sqrt(*args, **kwargs)¶: Square-root

grama.mutate_helpers.pow(*args, **kwargs)¶

Power

Usage:: q = pow(x, p) := x ^ p

Parameters:	= base (x) – = exponent (p) –

grama.mutate_helpers.floor(*args, **kwargs)¶: Round downwards to nearest integer

grama.mutate_helpers.ceil(*args, **kwargs)¶: Round upwards to nearest integer

grama.mutate_helpers.round(*args, **kwargs)¶

Round value to desired precision

Parameters:	x (float or iterable of floats) – Value(s) to round decimals (int) – Number of decimal places for rounding; decimals=0 rounds to integers

grama.mutate_helpers.as_int(*args, **kwargs)¶: Cast to integer

grama.mutate_helpers.as_float(*args, **kwargs)¶: Cast to float

grama.mutate_helpers.as_str(*args, **kwargs)¶: Cast to string

grama.mutate_helpers.as_factor(*args, **kwargs)¶

Cast to factor

Parameters:	x (pd.Series) – Column to convert categories (list) – Categories (levels) of factor (Optional) ordered (boolean) – Order the factor?

grama.mutate_helpers.as_numeric(*args, **kwargs)¶: Cast to numeric

grama.mutate_helpers.fct_reorder(*args, **kwargs)¶

Reorder a factor on another variable

Parameters:	f (iterable OR DataFrame column) – factor to reorder x (iterable OR DataFrame column) – variable on which to reorder; specify aggregation method with fun fun (function) – aggregation function for reordering, default=median
Returns:	Iterable with levels sorted according to x
Return type:	Categorical

Examples:

import grama as gr
from grama.data import df_diamonds
DF = gr.Intention()
(
    df_diamonds
    >> gr.tf_mutate(cut=gr.fct_reorder(DF.cut, DF.price))
    >> gr.tf_group_by(DF.cut)
    >> gr.tf_summarize(max=gr.colmax(DF.price), mean=gr.mean(DF.price))
)

grama.mutate_helpers.fillna(*args, **kwargs)¶

Wrapper for pandas Series.fillna

(See below for Pandas documentation)

Examples:

import grama as gr
X = gr.Intention()
df = gr.df_make(x=[1, gr.NaN], y=[2, 3])
df_filled = (
    df
    >> gr.tf_mutate(x=gr.fillna(X.x, 0))
)

Fill NA/NaN values using the specified method.

Parameters:	value (scalar, dict, Series, or DataFrame) – Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list. method ({'backfill', 'bfill', 'pad', 'ffill', None}, default None) – Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap. axis ({0 or 'index'}) – Axis along which to fill missing values. inplace (bool, default False) – If True, fill in-place. Note: this will modify any other views on this object (e.g., a no-copy slice for a column in a DataFrame). limit (int, default None) – If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None. downcast (dict, default is None) – A dict of item->dtype of what to downcast if possible, or the string ‘infer’ which will try to downcast to an appropriate equal type (e.g. float64 to int64 if possible).
Returns:	Object with missing values filled or None if `inplace=True`.
Return type:	Series or None

See also

interpolate(): Fill NaN values using interpolation.
reindex(): Conform object to new index.
asfreq(): Convert TimeSeries to specified frequency.

Examples

>>> df = pd.DataFrame([[np.nan, 2, np.nan, 0],
...                    [3, 4, np.nan, 1],
...                    [np.nan, np.nan, np.nan, 5],
...                    [np.nan, 3, np.nan, 4]],
...                   columns=list("ABCD"))
>>> df
     A    B   C  D
0  NaN  2.0 NaN  0
1  3.0  4.0 NaN  1
2  NaN  NaN NaN  5
3  NaN  3.0 NaN  4

Replace all NaN elements with 0s.

>>> df.fillna(0)
    A   B   C   D
0   0.0 2.0 0.0 0
1   3.0 4.0 0.0 1
2   0.0 0.0 0.0 5
3   0.0 3.0 0.0 4

We can also propagate non-null values forward or backward.

>>> df.fillna(method="ffill")
    A   B   C   D
0   NaN 2.0 NaN 0
1   3.0 4.0 NaN 1
2   3.0 4.0 NaN 5
3   3.0 3.0 NaN 4

Replace all NaN elements in column ‘A’, ‘B’, ‘C’, and ‘D’, with 0, 1, 2, and 3 respectively.

>>> values = {"A": 0, "B": 1, "C": 2, "D": 3}
>>> df.fillna(value=values)
    A   B   C   D
0   0.0 2.0 2.0 0
1   3.0 4.0 2.0 1
2   0.0 1.0 2.0 5
3   0.0 3.0 2.0 4

Only replace the first NaN element.

>>> df.fillna(value=values, limit=1)
    A   B   C   D
0   0.0 2.0 2.0 0
1   3.0 4.0 NaN 1
2   NaN 1.0 NaN 5
3   NaN 3.0 NaN 4

When filling using a DataFrame, replacement happens along the same column names and same indices

>>> df2 = pd.DataFrame(np.zeros((4, 4)), columns=list("ABCE"))
>>> df.fillna(df2)
    A   B   C   D
0   0.0 2.0 0.0 0
1   3.0 4.0 0.0 1
2   0.0 0.0 0.0 5
3   0.0 3.0 0.0 4

grama.mutate_helpers.qnorm(*args, **kwargs)¶: Normal quantile function (inverse CDF)

grama.mutate_helpers.pnorm(*args, **kwargs)¶: Normal cumulative distribution function (CDF)

grama.mutate_helpers.dnorm(*args, **kwargs)¶: Normal probability density function (PDF)

grama.mutate_helpers.pareto_min(*args, **kwargs)¶

Determine if observation is a Pareto point

Find the Pareto-efficient points that minimize the provided features.

Parameters:	xi (iterable OR gr.Intention()) – Feature to minimize; use -X to maximize
Returns:	Indicates if observation is Pareto-efficient
Return type:	np.array of boolean

grama.mutate_helpers.stratum_min(*args, **kwargs)¶

Compute Pareto stratum number

Compute the Pareto stratum number for a given dataset.

Parameters:	xi (iterable OR gr.Intention()) – Feature to minimize; use -X to maximize max_depth (int) – Maximum depth for recursive computation; stratum numbers exceeding this value will not be computed and will be flagged as NaN.
Returns:	Pareto stratum number
Return type:	np.array of floats

References

del Rosario, Rupp, Kim, Antono, and Ling “Assessing the frontier: Active learning, model accuracy, and multi-objective candidate discovery and optimization” (2020) J. Chem. Phys.

grama.mutate_helpers.qqvals(*args, **kwargs)¶

Generate theoretical quantiles

Generate theoretical quantiles for a Q-Q plot. Can provide either a pre-defined Marginal object or the name of a distribution to fit.

Keyword Arguments:
Parameters:	x (array-like or gr.Intention()) – Target observations
	dist (str or None) – Name of scipy distribution to fit; see gr.valid_dist for list of valid distributions marg (gr.Marginal() or None) – Pre-fitted marginal
Returns:	Theoretical quantiles, matched in order with target observations
Return type:	Series

References

Filliben, J. J., “The Probability Plot Correlation Coefficient Test for Normality” (1975) Technometrics. DOI: 10.1080/00401706.1975.10489279

Examples:

import grama as gr
from grama.data import df_shewhart
DF = gr.Intention()

## Make a Q-Q plot
(
    df_shewhart
    >> gr.tf_mutate(q=gr.qqvals(DF.tensile_strength, dist="norm"))
    >> gr.ggplot(gr.aes("q", "tensile_strength"))
    + gr.geom_abline(intercept=0, slope=1, linetype="dashed")
    + gr.geom_point()
)

## Fit a marginal, use in Q-Q plot
mg_tys = gr.marg_fit("lognorm", df_shewhart.tensile_strength, floc=0)
# Plot to assess the fit
(
    df_shewhart
    >> gr.tf_mutate(q=gr.qqvals(DF.tensile_strength, marg=mg_tys))
    >> gr.ggplot(gr.aes("q", "tensile_strength"))
    + gr.geom_abline(intercept=0, slope=1, linetype="dashed")
    + gr.geom_point()
)

grama.mutate_helpers.linspace(*args, **kwargs)¶

Linearly-spaced values

Create an array of linearly-spaced values. Accepts keyword arguments for numpy.linspace.

Parameters:	a (numeric) – Smallest value b (numeric) – Largest value n (int) – Number of points
Returns:	Array of requested values
Return type:	numpy array

Notes

This is a symbolic alias for np.linspace(); you can use this in pipe-enabled functions.

Examples:

import grama as gr
from grama.data import df_stang
DF = gr.Intention()
(
    df_stang
    >> gr.tf_mutate(c=gr.linspace(0, 1, gr.n(DF.index)))
)

grama.mutate_helpers.logspace(*args, **kwargs)¶

Logarithmically-spaced values

Create an array of logarithmically-spaced values. Accepts keyword arguments for numpy.logspace.

Parameters:	a (numeric) – Smallest value b (numeric) – Largest value n (int) – Number of points
Returns:	Array of requested values
Return type:	numpy array

Notes

This is a symbolic alias for np.logspace(); you can use this in pipe-enabled functions.

Examples:

import grama as gr
from grama.data import df_stang
DF = gr.Intention()
(
    df_stang
    >> gr.tf_mutate(c=gr.logspace(0, 1, gr.n(DF.index)))
)

grama.mutate_helpers.consec(*args, **kwargs)¶

Flag consecutive runs of True values

Flag as True all entries associated with a “run” of True values. A “run” is defined as a user-defined count (given by i) of consecutive True values.

Parameters:	series (pd.Series) – Series of boolean values i (int) – User-defined count to define a run
Returns:	Boolean series flagging consecutive True values
Return type:	series

grama.plot_auto module¶

grama.plot_auto.plot_contour¶

Plot 2d contours

Plot contours. Usually called as a dispatch from plot_auto().

Parameters:	var (array of str) – Variables for plot axes out (str) – Name of output identifier column level (str) – Name of level identifier column aux (bool) – Auxillary variables present? color (str) – Color mode; options: “full”, “bw”
Returns:	Contour image
Return type:	ggplot

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
(
    md_beam
    >> gr.ev_contour(
        var=["w", "t"],
        out=["g_stress"],
        # Set auxilliary inputs to nominal levels
        df=gr.eval_nominal(md_beam, df_det="nom"),
    )
    >> gr.pt_auto()
)

grama.plot_auto.plot_corrtile¶

grama.plot_auto.plot_scattermat¶

Create a scatterplot matrix

Create a scatterplot matrix. Often used to visualize a design (set of inputs points) before evaluating the functions.

Usually called as a dispatch from plot_auto().

Parameters:	var (list of strings) – Variables to plot color (str) – Color mode; value ignored as default vis is black & white
Returns:	Scatterplot matrix
Return type:	ggplot

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Dispatch from autoplotter
(
    md_beam
    >> gr.ev_sample(n=100, df_det="nom", skip=True)
    >> gr.pt_auto()
)
## Re-create plot without metadata
(
    md_beam
    >> gr.ev_sample(n=100, df_det="nom")
    >> gr.pt_scattermat(var=md.var)
)

grama.plot_auto.plot_hists¶

Construct histograms

Create a set of histograms. Often used to visualize the results of random sampling for multiple outputs.

Usually called as a dispatch from plot_auto().

Parameters:	out (list of strings) – Variables to plot color (str) – Color mode; value ignored as default vis is black & white
Returns:	Seaborn histogram plot

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Dispatch from autoplotter
(
    md_beam
    >> gr.ev_sample(n=100, df_det="nom")
    >> gr.pt_auto()
)
## Re-create without metadata
(
    md_beam
    >> gr.ev_sample(n=100, df_det="nom")
    >> gr.pt_hists(out=md.out)
)

grama.plot_auto.plot_sinew_inputs¶

Inspect a sinew design

Create a scatterplot matrix with hues. Often used to visualize a sinew design before evaluating the model functions.

Usually called as a dispatch from plot_auto().

Parameters:	df (Pandas DataFrame) – Input design data var (list of strings) – Variables to plot sweep_ind (string) – Sweep index column in df color (str) – Color mode; options: “full”, “bw”
Returns:	Seaborn scatterplot matrix

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Dispatch from autoplotter
(
    md_beam
    >> gr.ev_sinews(df_det="swp", skip=True)
    >> gr.pt_auto()
)
## Re-create without metadata
(
    md_beam
    >> gr.ev_sinews(df_det="swp")
    >> gr.pt_sinew_inputs(var=md.var)
)

grama.plot_auto.plot_sinew_outputs¶

Construct sinew plot

Create a relational lineplot with hues for each sweep. Often used to visualize the outputs of a sinew design.

Usually called as a dispatch from plot_auto().

Parameters:	df (Pandas DataFrame) – Input design data with output results var (list of strings) – Variables to plot out (list of strings) – Outputs to plot sweep_ind (string) – Sweep index column in df sweep_var (string) – Swept variable column in df color (str) – Color mode; options: “full”, “bw”
Returns:	Relational lineplot

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Dispatch from autoplotter
(
    md_beam
    >> gr.ev_sinews(df_det="swp")
    >> gr.pt_auto()
)
## Re-create without metadata
(
    md_beam
    >> gr.ev_sinews(df_det="swp")
    >> gr.pt_sinew_inputs(var=md.var, out=md.out)
)

grama.plot_auto.plot_auto¶

Automagic plotting

Convenience tool for various grama outputs. Prints delegated plotting function, which can be called manually with different arguments for more tailored plotting.

Parameters:	df (DataFrame) – Data output from appropriate grama routine. See gr.plot_list.keys() for list of supported methods. color (str) – Color mode; options: “full”, “bw”
Returns:	Plot results

grama.spc module¶

grama.spc.c_sd(n)¶

Anti-biasing constant for aggregate standard deviation

Returns the anti-biasing constant for aggregated standard deviation estimates. If the average of $k$ samples each size $n$ are averaged to produce $overline{S} = (1/k) sum_{i=1}^k S_i$, then the de-biased standard deviation is:

$$hat{sigma} = overline{S} / c(n)$$

Parameters:	n (int) – Sample (batch) size
Returns:	anti-biasing constant
Return type:	float

References

Kenett and Zacks, Modern Industrial Statistics (2014) 2nd Ed

grama.spc.B3(n)¶

Lower Control Limit constant for standard deviation

Returns the Lower Control Limit (LCL) constant for aggregated standard deviation estimates. If the average of $k$ samples each size $n$ are averaged to produce $overline{S} = (1/k) sum_{i=1}^k S_i$, then the LCL is:

$$LCL = B_3 overline{S}$$

Parameters:	n (int) – Sample (batch) size
Returns:	LCL constant
Return type:	float

References

Kenett and Zacks, Modern Industrial Statistics (2014) 2nd Ed, Equation (8.22)

grama.spc.B4(n)¶

Upper Control Limit constant for standard deviation

Returns the Upper Control Limit (UCL) constant for aggregated standard deviation estimates. If the average of $k$ samples each size $n$ are averaged to produce $overline{S} = (1/k) sum_{i=1}^k S_i$, then the UCL is:

$$UCL = B_4 overline{S}$$

Parameters:	n (int) – Sample (batch) size
Returns:	UCL constant
Return type:	float

References

Kenett and Zacks, Modern Industrial Statistics (2014) 2nd Ed, Equation (8.22)

grama.spc.plot_xbs(df, group, var, n_side=9, n_delta=6, color='full')¶

Construct Xbar and S chart

Construct an Xbar and S chart to assess the state of statistical control of a dataset.

Keyword Arguments:
Parameters:	df (DataFrame) – Data to analyze group (str) – Variable for grouping var (str) – Variable to study
	n_side (int) – Number of consecutive runs above/below centerline to flag n_delta (int) – Number of consecutive runs increasing/decreasing to flag color (str) – Color mode; “full” for full color, “bw” for greyscale
Returns:	Xbar and S chart
Return type:	plotnine object

Examples:

import grama as gr
DF = gr.Intention()

from grama.data import df_shewhart
(
    df_shewhart
    >> gr.tf_mutate(idx=DF.index // 10)
    >> gr.pt_xbs("idx", "tensile_strength")
)

grama.string_helpers module¶

grama.string_helpers.str_c(*args, **kwargs)¶: Concatenate strings

grama.string_helpers.str_count(*args, **kwargs)¶: Count the number of matches in a string.

grama.string_helpers.str_detect(*args, **kwargs)¶: Detect the presence of a pattern match in a string.

grama.string_helpers.str_extract(*args, **kwargs)¶: Extract the first regex pattern match

grama.string_helpers.str_locate(*args, **kwargs)¶: Find the indices of all pattern matches.

grama.string_helpers.str_replace(*args, **kwargs)¶: Replace the first matched pattern in each string.

grama.string_helpers.str_replace_all(*args, **kwargs)¶: Replace all occurences of pattern in each string.

grama.string_helpers.str_sub(*args, **kwargs)¶: Extract substrings

grama.string_helpers.str_split(*args, **kwargs)¶

Split string into list on pattern

Parameters:	string (str or iterable[str]) – String(s) to split pattern (str) – Regex pattern on which to split

Kwargs:: maxsplit (int): Maximum number of splits, or 0 for unlimited
Returns: str or iterable[str]: List (of lists) of strings

grama.string_helpers.str_which(*args, **kwargs)¶: Find the index of the first pattern match.

grama.string_helpers.str_to_lower(*args, **kwargs)¶

Make string lower-case

Parameters:	string (str or iterable[str]) – String(s) to modify

Returns: str or iterable[str]: List (of lists) of strings

grama.string_helpers.str_to_upper(*args, **kwargs)¶

Make string upper-case

Parameters:	string (str or iterable[str]) – String(s) to modify

Returns: str or iterable[str]: List (of lists) of strings

grama.string_helpers.str_to_snake(*args, **kwargs)¶

Make string snake case

Parameters:	string (str or iterable[str]) – String(s) to modify

Returns: str or iterable[str]: List (of lists) of strings

grama.support module¶

grama.support.tran_sp¶

Compact a dataset with support points

Parameters:	df (DataFrame) – dataset to compact n (int) – number of samples for compacted dataset var (list of str) – list of variables to compact, must all be numeric n_maxiter (int) – maximum number of iterations for support point algorithm tol (float) – convergence tolerance verbose (bool) – print messages to the console? standardize (bool) – standardize columns before running sp? (Restores after sp)
Returns:	dataset compacted with support points
Return type:	DataFrame

References

Mak and Joseph, “Support Points” (2018) The Annals of Statistics

Examples:

import grama as gr
# Compact an existing dataset
from grama.data import df_diamonds
df_sp = gr.tran_sp(df_diamonds, n=50, var=["price", "carat"])

# Use support points to reduce model runtime
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
(
    md_beam
    ## Generate input sample but don't evaluate outputs
    >> gr.ev_sample(n=1e4, df_det="nom", skip=True)
    ## Reduce to a smaller---but representative---sample
    >> gr.tf_sp(n=50)
    ## Evaluate the outputs
    >> gr.tf_md(md_beam)
)

grama.tools module¶

grama.tools.add_pipe(fun)¶

grama.tools.copy_meta(df_source, df_target)¶

Internal metadata copy tool

Parameters:	df_source (DataFrame) – Original dataframe df_target (DataFrame) – Target dataframe; receives metadata
Returns:	df_target with copied metadata
Return type:	DataFrame

grama.tools.custom_formatwarning(msg, *args, **kwargs)¶

grama.tools.lookup(df, row_labels, col_labels)¶

2D lookup function for a dataframe (Old Pandas Lookup Method)

Parameters:	df (DataFrame) – DataFrame for lookup row_labels (List) – Row labels to use for lookup. col_labels (List) – Column labels to use for lookup.
Returns:	Found values
Return type:	numpy.ndarray

grama.tools.hide_traceback()¶: Configure Jupyter to hide error traceback

class grama.tools.pipe(function)¶: Bases: object

grama.tools.tran_outer¶

Outer merge

Perform an outer-merge on two dataframes.

Parameters:	df (DataFrame) – Data to merge df_outer (DataFrame) – Data to merge; outer
Returns:	Merged data
Return type:	DataFrame

Examples:

import grama as gr
import pandas as pd
df = pd.DataFrame(dict(x=[1,2]))
df_outer = pd.DataFrame(dict(y=[3,4]))
df_res = gr.tran_outer(df, df_outer)
df_res
     x  y
  0  1  3
  1  2  3
  2  1  4
  3  2  4

grama.tran_is module¶

grama.tran_is.tran_reweight¶

Reweight a sample using likelihood ratio

Reweight is a tool to facilitate “What If?” Monte Carlo simulation; specifically, to make testing a models with the same function(s) but different distributions more computationally efficient.

This tool automates calulation of the likelihood ratio between the distributions of two given models. Using the resulting weights to scale (elementwise multiply) output values and compute summaries is called importance sampling, enabling “What If?” testing. Use of this tool enables one to generate a single Monte Carlo sample, rather than multiple samples for each “What If?” scenario (avoiding extraneous function evaluations).

Let y be a generic output of the scenario. The importance sampling procedure is as follows:

Create a base scenario represented by md_base, and a desired number of alternative “What If?” scenarios represented my other models.
Use gr.eval_monte_carlo to generate a single sample df_base of size n using the base scenario md_base. Compute statistics of interest on the output y for this base scenario.
For each alternative scenario md_new, use gr.tran_reweight to generate weights weight, and use the tool n_e = gr.neff_is(DF.weight) to compute the effective sample size. If n_e << n, then importance sampling is unlikely to give an accurate estimate for this scenario.
For each alternative scenario md_new, use the relevant weights weight to scale the output value y_new = y * weight, and compute statistics of interest of the weighted output values.

Parameters:	df_base (DataFrame) – Monte Carlo results from md_base. md_base (Model) – Model used to generate df_base. md_new (Model) – Model defining a new “What If?” scenario. var_weight (string) – Name to give new column of weights. append (bool) – Append results to original DataFrame?
Returns:	Original df_base with added column of weights.
Return type:	DataFrame

Notes

The base scenario md_base should have fatter tails than any of the scenarios considered as df_new. See Owen (2013) Chapter 9 for more details.

References

A.B. Owen, “Monte Carlo theory, methods and examples” (2013)

Examples:

import grama as gr
from grama.models import make_cantilever_beam
DF = gr.Intention()

md_base = make_cantilever_beam()
md_new = (
    md_base
    >> gr.cp_marginals(
        H=dict(dist="norm", loc=500.0, scale=50.0),
    )
)

## Assess safety via simple Monte Carlo
df_base = gr.eval_monte_carlo(md_base, df_det="nom", n=1e3)
print(
    df_base
    >> gr.tf_summarize(
        pof_stress=gr.mean(DF.g_stress <= 0),
        pof_disp=gr.mean(DF.g_disp <= 0),
    )
)

## Re-use samples to test another scenario
print(
    df_base
    >> gr.tf_reweight(md_base=md_base, md_new=md_new)
    >> gr.tf_summarize(
        pof_stress=gr.mean((DF.g_stress <= 0) * DF.weight),
        pof_disp=gr.mean((DF.g_disp <= 0) * DF.weight),
        n_eff=gr.neff_is(DF.weight),
    )
)

## It is unsafe to study new scenarios with wider uncertainty than the base scenario
md_poor = (
    md_base
    >> gr.cp_marginals(
        H=dict(dist="norm", loc=500.0, scale=400.0),
    )
)
## Note the tiny effective sample size in this case
print(
    md_base
    >> gr.ev_monte_carlo(n=1e3, df_det="nom")
    >> gr.tf_reweight(md_base=md_base, md_new=md_poor)
    >> gr.tf_summarize(
        pof_stress=gr.mean((DF.g_stress <= 0) * DF.weight),
        pof_disp=gr.mean((DF.g_disp <= 0) * DF.weight),
        n_eff=gr.neff_is(DF.weight),
    )
)

grama.tran_pivot module¶

grama.tran_pivot.tran_pivot_longer(df, columns, index_to=None, names_to=None, names_sep=None, names_pattern=None, values_to=None)¶

Lengthen a dataset

“Lengthens” data by increasing the number of rows and decreasing the number of columns.

Parameters:

df (DataFrame) – DataFrame passed through
columns (str) – Label of column(s) to pivot into longer format
index_to (str) – str name to create a new representation index of observations; Optional.
names_to (str) –
name to use for the ‘variable’ column, if None frame.columns.name is used or ‘variable’
- .value indicates that component of the name defines the name of the column containing the cell values, overriding values_to
names_sep (str OR list of int) –
delimter to seperate the values of the argument(s) from the ‘columns’ parameter into 2 new columns with those values split by that delimeter
- Regex expression is a valid input for names_sep
names_pattern (str) – Regular expression with capture groups to define targets for names_to.
values_to (str) – name to use for the ‘value’ column; overridden if “.value” is provided in names_to argument.

Notes

Only one of names_sep OR names_pattern may be given.

Returns:	result of being pivoted into a longer format
Return type:	DataFrame

Examples:

import grama as gr
## Simple example
(
    gr.df_make(
        A=[1, 2, 3],
        B=[4, 5, 6],
        C=[7, 8, 9],
    )
    >> gr.tf_pivot_longer(
        columns=["A", "B", "C"],
        names_to="variable",
        values_to="value",
    )
)

## Matching columns on patterns
(
    gr.df_make(
        x1=[1, 2, 3],
        x2=[4, 5, 6],
        x3=[7, 8, 9],
    )
    >> gr.tf_pivot_longer(
        columns=gr.matches("\d+"),
        names_to="variable",
        values_to="value",
    )
)

## Separating column names and data on a names_pattern
(
    gr.df_make(
        E00=[1, 2, 3],
        E45=[4, 5, 6],
        E90=[7, 8, 9],
    )
    >> gr.tf_pivot_longer(
        columns=gr.matches("\d+"),
        names_to=[".value", "angle"],
        names_pattern="(E)(\d+)",
    )
)

grama.tran_pivot.tran_pivot_wider(df, names_from, indexes_from=None, values_from=None)¶

Widen a dataset

“Widens” data by increasing the number of columns and decreasing the number of rows.

Parameters:	df (DataFrame) – DataFrame passed through names_from (str) – Column(s) name(s) to use to make new columns indexes_from (str) – Column(s) to use to make new index, if None will preserve and use unspecified columns as index values_from (str) – Column(s) to use as new values column(s)
Returns:	result of being pivoted wider
Return type:	DataFrame

Examples:

import grama as gr
## Simple example
(
    gr.df_make(var=["x", "x", "y", "y"], value=[0, 1, 2, 3])
    >> gr.tf_pivot_wider(
        names_from="var",
        values_from="value",
    )
(

grama.tran_shapley module¶

grama.tran_shapley.tran_shapley_cohort¶

Compute cohort shapley values

Assess the impact of each variable on selected observations via cohort shapley [1]. Shapley values are a game-theoretic way to assess the importance of input variables (var) on each of a set of outputs (out). Since values are computed on each observation, cohort shapley can distinguish cases where a variable has a positive impact on one observation, and a negative impact on a different observation.

Note that cohort shapley is combinatorialy expensive in the number of variables, and this expense is multiplied by the number of observations. Use with caution in cases of high dimensionality. Consider using the inds argument to analyze a small subset of your observations.

Parameters:	df (DataFrame) – Variable and output data to analyze var (list of strings) – Input variables out (list of strings) – Outputs variables bins (integer) – Number of “bins” to define coordinate refinement distance inds (iterable of indices or None) – Indices of rows to analyze

References

[1] Mase, Owen, and Seiler, “Explaining black box decisions by Shapley cohort refinement” (2019) Arxiv

Examples:

import grama as gr
from grama.data import df_stang
DF = gr.Intention()
# Analyze all observations
(
    gr.tran_shapley_cohort(
        df_stang,
        var=["thick", "ang"],
        out=["E"],
    )
    >> gr.tf_bind_cols(df_stang)
    >> gr.tf_filter(DF.E_thick < 0)
)
# Compute subset of values
(
    gr.tran_shapley_cohort(
        df_stang,
        var=["thick", "ang"],
        out=["E"],
        inds=(
            df_stang
            >> gr.tf_filter(DF.thick > 0.08)
        ).index
    )
    >> gr.tf_bind_cols(df_stang)
)

grama.tran_summaries module¶

grama.tran_summaries.tran_asub¶

Active subspace estimator

Compute principal directions and eigenvalues for all outputs based on output of ev_grad_fd() to estimate the /active subspace/ (Constantine, 2015).

See also gr.tran_polyridge() for a gradient-free approach to approximating the active subspace.

Parameters:	df (DataFrame) – Gradient evaluations prefix (str) – Column name prefix; default=”D” outvar (str) – Name to give output id column; default=”output” lambvar (str) – Name to give eigenvalue column; default=”lam”
Returns:	Active subspace directions and eigenvalues
Return type:	DataFrame

References

Constantine, “Active Subspaces” (2015) SIAM

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md = make_cantilever_beam()
df_base = md >> gr.ev_sample(n=1e2, df_det="nom", skip=True)
df_grad = md >> gr.ev_grad_fd(df_base=df_base)
df_as = df_grad >> gr.tf_asub()

grama.tran_summaries.tran_describe¶

Describe a dataframe

Synonym for Pandas df.describe()

Parameters:	df (DataFrame) – Data to describe
Returns:	Printed summary

grama.tran_summaries.tran_inner¶

Inner products

Compute inner product of target df with weights defined by df_weights.

Parameters:	df (DataFrame) – Data to compute inner products against df_weights (DataFrame) – Weights for inner prodcuts prefix (str) – Name prefix for resulting inner product columns; default=”dot” name (str) – Name of identity column in df_weights or None append (bool) – Append new data to original DataFrame?
Returns:	Results of inner products
Return type:	DataFrame

Examples:

## Setup
import grama as gr
DF = gr.Intention()

## PCA example
from grama.data import df_diamonds
# Compute PCA weights
df_weights = gr.tran_pca(
    df_diamonds
    >> gr.tf_sample(1000),
    var=["x", "y", "z", "carat"],
)
# Visualize
(
    df_diamonds
    >> gr.tf_inner(df_weights=df_weights)
    >> gr.ggplot(gr.aes("dot0", "dot1"))
    + gr.geom_point()
)

## Active subspace example
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
# Approximate the active subspace
df_data = gr.ev_sample(md_beam, n=1e2, df_det="nom")
df_weights = gr.tran_polyridge(
    df_data,
    var=md_beam.var_rand, # Use meaningful predictors only
    out="g_disp",         # Target g_disp for reduction
    n_degree=2,
    n_degree=1,
)
# Construct shadow plot; use tran_inner to calculate active variable
(
    df_data
    >> gr.tf_inner(df_weights=df_weights)
    >> gr.ggplot(gr.aes("dot", "g_disp"))
    + gr.geom_point()
)

grama.tran_summaries.tran_pca¶

Principal Component Analysis

Compute principal directions and eigenvalues for a dataset. Can specify columns to analyze, or just analyze all numerical columns.

Parameters:	df (DataFrame) – Data to analyze var (list of str or None) – List of columns to analyze lambvar (str) – Name to give eigenvalue column; default=”lam” standardize (bool) – Standardize columns? default=False
Returns:	principal directions and eigenvalues
Return type:	DataFrame

References

TODO

Examples:

import grama as gr
from grama.data import df_stang
df_pca = df_stang >> gr.tf_pca()

grama.tran_summaries.tran_sobol¶

Post-process results from gr.eval_hybrid()

Estimate Sobol’ indices based on hybrid point evaluations (Sobol’, 1999). Intended as post-processor for gr.eval_hybrid().

Parameters:	df (DataFrame) – Hybrid point results from gr.eval_hybrid() typename (str) – Name to give index type column in results digits (int) – Number of digits for rounding reported results full (bool) – Return un-normalized indices and variance?
Returns:	Sobol’ indices
Return type:	DataFrame

Notes

Index type [“first”, “total”] is inferred from input df._meta; this is assigned by gr.eval_hybrid().
Index normalization coded in the “ind” column; S: Normalized index T: Un-normalized index var: Total variance

References

I.M. Sobol’, “Sensitivity Estimates for Nonlinear Mathematical Models” (1999) MMCE, Vol 1.

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md = make_cantilever_beam()
df_first = md >> gr.ev_hybrid(df_det="nom", plan="first")
df_first >> gr.tf_sobol()

df_total = md >> gr.ev_hybrid(df_det="nom", plan="total")
df_total >> gr.tf_sobol()

grama.tran_summaries.tran_iocorr¶

Compute input-output correlations

Compute all correlations between input and output quantities.

Parameters:	df (DataFrame) – Data to summarize var (iterable of strings) – Column names for inputs out (iterable of strings) – Column names for outputs method (str) – Method for correlation; one of “pearson” or “spearman”
Returns:	Correlations between each input and output
Return type:	DataFrame

grama.tran_tools module¶

grama.tran_tools.tran_angles(df, df2)¶

Subspace angles

Compute the subspace angles between two matrices. A wrapper for scipy.linalg.subspace_angles that corrects for column ordering. Row ordering is assumed.

Parameters:	df (DataFrame) – First matrix to compare df2 (DataFrame) – Second matrix to compare
Returns:	Array of angles (in radians)
Return type:	array

Examples:

import grama as gr
import pandas as pd
df = pd.DataFrame(dict(v=[+1, +1]))
df_v1 = pd.DataFrame(dict(w=[+1, -1]))
df_v2 = pd.DataFrame(dict(w=[+1, +1]))
theta1 = angles(df, df_v1)
theta2 = angles(df, df_v2)

grama.tran_tools.tran_bootstrap¶

Estimate bootstrap confidence intervals

Estimate bootstrap confidence intervals for a given transform. Uses the “bootstrap-t” procedure discussed in Efron and Tibshirani (1993).

Parameters:	df (DataFrame) – Data to bootstrap tran (grama tran_ function) – Transform procedure which generates statistic n_boot (numeric) – Monte Carlo resamples for bootstrap n_sub (numeric) – Nested resamples to estimate SE con (float) – Confidence level col_sel (list(string)) – Columns to include in bootstrap calculation
Returns:	Results of tran(df), plus _lo and _up columns for numeric columns
Return type:	DataFrame

References and notes:: Efron and Tibshirani (1993) “The bootstrap-t procedure… is particularly applicable to location statistics like the sample mean…. The bootstrap-t method, at least in its simple form, cannot be trusted for more general problems, like setting a confidence interval for a correlation coefficient.”

Examples:

grama.tran_tools.tran_copula_corr(df, model=None, density=None)¶

Compute Gaussian copula correlations from data

Convenience function to fit a Gaussian copula (correlations) based on data and pre-fitted marginals. Intended for use with gr.comp_copula_gaussian(). Must provide either model or density.

Note: This is called automatically when you provide a dataset to gr.comp_copula_gaussian().

Parameters:	df (DataFrame) – Matrix of data for correlation estimation model (gr.Model) – Model with defined marginals density (gr.Density) – Density with defined marginals
Returns:	Correlation data ready for use with `gr.comp_copula_gaussian()`
Return type:	DataFrame

Examples:

import grama as gr
from grama.data import df_stang
## Verbose, manual approach
md = (
    gr.Model()
    >> gr.cp_marginals(
        E=gr.marg_named(df_stang.E, "norm"),
        mu=gr.marg_named(df_stang.mu, "beta"),
        thick=gr.marg_named(df_stang.thick, "norm"),
    )
)
df_corr = gr.tran_copula_corr(df_stang, model=md)
md = gr.comp_copula_gaussian(md, df_corr=df_corr)

## Automatic approach
md = (
    gr.Model()
    >> gr.cp_marginals(
        E=gr.marg_named(df_stang.E, "norm"),
        mu=gr.marg_named(df_stang.mu, "beta"),
        thick=gr.marg_named(df_stang.thick, "norm"),
    )
    >> gr.cp_copula_gaussian(df_data=df_stang)
)

grama.tran_tools.tran_kfolds¶

Perform k-fold CV

Perform k-fold cross-validation (CV) using a given fitting procedure (ft). Optionally provide a fold identifier column, or (randomly) assign folds.

Parameters:

df (DataFrame) – Data to pass to given fitting procedure
ft (gr.ft_) – Partially-evaluated grama fit function; defines model fitting procedure and outputs to aggregate
tf (gr.tf_) – Partially-evaluated grama transform function; evaluation of fitted model will be passed to tf and provided with keyword arguments from summaries
out (list or None) – Outputs for which to compute summaries; None uses ft.out
var_fold (str or None) – Column to treat as fold identifier; overrides k
suffix (str) – Suffix for predicted value; used to distinguish between predicted and actual
summaries (dict of functions) – Summary functions to pass to tf; will be evaluated for outputs of ft. Each summary must have signature summary(f_pred, f_meas). Grama includes builtin options: gr.mse, gr.rmse, gr.rel_mse, gr.rsq, gr.ndme
k (int) – Number of folds; k=5 to k=10 recommended [1]
shuffle (bool) – Shuffle the data before CV? True recommended [1]

Notes

Many grama functions support partial evaluation; this allows one to specify things like hyperparameters in fitting functions without providing data and executing the fit. You can take advantage of this functionality to easly do hyperparameter studies.

Returns:	Aggregated results within each of k-folds using given model and summary transform
Return type:	DataFrame

References

[1] James, Witten, Hastie, and Tibshirani, “An introduction to statistical learning” (2017), Chapter 5. Resampling Methods

Examples:

import grama as gr
from grama.data import df_stang
df_kfolds = (
    df_stang
    >> gr.tf_kfolds(
        k=5,
        ft=gr.ft_rf(out=["thick"], var=["E", "mu"]),
    )
)

grama.tran_tools.tran_md¶

Model as transform

Use a model to transform data; useful when pre-processing data to evaluate a model.

Parameters:	df (DataFrame) – Data to merge md (gr.Model) – Model to use as transform
Returns:	Output of evaluated model
Return type:	DataFrame

Examples:

import grama as gr
from grama.models import make_cantilever_beam
md_beam = make_cantilever_beam()
## Use support points to generate a smaller---but representative---sample
df_res = (
    md_beam
    >> gr.ev_sample(n=1e3, df_det="nom", skip=True, seed=101)
    >> gr.tf_sp(n=100)
    >> gr.tf_md(md=md_beam)
)

Module contents¶

grama.choice(a, size=None, replace=True, p=None)¶

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; please see the random-quick-start.

Parameters:	a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were `np.arange(a)` size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., `(m, n, k)`, then `m * n * k` samples are drawn. Default is None, in which case a single value is returned. replace (boolean, optional) – Whether the sample is with or without replacement. Default is True, meaning that a value of `a` can be selected multiple times. p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in `a`.
Returns:	samples – The generated random samples
Return type:	single item or ndarray
Raises:	`ValueError` – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint(), shuffle(), permutation()

Generator.choice(): which should be used in new code

Notes

Setting user-specified probabilities through p uses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element of p is 1 / len(a).

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')

grama.multivariate_normal(mean, cov, size=None, check_valid='warn', tol=1e-8)¶

Draw random samples from a multivariate normal distribution.

The multivariate normal, multinormal or Gaussian distribution is a generalization of the one-dimensional normal distribution to higher dimensions. Such a distribution is specified by its mean and covariance matrix. These parameters are analogous to the mean (average or “center”) and variance (standard deviation, or “width,” squared) of the one-dimensional normal distribution.

Note

New code should use the multivariate_normal method of a default_rng() instance instead; please see the random-quick-start.

Parameters:

mean (1-D array_like, of length N) – Mean of the N-dimensional distribution.
cov (2-D array_like, of shape (N, N)) – Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling.
size (int or tuple of ints, optional) – Given a shape of, for example, (m,n,k), m*n*k samples are generated, and packed in an m-by-n-by-k arrangement. Because each sample is N-dimensional, the output shape is (m,n,k,N). If no shape is specified, a single (N-D) sample is returned.
check_valid ({ 'warn', 'raise', 'ignore' }, optional) – Behavior when the covariance matrix is not positive semidefinite.
tol (float, optional) – Tolerance when checking the singular values in covariance matrix. cov is cast to double before the check.

Returns:

out – The drawn samples, of shape size, if that was provided. If not, the shape is (N,).

In other words, each entry out[i,j,...,:] is an N-dimensional value drawn from the distribution.

Return type:

ndarray

See also

Generator.multivariate_normal(): which should be used for new code.

Notes

The mean is a coordinate in N-dimensional space, which represents the location where samples are most likely to be generated. This is analogous to the peak of the bell curve for the one-dimensional or univariate normal distribution.

Covariance indicates the level to which two variables vary together. From the multivariate normal distribution, we draw N-dimensional samples, $X = [x_1, x_2, ... x_N]$ . The covariance matrix element $C_{ij}$ is the covariance of $x_i$ and $x_j$ . The element $C_{ii}$ is the variance of $x_i$ (i.e. its “spread”).

Instead of specifying the full covariance matrix, popular approximations include:

Spherical covariance (cov is a multiple of the identity matrix)

Diagonal covariance (cov has non-negative elements, and only on the diagonal)

This geometrical property can be seen in two dimensions by plotting generated data-points:

>>> mean = [0, 0]
>>> cov = [[1, 0], [0, 100]]  # diagonal covariance

Diagonal covariance means that points are oriented along x or y-axis:

>>> import matplotlib.pyplot as plt
>>> x, y = np.random.multivariate_normal(mean, cov, 5000).T
>>> plt.plot(x, y, 'x')
>>> plt.axis('equal')
>>> plt.show()

Note that the covariance matrix must be positive semidefinite (a.k.a. nonnegative-definite). Otherwise, the behavior of this method is undefined and backwards compatibility is not guaranteed.

References

[1]	Papoulis, A., “Probability, Random Variables, and Stochastic Processes,” 3rd ed., New York: McGraw-Hill, 1991.

[2]	Duda, R. O., Hart, P. E., and Stork, D. G., “Pattern Classification,” 2nd ed., New York: Wiley, 2001.

Examples

>>> mean = (1, 2)
>>> cov = [[1, 0], [0, 1]]
>>> x = np.random.multivariate_normal(mean, cov, (3, 3))
>>> x.shape
(3, 3, 2)

The following is probably true, given that 0.6 is roughly twice the standard deviation:

>>> list((x[0,0,:] - mean) < 0.6)
[True, True] # random

grama.set_seed()¶

seed(self, seed=None)

Reseed a legacy MT19937 BitGenerator

Notes

This is a convenience, legacy function.

The best practice is to not reseed a BitGenerator, rather to recreate a new one. This method is here for legacy reasons. This example demonstrates best practice.

>>> from numpy.random import MT19937
>>> from numpy.random import RandomState, SeedSequence
>>> rs = RandomState(MT19937(SeedSequence(123456789)))
# Later, you want to restart the stream
>>> rs = RandomState(MT19937(SeedSequence(987654321)))