atm.metrics module

Functions

cross_validate_pipeline(pipeline, X, y[, …])

Compute metrics for each of n_folds folds of the training data in (X, y).

get_metrics_binary(y_true, y_pred, y_pred_probs)

get_metrics_multiclass(y_true, y_pred, …)

get_per_class_matrix(y[, classes])

Create a (num_classes x num_examples) binary matrix representation of the true and predicted y values.

get_pr_roc_curves(y_true, y_pred_probs)

Compute precision/recall and receiver operating characteristic metrics for a binary class label.

rank_n_accuracy(y_true, y_prob_mat[, n])

Compute how often the true label is one of the top n predicted classes for each training example.

test_pipeline(pipeline, X, y, binary, **kwargs)

atm.metrics.cross_validate_pipeline(pipeline, X, y, binary=True, n_folds=10, **kwargs)[source]

Compute metrics for each of n_folds folds of the training data in (X, y).

pipeline: the sklearn Pipeline to train and test. X: feature matrix. y: series of labels corresponding to rows in X. binary: whether the label is binary or multi-ary. n_folds: number of non-overlapping “folds” of the data to make for cross-validation.

atm.metrics.get_metrics_binary(y_true, y_pred, y_pred_probs, include_curves=False)[source]
atm.metrics.get_metrics_multiclass(y_true, y_pred, y_pred_probs, include_per_class=False, include_curves=False)[source]
atm.metrics.get_per_class_matrix(y, classes=None)[source]

Create a (num_classes x num_examples) binary matrix representation of the true and predicted y values. If classes is None, class values will be extracted from y. Values that are not present at all will not receive a column – this is to allow computation of per-class roc_auc scores without error.

atm.metrics.get_pr_roc_curves(y_true, y_pred_probs)[source]

Compute precision/recall and receiver operating characteristic metrics for a binary class label.

y_true: series of true class labels (only 1 or 0) y_pred_probs: series of probabilities generated by the model for the label class 1

atm.metrics.rank_n_accuracy(y_true, y_prob_mat, n=0.33)[source]

Compute how often the true label is one of the top n predicted classes for each training example. If n is an integer, consider the top n predictions for each example. If n is a float, it represents a proportion of the top predictions. This metric is only really useful when the total number of classes is large.

atm.metrics.test_pipeline(pipeline, X, y, binary, **kwargs)[source]