atm.config module

Configuration Module.

Classes

AWSConfig(args[, path])

Stores configuration for AWS S3 connections

Config(args[, path])

Class which stores configuration for one aspect of ATM.

DatasetConfig(args[, path])

Stores configuration of a Dataset

LogConfig(args[, path])

RunConfig(args[, path])

Stores configuration for Dataset and Datarun setup.

SQLConfig(args[, path])

Stores configuration for SQL database setup & connection

class atm.config.AWSConfig(args, path=None)[source]

Bases: atm.config.Config

Stores configuration for AWS S3 connections

Attributes

access_key

str(object=’‘) -> str

s3_bucket

str(object=’‘) -> str

s3_folder

str(object=’‘) -> str

secret_key

str(object=’‘) -> str

access_key = 'AWS access key'
s3_bucket = 'AWS S3 bucket to store data'
s3_folder = 'Folder in AWS S3 bucket in which to store data'
secret_key = 'AWS secret key'
class atm.config.Config(args, path=None)[source]

Bases: object

Class which stores configuration for one aspect of ATM. Subclasses of Config should define the list of all configurable parameters and any default values for those parameters other than None (in PARAMETERS and DEFAULTS, respectively). The object can be initialized with any number of keyword arguments; only kwargs that are in PARAMETERS will be used. This means you can (relatively) safely do things like args = parser.parse_args() conf = Config(**vars(args)) and only relevant parameters will be set.

Subclasses do not need to define __init__ or any other methods.

Methods

get_parser()

Get an ArgumentParser for this config.

to_dict()

Get a dict representation of this configuraiton.

classmethod get_parser()[source]

Get an ArgumentParser for this config.

to_dict()[source]

Get a dict representation of this configuraiton.

class atm.config.DatasetConfig(args, path=None)[source]

Bases: atm.config.Config

Stores configuration of a Dataset

Attributes

class_column

tuple() -> empty tuple

description

str(object=’‘) -> str

name

str(object=’‘) -> str

test_path

str(object=’‘) -> str

train_path

dict() -> new empty dictionary

class_column = ('Name of the class column in the input data', 'class')
description = 'Description of dataset'
name = 'Given name for this dataset.'
test_path = 'Path to raw test data (if applicable)'
train_path = {'help': 'Path to raw training data', 'required': True}
class atm.config.LogConfig(args, path=None)[source]

Bases: atm.config.Config

Attributes

metrics_dir

tuple() -> empty tuple

models_dir

tuple() -> empty tuple

verbose_metrics

dict() -> new empty dictionary

metrics_dir = ('Directory where model metrics will be saved', 'metrics')
models_dir = ('Directory where computed models will be saved', 'models')
verbose_metrics = {'action': 'store_true', 'default': False, 'help': 'If set, compute full ROC and PR curves and per-label metrics for each classifier'}
class atm.config.RunConfig(args, path=None)[source]

Bases: atm.config.Config

Stores configuration for Dataset and Datarun setup.

Attributes

budget

dict() -> new empty dictionary

budget_type

dict() -> new empty dictionary

dataset_id

dict() -> new empty dictionary

deadline

str(object=’‘) -> str

gridding

dict() -> new empty dictionary

k_window

dict() -> new empty dictionary

methods

dict() -> new empty dictionary

metric

dict() -> new empty dictionary

priority

dict() -> new empty dictionary

r_minimum

dict() -> new empty dictionary

run_per_partition

dict() -> new empty dictionary

score_target

dict() -> new empty dictionary

selector

dict() -> new empty dictionary

tuner

dict() -> new empty dictionary

budget = {'default': 100, 'help': 'Value of the budget, either in classifiers or minutes', 'type': <class 'int'>}
budget_type = {'choices': ['none', 'classifier', 'walltime'], 'default': 'classifier', 'help': 'Type of budget to use'}
dataset_id = {'help': 'ID of dataset, if it is already in the database', 'type': <class 'int'>}
deadline = 'Deadline for datarun completion. If provided, this overrides the configured walltime budget.\nFormat: %%Y-%%m-%%d %%H:%%M'
gridding = {'default': 0, 'help': 'gridding factor (0: no gridding)', 'type': <class 'int'>}
k_window = {'default': 3, 'help': 'number of previous scores considered by -k selector methods', 'type': <class 'int'>}
methods = {'default': ['logreg', 'dt', 'knn'], 'help': 'Method or list of methods to use for classification. Each method can either be one of the pre-defined method codes listed below or a path to a JSON file defining a custom method.\n\nOptions: [logreg, svm, sgd, dt, et, rf, gnb, mnb, bnb, gp, pa, knn, mlp, ada]', 'nargs': '+', 'type': <function _option_or_path.<locals>.type_check>}
metric = {'choices': ['ap', 'f1_macro', 'f1', 'f1_micro', 'cohen_kappa', 'roc_auc', 'mcc', 'roc_auc_macro', 'roc_auc_micro', 'accuracy', 'rank_accuracy'], 'default': 'f1', 'help': 'Metric by which ATM should evaluate classifiers. The metric function specified here will be used to compute the "judgment metric" for each classifier.'}
priority = {'default': 1, 'help': 'Priority of the datarun (higher = more important', 'type': <class 'int'>}
r_minimum = {'default': 2, 'help': 'number of random runs to perform before tuning can occur', 'type': <class 'int'>}
run_per_partition = {'action': 'store_true', 'default': False, 'help': 'if true, generate a new datarun for each hyperpartition'}
score_target = {'choices': ['cv', 'test', 'mu_sigma'], 'default': 'cv', 'help': 'Determines which judgment metric will be used to search the hyperparameter space. "cv" will use the mean cross-validated performance, "test" will use the performance on a test dataset, and "mu_sigma" will use the lower confidence bound on the CV performance.'}
selector = {'default': 'uniform', 'help': 'Type of BTB selector to use. Can either be one of the pre-configured selectors listed below or a path to a custom tuner in the form "/path/to/selector.py:ClassName".\n\nOptions: [uniform, ucb1, bestk, bestkvel, purebestkvel, recentk, recentkvel, hieralg]', 'type': <function _option_or_path.<locals>.type_check>}
tuner = {'default': 'uniform', 'help': 'Type of BTB tuner to use. Can either be one of the pre-configured tuners listed below or a path to a custom tuner in the form "/path/to/tuner.py:ClassName".\n\nOptions: [uniform, gp, gp_ei, gp_eivel]', 'type': <function _option_or_path.<locals>.type_check>}
class atm.config.SQLConfig(args, path=None)[source]

Bases: atm.config.Config

Stores configuration for SQL database setup & connection

Attributes

database

tuple() -> empty tuple

dialect

dict() -> new empty dictionary

host

str(object=’‘) -> str

password

str(object=’‘) -> str

port

str(object=’‘) -> str

query

str(object=’‘) -> str

username

str(object=’‘) -> str

database = ('Name of, or path to, SQL database', 'atm.db')
dialect = {'choices': ['sqlite', 'mysql'], 'default': 'sqlite', 'help': 'Dialect of SQL to use'}
host = 'Hostname for database machine'
password = 'Password for SQL database'
port = 'Port used to connect to database'
query = 'Specify extra login details'
username = 'Username for SQL database'