featurehub package¶
Subpackages¶
Submodules¶
featurehub.util module¶
-
featurehub.util.
compute_dataset_hash
(dataset)[source]¶ Return hash value of dataset contents.
Uses xxhash.xxh64 hash algorithm for performance, but this algorithm should not be considered cryptographically secure.
dataset : dict mapping str to pd.DataFrame
-
featurehub.util.
get_function
(source)[source]¶ Return a function from given source code.
This function is usually called on source code that was in turn produced by get_source. Note that the source code produced by get_source includes the source for the top-level function as well as any other local functions it calls. Here, we return the top-level function directly.
source : str or bytes
-
featurehub.util.
get_function2
(source)[source]¶ Return a function from given source code.
This function is usually called on source code that was in turn produced by get_source. This function differs from
get_function
in the method used is to write the source code to a file and then import that file as a new module.Note that the source code produced by get_source includes the source for the top-level function as well as any other local functions it calls. Here, we return the top-level function directly.
Caveat: This does not solve the problem of being able to re-extract source from the returned function. (Or, at least, as currently implemented.)
source : str, bytes
-
featurehub.util.
get_source
(function)[source]¶ Extract the source code from a given function.
Recursively extracts the source code for all local functions called by given function. The resulting source code is encoded in utf-8.
Limitations: Cannot use
get_source
on function defined interactively in normal Python terminal. Functions defined interactively in IPython are still okay.function : function
-
featurehub.util.
get_top_level_function_name
(namespace, remove_names=['__builtins__'])[source]¶ Figure out which is the top-level function in a namespace.
The top-level function is defined as the function that is not a name in any other functions. co_names is a tuple of local names. We could make more efficient, using constant lookups of names, stopping when there is only name left, and confirming this name is not called by anyone; but hard to anticipate a situation where user defines function chain that is long enough that this efficiency is required.
-
featurehub.util.
possibly_talking_action
(action, verbose=True)[source]¶ Wrap statements with description of their action.
Simply prints action before executing statement, without a trailing newline, and prints ‘done’ afterwards.
- action : str
- description of action
- verbose : bool, optional (default=True)
- whether to print anything at all
>>> with possibly_talking_action("Calling foo...", True): foo() Calling foo...done