lineapy.utils package



lineapy.utils.benchmarks module

Utilities for computing statistics on benchmark data.

Translated from which was originally added in

lineapy.utils.benchmarks.distribution_change(old_measures: List[float], new_measures: List[float], confidence_interval: float = 0.95) lineapy.utils.benchmarks.DistributionChange[source]

Compute the performance change based on a number of old and new measurements.

Based on the work by Tomas Kalibera and Richard Jones. See their paper “Quantifying Performance Changes with Effect Size Confidence Intervals”, section 6.2, formula “Quantifying Performance Change”.

Note: The measurements must have the same length. As fallback, you could use the minimum size of the two measurement sets.

  • old_measures – The list of timings from the old system

  • new_measures – The list of timings from the new system

  • confidence_interval – The confidence interval for the results. The default is a 95% confidence interval (95% of the time the true mean will be between the resulting mean +- the resulting CI)

# Test against the example in the paper, from Table V, on pages 18-19

>>> res = distribution_change(
...     old_measures=[
...         round(mean([9, 11, 5, 6]), 1),
...         round(mean([16, 13, 12, 8]), 1),
...         round(mean([15, 7, 10, 14]), 1),
...     ],
...     new_measures=[
...         round(mean([10, 12, 6, 7]), 1),
...         round(mean([9, 1, 11, 4]), 1),
...         round(mean([8, 5, 3, 2]), 1),
...     ],
...     confidence_interval=0.95
... )
>>> from math import isclose
>>> assert isclose(res.mean, 68.3 / 74.5, rel_tol=0.05)
>>> assert isclose(res.confidence_interval, 60.2 / 74.5, rel_tol=0.05)

lineapy.utils.config module

class lineapy.utils.config.lineapy_config(home_dir='/home/docs/.lineapy', database_url=None, artifact_storage_dir=None, customized_annotation_folder=None, do_not_track=False, logging_level='INFO', logging_file=None, storage_options=None, is_demo=False)[source]

LineaPy Configuration

A dataclass that holds configuration items and sets them as environmental variables. All items are initialized with default value. Then replace with values in the configuration file (if it is existing in LINEAPY_HOME_DIR, use this one, otherwise look for home directory) if available. Then, replace with values in environmental variables if possible. Finally, it sets all values in environmental variables.

  • home_dir – home directory of LineaPy (must be local)

  • database_url – database connection string for LineaPy database

  • artifact_storage_dir – directory for storing artifacts

  • customized_annotation_folder – directory for storing customized annotations

  • do_not_track – opt out or user analytics

  • logging_level – logging level

  • logging_file – logging file location (only support local for at this time)

  • storage_options – a dictionary for artifact storage configuration(same as storage_options in pandas, Dask and fsspec)

get(key: str) Any[source]

Get LineaPy config field

set(key: str, value: Any, verbose=True) None[source]

Set LineaPy config field

lineapy.utils.constants module

lineapy.utils.constants.VERSION_DATE_STRING = '%Y-%m-%dT%H:%M:%S'

sqlalchemy defaults to a type of Optional[str] even when a column is set to be not nullable. This is per their documentation. One option is to add type:ignore for python objects that should not be nulls and are mapped to sqlalchemy ORM objects. Alternately, as is here, we can add a placeholder. This will be used like = or placeholder. This should separate out the ORM objects and their policy of setting all columns to be Optional vs app objects that should reflect the app’s expectation of not allowing nulls. The app object’s property does not get set to None and the ORM object doesn’t need to worry about knowing what the app is doing.

lineapy.utils.deprecation_utils module

lineapy.utils.deprecation_utils.get_source_segment(source, node, padded=False)[source]

Get source code segment of the source that generated node.


This is a polyfill for the ast.get_source_segment function that was introduced in python 3.8.

If some location information (lineno, end_lineno, col_offset, or end_col_offset) is missing, return None.

If padded is True, the first line of a multi-line statement will be padded with spaces to match its original position.

class lineapy.utils.deprecation_utils.singledispatchmethod(func)[source]

Single-dispatch generic method descriptor.

Supports wrapping existing descriptors and handles non-descriptor callables as instance methods.

register(cls, func) func[source]

Registers a new implementation for the given cls on a generic_method.

lineapy.utils.lineabuiltins module

lineapy.utils.lineabuiltins.l_dict(*keys_and_values: Union[Tuple[lineapy.utils.lineabuiltins.K, lineapy.utils.lineabuiltins.V], Tuple[lineapy.utils.lineabuiltins._DictKwargsSentinel, Mapping[lineapy.utils.lineabuiltins.K, lineapy.utils.lineabuiltins.V]]]) Dict[lineapy.utils.lineabuiltins.K, lineapy.utils.lineabuiltins.V][source]

Build a dict from a number of key value pairs.

There is a special case for dictionary unpacking. In this case, the key will be an instance of _DictKwargsSentinel.

For example, if the user creates a dict like {1: 2, **d, 3: 4}, then it will create a call like:

l_dict((1, 2), (l_dict_kwargs_sentinel(), d), (3, 4))

We use a sentinel value instead of None, because None can be a valid dictionary key.

lineapy.utils.lineabuiltins.l_exec_expr(code: str) object[source]

Executes code expressions. These typically are ast nodes that inherit from ast.expr. Examples include ast.ListComp, ast.Lambda

Execute the code with input_locals set as locals, and returns a list of the output_locals pulled from the environment.


it will return the result as well as the last argument.

lineapy.utils.lineabuiltins.l_exec_statement(code: str) None[source]

Executes code statements. These typically are ast nodes that inherit from ast.stmt. Examples include ast.ClassDef, ast.If, ast.For, ast.FunctionDef, ast.While, ast.Try, ast.With

Execute the code with input_locals set as locals, and returns a list of the output_locals pulled from the environment.


None. Since the code is a statement, it will not return anything

lineapy.utils.lineabuiltins.l_import(name: str, base_module: Optional[module] = None) module[source]

Imports and returns a module. If the base_module is provided, the module will be a submodule of the base.

If a base_module is provided, the base_module will be flagged as ‘mutated’ by our annotations.

lineapy.utils.lineabuiltins.l_unpack_ex(xs: Iterable[lineapy.utils.lineabuiltins.T], before: int, after: int) List[Union[lineapy.utils.lineabuiltins.T, List[lineapy.utils.lineabuiltins.T]]][source]

Slits the iterable xs into three pieces and then joins them [*first, middle, *list] The first of length before, the last of length after, and the middle whatever is remaining.

Modeled after the UNPACK_EX bytecode to be used in unpacking.

lineapy.utils.lineabuiltins.l_unpack_sequence(xs: Iterable[lineapy.utils.lineabuiltins.T], n: int) List[lineapy.utils.lineabuiltins.T][source]

Asserts the iterable xs is of length n and turns it into a list.

The same as l_list but asserts the length. This was modeled after the UNPACK_SEQUENCE bytecode to be used in unpacking

The result should be a view of the input.

lineapy.utils.logging_config module

Setup logging config for CLI and debugging.

We don’t do this in our init, because if imported as a library we don’t want to mess up others logging configuration.

lineapy.utils.logging_config.configure_logging(level=None, LOG_SQL=False)[source]

Configure logging for Lineapy.

Logging level is read first from the function parameter, then lineapy_config options and defaults to INFO.

This function should be idempotent.

lineapy.utils.migration module

lineapy.utils.tree_logger module

Logging util for outputting function calls as trees!

This is currently exposed through the –tree-log pytest command, which will log each test case to stdout.

We log every method of the CLASSES, so to change what is logged, modify that list. Also, we color the statements, based on the class, using the CLASS_TO_COLOR mapping.

lineapy.utils.tree_logger.print_tree_log() None[source]

Print the tree log with rich.

lineapy.utils.tree_logger.start_tree_log(label: str) None[source]

Starts logging, by overriding the classes, and also sets the top level label for the tree.

lineapy.utils.utils module

lineapy.utils.utils.get_value_type(val: Any) Optional[][source]

Got a little hacky so as to avoid dependency on external libraries. Current method is to check if the dependent library is already imported, if they are, then we can reference them.

Note: - Watch out for error here if the Executor tests fail. TODO - We currently just silently ignore cases we cant handle

lineapy.utils.utils.listify(fn: lineapy.utils.utils.CALLABLE) lineapy.utils.utils.CALLABLE[source]

TODO: Once we switch to Python 3.10, we can type this properly

lineapy.utils.utils.remove_duplicates(xs: Iterable[lineapy.utils.utils.T]) Iterable[lineapy.utils.utils.T][source]

Remove all duplicate items, maintaining order.

lineapy.utils.utils.remove_value(xs: Iterable[lineapy.utils.utils.T], x: lineapy.utils.utils.T) Iterable[lineapy.utils.utils.T][source]

Remove all items equal to x.

lineapy.utils.validate_annotation_spec module

Validate the annotations.yaml files in the instrumentation directory.

lineapy.utils.validate_annotation_spec.validate_spec(spec_file: pathlib.Path) List[Any][source]

Validate all ‘.annotations.yaml’ files at path and return all invalid items.

Throws yaml.YAMLError

lineapy.utils.version module

Module contents