larch.DataTree.setup_flow

larch.DataTree.setup_flow

DataTree.setup_flow(*args, **kwargs)[source]

Set up a new Flow for analysis using the structure of this DataTree.

Parameters
  • definition_spec (Dict[str,str]) – Gives the names and expressions that define the variables to create in this new Flow.

  • cache_dir (Path-like, optional) – A location to write out generated python and numba code. If not provided, a unique temporary directory is created.

  • name (str, optional) – The name of this Flow used for writing out cached files. If not provided, a unique name is generated. If cache_dir is given, be sure to avoid name conflicts with other flow’s in the same directory.

  • dtype (str, default "float32") – The name of the numpy dtype that will be used for the output.

  • boundscheck (bool, default False) – If True, boundscheck enables bounds checking for array indices, and out of bounds accesses will raise IndexError. The default is to not do bounds checking, which is faster but can produce garbage results or segfaults if there are problems, so try turning this on for debugging if you are getting unexplained errors or crashes.

  • error_model ({'numpy', 'python'}, default 'numpy') – The error_model option controls the divide-by-zero behavior. Setting it to ‘python’ causes divide-by-zero to raise exception like CPython. Setting it to ‘numpy’ causes divide-by-zero to set the result to +/-inf or nan.

  • nopython (bool, default True) – Compile using numba’s nopython mode. Provided for debugging only, as there’s little point in turning this off for production code, as all the speed benefits of sharrow will be lost.

  • fastmath (bool, default True) – If true, fastmath enables the use of “fast” floating point transforms, which can improve performance but can result in tiny distortions in results. See numba docs for details.

  • parallel (bool, default True) – Enable or disable parallel computation for certain functions.

  • readme (str, optional) – A string to inject as a comment at the top of the flow Python file.

  • flow_library (Mapping[str,Flow], optional) – An in-memory cache of precompiled Flow objects. Using this can result in performance improvements when repeatedly using the same definitions.

  • extra_hash_data (Tuple[Hashable], optional) – Additional data used for generating the flow hash. Useful to prevent conflicts when using a flow_library with multiple similar flows.

  • write_hash_audit (bool, default True) – Writes a hash audit log into a comment in the flow Python file, for debugging purposes.

  • hashing_level (int, default 1) – Level of detail to write into flow hashes. Increase detail to avoid hash conflicts for similar flows. Level 2 adds information about names used in expressions and digital encodings to the flow hash, which prevents conflicts but requires more pre-computation to generate the hash.

  • dim_exclude (Collection[str], optional) – Exclude these root dataset dimensions from this flow.

Returns

Flow