202: Exampville Mode Choice Logsums

202: Exampville Mode Choice Logsums

Welcome to Exampville, the best simulated town in this here part of the internet!

Exampville is a demonstration provided with Larch that walks through some of the data and tools that a transportation planner might use when building a travel model.

import os
import numpy as np
import xarray as xr
from addicty import Dict
from larch import P, X
import larch.numba as lx

In this example notebook, we will walk through the creation of logsums from an existing tour mode choice model. First, let’s load the data files from our example.

hh, pp, tour, skims = lx.example(200, ['hh', 'pp', 'tour', 'skims'])

We’ll also load the saved model from the mode choice estimation.

exampville_mode_choice_file = lx.example(201, output_file='/tmp/exampville_mode_choice.html')
m = lx.read_metadata(exampville_mode_choice_file)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [4], in <module>
----> 1 exampville_mode_choice_file = lx.example(201, output_file='/tmp/exampville_mode_choice.html')
      2 m = lx.read_metadata(exampville_mode_choice_file)

File ~/work/larch/larch/larch/larch/numba/__init__.py:15, in example(*args, **kwargs)
     13 import importlib
     14 kwargs['larch'] = importlib.import_module(__name__)
---> 15 return _example(*args, **kwargs)

File ~/work/larch/larch/larch/larch/examples/__init__.py:163, in example(n, extract, echo, output_file, larch, legacy)
    161 if os.path.exists(output_file):
    162 	return output_file
--> 163 _exec_example_n(n, extract=None, echo=echo, larch=larch, legacy=legacy)
    164 if os.path.exists(output_file):
    165 	return output_file

File ~/work/larch/larch/larch/larch/examples/__init__.py:135, in _exec_example_n(n, echo, larch, legacy, *arg, **kwarg)
    133 def _exec_example_n(n, *arg, echo=False, larch=None, legacy=False, **kwarg):
    134 	if not legacy and n in ex:
--> 135 		return ex[n](*arg, **kwarg)
    136 	f = os.path.join(exampledocdir, get_examplefile(n))
    137 	return _exec_example(f, *arg, echo=echo, larch=larch, **kwarg)

File ~/work/larch/larch/larch/larch/examples/generated/_201_exville_mode_choice.py:44, in example(extract, estimate)
     41 m = lx.Model(datatree = dt_work)
     42 m.title = "Exampville Work Tour Mode Choice v1"
---> 44 m.utility_co[Mode.DA] = (
     45         + P.InVehTime * X.AUTO_TIME
     46         + P.Cost * X.AUTO_COST # dollars per mile
     47 )
     49 m.utility_co[Mode.SR] = (
     50         + P.ASC_SR
     51         + P.InVehTime * X.AUTO_TIME
     52         + P.Cost * (X.AUTO_COST * 0.5) # dollars per mile, half share
     53         + P("LogIncome:SR") * X("log(INCOME)")
     54 )
     56 m.utility_co[Mode.Walk] = (
     57         + P.ASC_Walk
     58         + P.NonMotorTime * X.WALK_TIME
     59         + P("LogIncome:Walk") * X("log(INCOME)")
     60 )

AttributeError: 'NoneType' object has no attribute 'DA'

We’ll replicate the pre-processing used in the mode choice estimation, to merge the household and person characteristics into the tours data, add the index values for the home TAZ’s, filter to include only work tours, and merge with the level of service skims. (If this pre-processing was computationally expensive, it would probably have been better to save the results to disk and reload them as needed, but for this model these commands will run almost instantaneously.)

Mode = Dict(
    DA = 1,
    SR = 2,
    Walk = 3,
    Bike = 4,
    Transit = 5,
).freeze()

tour_dataset = lx.Dataset.from_idco(tour.set_index('TOURID'), alts=Mode)
od_skims = lx.Dataset.from_omx(skims)

dt = lx.DataTree(
    tour=tour_dataset.query_cases("TOURPURP == 1"),
    hh=hh.set_index('HHID'),
    person=pp.set_index('PERSONID'),
    od=od_skims,
    do=od_skims,
    relationships=(
        "tours.HHID @ hh.HHID",
        "tours.PERSONID @ person.PERSONID",
        "hh.HOMETAZ @ od.otaz",
        "tours.DTAZ @ od.dtaz",
        "hh.HOMETAZ @ do.dtaz",
        "tours.DTAZ @ do.otaz",
    ),
)

Then we bundle the raw data into the larch.DataFrames structure, as we did for estimation, and attach this structure to the model as its dataservice.

We’ll also initialize a DataArray to hold the computed logsums. This data will have one row for each case in our source data, and a column for each possible destination zone.

logsums = lx.DataArray.zeros(
    dt.caseids(),
    skims.TAZ_ID,
    name='logsums',
)

The logsums from a Model can be computed using the Model.logsums method. However, if we want the logsums for each possible destination, we’ll need to replace the part of our data that depends on the destination zone, writing in the appropriate values for each. We can simply iterate over the zones, dropping in the zone as the destination and computing the logsums.

for i,dtaz in enumerate(logsums.TAZ_ID):
    m.datatree = dt.replace_datasets(
        tour=dt.root_dataset.assign(
            DTAZ=xr.full_like(dt['DTAZ'], dtaz)
        ),
    )
    logsums.loc[dict(TAZ_ID=dtaz)] = m.logsums()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [7], in <module>
      1 for i,dtaz in enumerate(logsums.TAZ_ID):
----> 2     m.datatree = dt.replace_datasets(
      3         tour=dt.root_dataset.assign(
      4             DTAZ=xr.full_like(dt['DTAZ'], dtaz)
      5         ),
      6     )
      7     logsums.loc[dict(TAZ_ID=dtaz)] = m.logsums()

NameError: name 'm' is not defined

Then we can persist the logsums dataframe to disk, for use in the next example, where we will estimate a destination choice model.

logsums.to_zarr('/tmp/logsums.zarr.zip')

To recover the DataArray later, we can read it using the DataArray.from_zarr method.

lx.DataArray.from_zarr('/tmp/logsums.zarr.zip')
<larch.DataArray 'logsums' (TOURID: 7564, TAZ_ID: 40)>
dask.array<open_dataset-55c5e06d829e382dc4a5aa307f7fc5e9logsums, shape=(7564, 40), dtype=float64, chunksize=(1891, 20), chunktype=numpy.ndarray>
Coordinates:
  * TAZ_ID   (TAZ_ID) int64 1 2 3 4 5 6 7 8 9 10 ... 32 33 34 35 36 37 38 39 40
  * TOURID   (TOURID) int64 0 1 3 7 10 13 ... 20728 20729 20731 20734 20736