301: Itinerary Choice using MNL

[1]:
import larch
[2]:
from larch.log.verbose import log

This example is an itinerary choice model built using the example itinerary choice dataset included with Larch. We’ll begin by loading that example data.

[3]:
from larch.examples import example
d = example(300, 'd')
[2019-01-11 12:08:11,905] L4.WARNING: converting data_co to <class 'numpy.float64'>
[2019-01-11 12:08:11,907] L4.WARNING: converting data_ce to <class 'numpy.float64'>
[2019-01-11 12:08:11,911] L4.WARNING: rescaled array of weights by a factor of 2239.980952380952

Now let’s make our model. We’ll use a few variables to define our linear-in-parameters utility function.

[4]:
m = larch.Model(dataservice=d)

v = [
    "timeperiod==2",
    "timeperiod==3",
    "timeperiod==4",
    "timeperiod==5",
    "timeperiod==6",
    "timeperiod==7",
    "timeperiod==8",
    "timeperiod==9",
    "carrier==2",
    "carrier==3",
    "carrier==4",
    "carrier==5",
    "equipment==2",
    "fare_hy",
    "fare_ly",
    "elapsed_time",
    "nb_cnxs",
]

The larch.roles module defines a few convenient classes for declaring data and parameter. One we will use here is PX which creates a linear-in-parameter term that represents one data element (a column from our data, or an expression that can be evaluated on the data alone) multiplied by a parameter with the same name.

[5]:
from larch.roles import PX
m.utility_ca = sum(PX(i) for i in v)
[6]:
m.choice_ca_var = 'choice'

Since we are estimating just an MNL model in this example, this is all we need to do to build our model, and we’re ready to go. To estimate the likelihood maximizing parameters, we give:

[7]:
m.load_data()
[2019-01-11 12:08:12,272] L4.WARNING: req_data does not request weight_co but it is set and being provided
[8]:
m.maximize_loglike()

Iteration 011 [Converged]

LL = -777770.0688722526

value initvalue nullvalue minimum maximum holdfast note best
carrier==2 0.117200 0.0 0.0 -inf inf 0 0.117200
carrier==3 0.638554 0.0 0.0 -inf inf 0 0.638554
carrier==4 0.565252 0.0 0.0 -inf inf 0 0.565252
carrier==5 -0.624022 0.0 0.0 -inf inf 0 -0.624022
elapsed_time -0.006087 0.0 0.0 -inf inf 0 -0.006087
equipment==2 0.466305 0.0 0.0 -inf inf 0 0.466305
fare_hy -0.001175 0.0 0.0 -inf inf 0 -0.001175
fare_ly -0.001177 0.0 0.0 -inf inf 0 -0.001177
nb_cnxs -2.947153 0.0 0.0 -inf inf 0 -2.947153
timeperiod==2 0.095949 0.0 0.0 -inf inf 0 0.095949
timeperiod==3 0.126533 0.0 0.0 -inf inf 0 0.126533
timeperiod==4 0.060552 0.0 0.0 -inf inf 0 0.060552
timeperiod==5 0.140963 0.0 0.0 -inf inf 0 0.140963
timeperiod==6 0.238254 0.0 0.0 -inf inf 0 0.238254
timeperiod==7 0.351391 0.0 0.0 -inf inf 0 0.351391
timeperiod==8 0.353302 0.0 0.0 -inf inf 0 0.353302
timeperiod==9 -0.010309 0.0 0.0 -inf inf 0 -0.010309
[8]:
┣          loglike: -777770.0688722526
┣                x: carrier==2       0.117200
┃                   carrier==3       0.638554
┃                   carrier==4       0.565252
┃                   carrier==5      -0.624022
┃                   elapsed_time    -0.006087
┃                   equipment==2     0.466305
┃                   fare_hy         -0.001175
┃                   fare_ly         -0.001177
┃                   nb_cnxs         -2.947153
┃                   timeperiod==2    0.095949
┃                   timeperiod==3    0.126533
┃                   timeperiod==4    0.060552
┃                   timeperiod==5    0.140963
┃                   timeperiod==6    0.238254
┃                   timeperiod==7    0.351391
┃                   timeperiod==8    0.353302
┃                   timeperiod==9   -0.010309
┃                   dtype: float64
┣        tolerance: 1.3256993599095283e-06
┣            steps: array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
┣          message: 'Optimization terminated successfully.'
┣     elapsed_time: datetime.timedelta(microseconds=233897)
┣           method: 'bhhh'
┣          n_cases: 105
┣ iteration_number: 11
┣          logloss: 7407.333989259549
[ ]: