301: Itinerary Choice using MNL
301: Itinerary Choice using MNL¶
import pandas as pd
import larch
larch.__version__
'5.7.0'
This example is an itinerary choice model built using the example itinerary choice dataset included with Larch. We’ll begin by loading that example data.
from larch.data_warehouse import example_file
itin = pd.read_csv(example_file("arc"), index_col=['id_case','id_alt'])
d = larch.DataFrames(itin, ch='choice', crack=True, autoscale_weights=True)
rescaled array of weights by a factor of 2239.980952380952
Now let’s make our model. We’ll use a few variables to define our linear-in-parameters utility function.
m = larch.Model(dataservice=d)
v = [
"timeperiod==2",
"timeperiod==3",
"timeperiod==4",
"timeperiod==5",
"timeperiod==6",
"timeperiod==7",
"timeperiod==8",
"timeperiod==9",
"carrier==2",
"carrier==3",
"carrier==4",
"carrier==5",
"equipment==2",
"fare_hy",
"fare_ly",
"elapsed_time",
"nb_cnxs",
]
The larch.roles
module defines a few convenient classes for declaring data and parameter.
One we will use here is PX
which creates a linear-in-parameter term that represents one data
element (a column from our data, or an expression that can be evaluated on the data alone) multiplied
by a parameter with the same name.
from larch.roles import PX
m.utility_ca = sum(PX(i) for i in v)
m.choice_ca_var = 'choice'
Since we are estimating just an MNL model in this example, this is all we need to do to build our model, and we’re ready to go. To estimate the likelihood maximizing parameters, we give:
m.load_data()
m.maximize_loglike()
req_data does not request weight_co but it is set and being provided
req_data does not request avail_ca or avail_co but it is set and being provided
converting data_ce to <class 'numpy.float64'>
Iteration 011 [Optimization terminated successfully.]
Best LL = -777770.0688722524
value | initvalue | nullvalue | minimum | maximum | holdfast | note | best | |
---|---|---|---|---|---|---|---|---|
carrier==2 | 0.117200 | 0.0 | 0.0 | -inf | inf | 0 | 0.117200 | |
carrier==3 | 0.638554 | 0.0 | 0.0 | -inf | inf | 0 | 0.638554 | |
carrier==4 | 0.565252 | 0.0 | 0.0 | -inf | inf | 0 | 0.565252 | |
carrier==5 | -0.624022 | 0.0 | 0.0 | -inf | inf | 0 | -0.624022 | |
elapsed_time | -0.006087 | 0.0 | 0.0 | -inf | inf | 0 | -0.006087 | |
equipment==2 | 0.466305 | 0.0 | 0.0 | -inf | inf | 0 | 0.466305 | |
fare_hy | -0.001175 | 0.0 | 0.0 | -inf | inf | 0 | -0.001175 | |
fare_ly | -0.001177 | 0.0 | 0.0 | -inf | inf | 0 | -0.001177 | |
nb_cnxs | -2.947153 | 0.0 | 0.0 | -inf | inf | 0 | -2.947153 | |
timeperiod==2 | 0.095949 | 0.0 | 0.0 | -inf | inf | 0 | 0.095949 | |
timeperiod==3 | 0.126533 | 0.0 | 0.0 | -inf | inf | 0 | 0.126533 | |
timeperiod==4 | 0.060552 | 0.0 | 0.0 | -inf | inf | 0 | 0.060552 | |
timeperiod==5 | 0.140963 | 0.0 | 0.0 | -inf | inf | 0 | 0.140963 | |
timeperiod==6 | 0.238254 | 0.0 | 0.0 | -inf | inf | 0 | 0.238254 | |
timeperiod==7 | 0.351391 | 0.0 | 0.0 | -inf | inf | 0 | 0.351391 | |
timeperiod==8 | 0.353302 | 0.0 | 0.0 | -inf | inf | 0 | 0.353302 | |
timeperiod==9 | -0.010309 | 0.0 | 0.0 | -inf | inf | 0 | -0.010309 |
key | value | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
loglike | -777770.0688722524 | ||||||||||||||||||||||||||||||||||||
x |
| ||||||||||||||||||||||||||||||||||||
tolerance | 1.3256993570743925e-06 | ||||||||||||||||||||||||||||||||||||
steps | array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) | ||||||||||||||||||||||||||||||||||||
message | 'Optimization terminated successfully.' | ||||||||||||||||||||||||||||||||||||
elapsed_time | 0:00:00.248166 | ||||||||||||||||||||||||||||||||||||
method | 'bhhh' | ||||||||||||||||||||||||||||||||||||
n_cases | 105 | ||||||||||||||||||||||||||||||||||||
iteration_number | 11 | ||||||||||||||||||||||||||||||||||||
logloss | 3.3068736505933396 |