109: Swissmetro Nested Logit Mode Choice
109: Swissmetro Nested Logit Mode Choice¶
import pandas as pd
import larch.numba as lx
This example is a mode choice model built using the Swissmetro example dataset. First we create the Dataset and Model objects.
raw_data = pd.read_csv(lx.example_file('swissmetro.csv.gz')).rename_axis(index='CASEID')
data = lx.Dataset.construct.from_idco(raw_data, alts={1:'Train', 2:'SM', 3:'Car'})
m = lx.Model(data.dc.query_cases("PURPOSE in (1,3) and CHOICE != 0"))
We can attach a title to the model. The title does not affect the calculations as all; it is merely used in various output report styles.
m.title = "swissmetro example 09 (nested logit)"
We need to identify the availability and choice variables.
m.availability_co_vars = {
1: "TRAIN_AV * (SP!=0)",
2: "SM_AV",
3: "CAR_AV * (SP!=0)",
}
m.choice_co_code = 'CHOICE'
The swissmetro dataset, as with all Biogeme data, is only in co
format.
from larch.roles import P,X
m.utility_co[1] = ( P.ASC_TRAIN
+ P.B_TIME * X.TRAIN_TT
+ P.B_COST * X("TRAIN_CO*(GA==0)") )
m.utility_co[2] = ( P.B_TIME * X.SM_TT
+ P.B_COST * X("SM_CO*(GA==0)") )
m.utility_co[3] = ( P.ASC_CAR
+ P.B_TIME * X.CAR_TT
+ P.B_COST * X("CAR_CO") )
To create a new nest, we can use the graph.new_node
command.
For this example, we want to nest together the Train and Car modes into a “existing” modes nest.
Those are modes 1 and 3, so we can use the graph.new_node
command like this:
m.graph.new_node(parameter="existing", children=[1,3], name='Existing')
4
In a Jupyter notebook, we can verify the nesting structure visually if we like.
m.graph
Larch will find all the parameters in the model, but we’d like to output them in a rational order. We can use the ordering method to do this:
m.ordering = [
("ASCs", 'ASC.*',),
("LOS", 'B_.*',),
("LogSums", 'existing'),
]
We can estimate the models and check the results match up with those given by Biogeme:
m.set_cap(15)
m.maximize_loglike()
key | value | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x |
| ||||||||||||
loglike | -5236.900013608125 | ||||||||||||
d_loglike |
| ||||||||||||
nit | 17 | ||||||||||||
nfev | 37 | ||||||||||||
njev | 17 | ||||||||||||
status | 0 | ||||||||||||
message | 'Optimization terminated successfully' | ||||||||||||
success | True | ||||||||||||
elapsed_time | 0:00:00.290973 | ||||||||||||
method | 'slsqp' | ||||||||||||
n_cases | 6768 | ||||||||||||
iteration_number | 0 | ||||||||||||
logloss | 0.7737736426725953 | ||||||||||||
__verbose_repr__ | True |
m.calculate_parameter_covariance()
m.parameter_summary()
Value | Std Err | t Stat | Signif | Null Value | ||
---|---|---|---|---|---|---|
Category | Parameter | |||||
ASCs | ASC_CAR | -0.167 | 0.0371 | -4.50 | *** | 0.00 |
ASC_TRAIN | -0.512 | 0.0452 | -11.33 | *** | 0.00 | |
LOS | B_COST | -0.00857 | 0.000463 | -18.51 | *** | 0.00 |
B_TIME | -0.00899 | 0.000570 | -15.77 | *** | 0.00 | |
LogSums | existing | 0.487 | 0.0279 | -18.39 | *** | 1.00 |
The pfo
(parameter frame, ordered) has other useful information,
including things like the robust standard errors and t statistics.
print(m.pfo()[['value','std_err','t_stat','robust_std_err','robust_t_stat']])
value std_err t_stat robust_std_err robust_t_stat
Category Parameter
ASCs ASC_CAR -0.167 3.714e-02 -4.501 5.453e-02 -3.065
ASC_TRAIN -0.512 4.518e-02 -11.331 7.911e-02 -6.471
LOS B_COST -0.009 4.627e-04 -18.513 6.004e-04 -14.269
B_TIME -0.009 5.699e-04 -15.769 1.071e-03 -8.390
LogSums existing 0.487 2.790e-02 -18.394 3.892e-02 -13.186