102: Swissmetro Weighted MNL Mode Choice
102: Swissmetro Weighted MNL Mode Choice¶
import pandas as pd
import larch.numba as lx
This example is a mode choice model built using the Swissmetro example dataset. First we create the Dataset and Model objects:
raw_data = pd.read_csv(lx.example_file('swissmetro.csv.gz')).rename_axis(index='CASEID')
data = lx.Dataset.construct.from_idco(raw_data, alts={1:'Train', 2:'SM', 3:'Car'})
data
<xarray.Dataset> Dimensions: (CASEID: 10728, _altid_: 3) Coordinates: * CASEID (CASEID) int64 0 1 2 3 4 5 ... 10723 10724 10725 10726 10727 * _altid_ (_altid_) int64 1 2 3 alt_names (_altid_) <U5 'Train' 'SM' 'Car' Data variables: (12/28) GROUP (CASEID) int64 2 2 2 2 2 2 2 2 2 2 2 2 ... 3 3 3 3 3 3 3 3 3 3 3 SURVEY (CASEID) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1 1 1 1 SP (CASEID) int64 1 1 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1 1 1 ID (CASEID) int64 1 1 1 1 1 1 1 ... 1192 1192 1192 1192 1192 1192 PURPOSE (CASEID) int64 1 1 1 1 1 1 1 1 1 1 1 1 ... 4 4 4 4 4 4 4 4 4 4 4 FIRST (CASEID) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1 1 1 1 ... ... SM_CO (CASEID) int64 52 49 58 52 42 49 58 43 ... 21 21 17 16 16 17 21 SM_HE (CASEID) int64 20 10 30 20 20 10 10 30 ... 20 30 30 10 20 30 30 SM_SEATS (CASEID) int64 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 CAR_TT (CASEID) int64 117 117 117 72 90 90 72 ... 96 156 96 96 96 120 CAR_CO (CASEID) int64 65 84 52 52 84 52 65 65 ... 70 91 56 70 56 91 70 CHOICE (CASEID) int64 2 2 2 2 2 2 2 1 2 2 2 2 ... 3 3 3 2 2 3 2 3 3 2 3 Attributes: _caseid_: CASEID _altid_: _altid_
The swissmetro example models exclude some observations. We can use the
Dataset.query_cases
method to identify the observations we would like to keep.
m = lx.Model(data.dc.query_cases("PURPOSE in (1,3) and CHOICE != 0"))
We can attach a title to the model. The title does not affect the calculations as all; it is merely used in various output report styles.
m.title = "swissmetro example 02 (weighted logit)"
We need to identify the availability and choice variables.
m.availability_co_vars = {
1: "TRAIN_AV * (SP!=0)",
2: "SM_AV",
3: "CAR_AV * (SP!=0)",
}
m.choice_co_code = 'CHOICE'
This model adds a weighting factor.
m.weight_co_var = "1.0*(GROUP==2)+1.2*(GROUP==3)"
The swissmetro dataset, as with all Biogeme data, is only in co
format.
from larch.roles import P,X
m.utility_co[1] = P("ASC_TRAIN")
m.utility_co[2] = 0
m.utility_co[3] = P("ASC_CAR")
m.utility_co[1] += X("TRAIN_TT") * P("B_TIME")
m.utility_co[2] += X("SM_TT") * P("B_TIME")
m.utility_co[3] += X("CAR_TT") * P("B_TIME")
m.utility_co[1] += X("TRAIN_CO*(GA==0)") * P("B_COST")
m.utility_co[2] += X("SM_CO*(GA==0)") * P("B_COST")
m.utility_co[3] += X("CAR_CO") * P("B_COST")
Larch will find all the parameters in the model, but we’d like to output them in a rational order. We can use the ordering method to do this:
m.ordering = [
("ASCs", 'ASC.*',),
("LOS", 'B_.*',),
]
We can estimate the models and check the results match up with those given by Biogeme:
m.set_cap(15)
m.maximize_loglike(method='SLSQP')
key | value | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
x |
| ||||||||||
loglike | -5931.557677745116 | ||||||||||
d_loglike |
| ||||||||||
nit | 11 | ||||||||||
nfev | 26 | ||||||||||
njev | 11 | ||||||||||
status | 0 | ||||||||||
message | 'Optimization terminated successfully' | ||||||||||
success | True | ||||||||||
elapsed_time | 0:00:00.178762 | ||||||||||
method | 'SLSQP' | ||||||||||
n_cases | 6768 | ||||||||||
iteration_number | 0 | ||||||||||
logloss | 0.7792172667225135 | ||||||||||
__verbose_repr__ | True |
m.calculate_parameter_covariance()
m.parameter_summary()
Value | Std Err | t Stat | Signif | Null Value | ||
---|---|---|---|---|---|---|
Category | Parameter | |||||
ASCs | ASC_CAR | -0.114 | 0.0407 | -2.81 | ** | 0.00 |
ASC_TRAIN | -0.757 | 0.0528 | -14.32 | *** | 0.00 | |
LOS | B_COST | -0.0112 | 0.000490 | -22.83 | *** | 0.00 |
B_TIME | -0.0132 | 0.000537 | -24.62 | *** | 0.00 |
Looks good!