302: Itinerary Choice using Simple Nested Logit¶

import pandas as pd
import larch
larch.__version__

'5.7.0'

This example is an itinerary choice model built using the example itinerary choice dataset included with Larch. See example 300 for details.

from larch.data_warehouse import example_file
itin = pd.read_csv(example_file("arc"), index_col=['id_case','id_alt'])
d = larch.DataFrames(itin, ch='choice', crack=True, autoscale_weights=True)

rescaled array of weights by a factor of 2239.980952380952

We will be building a nested logit model, but in order to do so we need to rationalize the alternative numbers. As given, our raw itinerary choice data has a lot of alternatives, but they are not ordered or numbered in a regular way; each elemental alternative has an arbitrary code number assigned to it, and the code numbers for one case are not comparable to another case. We need to renumber the alternatives in a manner that is more suited for our application, such that based on the code number we can programatically extract a the relevant features of the alternative that we will want to use in building our nested logit model. In this example we want to test a model which has nests based on level of service. To renumber, first we will define the relevant categories and values, and establish a numbering system using a special object:

d1 = d.new_systematic_alternatives(
    groupby='nb_cnxs',
    name='alternative_code',
    padding_levels=4,
    groupby_prefixes=['Cnx'],
    overwrite=False,
    complete_features_list={'nb_cnxs':[0,1,2]},
)

If we compare the new data with the old data, we’ll see that we have created a few more alternative.

d.info()

larch.DataFrames:  (not computation-ready)
  n_cases: 105
  n_alts: 127
  data_ce: 8 variables, 6023 rows
  data_co: 3 variables
  data_av: <populated>
  data_ch: choice
  data_wt: computed_weight (/ 2239.980952380952)

d1.info()

larch.DataFrames:  (not computation-ready)
  n_cases: 105
  n_alts: 134
  data_ce: 9 variables, 6023 rows
  data_co: 3 variables
  data_av: <populated>
  data_ch: choice
  data_wt: computed_weight (/ 2239.98095703125)

Now let’s make our model. The utility function we will use is the same as the one we used for the MNL version of the model.

m = larch.Model(dataservice=d1)

v = [
    "timeperiod==2",
    "timeperiod==3",
    "timeperiod==4",
    "timeperiod==5",
    "timeperiod==6",
    "timeperiod==7",
    "timeperiod==8",
    "timeperiod==9",
    "carrier==2",
    "carrier==3",
    "carrier==4",
    "carrier==5",
    "equipment==2",
    "fare_hy",
    "fare_ly",    
    "elapsed_time",  
    "nb_cnxs",       
]
from larch.roles import PX
m.utility_ca = sum(PX(i) for i in v)

m.choice_ca_var = 'choice'

If we just end our model specification here, we will have a plain MNL model. To change to a nested logit model, all we need to do is add the nests. We can do this easily, using the special magic_nesting method, that uses the structure of the data that we defined above.

m.magic_nesting()

m.load_data()

req_data does not request weight_co but it is set and being provided

req_data does not request avail_ca or avail_co but it is set and being provided

converting data_ce to <class 'numpy.float64'>

m.maximize_loglike()

Iteration 042 [Optimization terminated successfully]

Best LL = -777705.7732910335

	value	initvalue	nullvalue	minimum	maximum	best
MU_nb_cnxs	0.691151	1.0	1.0	0.001	1.0	0.691151
carrier==2	0.079567	0.0	0.0	-inf	inf	0.079567
carrier==3	0.440537	0.0	0.0	-inf	inf	0.440537
carrier==4	0.397000	0.0	0.0	-inf	inf	0.397000
carrier==5	-0.439005	0.0	0.0	-inf	inf	-0.439005
elapsed_time	-0.004229	0.0	0.0	-inf	inf	-0.004229
equipment==2	0.326813	0.0	0.0	-inf	inf	0.326813
fare_hy	-0.000847	0.0	0.0	-inf	inf	-0.000847
fare_ly	-0.000857	0.0	0.0	-inf	inf	-0.000857
nb_cnxs	-3.156922	0.0	0.0	-inf	inf	-3.156922
timeperiod==2	0.065438	0.0	0.0	-inf	inf	0.065438
timeperiod==3	0.087974	0.0	0.0	-inf	inf	0.087974
timeperiod==4	0.042816	0.0	0.0	-inf	inf	0.042816
timeperiod==5	0.096447	0.0	0.0	-inf	inf	0.096447
timeperiod==6	0.164563	0.0	0.0	-inf	inf	0.164563
timeperiod==7	0.243778	0.0	0.0	-inf	inf	0.243778
timeperiod==8	0.245030	0.0	0.0	-inf	inf	0.245030
timeperiod==9	-0.006025	0.0	0.0	-inf	inf	-0.006025

/home/runner/work/larch/larch/larch/larch/model/optimization.py:308: UserWarning: slsqp may not play nicely with unbounded parameters
if you get poor results, consider setting global bounds with model.set_cap()
  warnings.warn( # infinite bounds # )

key

value

x

	0
MU_nb_cnxs	0.691151
carrier==2	0.079567
carrier==3	0.440537
carrier==4	0.397000
carrier==5	-0.439005
elapsed_time	-0.004229
equipment==2	0.326813
fare_hy	-0.000847
fare_ly	-0.000857
nb_cnxs	-3.156922
timeperiod==2	0.065438
timeperiod==3	0.087974
timeperiod==4	0.042816
timeperiod==5	0.096447
timeperiod==6	0.164563
timeperiod==7	0.243778
timeperiod==8	0.245030
timeperiod==9	-0.006025

loglike

-777705.7732910335

d_loglike

	0
MU_nb_cnxs	0.019040
carrier==2	-0.099388
carrier==3	-0.042668
carrier==4	0.079734
carrier==5	0.092789
elapsed_time	4.260928
equipment==2	0.067728
fare_hy	-9.444235
fare_ly	-3.831194
nb_cnxs	0.001616
timeperiod==2	0.036795
timeperiod==3	0.021406
timeperiod==4	0.049108
timeperiod==5	-0.011497
timeperiod==6	-0.020032
timeperiod==7	0.012699
timeperiod==8	0.000712
timeperiod==9	-0.016040

nit

42

nfev

137

njev

42

status

0

message

'Optimization terminated successfully'

success

True

elapsed_time

0:00:00.453657

method

'slsqp'

n_cases

105

iteration_number

42

logloss

3.3066002758377295

v5.7.0

302: Itinerary Choice using Simple Nested Logit

302: Itinerary Choice using Simple Nested Logit¶

Iteration 042 [Optimization terminated successfully]