Logit Model for computation of preferred mode of travel taking into account factors such as cost, distance, time, etc.

Let us have a variable E(i), where “i’ are the various modes of transportation.

The computation that occurs for E(i) is basically:

E(i)=Cost(i)/Time(i)

Here Cost and Time represent the amount of money that “I” costs and the time “I” takes to get from origin to destination.

This computation for E is the same for all other modes of transportation except “Walking”, where E is basically:
E(Walk)=Time(Walk)/Busstop

Here, Time(walk) is basically the amount of walking time and BusStop is the distance from origin to bus stop. So we are calculating the ratio of the Walking distance to the destination and the bus stop.

For everything else, the Cost/Time computation gives us a good idea of what we call the “utility” of that mode of travel. This is the amount of dollars spent per minute time.

When designing our logit model, the next step we implemented was taking the exponential of all the E(i) values. This would make the rest of our calculations more measurable, since our logistics model makes use of a multinomial model of travel mode choice. We hope to capture most of the variables that affect the utility, or benefit, of choosing a particular mode for the school trip in question.

Taking the Exponentials of all the E(i)s we get a table that looks something like shown below. This table has the actual data so far with all the necessary computations.

Exp(E(Car))

Exp(E(Bus))

Exp(E(Walk))

1808.042414

1.068939

1.14201E+26

22026.46579

1.105171

148.4131591

5.29449005

1.022471

1.48938E+78

485165195.4

1

3269017.372

28.03162489

1

5.18471E+21

2.718281828

1.013423

9.424E+138

1600320.19

1.221403

22026.46579

22026.46579

1.051271

1.06865E+13

148.4131591

1

1.06865E+13

148.4131591

1

3.49343E+19

485165195.4

1.105171

22026.46579

518.0128247

1.105171

22026.46579

11789.91755

1

22026.46579

22026.46579

1.068939

22026.46579

22026.46579

1

3269017.372

72004899337

1

148.4131591

1265.037624

1.105171

485165195.4

268337.2865

1.221403

148.4131591

22026.46579

1

485165195.4

1808.042414

1.051271

3.49343E+19

 

 

Now let us take a closer look at the data. For our model to be able to compute a probability, we need to incorporate a method that puts the probabilities on a 0-1 scale. So all we have to do, to find the probability of taking mode “i2” of travel is:

Probability = 1-(Exp(E(i2))/Sum(Exp(E(i1)+ Exp(E(i2))+ Exp(E(i3))) , assuming that there are only three modes of travel.

Now let us review the computed data we have so far.

PDrive

PBus

PWalk

1

1

0

0.006742354

0.999950164

0.993307

1

1

0

0.006692853

0.999999998

0.993307

1

1

0

1

1

0

0.01357766

0.999999247

0.986423

0.999999998

1

2.06E-09

1

1

1.4E-11

1

1

0

4.54001E-05

0.999999998

0.999955

0.977023756

0.999950981

0.023025

0.651365174

0.999970429

0.348664

0.500012132

0.999975736

0.500012

0.993307151

0.999999696

0.006693

2.07504E-09

1

1

0.999997393

0.999999998

2.61E-06

0.000557325

0.999995451

0.999447

0.999954602

0.999999998

4.54E-05

1

1

0

 

Here, Pi represents the probability of taking mode of traveli”. The closer this probability is to 1, the more likely the student is to choose that particular mode “i” of travel. But in our particular case, we must  go a step further to enhance the accuracy of the model.

The first thing we do is we consider the two “i”s with the highest probabilities. We do this because just taking the Utility of that travel mode into account is not sufficient. We need a way to explicitly confirm a travel mode choice. When we say “confirm” we mean that one mode obviously has a higher preference over the other. As you can see from the data in the table above, there are quite a few occurrences of 1s for more than one “i” in the same column. But when we evaluate this, we realize that this is only because our model has now narrowed the travel mode choice down to two choices. Now, this choice will always be between Walking and taking the Bus or Taking the Bus and Driving, since Walking and Driving are two extremes.  

So we have two different sets of calculations that we consider depending on the two modes with the highest probability. If the modes with the highest mode of probability are Walking and Taking the Bus , then we consider the time it takes to travel on the bus vs time it takes to walk. If walking takes longer than taking the bus, then the chosen mode will most probably be taking the Bus. However if taking the bus takes longer  time, then the probability of walking will be the highest.

Now, if we are considering the probabilities of Taking the Bus and Driving, when the probability of these two are highest in the table shown above, then the first thing we do is obviously look if the person under question has a car in the first place. If not, then he/she will probably take the bus. We could have eliminated this mode earlier, which would have skimmed down some calculations but this is just to simulate the condition where the person “could” have a car. Now if the person has a car, then we compute the ratio of the difference between car and bus time to the difference between their costs. So the formula to do this is simply:

=Absolute Value of ((Bus Time – Car Time))/(Absolute Value of(Bus Cost-Car Cost))

Through considerable amounts of analysis, we have come to the conclusion that if this ratio is higher than 0.1 then Driving will be the preferred mode of travel. Otherwise if the ratio is lower than 0.1, then the person would probably take the bus.

Now let us take a look at our data. For the sake of better understanding and simplicity, we have included all the columns of computation.

Pcar

Pbus

Pwalk

Mode(Walk/Bus)

Ratio

Mode(Bus/Drive)

1

1

0

Bus

0.067568

Bus

0.006742354

0.999950164

0.993307

walk

0.102041

Drive

1

1

0

Bus

0.306122

Drive

0.006692853

0.999999998

0.993307

Bus

0.02

Bus

1

1

0

Bus

0.1

Bus

1

1

0

Bus

0.510204

Drive

0.01357766

0.999999247

0.986423

Bus

0.020202

Bus

0.999999998

1

2.06E-09

Bus

0.10101

Drive

1

1

1.4E-11

Bus

0.1

Bus

1

1

0

Bus

0.066667

Bus

4.54001E-05

0.999999998

0.999955

Bus

0.050505

Bus

0.977023756

0.999950981

0.023025

Bus

0.040816

Bus

0.651365174

0.999970429

0.348664

Bus

0.026667

Bus

0.500012132

0.999975736

0.500012

walk

0.204082

Drive

0.993307151

0.999999696

0.006693

Bus

0.2

Drive

2.07504E-09

1

1

walk

0.16

Drive

0.999997393

0.999999998

2.61E-06

Bus

0.061224

Bus

0.000557325

0.999995451

0.999447

Bus

0.020408

Bus

0.999954602

0.999999998

4.54E-05

Bus

0.2

Drive

1

1

0

Bus

0.135135

Drive

 

To cite an example of our computations so far, let us take a look at the first row of data in the table above. The two modes with the highest mode of travel are “Car” and “Bus”. So we now look at our original data and see if “Person 1” has a car or not. So it turns out he does have a car, so now we consider the column in our table named “Mode (Bus/Drive)”. Based on our ratio, the probability that the person will take the Bus is higher.

 

Percentage Error and Accuracy

Since our survey includes actual travel mode choices, we can simply compare our results to these actual choices. The first thing we do is count the number of entries or people in our survey. Now we compare the survey to the predicted travel mode choices. We now count the number of errors in our predicted travel modes. For example, every time we predict a travel mode “Walk” but the travel mode chosen in reality is “Bus” we add one to our count. So the percentage accuracy in our model, is basically the number of people minus the number of “false predictions” divided by the number of people multiplied by a 100. This is:

((n-x)/n)*100 , where n is the number of entries or people in the survey and x is the number of predictions that were not true in reality.

RESULTS

To summarize, the results (predicted) are:

#

Predicted

1

Bus

2

Walk

3

Drive

4

Bus

5

Bus

6

Drive

7

Bus

8

Drive

9

Bus

10

Bus

11

Bus

12

Bus

13

Bus

14

Drive

15

Drive

16

Walk

17

Bus

18

Bus

19

Drive

20

Drive

 

The full table of data considered and calculated is given below:

#

Car

Bus

Walk

Pcar

Pbus

Pwalk

Mode(Walk/Bus)

Ratio

Mode(Bus/Drive)

1

1808.042

1.0689391

1.14201E+26

1

1

0

Bus

0.067568

Bus

2

22026.47

1.1051709

148.4131591

0.006742

0.999950164

0.993307

walk

0.102041

Drive

3

5.29449

1.022471

1.48938E+78

1

1

0

Bus

0.306122

Drive

4

4.85E+08

1

3269017.372

0.006693

0.999999998

0.993307

Bus

0.02

Bus

5

28.03162

1

5.18471E+21

1

1

0

Bus

0.1

Bus

6

2.718282

1.0134226

9.424E+138

1

1

0

Bus

0.510204

Drive

7

1600320

1.2214028

22026.46579

0.013578

0.999999247

0.986423

Bus

0.020202

Bus

8

22026.47

1.0512711

1.06865E+13

1

1

2.06E-09

Bus

0.10101

Drive

9

148.4132

1

1.06865E+13

1

1

1.4E-11

Bus

0.1

Bus

10

148.4132

1

3.49343E+19

1

1

0

Bus

0.066667

Bus

11

4.85E+08

1.1051709

22026.46579

4.54E-05

0.999999998

0.999955

Bus

0.050505

Bus

12

518.0128

1.1051709

22026.46579

0.977024

0.999950981

0.023025

Bus

0.040816

Bus

13

11789.92

1

22026.46579

0.651365

0.999970429

0.348664

Bus

0.026667

Bus

14

22026.47

1.0689391

22026.46579

0.500012

0.999975736

0.500012

walk

0.204082

Drive

15

22026.47

1

3269017.372

0.993307

0.999999696

0.006693

Bus

0.2

Drive

16

7.2E+10

1

148.4131591

2.08E-09

1

1

walk

0.16

Drive

17

1265.038

1.1051709

485165195.4

0.999997

0.999999998

2.61E-06

Bus

0.061224

Bus

18

268337.3

1.2214028

148.4131591

0.000557

0.999995451

0.999447

Bus

0.020408

Bus

19

22026.47

1

485165195.4

0.999955

0.999999998

4.54E-05

Bus

0.2

Drive

20

1808.042

1.0512711

3.49343E+19

1

1

0

Bus

0.135135

Drive

 

Our percentage accuracy is:

(20-4)/10 * 100 = 80%

 

Thus our model will be accurate 80% of the time.