Logit Model for computation of preferred mode of travel taking into account factors such as cost, distance, time, etc.
Let us have a variable E(i), where “i’ are the various modes of transportation.
The computation that occurs for E(i) is basically:
E(i)=Cost(i)/Time(i)
Here Cost and Time represent the amount of money that “I” costs and the time “I” takes to get from origin to destination.
This computation for E is the same for all other modes of
transportation except “Walking”, where E is basically:
E(Walk)=Time(Walk)/Busstop
Here, Time(walk) is basically the amount of walking time and BusStop is the distance from origin to bus stop. So we are calculating the ratio of the Walking distance to the destination and the bus stop.
For everything else, the Cost/Time computation gives us a good idea of what we call the “utility” of that mode of travel. This is the amount of dollars spent per minute time.
When designing our logit model, the next step we implemented was taking the exponential of all the E(i) values. This would make the rest of our calculations more measurable, since our logistics model makes use of a multinomial model of travel mode choice. We hope to capture most of the variables that affect the utility, or benefit, of choosing a particular mode for the school trip in question.
Taking the Exponentials of all the E(i)s we get a table that looks something like shown below. This table has the actual data so far with all the necessary computations.
|
Exp(E(Car)) |
Exp(E(Bus)) |
Exp(E(Walk)) |
|
1808.042414 |
1.068939 |
1.14201E+26 |
|
22026.46579 |
1.105171 |
148.4131591 |
|
5.29449005 |
1.022471 |
1.48938E+78 |
|
485165195.4 |
1 |
3269017.372 |
|
28.03162489 |
1 |
5.18471E+21 |
|
2.718281828 |
1.013423 |
9.424E+138 |
|
1600320.19 |
1.221403 |
22026.46579 |
|
22026.46579 |
1.051271 |
1.06865E+13 |
|
148.4131591 |
1 |
1.06865E+13 |
|
148.4131591 |
1 |
3.49343E+19 |
|
485165195.4 |
1.105171 |
22026.46579 |
|
518.0128247 |
1.105171 |
22026.46579 |
|
11789.91755 |
1 |
22026.46579 |
|
22026.46579 |
1.068939 |
22026.46579 |
|
22026.46579 |
1 |
3269017.372 |
|
72004899337 |
1 |
148.4131591 |
|
1265.037624 |
1.105171 |
485165195.4 |
|
268337.2865 |
1.221403 |
148.4131591 |
|
22026.46579 |
1 |
485165195.4 |
|
1808.042414 |
1.051271 |
3.49343E+19 |
Now let us take a closer look at the data. For our model to be able to compute a probability, we need to incorporate a method that puts the probabilities on a 0-1 scale. So all we have to do, to find the probability of taking mode “i2” of travel is:
Probability = 1-(Exp(E(i2))/Sum(Exp(E(i1)+ Exp(E(i2))+ Exp(E(i3))) , assuming that there are only three modes of travel.
Now let us review the computed data we have so far.
|
PDrive |
PBus |
PWalk |
|
1 |
1 |
0 |
|
0.006742354 |
0.999950164 |
0.993307 |
|
1 |
1 |
0 |
|
0.006692853 |
0.999999998 |
0.993307 |
|
1 |
1 |
0 |
|
1 |
1 |
0 |
|
0.01357766 |
0.999999247 |
0.986423 |
|
0.999999998 |
1 |
2.06E-09 |
|
1 |
1 |
1.4E-11 |
|
1 |
1 |
0 |
|
4.54001E-05 |
0.999999998 |
0.999955 |
|
0.977023756 |
0.999950981 |
0.023025 |
|
0.651365174 |
0.999970429 |
0.348664 |
|
0.500012132 |
0.999975736 |
0.500012 |
|
0.993307151 |
0.999999696 |
0.006693 |
|
2.07504E-09 |
1 |
1 |
|
0.999997393 |
0.999999998 |
2.61E-06 |
|
0.000557325 |
0.999995451 |
0.999447 |
|
0.999954602 |
0.999999998 |
4.54E-05 |
|
1 |
1 |
0 |
Here, Pi represents the probability of taking mode of travel “i”. The closer this probability is to 1, the more likely the student is to choose that particular mode “i” of travel. But in our particular case, we must go a step further to enhance the accuracy of the model.
The first thing we do is we consider the two “i”s with the highest probabilities. We do this because just taking the Utility of that travel mode into account is not sufficient. We need a way to explicitly confirm a travel mode choice. When we say “confirm” we mean that one mode obviously has a higher preference over the other. As you can see from the data in the table above, there are quite a few occurrences of 1s for more than one “i” in the same column. But when we evaluate this, we realize that this is only because our model has now narrowed the travel mode choice down to two choices. Now, this choice will always be between Walking and taking the Bus or Taking the Bus and Driving, since Walking and Driving are two extremes.
So we have two different sets of calculations that we consider depending on the two modes with the highest probability. If the modes with the highest mode of probability are Walking and Taking the Bus , then we consider the time it takes to travel on the bus vs time it takes to walk. If walking takes longer than taking the bus, then the chosen mode will most probably be taking the Bus. However if taking the bus takes longer time, then the probability of walking will be the highest.
Now, if we are considering the probabilities of Taking the Bus and Driving, when the probability of these two are highest in the table shown above, then the first thing we do is obviously look if the person under question has a car in the first place. If not, then he/she will probably take the bus. We could have eliminated this mode earlier, which would have skimmed down some calculations but this is just to simulate the condition where the person “could” have a car. Now if the person has a car, then we compute the ratio of the difference between car and bus time to the difference between their costs. So the formula to do this is simply:
=Absolute Value of ((Bus Time – Car Time))/(Absolute Value of(Bus Cost-Car Cost))
Through considerable amounts of analysis, we have come to the conclusion that if this ratio is higher than 0.1 then Driving will be the preferred mode of travel. Otherwise if the ratio is lower than 0.1, then the person would probably take the bus.
Now let us take a look at our data. For the sake of better understanding and simplicity, we have included all the columns of computation.
|
Pcar |
Pbus |
Pwalk |
Mode(Walk/Bus) |
Ratio |
Mode(Bus/Drive) |
|
1 |
1 |
0 |
Bus |
0.067568 |
Bus |
|
0.006742354 |
0.999950164 |
0.993307 |
walk |
0.102041 |
Drive |
|
1 |
1 |
0 |
Bus |
0.306122 |
Drive |
|
0.006692853 |
0.999999998 |
0.993307 |
Bus |
0.02 |
Bus |
|
1 |
1 |
0 |
Bus |
0.1 |
Bus |
|
1 |
1 |
0 |
Bus |
0.510204 |
Drive |
|
0.01357766 |
0.999999247 |
0.986423 |
Bus |
0.020202 |
Bus |
|
0.999999998 |
1 |
2.06E-09 |
Bus |
0.10101 |
Drive |
|
1 |
1 |
1.4E-11 |
Bus |
0.1 |
Bus |
|
1 |
1 |
0 |
Bus |
0.066667 |
Bus |
|
4.54001E-05 |
0.999999998 |
0.999955 |
Bus |
0.050505 |
Bus |
|
0.977023756 |
0.999950981 |
0.023025 |
Bus |
0.040816 |
Bus |
|
0.651365174 |
0.999970429 |
0.348664 |
Bus |
0.026667 |
Bus |
|
0.500012132 |
0.999975736 |
0.500012 |
walk |
0.204082 |
Drive |
|
0.993307151 |
0.999999696 |
0.006693 |
Bus |
0.2 |
Drive |
|
2.07504E-09 |
1 |
1 |
walk |
0.16 |
Drive |
|
0.999997393 |
0.999999998 |
2.61E-06 |
Bus |
0.061224 |
Bus |
|
0.000557325 |
0.999995451 |
0.999447 |
Bus |
0.020408 |
Bus |
|
0.999954602 |
0.999999998 |
4.54E-05 |
Bus |
0.2 |
Drive |
|
1 |
1 |
0 |
Bus |
0.135135 |
Drive |
To cite an example of our computations so far, let us take a look at the first row of data in the table above. The two modes with the highest mode of travel are “Car” and “Bus”. So we now look at our original data and see if “Person 1” has a car or not. So it turns out he does have a car, so now we consider the column in our table named “Mode (Bus/Drive)”. Based on our ratio, the probability that the person will take the Bus is higher.
Percentage Error and Accuracy
Since our survey includes actual travel mode choices, we can simply compare our results to these actual choices. The first thing we do is count the number of entries or people in our survey. Now we compare the survey to the predicted travel mode choices. We now count the number of errors in our predicted travel modes. For example, every time we predict a travel mode “Walk” but the travel mode chosen in reality is “Bus” we add one to our count. So the percentage accuracy in our model, is basically the number of people minus the number of “false predictions” divided by the number of people multiplied by a 100. This is:
((n-x)/n)*100 , where n is the number of entries or people in the survey and x is the number of predictions that were not true in reality.
RESULTS
To summarize, the results (predicted) are:
|
# |
Predicted |
|
1 |
Bus |
|
2 |
Walk |
|
3 |
Drive |
|
4 |
Bus |
|
5 |
Bus |
|
6 |
Drive |
|
7 |
Bus |
|
8 |
Drive |
|
9 |
Bus |
|
10 |
Bus |
|
11 |
Bus |
|
12 |
Bus |
|
13 |
Bus |
|
14 |
Drive |
|
15 |
Drive |
|
16 |
Walk |
|
17 |
Bus |
|
18 |
Bus |
|
19 |
Drive |
|
20 |
Drive |
The full table of data considered and calculated is given below:
|
# |
Car |
Bus |
Walk |
Pcar |
Pbus |
Pwalk |
Mode(Walk/Bus) |
Ratio |
Mode(Bus/Drive) |
|
1 |
1808.042 |
1.0689391 |
1.14201E+26 |
1 |
1 |
0 |
Bus |
0.067568 |
Bus |
|
2 |
22026.47 |
1.1051709 |
148.4131591 |
0.006742 |
0.999950164 |
0.993307 |
walk |
0.102041 |
Drive |
|
3 |
5.29449 |
1.022471 |
1.48938E+78 |
1 |
1 |
0 |
Bus |
0.306122 |
Drive |
|
4 |
4.85E+08 |
1 |
3269017.372 |
0.006693 |
0.999999998 |
0.993307 |
Bus |
0.02 |
Bus |
|
5 |
28.03162 |
1 |
5.18471E+21 |
1 |
1 |
0 |
Bus |
0.1 |
Bus |
|
6 |
2.718282 |
1.0134226 |
9.424E+138 |
1 |
1 |
0 |
Bus |
0.510204 |
Drive |
|
7 |
1600320 |
1.2214028 |
22026.46579 |
0.013578 |
0.999999247 |
0.986423 |
Bus |
0.020202 |
Bus |
|
8 |
22026.47 |
1.0512711 |
1.06865E+13 |
1 |
1 |
2.06E-09 |
Bus |
0.10101 |
Drive |
|
9 |
148.4132 |
1 |
1.06865E+13 |
1 |
1 |
1.4E-11 |
Bus |
0.1 |
Bus |
|
10 |
148.4132 |
1 |
3.49343E+19 |
1 |
1 |
0 |
Bus |
0.066667 |
Bus |
|
11 |
4.85E+08 |
1.1051709 |
22026.46579 |
4.54E-05 |
0.999999998 |
0.999955 |
Bus |
0.050505 |
Bus |
|
12 |
518.0128 |
1.1051709 |
22026.46579 |
0.977024 |
0.999950981 |
0.023025 |
Bus |
0.040816 |
Bus |
|
13 |
11789.92 |
1 |
22026.46579 |
0.651365 |
0.999970429 |
0.348664 |
Bus |
0.026667 |
Bus |
|
14 |
22026.47 |
1.0689391 |
22026.46579 |
0.500012 |
0.999975736 |
0.500012 |
walk |
0.204082 |
Drive |
|
15 |
22026.47 |
1 |
3269017.372 |
0.993307 |
0.999999696 |
0.006693 |
Bus |
0.2 |
Drive |
|
16 |
7.2E+10 |
1 |
148.4131591 |
2.08E-09 |
1 |
1 |
walk |
0.16 |
Drive |
|
17 |
1265.038 |
1.1051709 |
485165195.4 |
0.999997 |
0.999999998 |
2.61E-06 |
Bus |
0.061224 |
Bus |
|
18 |
268337.3 |
1.2214028 |
148.4131591 |
0.000557 |
0.999995451 |
0.999447 |
Bus |
0.020408 |
Bus |
|
19 |
22026.47 |
1 |
485165195.4 |
0.999955 |
0.999999998 |
4.54E-05 |
Bus |
0.2 |
Drive |
|
20 |
1808.042 |
1.0512711 |
3.49343E+19 |
1 |
1 |
0 |
Bus |
0.135135 |
Drive |
Our percentage accuracy is:
(20-4)/10 * 100 = 80%
Thus our model will be accurate 80% of the time.