You are viewing this article in the archives. For the latest breaking news and updates in Ann Arbor and the surrounding area, see
Posted on Fri, Oct 29, 2010 : 9:50 a.m.

FOIA Friday: Tracking on-time performance of the bus system

By Edward Vielmetti

The board of the Ann Arbor Transportation Authority regularly gets reports of on-time performance of the system. At the latest meeting, board member David Nacht noted that on time performance was "abysmal," with only 83.3 percent of trips reported on time in the July to September time period.

If a system's entire performance is encapsulated into one number, it's hard for management to suggest anything other than "work harder."  Here's a look at tracking system performance from the outside, and some musings on whether going through the FOIA process actually gives you any useful advantages vs. working from publicly available data.

The limitations of FOIA for analysis of complex data sets

Organizations that collect real time data collect a lot of it. With multiple vehicles on each route reporting their location via GPS every minute, you might reasonablly expect millions of individual records to be generated.

Making sense of all of this data can be difficult, even for the organizations that collect it. To produce a report out of all of this, you not only have to know the raw information about all of the locations of all of the vehicles, but also all of the schedule information and all of the detail about which buses are going off their routes back to the depot. A hypothetical FOIA request that requested every location of every bus in the system at all times still would not be enough to tell you why the buses are running late.

FOIA entitles you to copies of existing records, but it does not compel the organization to create a new record on your behalf. If there were an existing report that illustrated the information that you wanted to gather, you could ask for it, and if it had been prepared already you would be entitled to a copy. But if the information you want is hiding in a database and it would take hours of effort to tease it out, no request will compel anyone to do that work for you.

Approximating the analysis from the outside

Fortunately in the case of the AATA, there is a publicly available external source of on-time performance data. The Ridetrak system shows the current state of each of the routes in the system, and the Mobile Ridetrak version of that formatted for mobile phones is relatively uncomplicated to parse.

I collected about three hours of data on on-time performance of five routes in the AATA system this afternoon, and wrote some relatively simplistic code to determine performance for each bus in the collection. All in all, the data I collected reflected the location of each of 1,735 bus times in the system, collected this afternoon between 1:30 p.m. and 4 p.m.

With any data set like this, you worry a little bit about quality. A spot check for sanity showed several routes with on time reports that occasionally did not make sense, e.g. a bus reported to be nine minutes late one minute, on time the next, and nine minutes late again a few minutes later. This did not appear to be a problem that repeated for every route frequently enough to throw broad conclusions off, but it does suggest that errors may creep in that would be cleaned up by a more careful analysis.

Simple conclusions, complicated questions


A distribution of on-time performance collected for 1,735 bus arrivals during the afternoon shows a range of on time performance. 350 vehicles, representing 20 percent of the sample, were more than five minutes delayed.

Edward Vielmetti |

The results of this survey - which, it is to be noted, include some known sample errors - are depicted at right. This afternoon's sample showed about 20 percent of the buses running more than 5 minutes late, with a maximum delay of 19 minutes reported on a bus serving Route 2 (Plymouth Road) at 3:11 p.m.

If you had all of the data in the entire system over the entire reporting period, you could start to answer more complicated questions. Are some times of day worse for on time performance than others? Do some routes perpetually run late? Is there some systematic explanation, like a snow storm, that causes all routes to be late all day long?

Don't start with FOIA first

FOIA is a relatively blunt instrument for requesting detailed system analysis. You may find that it takes a long time and a lot of money to get detailed data that you want, and you might not even be able to understand what you get.

Reports drawn from publicly available data, though incomplete, can suggest a course of analysis more complicated than you are able to answer. By putting together a prototype, you can start to ask questions of people who have access to detailed reporting tools and all of the data which you already have some fraction of the answer for.

Remember, though, that FOIA does not compel anyone to explore the data for you. If you want to answer questions about a system, it can often be most practical to collect the data you need by yourself, and only then go back to the agency with your prototype in hand to say "I did this, can you do better"?

Edward Vielmetti rides the bus for Contact him at 



Fri, Oct 29, 2010 : 12:45 p.m.

I would really like to see some labels on the axes of that graph. Just to be picky, it should be a histogram, not a line graph. I agree that the AA buses are actually rather reliable compared to any "big city" transportation system. In Chicago, the bus schedule is basically a running joke. I think that it might make most sense to look at how long it takes a route from start to finish (which can be calculated from your data), and also how long it takes over a whole day (ie repeated measures from the same route). I think this might help with some of the issues relating to data quality. It may be that some routes are always late, which you would see with the first analysis, or it may be that all routes are late during certain times of the day. (It might also make sense to link this information to drivers, to see if there are certain drivers that are always late, etc.) Both of these things make sense to me and also suggest "easy" fixes. Ed, I would be interested in looking at your data and trying some of this out.


Fri, Oct 29, 2010 : 10:56 a.m.

Compared to the buses I used to take in Detroit, Ann Arbor's are great! They arrive at least close to on time, give web updates as to their status, and don't leak inside when it rains. It looks like most of the late buses are still within 10 minutes of their scheduled arrival (long live Michigan time), which I think is pretty reasonable, especially since early buses are an even bigger inconvenience than late ones. Honestly, with the never-ending purgatory that is Ann Arbor construction, I'm surprised more buses aren't more seriously delayed.


Fri, Oct 29, 2010 : 10:26 a.m.

There's road construction all over this area so I'm not surprised that the AATA buses have been late. Perhaps AATA can revise their route times when they know a particular route has construction slow-downs for several weeks/months. Finally, I wonder if the buses start out late, leaving the Blake Transit Center? That whole area is a MESS, with 5th Avenue closed and the combination of buses, cars, and pedestrians all sharing space on 4th Avenue.


Fri, Oct 29, 2010 : 9:26 a.m.

With the construction on Plymouth Road this summer, I am not surprised there were a lot of late afternoon buses.

5c0++ H4d13y

Fri, Oct 29, 2010 : 9:05 a.m.

Kinda looks like a poisson distribution. That would make sense considering the busses specifically drive to not be early, they pull over and wait, but if they are late then there's little they can do.