You are viewing this article in the AnnArbor.com archives. For the latest breaking news and updates in Ann Arbor and the surrounding area, see MLive.com/ann-arbor
Posted on Fri, Jun 11, 2010 : 11 a.m.

How to read a lot of restaurant inspection reports

By Edward Vielmetti

I've been reading restaurant inspection reports.

In Washtenaw County, restaurants are inspected every six months. The county uploads the inspection results to a server run by Sword Solutions, a company based in Dewitt, Mich., which specializes in "mobile inspection software" for organizations like county health departments to handle inspections of campgrounds, mobile home parks, restaurants and temporary food service establishments, migrant labor camps, vending machines, swimming pools, and well and septic systems.

The restaurant inspection reports from Washtenaw County list a set of violation codes for each individual restaurant. The interface provided to the media is the same as the one provided to the public. You can look at each individual restaurant from the Sword Solutions interface, but there's no evident way to pull a report to see how things are across the whole county.

As I think about how I might cobble together something to automate the process of analyzing this entire data set, here are some notes and links on what I'm looking for.

Getting the data

Restaurants in the database can be found by name or by license number. License numbers appear to be sequential, so it should be possible to write a program to work my way through them systematically one by one and fetch all of them with a relatively simple script. That would give me a complete duplicate of the county's database, updatable at any time and searchable with the tools that I have and not what the vendor provides.

It's worthwhile noting that if you want to get a complete dump of a government vendor's database, either the vendor or its sponsor within the government agency may put up roadblocks. A little more than a year ago, I was informed by the Ann Arbor DDA that a project that relied on automated downloads of parking structure occupancy data for downtown garages was not considered an appropriate use. The Ann Arbor News reported at the time that "DDA officials said they understand the information is public, but the computer program raised concerns about the security of the DDA Web site." Similarly, the Ann Arbor Transporation Authority's automated ride information throws an error message if it gets a burst of back-to-back requests.

Automated data downloads are key to any project like this. You want to spend your time talking to people, not futzing with downloading data while you are on a deadline. This means you really have two different approaches to follow: You can ask for permission and wait for official blessing before proceeding, or you can retrieve the data very thoroughly and very slowly in such a way that the agency never notices any unusual behavior.

Only a few dozen restaurants are inspected every few weeks, so the total number of downloads is relatively small.

Parsing the data

What you get back from a system like this is a Web page, designed to be read by people. You need to turn it back into data that can be read by a machine.

A typical inspection report, like this one for Senor Lopez which closed in March after receiving a particularly bad inspection, has a number of sections.

At the top of the page you have reference information: the date of the inspection, the restaurant's license number, its address, name, and notes. There is no field in the inspection report for the name of the inspector.

Each inspection in turn has a number of records, each of which has a number of fields. Here's a sample:

Violation (Critical): Safe/Unadult./Honestly Present (3-101.11) Items: Food Problems: Adulterated/contaminated Corrections: Discard. Comments: OBSERVED A COUPLE ROTTING ONIONS AND SEVERAL ROTTING POTATOES ON THE SHELF IN THE BACK DRY STORAGE ROOM. THE SMALL FLIES IN THE FACILITY ARE ATTRACTED TO THIS ROTTING PRODUCE. CORRECTED BY DISCARDING THESE ITEMS. IN THE FUTURE, ENSURE STAFF ARE INSPECTING THESE FOODS AND THROWING AWAY BAD ONES.

The data is presented as HTML; you'll need an HTML parser, hopefully one that is tolerant of Web data, to make sense of it. I have recommended Beautiful Soup to others who have faced this problem before, since it's straightforward to use it in your programs.

You may also be able to read the data into a Google spreadsheet automatically, using either the ImportData or ImportXML functions. Start with this function

=importdata("http://www.swordsolutions.com/inspections/pgeSearchResults.asp?Name=SENOR%20LOPEZ%20TAQUERIA%20MEXICAN%20RESTAURANT&County=28&License=047178&Hit=DirectSearch&CountHit=28&")

and iterate depending on your relative skills with XML vs. Excel functions.

Decoding the codes

The interesting piece of this, which will make the bigger project easier, is the code in the violation line. A search for 3.101.11 on the Michigan.GOV site unearthed this absolute treasure trove of practical information: the Procedures for Field Standardization Of Trainers, published by the Michigan Department of Agriculture, Food and Dairy Division, Food Service Program, Revised August 2008.

This document is a guide for those who are training people to become food safety inspectors. It has checklists, documentation for the codes (which come from the FDA), and some narrative about what to look for. From its introduction:

The procedure is not intended to provide basic training to individuals; rather it is designed to evaluate and attest that an individual who is well prepared meets a standard of performance in food service evaluations. The procedure examines the candidate’s knowledge, understanding, and practical application of risk-based inspection techniques, Michigan’s Food Law and interpretations, HACCP principles as a risk-based food safety evaluation tool, and the ability to work with operators to develop risk control plans.

Next steps

Armed with this information, what do you do next? More importantly, to my editor, how is this news?

If I am reading all of this information right, it should be possible to take any given inspection and turn it into a set of rows in a spreadsheet. The columns would include license number, restaurant name, inspection date, address, notes, violation code, violation name, a flag to mark if the violation was critical, the item inspected, problems, corrections, and comments. A complete dump of the entire database would have one row for each observation in every inspection.

Once you have all of this, you could start to do reporting across the whole system. Are restaurants in better or worse shape than before? What are the most frequent violations? What are the critical violations that show up over and over again? Are there some parts of the county or the city with a better track record than others?

All those questions, which I'd love to be able to answer, are unanswerable to me now just by looking at links. The key discovery in this process was the train-the-trainers manual, which told me not only what all of the codes mean, but also that Michigan's health inspection of restaurants is based on a risk model which I don't know anything about. That's for another day.

Edward Vielmetti is an omnivore for AnnArbor.com .