Archived Content

The following content is from an older version of this website, and may not display correctly.

A big city is a perfect case study for Big Data. It generates lots of data and has lots of problems that need to be solved. Progress or regression on many of the common big-city issues can be felt immediately: from fluctuations in the number of murders in a neighborhood to the amount of time spent waiting for a bus. This means it is easy to see whether or not a data-analytics method deployed works.

Chicago fits the profile. It is large, diverse and has lots of problems. Its unemployment rate has been hovering somewhere around 10% and its 2012 murder rate has so far been higher than that of many of the world’s largest metropolises, including Sao Paulo, Moscow, Mexico City, Los Angeles and New York, according to a recent estimate by the NBC Chicago news service.

With its new CIO Brett Goldstein in office, Chicago may just be the city to watch if you want to learn whether Big Data analytics can cause positive change to quality of life. And if anyone had the attributes to make it happen, Goldstein has the package: he is smart, educated and has the passion and the drive for data science, public safety and government.

Analyzing crime

Prior to landing his first City Hall job as Chief Data Officer, Goldstein worked for the Chicago Police Department, where he founded and oversaw the department’s Predictive Analytics Group. They applied Big Data analytics to learning about and fighting crime in the city. He established the group in 2010 after receiving a grant from the National Institute of Justice to operationalize analytics in policing.

Goldstein had conducted a research project that looked at data collected by the police department to see if Big Data analytics could be applied in police work. “We found that we were able to do exactly that,” he says of the results. “We could in fact leverage data science across police administrative data and use it as a framework to use predictive data to prevent violence.”

They could, for example, identify areas at highest risk for near-term increases in violence and use that information to deploy police units. “It went form a lab project to being a part of the operation,” Goldstein says. One of the methods was leveraging statistical relationships between 911 calls and shooting statistics in attempts to predict a rise in shootings in a particular area.

It is hard to measure the group’s success or failure, however. In “2010 [when he was running the analytics group at the police department], we had a decrease in crime,” Goldstein says. He is quick to point out though that you cannot really pinpoint a specific reason for that decrease. “I don’t have the hubris to sort of speak to any sort of causality,” he says. “We made a good contribution as part of a broader strategy.”

Big hopes for big data

In May 2011, Goldstein left the police department to become the Chicago’s Chief Data Officer, and in June 2012, Mayor Rahm Emmanuel appointed him to be the city’s CIO, replacing Jason DeHaan, who had left city government. But analytics concepts he applied at the police department are extensible across the board, and he fully intends to apply them in his new job. He is currently interested in 311, a number Chicago residents can dial to access information about any of the city’s programs and services.

Last year, the call center received about 3.9m calls, according to city’s website. “In many cases we’re providing a city service,” Goldstein says about 311. “Whether you call to report a pothole, graffiti or a lights-out.” The city’s CIO wants to examine relationships between patterns of certain types of calls to the center and use that information to foresee issues. There may, for example, be a statistical relationship between lights going out in an alley and garbage cans disappearing. Maybe, next time a street light is out, it is time to deliver a replacement garbage can to the neighborhood, he says.

And there is no need to reinvent the wheel. “The techniques are there,” Goldstein says. “The data is there. It’s just [a matter of] adding data science to our chops as a government.”

A lot of the analytics systems his team has built use a combination of Python and C. More recently, however, he has been adopting R, an open-source language and software environment for statistical computing. He says he is an “open-source” guy, which is all the better, since “government can’t afford large commercial analytics software implementations.” At the moment, the city uses MongoDB, an open-source NoSQL database system, and Hadoop is on research slate for the next two years, as they look for an appropriate mix.

Infrastructure at a pivot point

The city has multiple data centers and overhauling this infrastructure is a priority among Goldstein’s multiple 12-month projects. The core data center is within his department, the Department of Innovation and Technology. It hosts more than 400 servers, some virtualized and some not. This is the city’s biggest data center, with several more scattered around the city.

Goldstein and his department are currently at a crossroads, or a “pivot point”, as he calls it. They need to decide whether to outsource the data center function altogether or to consolidate while leaving the data center capability in-house. Leaving it in-house would mean building out a facility with proper power and cooling infrastructure. “I can be happy with a location that will be safe and sustainable” – a non-trivial task. In addition the city would have to create a proper disaster-recovery site.

The cost to make it all happen has to be calculated, while at the same time conceptualizing what the next-generation hardware is going to be and making sure the facility that is built is able to support it.

“Or, do I actually consider getting out of the business entirely,” he says, explaining that moving into a colocation data center is very much on the table. The city’s technological needs are huge, resiliency requirements of its IT infrastructure are high, and the economic climate is tough. If someone else provides power, cooling and network, “we can focus on the layers above that”.

One of the primary IT systems that needs an overhaul is the city’s database environment. “We have a database diaspora,” Goldstein says. “We need to do it correctly in a consolidated environment and start to offer it as service.”

Another service getting a hard look is email. “What businesses should I be in? What businesses should I get out of?” Goldstein asks. If he was to start his own company today, he says, “I can promise you I would not be doing my own email”.

Whatever decisions he and his team make now will have long-lasting effects on the city, so this is not something they take lightly.

While most cities operate their own data centers, there are examples of successful moves to commercial colocation facilities. One of the most recent examples is San Francisco. The city recently consolidated its IT systems and moved its primary data center from an old space not really designed to support a data center to a facility operated by United Layer at a Digital Realty Trust-owned campus at 200 Paul Ave.

Government doesn’t have to suck

Goldstein came to city government from the private sector, where he had a successful career. Prior to coming to work for the Chicago Police Department, he was one of the first employees at OpenTable, a popular online-reservation service for restaurants, where he served as IT director. “I’m very proud of my time at OpenTable and I think we did some very good work,” but at some point he decided he would get more satisfaction out of “working on problems that really change people’s lives,” which is what made public safety and city government attractive to him.

He has a master’s degree in criminal justice from Suffolk University and another master’s in computer science from the University of Chicago. He is also currently pursuing a doctor’s degree in criminology, law and justice at the University of Illinois – Chicago. “I still have passion behind the idea of understanding crime patterns,” he says. There are also so many factors beyond crime statistics that influence the well-being of a city. There is census, financial and education data. He wants to use this information to create simulations, where changing ‘X’ can cause outcome ‘Y’. “If we embrace data science we’ll be able to do some really smart work,” Goldstein says.

A successful crossover from the startup world to the public sector, he does not subscribe to the popular belief that government is just incapable of being innovative. “One of the things I have remarkably little patience for is people who say the government is always going to be behind the private sector,” he says. His job is to provide top-level service and “my commitment to the mayor is making that real”.

This article was originally published in the DatacenterDynamics FOCUS magazine, Issue 25. Subscribe for free on the DCD website.