Blogs

Links: Squishy crime numbers, FEC data blog

The Dallas Morning News ran a probing article that examines how the Dallas Police Department classifies what most people would think of as burglaries. The newspaper found that the police department often called it vandalism if someone broke into a home but didn’t take anything.

That data state of mind

I’ve come to the realization in the past couple of years that teaching computer-assisted reporting requires first teaching how to approach stories with a "data state of mind" — a term that I’ve co-opted from the famous Barlett and Steele phrase, "document state of mind."

in

Felon hunters story is off the table in Minnesota

This spring, the Minnesota Legislature made it impossible for news organizations here to do the classic "felons with hunting license" story that many newspapers (including mine) have done in the past.

They passed a bill that made private key information from the Department of Natural Resources' license database. This includes everyone who has obtained a license for fishing, hunting, trail use and other activities. The person's name, address, driver's license number and date of birth are now private.

EveryBlock Goes Open Source

By now you may have read that EveryBlock, a Knight Foundation-funded project, has released its source code to the public (here's a browsable version). Getting a chance to look under the hood is a great opportunity to see how other folks tackle some of the tasks we all face, or are likely to.

A Tale of Two Values: Using Logistic Regression

Linear regression is a great tool when your outcome variable is test scores or loan amounts or another continuous variable. But sometimes, your output is a Yes or a No. That type of outcome is known as dichotomous.

You still can do something similar to linear regression because some super smart stats dude awhile back came up with a way to mimic linear regression with a dichotomous outcome variable.

To do logistic regression SPSS, you need to have the “regression models” add-on program. You also have to understand your data and do a little prep work on it.

Links: Our CIO and Tufte

It sounds like Vivek Kundra, the federal government’s first chief information officer gets the importance of releasing unadulterated data.

In an in-depth interview with Wired, Kundra says, “The core principles are using open standards, presenting raw data, and distributing it in as many formats as possible. Public policy decisions are made using the data anyway, but the raw data is important because if it is massaged too much, you can lose the big issues.”

Comparing people finders

This is the first of a three-part look at nearly 20 free or nearly free search sites aimed specifically at finding information on people by mining not only the Web but also social networking sites, archives and, to some degree, public records sites.

In this post, we'll look at 123people, CVGadget, iSearch and LinkedIn.

The bottom line: No one offers everything available on a person, but taken together they paint a detailed portrait of an individual. Add more traditional resources, such as property and court records, and even more details emerge.

Where we've been

Future Tools: Some thoughts on the future of CAR

I had the privilege of speaking on a panel with Sarah Cohen and Steve Doig last week in Baltimore about the future of computer-assisted reporting. Whoever thought I even belonged in the same room as those two gave me way more credit than I deserved.

But in preparing for that panel, I got to thinking: What skills and software tools are we going to be using in 10 years? What skills should we start learning now if we want to be prepared for the future? Or better yet: What types of problems in newsgathering and investigations could technology best help solve?

Links: Data.gov and credit union health

The federal government launched Data.gov a little less than a month ago with raw databases, data extraction tools and widgets and pledged to bring "unprecedented access to government information." The White House said the site would allow "unfiltered access to government data streams in machine-readable formats."

Yahoo! Placemaker

The process of geolocating information isn't new to journalists; producing maps has long been a key part of what we do. But when it comes to our stories, extracting mappable entities like cities from text is a relatively new concept.

There are commercial services that do this task, and researchers have created software for academic pursuits as well. Widespread free availability of geolocation services, however, has been mostly wishful thinking until last month.

Advertise in Uplink

IRE logo

The National Institute for Computer-Assisted Reporting is a joint program of
Investigative Reporters and Editors, Inc., and the Missouri School of Journalism.

141 Neff Annex, Missouri School of Journalism, Columbia MO, 65211, Tel. 573-882-2042, Fax 573-884-5544

All Rights Reserved