Putting the Data Science into Journalism

By Keith Kirkpatrick

Communications of the ACM, Vol. 58 No. 5, Pages 15-17

The key attributes journalists must have—the ability to separate fact from opinion, a willingness to find and develop strong sources, and the curiosity to ask probing, intelligent questions—are still relevant in today's 140-character-or-less, ADHD-esque society. Yet increasingly, journalists dealing with technical topics often found in science or technology are turning to tools that were once solely the province of data analysts and computer scientists.

Data mining, Web scraping, classifying unstructured data types, and creating complex data visualizations are being utilized by news organizations to uncover data that would be impossible to compile manually, such as searching all Web retailers to find the average price of a particular item, given the vast number of potential sites to visit and the limited amount of time usually afforded to reporters on deadline. Additionally, the tools can be used to dig up or present data in a way that helps journalists generate story ideas, as well as presenting complex information to readers in ways they have not seen it presented before.


