background preloader

A computational journalism reading list

A computational journalism reading list
[Last updated: 18 April 2011 -- added statistical NLP book link] There is something extraordinarily rich in the intersection of computer science and journalism. It feels like there’s a nascent field in the making, tied to the rise of the internet. I’d like to propose a working definition of computational journalism as the application of computer science to the problems of public information, knowledge, and belief, by practitioners who see their mission as outside of both commerce and government. “Computational journalism” has no textbooks yet. Data journalism Data journalism is obtaining, reporting on, curating and publishing data in the public interest. Visualization Big data requires powerful exploration and storytelling tools, and increasingly that means visualization. Tamara Munzner’s chapter on visualization is the essential primer. Computational linguistics Data is more than numbers. Communications technology and free speechCode is law.

Leaked Labour email: lay off Murdoch An email, forwarded on behalf of Ed Miliband's director of strategy, Tom Baldwin, to all shadow cabinet teams warns Labour spokespeople to avoid linking hacking with the BSkyB bid, to accept ministerial assurances that meetings with Rupert Murdoch are not influencing that process, and to ensure that complaints about tapping are made in a personal, not shadow ministerial, capacity. The circular, sent by a Labour press officer on 27 January, states: "Tom Baldwin has requested that any front-bench spokespeople use the following line when questioned on phone-hacking. BSkyB bid and phone-tapping . . . these issues should not be linked. One is a competition issue, the other an allegation of criminal activity." It goes on: "Downing Street says that Cameron's dinners with Murdoch will not affect Hunt's judgement. We have to take them at their word." Referring separately to the phone-hacking allegations, the memo states: "We believe the police should thoroughly investigate all allegations.

September | 2012 | Frontiers of Computational Journalism In this week’s class, we discussed clustering algorithms and their application to journalism. As an example, we built a distance metric to measure the similarity of the voting history between two members of the UK House of Lords, and used it with multi-dimensional scaling to visualize the voting blocs. The data comes from The Public Whip, an independent site that scrapes the British parliamentary proceedings (the “hansard“) and extracts the voting record into a database. The converted data files plus the scripts I used in class are up on GitHub. Then start R, and enter source(“lords-votes.R”) You should see this (click for larger): And voila! Let’s break down how we made this plot — which will also illuminate how to interpret it, and how much editorial choice went into its construction. The first section of lords-votes.R just loads in the data, and convers the weird encoding (2=aye, 4=nay, -9=not present, etc.) into a matrix of 1 for aye, -1 for nay, and 0 did not vote.

Big Data : Making sense at scale D'un récent voyage dans la Silicon Valley (merci aux amis du Orange Institute), je rentre avec une conviction : tout ce que nous connaissions du web va changer à nouveau avec le phénomène des big data. Il pose à nouveau, sur des bases différentes, presque toutes les questions liées à la transformation numérique. En 2008, l’humanité a déversé 480 milliards de Gigabytes sur Internet. En 2010, ce furent 800 milliards de Gygabytes, soit, comme l’a dit un jour Eric Schmidt, plus que la totalité de ce que l’humanité avait écrit, imprimé, gravé, filmé ou enregistré de sa naissance jusqu’en 2003. Ces données ne sont pas toutes des oeuvres. Outre les blogs, les textes, les vidéos (35 millions sont regardées sur Youtube chaque minute) ou le partage de musique, il y a désormais les microconversations, les applications géolocalisées, la production de données personnelles, la mise en ligne de données publiques, les interactions de l’Internet des objets... Le web était globalement transactionnel.

The Witch Hunt Against Assange Is Turning into an Extremely Dangerous Assault on Journalism Itself | News & Politics December 17, 2010 | Like this article? Join our email list: Stay up to date with the latest headlines via email. Whatever the unusual aspects of the case, the Obama administration’s reported plan to indict WikiLeaks founder Julian Assange for conspiring with Army Pvt. That’s because the process for reporters obtaining classified information about crimes of state most often involves a journalist persuading some government official to break the law either by turning over classified documents or at least by talking about the secret information. Contrary to what some outsiders might believe, it’s actually quite uncommon for sensitive material to simply arrive “over the transom” unsolicited. In most cases, I played some role – either large or small – in locating the classified information or convincing some government official to divulge some secrets. Other times, I was sneaky in liberating some newsworthy classified information from government control. A Nixon Precedent

Periodista, pregúntate qué puede hacer una buena Ley de Transparencia por ti Desde el primer día, el propósito de Tuderechoasaber.es (así como de Access Info Europe y de la Fundación Civio) ha sido tratar de facilitar, infundir curiosidad y propagar la práctica de solicitar información a cualquier ciudadano, no solo a los profesionales de información. Aquí va un ejemplo. No obstante, necesitamos unos medios de comunicación conscientes de la importancia de contar con una buena Ley de Transparencia y de un derecho de acceso a la información plenamente reconocido y garantizado. Y, quizá, menos distraídos por el tira y afloja y las declaraciones partidistas e interesadas que están acompañando a la tramitación del texto. El derecho de acceso a la información pública, con leyes que de verdad lo protegen, es un filón de noticias para los medios de comunicación en otros países. No hay más enigma: a más clara, específica y ambiciosa la norma, mayor es el deber de la administración de sacar los datos de su alforja, y mayores las salvaguardas para que tenga que cumplirlo.

Real-Time Data And A More Personalized Web - Smashing Magazine Advertisement As Web designers, we face a daily struggle to keep pace with advances in technology, new standards and new user expectations. We spend a large part of our working life dipping in and out of recent developments in an attempt to stay both relevant and competitive, and while this is what makes our industry so exciting to be a part of, it often becomes all too easy to get caught up in the finer details. Responsive Web design, improved semantics and rich Web typography have all seen their fair share of the limelight over the last year, but two developments in particular mark true milestones in the maturation of the Web: “real-time data” and a more “personalized Web.” Since the arrival of the new Web, we’ve been enraptured by social media. Web gurus and industry analysts are simultaneously arriving at the same conclusion: we are entering a new chapter in the evolution of the Web. Welcome to the new era. Real-Time Data Real-time data is making waves in Web analytics. Summary (al)

Your Life Torn Open, essay 1: Sharing is a trap This article was taken from the March 2011 issue of Wired magazine. Be the first to read Wired's articles in print before they're posted online, and get your hands on loads of additional content by subscribing online. The author of The Cult Of The Amateurargues that if we lose our privacy we sacrifice a fundamental part of our humanity. Every so often, when I'm in Amsterdam, I visit the Rijksmuseum to remind myself about the history of privacy. Today, as social media continues radically to transform how we communicate and interact, I can't help thinking with a heavy heart about The Woman in Blue. On this future network, we will all know what everyone is doing all the time. Every so often, when I'm in London, I visit University College to remind myself about the future of privacy. Unfortunately, Bentham's panopticon was a dark premonition. Yet nobody in the industrial era actually wanted to become artefacts in this collective exhibition.

¿Qué es el periodismo de datos? (Curso periodismo de datos 1/10) Desde hoy y durante las próximas semanas, iremos publicando en Irekia una serie de tutoriales y vídeos elaborados durante el curso de periodismo de datos que impartieron Mar Cabra y David Cabo el pasado mes de junio en las tres capitales vascas al que asistieron más de 70 periodistas. Estos materiales, disponibles para la ciudadanía en general, buscan profundizar en esta vertiente del periodismo y potenciar el uso, tratamiento y análisis de los datos públicos liberados en open data (open data Euskadi en el caso vasco) para que los periodistas o cualquier ciudadano pueda confeccionar sus propias informaciones. Precisamente, hoy, se celebran unos talleres enfocados hacia el periodismo de datos y open data en los curso de verano de la UPV, organizados por la dirección de Gobierno Abierto y comunicación en internet del Gobierno Vasco, dentro del primer curso de verano realizado sobre Gobierno Abierto. Tutoriales y videos

Tools to help bring data to your journalism « Michelle Minkoff NOTE: This entry was modified on the evening of 11/9/10 to deal with typos and missing words, resulting from posting this too late the previous night. Sleep deprivation isn’t always a good thing — although it allows one to do things more fun than sleep. Like play with data. Many of the stories we do every day, across beats, could benefit from a data component. So, here’s a round up of some tools you can use to rapidly produce data pieces without programming knowledge. Prepping tables Tableizer – – Copy and paste cells from your Excel spreadsheet into this tool, and it’ll spit back a formatted HTML table that you can copy and paste into a CMS of your choice. Interactive viz – no programming Many Eyes – – A fantastic tool for presentation, built by some of the best data visualizers in the business, who now work for Google. Static viz Use programming to make custom charts

WikiLeaks has created a new media landscape | Clay Shirky WikiLeaks affects one of the key tensions in democracies: the government needs to be able to keep secrets, but citizens need to know what is being done in our name. These requirements are fundamental and incompatible; like the trade-offs between privacy and security, or liberty and equality, different countries in different eras find different ways to negotiate those competing needs. In the case of state secrets v citizen oversight, however, there is one constant risk: since deciding what is a secret is itself a secret, there is always a risk that the government will simply hide an increasing amount of material of public concern. One response to this risk is the leaker, someone who believes that key elements of political life are being wrongly kept from public view, and who circulates that material on his or her own. This transformation is under-appreciated. The press often covers WikiLeaks as a series of unfortunate events, one crisis or scandal after another. Until WikiLeaks.

Related: