Monday, September 6, 2010

Flying Bombs on London, Summer of 1944

Beginning in June, 1944, London was the target of attacks by V-1 "flying bombs" launched primarily from France. The attacks continued through the summer, ending as the launch sites were overrun by advancing allied forces.

In a short, one page, article published in the Journal of the Institute of Actuaries in 1946, R. D. Clarke presented an analysis of the distribution of V-1 impacts on London. (A pdf of the article is available here.) He showed the pattern fit a Poisson distribution extraordinarily well, refuting claims that the bombs tended to cluster geographically. Subsequently, this Poisson pattern was mentioned in Thomas Pynchon's novel Gravity's Rainbow (see the reference to page 54 here) and in Feller's classic text, An Introduction to Probability Theory and Its Applications, in 1957.

While the data table from Clarke's article has often been reprinted and used in examples for statistics classes or lectures, the raw data of individual impacts seems not to have been available. (The paper is often misrepresented on web sites, by the way, with fanciful "stories" behind the analysis and frequent mis-identification of the V-1 as the later V-2 rocket. I was surprised by how much "legend" has grown up unchecked about this paper and the data behind it. Who knew the internet contained incorrect information.)

I have reconstructed the data from the original maps in the British Archives in Kew and present above the density distribution of impacts for June through August of 1944. In the plot above, the view is from the North-East. The higher the density the greater the number of impacts. The density falls off very sharply north of the Thames, while the greatest density is slightly south and east of central London.

Below is the same density but viewed from the south-west.

It is also good to consider a contour map of the density as well. The numbers on the contours are the estimated number of impacts per square kilometer. (Clarke's analysis was based on quarter-kilometer impact data.)

Clarke's analysis was focused on the central area of higher density here and with finer geographic coordinates. Within that area his analysis found no evidence of clustering that cannot be accounted for by a Poisson process.  From the wider geographic perspective presented here, the greatest density of the attack is well defined, with impacts rising and then falling sharply along a south-east to north-west axis, roughly the flight-path of the V-1s coming from France, and falling off a bit more slowly along the south-west to north-east axis. This suggests that the range of the V-1 was somewhat more accurately controlled than was the "windage" from left to right of the flight path.

The statistician F. N. David also analyzed these impact data during the war, applying a bivariate normal distribution to the data. In Clarke's unpublished war-time analysis of 1944 he critiques her modeling, and suggests that a mixture of normals would be a better representation due to frequently changing launch points and aiming points in London.

Sunday, November 29, 2009

Visualizing empires decline

Visualizing empires decline on Vimeo

This is done with Processing, the software we'll use in PS919 this spring.

His blog is interesting:

Discuss: 1) Does geographic area make sense as the measure of size? 2) Do the empires shrink proportionately, or seem to? 3) Should the empires jiggle when the bump each other? 4) and should they bump into one another in the first place? 5) could space be used better?

Saturday, November 14, 2009

Color: The Cinderella of dataviz

Color: The Cinderella of dataviz : Dataspora Blog

Color is so important but so hard to get right. Here is a great introduction with excellent reference links at the end.

Discuss: 1) Why is color so powerful? 2) Why use color when journals only publish grayscale?

Exercise: Convert a good color graphic into an equally good grayscale one.

Sexy Data Geeks

The Three Sexy Skills of Data Geeks : Dataspora Blog


2008 Newspaper Endorsements Map of Newspaper Endorsements for 2008 US Election

The data are on the page.

Challenge: Do better with these data.

Mentionmap, Another Twitter Vis

Mentionmap - A Twitter Visualization

Twitter vis is becoming a major category!

Discuss: 1) What does this accomplish? 2) How does it compare to other Twitter visualizations? (see topic "twitter vis" in the right column.)

Weather Down Under w/ Processing

Flink Labs | Data Visualisation | Flink Weather

Here is an example of what you can do with Processing, one of our foci for the Visualizing Data class.

Discuss: 1) How intuitive is the grid over the outline map? 2) does the grid make more sense after watching it for a while? 3) how do we balance simple graphics that are instantly clear versus complex graphics that repay close study with richer understanding? 4) Note the use of the weather dashboard at the bottom, an example of "sparklines", animation, and multivariate data display.

Conflict History

Conflict History | 1901-1913

All the wars, choose your years.

Discuss: 1) As a visual display of a database, how does this do? 2) What are the advantages of linking it to Google Maps? 3) What are the disadvantages of the ubiquitous use of Google Maps? 4) Is this a case where geography is the right abstraction for visualization? a) it locates conflict in familiar space b) non-spatial locations (time, magnitude of conflict) might be more revealing?

CNN on Data Vis

When it comes to making data sexy, you can't be too graphic -

And who can forget the election 2008 holographic correspondents? It would be more convincing if we saw CNN and other outlets striving to improve their graphics the way the NYT has done.

Discuss: Watch NBC, ABC, CBS, CNN, Fox main news shows and critique the graphics they use. What are informative elements? What are distractions?

Visible Tweets

Visible Tweets – Twitter Visualisations. Now with added prettiness!

Another Twitter visualization.

Discuss: 1) What is value added? 2) How is animation related to the goal of the visualization? 3) How would you do better?

How to Make a Thematic Map

How to Make a US County Thematic Map Using Free Tools | FlowingData

This is a great worked example of both scraping data and turning it into a nice visualization.

These are very important tools you should know.

Discuss: 1) Why learn Python when you could just type the data into a spreadsheet? 2) R does maps too. Which is better? 3) GIS? See Grass for an open source GIS system.

SOUNDS like a good idea

Hearing impaired renders you shortsighted Me, myself and BI

Now and then folks try to integrate sound with visualization. Here is an example.

Discuss: 1) What does the sound add? 2) Regardless of this attempt, what could sound add to a visualization? 3) Does sound produce information overload or provide a new dimension?

Picturing Unemployment

The Jobless Rate for People Like You - Interactive Graphic -

The NYT did this very interesting interactive chart allowing you to see the unemployment rates for various combinations of race, sex, age and education (72 groups in all).

The Wall Street Journal also had an interesting look at unemployment since 1948, but apparently the graphic is no longer available at the WSJ site. Here is a post that features part of the static graphic and offers a few comments.

Discuss: 1) What else might the NYT chart offer as options? 2) Given the CPS data used by NYT, what else might you do to visualize unemployment? 3) The WSJ graph is unusual because it converts a timeseries into a rectangular grid. What is gained and lost by this?



Many beautiful visualizations here. Some of the off-site links are also amazing.

Web Trend Maps's Top Trending Links - Web Trend Map

Check out some of the Featured Maps from the top menu, one of which is InfoVis.

Discuss: 1) What determines the spatial layout here? How would you do it? 2) Is Twitter's sequential display of the timeline a flaw? Would two (or more) dimensions help organize the timeline?

Vis and Africa

Infostate of Africa

Data outside the OECD is ripe for the picking. Find. Visualize. Publish!

20 Useful Visualization Libraries

20 Useful Visualization Libraries | A Beautiful WWW

Some of these are linked under Vis Resources here, but others are not and browsing through them provides lots of stimulation as well as potential tools you might want to know and use.

NYT: Charles M. Blow

Charles M. Blow - Columns - The New York Times

Charles Blow is the NYT's "Visual Op-Ed Columnist". His work appears every other Saturday. The pieces combine a graphic with text (sounds vaguely familiar). Blow's bio is also interesting as an example of a career path.

There is also a web Q&A with Blow here.

Discuss: 1) Do text and graphic work together? Are they both integral to the point? Could you get the point from either (in which case, why text or why graphic)? 2) Do Blow's graphics provide new insights into the data? 3) What do you think of the aspect ratio used in most of the graphics (tall and skinny)?

NYT: Naming Names

Naming Names - Interactive Graphic -

The NYT offers this visualization of the use of names in the presidential primary debates prior to the Iowa caucus.

This is an example of a Circos plot, a visualization tool originally developed for genomic data. The Circos website has numerous examples and details.

The basic idea has now been imported to many relational data. One example uses a computer security data, and offers a technical discussion of "how did he do that" as well.

Discuss: 1) How effective is this for directed paths? 2) Does the density of paths clearly illustrate who is a primary target? 3) Why does the NYT pick the quotes it does and why not others? 4) What would you add as features of this graphic?

NYT Graphics Director Steve Duenes

Steve Duenes -- Talk to the Newsroom -- The New York Times -- Reader Questions and Answers - New York Times

Newspaper graphics have often been the target of derision, with USAToday providing rich fodder for what not to do. The NYT has taken an aggressive role in improving its graphics content.

This online Q&A with the Times' graphics director is quite interesting.

Discuss: 1) How do NYT graphics measure up these days? 2) What proportion of their graphics are data visualizations and what proportion are merely illustrations? 3) How are they leveraging animation or interactive graphs on the web? 4) Does this translate to print? (Compare the same story on web and in print.) 5) Which graphics present "analysis" and which are merely "visual description"? Is there a difference?