Hive plots and hairballs

Posted on 02/05/2011 by


Earlier, I covered Hierarchical Edge Bundles as a neat way of simplifying the presentation of network relations.  Another way of simplifying visualization of large social networks that I found today are Hive Plots by Martin Krzywinski.  The problem for him is that representations of social networks tend to get bogged down in uselessly complex linear relationships:

Conventional network visualization is unsuitable for visual analytics of large networks. So-called hairballs earn their moniker by becoming impenetrably complex as your network grows. They are least effective when visualization is most needed — for large networks…More critically, data in conventional visualizations is subordinate to layout — node and edge positions and lengths depend as much on the layout algorithm, as on the data.

(The problem)

His solution is to position nodes based only on the meaningful properties of the network that you (rather than an random algorithm) decide:

If the layout shows a pattern, you can be sure it is due to structure in the underlying data and not on the layout algorithm’s interpretation of how the data should best be shown.

You start with this:

and get this:

Like parallel Coordinates plots, there is a periodic element to the graph; however, as Krzywinski points out, the Hive Plot wraps around (and I think it represents data much more effectively in certain contexts).

(Via Kovcomp)

It is elegant, simple, and efficient.  A variation is presented below using World Economic Forum data:

(A variation of the Hive Plot using World Economic Forum data. Via

I actually see the problem in network representations as their tendency to get so complex that you have to resort to looking at their accompanying texts.  Although Krzywinski provides an accompanying key, you don’t have to resort to looking at the data table unless you want a more detail explanation.  Otherwise, you can draw out lots of relevant information.  I was thinking about it this way: in a univariate OLS regression plot, you see directionality and slope, and intuitively draw conclusions about correlation and causation (don’t lie to yourself, yes you do).  Although the type of information being presented here is different, it offers an example of what I mean by visual data compression: by thinking about just the various ways of working with shapes, you actually get a more efficient encoding that can relay more information better and faster.  Thanks Krzywinski!