- Lose the base, connect the dot, and confuse the message
- More power brings more responsibility
- Light entertainment: others' idea of fun
- Rotating circle, loose ends, on-line dashboards and charts
Reader Jack S. sent over this chart (link): The first problem readers encounter with this image is "What is MMI?" I like to think of any presentation as a set of tearout pages. Even if the image is part of a book, or part of a deck of slides, once it is published, the writer should expect readers from tearing a sheet out and passing it along. In fact, you'd love to have people pass along your work. This means that when creating a plot such as this, the designer must explain what MMI is in the footnote. Yes, on every chart even if every chart in the report deals with MMI. MMI, I'm told, is some kind of metric of health care cost. *** What a mess. They are trying to use the metaphor of "measuring one's temperature", which I suppose is cute because MMI measures health care costs. Next, the designer chose to plot the index against the national average as opposed to the dollar amount of MMI. This presents a challenge since the thermometer does not have a natural baseline number. This is especially true on the Fahrenheit scale used in the U.S. Then, a map is introduced to place the major cities. The bulb of each thermometer now doubles as a dot on the map. This step is mind-boggling because the city labels aren't even on the map. So if you know where these cities are, you don't need the map for guidance but if you don't know the locations, you're as hopeless as before. How the data now gets onto the complex picture requires some deconstruction. First, start with a bar chart of the relative index (the third column of the table shown above). Then, chop off the parts below 85 (colored gray). Next, identify the cities that are below the national average (i.e. index < 100) and color them blue. You can see this by focusing only on the chart above the map. In other words, this part: To get from here to the version published, add a guiding line from each bar to the dot on the map for the corresponding city. Notice that a constant-length portion of each bar has been chopped off, and now each bar is augmented by some additional length that varies with the distance of the bar chart from the geographical location of the city as shown on the map below. For instance, Miami, which is furthest south, has the biggest distortion. *** The choice of 85 as a cutoff is arbitrary and inexplicable. If we really want to create a "cutoff" of sorts, we can use 100, which represents the national average. By plotting the gap between the city index and the national index, effectively, the percent difference, we also can use the sign of the difference to indicate above/below the national average, thus saving a color. *** One of the most telling signs of a failed chart is the appearance of the entire data set next to the chart. That's the essence of the self-sufficiency test.
Nick C. on Twitter sent us to the following chart of salaries in Major League Soccer. (link) This chart is hosted at Tableau, which is one of the modern visualization software suites. It appears to be a user submission. Alas, more power did not bring more responsibility. Sorting the bars by total salary would be a start. The colors and subsections of the bars were intended to unpack the composition of the total salaries, namely, which positions took how much of the money. I'm at a loss to explain why those rectangles don't seem to be drawn to scale, or what it means to have rectangles stacked on top of each other. Perhaps it's because I don't know much about how the cap works. Combined with the smaller chart (shown below), the story seems to be that while all teams have similar cap numbers, the actual salaries being paid could differ by multiples. *** This is the standard stacked bar chart showing the distribution of salary cap usage by team: I have never understood the appeal of stacking data. It's not easy to compare the middle segments. After quite a bit of work, I arrived at the following: The MLS teams are divided into five groups based on how they used the salary cap. Salary cap figures are converted into proportion of total cap. For example, the first cluster includes Chicago, Los Angeles, New York, Seattle and Toronto, and these teams spread the wealth among the D, F, and M players while not spending much on goalie and "others". On the other hand, Groups 2 and 3, especially Group 3 allocated 30-45% of the cap on the midfield. Three teams form their own clusters. CLB spends more of its cap on "others" than any other team (others are mostly hyphenated positions like D-F, F-M, etc.) DAL and VAN spend a lot less on midfield players than other teams. VAN spends a lot on defense. My version has many fewer data points (although the underlying data set is the same) but it's easier to interpret. *** I tried various chart types like bar charts, and even pie charts. I still like the profile (line) charts best. In a modern software (I'm using JMP's Graph Builder here), it's only one click to go from line to bar, and one click to go to pie.
This chart (link) I think it's a line chart, not an area chart.
There is a tendency when producing dashboards to go for the cutesy-cutesy. Reader Daniel L. came across an attempt by Facebook to document its data center metrics (link). They chose this circular, spiraling design: Notice that the lines of equal distance on a circular plot are the concentric circles. Thus, when they connect different points in a continuous way, as if it were a standard line chart, the line segments between data points are distorted. The diagram below shows the problem: One potential advantage (although not worthwhile) of wrapping the data into a circle is that the 24 hours become a continuous line. Except that it isn't the case here! Weirdly, the purple and blue lines show a huge discontinuity at the ray that points vertically upwards from the origin. This leads to an even more fascinating find. The circle actually rotates! It's like a rotating restaurant. The time shown vertically pointing upwards keeps changing as I write this post. This makes the discontinuity even more baffling. You'd think the previous data point just shifts anti-clockwise but apparently not. If any of you can figure this out, please leave a comment. *** As Daniel pointed out, the traditional line charts shown in the bottom half of the page would have done the job with less fuss. Not as eye-catching, but not as baffling either. *** One innovation of on-line charts is the replacement of axis labels with mouse-over effects. Mousing over the chart here produces the underlying data values. This is elegance. One horrible trend with on-line charts is the horrendous choice of scale. Look at the top two charts, especially the orange line chart about power usage. It makes no sense to choose a scale that completely annihilates the underlying fluctuations. I have found the same problems with many Google charts. It looks as if nothing is happening except when you look more closely, you learn that a tiny distance represents a big percentage shift in the underlying data.