Tuesday, October 14, 2008

Visualizing trees

We recently had a community summit at NESCent on directions for the future. As part of the biodiversity and phylogenetics breakout group (see notes here), one thing that came up was a need for better ways of visualizing trees [one good idea was inviting Ben Fry to NESCent to work on this]. Rod Page suggested putting tree visualization in TreeTapper as a category, which is a great idea, as there are dozens of programs (see partial list at Felsenstein's site). But why, with so many programs, do people feel that so much more work is needed?

Well, based on behavior, it's obvious that current solutions don't work. NESCent has a fairly sophisticated set of users and builders of trees. When they need to view a large tree (hundreds to thousands of taxa), they don't use any sort of cool tree stretching or zooming program -- they find an old Mac, open Paup in Classic, and print out a tree over multiple pages, which they then assemble using tape and scissors. I think the reason they do this comes down to resolution of paper versus monitors (see some of Edward Tufte's books for a more general and informed discussion of this). My 1920 x 1200 pixel giant Apple Cinema display monitor (a perk loaned to all NESCent postdocs) can display fewer than 600 distinguishable horizontal lines (one pixel thick with one pixel between them). Our laser printer has a resolution of ~1200 DPI, suggesting that it could print this many lines in about one inch (I might be slightly off if there are some sort of constraints on dot geometry, but the basic idea still holds). By this calculation, my entire monitor display can be reproduced pixel by pixel in a few square inches. Plus, a printed tree can be arbitrarily large (Michael Donoghue described one several feet in diameter). Speed of visually parsing such a tree is related to how quickly the eye can move around a page: focus closely on one section to read the taxon names, jump to look at the overall picture, etc. On a screen, one would have to move the mouse around and wait for the screen to update. Even with tremendous zoom, there are only 1200 vertical positions my cursor can occupy (assuming cursor resolution == screen resolution) -- a tree larger than this, and there's no way to display even a cartoon of the whole tree on one screen in such a way that moving the mouse can select just one taxon (other than a nesting of zooms within zooms). In contrast, even on a 11" high piece of paper, with half inch margins and default line spacing, I can print a column of 155 taxon names (using 4 point Times font, about the size used on insect labels), all easily readable. Just a few pieces of paper provide much more resolution than possible on a costly monitor. It's similar to the comparison between a paper map and a GPS navigation system (or Google Maps/Earth): in a given area, the paper map has much more detail. The advantage of the navigation system (besides the whole navigation bit) is that it allows you to zoom in and out for an unlimited amount of information. For looking at a tree, though, as for visualizing a trip, seeing the entire thing with a great amount of detail rather than zooming in and out constantly can be much easier.

One solution is to have even higher resolution displays (see Mike Sanderson's wall of monitors [taken from his web page] at the end of this post, for example). I think that there will be limits to this, and we might have to wait for other fields to advance first. Instead, it might be worthwhile to work on better ways to print out large trees on tiled pieces of paper. Imagine software that takes a set of bootstrap trees and can create a PDF you can print out and stitch together showing support values and branch lengths, perhaps with reconstruction of characters in color at the nodes, too. It seems appallingly low-tech and yet rather useful. As far as I know, Classic Paup is the only program that can do this, though Mesquite has some options for tree printing that might allow this [this will be much easier to know once this section of TreeTapper's database has been filled in]. The downside is that this doesn't help much the issue of displaying trees in papers, but there, perhaps some summary graphic would work better.

image from http://loco.biosci.arizona.edu


MyTerciopelo said...

I don't know if you are already familiar with this, but there is a method of printing called rasterbation (I know, its a weird name), and it can basically take any picture and distribute it over any nxn number of pages. Here is a link if you're interested http://homokaasu.org/rasterbator/
I'm not sure how well this would actually work, because it usually pixelizes the image. However, that may be adjustable.

Brian O'Meara said...

Thanks for the link. Looks like it might be a nice solution.