Teaching visualization

This is Jessica. I had been teaching visualization for years, to computer science students, informatics students, and occasionally journalism students, and recently overhauled how I do it curriculum-wise, including to focus a little more on ‘visualizations as model checks’ and visualization for decision making.

Previously I had been working from an outline of topics I’d gotten from teaching with my postdoc mentor Maneesh Agrawala, which had originated from Pat Hanrahan’s course at Stanford and I think included some materials from Jeff Heer. It was a very solid basis and so beyond adding some topics that weren’t previously represented (visualizing uncertainty, visualization to convey a narrative), I hadn’t really messed with it. But there was one issue, which seems common to lots of visualization course designs I’ve seen, which is that around midway through the quarter we’d leave the foundational stuff like grammar of graphics and effectiveness criteria behind and start traversing the various subareas of visualization focused on different data types (networks, time series, hierarchy, uncertainty) etc. Every time I taught it it felt like about midway through the course, I’d watch the excitement the students got when they realized that there are more systematic ways to think about what makes a visualization effective sort of fade as the second half of the course devolved into a grab bag of techniques for different types of data. 

The new plan came about when I was approached through my engineering school to work with an online company to develop a short (8 week) online course on visualization, which is now open and running multiple times a year. I agreed and suddenly I had professional learning designers interested in trying to identify a good curriculum, based on my guidance and my existing course materials and other research I sent their way. Sort of a power trip for a faculty member, as there is nothing quite like having a team of people go off and find the appropriate evidence to back up your statements with references you’d forgotten about.

Anyway, I am pretty happy with the result in terms of progression of topics: 

The purpose of visualization.  Course introduction. Covers essential functions of visualizations as laid out in work on graph comprehension in cognitive psych (affordances like facilitating search, offloading cognition to perception/freeing working memory), as well as an overview of the types of challenges (technical, cognitive, perceptual) that arise. I have them read a paper by Mary Hegarty and the Anscombe paper where he uses plots to show how regression coefficients can be misleading.

Data fundamentals and the visualization process. How visualization can fit into an analytical workflow to inform what questions get posed, what new data get collected, etc., and the more specific “visualization pipeline’, i.e. the process by which you translate some domain specific question into a search for the right visual representation. Also levels of measurement (how do we classify data abstractly in preparation for visualizing data) and basic tasks by data types, i.e. what types of questions can we ask when we visualize data of different dimensions/typess.

Visualization as a language. Introduction to Jacque Bertin’s view, where visual representations are rearrangeable, with different arrangements exposing different types of relationships, not unlike how words can. Characterizing visual or “image space” in terms of marks and encoding functions that map from data to visual variables like position, size, lightness, hue, etc. Semantics of visual variables (e.g., color hue is better for associating groups of data, position naturally expresses continuous data).  The grammar of graphics as a notational system for formalizing what constitutes a visualization and the space of possible visualizations. The idea of, and toolkits for, declarative visualization and why this is advantageous over programmatic specifications.

Principles for automated design. The problem of finding the best visualization for a set of n variables where we know the data types and relative importance of them, but must choose a particular set of encodings from the huge space of possibilities. Heuristics/canonical criteria for pruning and ranking alternative visualizations for a given dataset (taken from Jock Mackinlay’s canonical work on automated vis): expressiveness (do the encoding choices express the data and only the data?), effectiveness (how accurately can someone read the data from those encodings?), importance ordering (are the more accurate visual channels like position reserved for the most important data?)  Graphical perception experiments to learn about effectiveness of different visual variables for certain tasks (e.g, proportion judgment), pre-attentiveness, how visual variables can interact to make things harder/easier, types of color scales and how to use them. 

Multidimensional data analysis and visualization. The challenges of visually analyzing or communicating big multidimensional datasets. Scatterplot matrices, glyphs, parallel coordinate plots, hierarchical data, space-filling/space-efficient techniques, visualizing trees, time series (ok it gets a little grab baggy here).   

Exploratory data analysis. Returns to the idea of visualization in a larger analysis workflow to discuss stages of exploratory analysis, iterative nature (including breadth and depth first aspects), relation to confirmatory analysis and potential for “overfitting” in visual analysis. Statistical graphics for examining distribution, statistical graphics for judging model fit. The relationship between visualizations and statistical models (the students do a close read of Becker, Cleveland, and Shyu). 

Communication and persuasion. Designing visualizations for communication. Narrative tactics (highlighting, annotation, guided interactivity, etc.) and genres (poster-style, step-through, start guided end with exploration, etc.). Visualization design for communication as an editorial process with rhetorical goals. The influence of how you scale axes and other encodings, how you transform/aggregate data on the message the audience takes away (there are lots of good examples here from climate change axis scaling debates, maps being used to persuade, etc.) 

Judgment and decision making. Sources of uncertainty (in data collection, model selection, estimation of parameters, rendering the visualization) and various arguments for why it’s important to be transparent about them. Ways to define and measure uncertainty. Heuristics as tactics for suppressing uncertainty when making decisions from data. Techniques for visualizing uncertainty, including glyph-based approaches (e.g. error bars, box plots), mappings of probability to visual variables (e.g., density plots, gradient plots), and outcome or frequency-based encodings (animated hypothetical outcomes, quantile dotplots, ensemble visualization). Evaluating the effectiveness of an uncertainty visualization.  

Naturally there are activities throughout, and some applied stuff with Tableau. I didn’t really have a choice about, but I like Tableau among GUI visualization tools so that was fine for the purposes of this online version (I’ll probably still be teaching D3.js or maybe observable in my CS course). Also there are some interviews with guest experts, including Steve Franconeri on perceptual and cognitive challenges, Dominik Moritz on technical challenges, Danielle Szafir on perception, and Jen Christiansen on conveying a narrative while communicating uncertainty. 

I’m especially happy I was able to work in content that isn’t as typical to cover in visualization courses, at least in the computer science/informatics paradigm I’m used to. This includes talking more about the relationships between statistical models and visualization, the need for awareness of where we are in larger (inferential) data analysis process when we use visualization, the way axis scaling influences judgments of effect size. And decision making under uncertainty has become a relatively big portion as well, along with a more explicit than usual discussion of visualization as a language and the grammar of graphics. 

There are some major omissions – since this course is aimed more at analysts than the typical computer science student and I had a limited amount of space, the idea of interaction and how it augments visual analysis is implied in specific examples but not directly discussed as a first order concern. I usually talk about color perception more. And then there’s the omission of the laundry list of data types/domains: text, networks, maps, but I certainly won’t miss them. 

There are ways it could be further improved I think. When I talk about communication it would be good to bring in more observations from people who are experts on this, like data journalists. I recall Amanda Cox and others occasionally talking about how interactivity can fail, how mobile devices have killed certain techniques, etc. Relatedly, more on responsive visualization for different display sizes could be useful. 

Also I would love to also connect the model check idea more specifically to the communication piece. I did this in a paper once, where I basically concluded that if you see a visualization as a model check, then you can communicate better if you design it to specify the model rather than making the audience work hard to figure that out. But I think there’s a lot more to say about the subtle ways graphs convey models and influence what someone concludes by suggesting, for instance, what a plausible or important effect is. This is implicit in some materials but could be discussed more directly.

PS: This course is publicly available (it’s distinct from the course I teach in computer science at Northwestern, enrollment info is here).

35 thoughts on “Teaching visualization

  1. 1
    don’t read the pretentious men like Tufte; read Naomi Robbins wonderful little book instead
    https://www.nbr-graphs.com/

    2
    If you have time series data, or a simalar plot with> 3 variable, coding the variable by color is bad; use solid line, dashed line etd

    3
    cartograms at the state or county level ,eg when talking about us elections u not using a cartogram u not doing it right

    4
    shouldn’t have to say this,not just units on the axis, but clear simple easy to understand units

    • Ezra:

      Naomi Robbins is great but no need to put down Tufte! I don’t agree with everything he’s written, but he’s done some great things, not least to put beautiful statistical graphs on the agenda.

      • +1
        I realized that data visualization is a serious business after attending Tufte’s (paid) seminar around 2008. It was a great presentation! I later disregarded some of the stuff he wrote but he really got me interested in data viz much like Taleb got me interested in probability (after I read Fooled by Randomness).

    • On criticizing Tufte, I think some people miss some of his more valuable teachings by focusing on specifics. Let’s take his experiments with maximizing data ink. I frequently find criticism about the results of the bar chart experiments in his original text, for example, claiming his final version not easier to read. I believe it doesn’t matter whether that particular graph was improved or not, and that’s not what he’s really trying to say. The value is in understanding we can, and should, experiment.

      • Ssp:

        It’s complicated. Tufte has written some beautiful books and had important insights with lots of good advice. Some of his advice, such as the data-ink thing, have been overstated. But it could be that Tufte’s ultra-confident presentation has been important in persuading people to think seriously about graphics. That is, it might be hard to separate the effectiveness of his approach from the occasional overconfidence.

        I feel about Tufte kind of like how I feel about Strunk and White: They’re all great: we should value them for their strengths and their huge positive impact. We should also recognize their weaknesses but not let that overwhelm our judgment of their more important strengths.

        • 1. I think people tend to ignore that Tufte intentionally included “within reason” in Visual Display. It’s true that most people find his final revision of the standard plots kind of obscurantist, but that’s fine–his concept of “within reason” is different than most people’s. The point is that by exploring along the lines of the general principles, you’ll find improvements until you don’t. With each plot that he redesigned, there was some midpoint along his series of revisions that I found to be a massive improvement over the original standard.

          2. I also think a lot of people ignore the context in which he wrote. The 80s and 90s were the heyday of corporate nonsense visualization, with 3D bar charts and animated scene transitions in every PowerPoint presentation. There’s still a lot of bad, but a lot has gotten better, and I think pressing the point of interrogating for every glyph “why draw this” helped us to get here. Modern computer graphics have also changed a lot–pure white on a screen is not the same “neutral” or “blank” as blank space on a white page, so literal “data ink ratio” doesn’t make sense anymore. Visual Display came out 24 years before the iPhone; if you situate the writing that way and understand what he was getting at, you can get a lot more out of it.

  2. You said the part on multidimensional data viz gets grab-baggy. I’m curious: Why don’t you approach that with the same focus on principles and visualization-as-language that you do for 1D/2D data? Is it that there’s less research on this, or is it just that some things just need to get covered? (Or something else, of course!)

    • That’s a good point. It’s not entirely unprincipled, ie much of that module is organized around different ways of addressing challenges that you run into when you have higher dimensional or otherwise complex data: use layout more creatively (SPLOMs or parallel coordinate plots), switch to space-filling approaches, etc. But it’s not as principled I guess which is why I called it grab baggy. There’s a certain usefulness to making students aware of various techniques that are less widely used in the public sphere but can go a long way for certain datasets, even if the only way to relate them to each other is, ‘this technique makes this task easier’. Parallel coordinate plots are a good example – they can seem scary to people at first, but for exploring data with lots of measures (in interactive form) they can be the best thing.

      • I would love to see/be pointed to some examples of plots of high-dimensional (esp categorical) data that carefully explain *what new insights we gained from this particular approach to plotting the data*. In my experience many of the examples basically arrive at “we can plot Titanic/wine/olive oil/ with (parallel coordinates/mosaic/Chernoff faces/etc.), see, it looks cool”. I think I would except ordination techniques (PCA etc.), but that’s really just dimension *reduction* …

        • Yes, that section of the course involves a lot of emphasis on ‘you should not use this if you’re data doesn’t have these properties.’ Treemaps are a good example – they should be mostly avoided as far as I’m concerned unless you really care about seeing proportion within a reasonably complex hierarchy.

          For some of these it was itneresting to try to figure out how to make them in Tableau… some of the online tutorials on how to make things like parallel coordinate plots or treemaps or horizon charts in Tableau use datasets that I would not personally endorse these techniques for.

          As an aside, among visualization researchers there’s sort of an understanding that the field went through some periods where the hot thing was to create complicated new (often circular layout) ways to show data, many of which are of marginal usefulness. I think/hope the field has moved past that.

        • > see, it looks cool
          I once pointed out to students the possible confounding of what is considered a good visualization versus what was a good story. To me, that does seem to be a concern not to be overlooked.

      • Can’t each element of the grab bag be motivated by the earlier principles? For example, for a multidimensional dataset, we want to

        1. Explore a potential variate-covariate regression model
        2. Minimize the visual distortion inherent to projections onto a 2D plane
        3. Be able to express and evaluate many potential relationships quickly or simultaneously

        This leads naturally to the matrix of scatterplots, which itself motivates the grammar of graphic’s “frame” concept.

        This reminds me of when I was taught about data structures and algorithms. We were taught to care about

        1. Average asymptotic runtime complexity
        2. Worst case asymptotic runtime complexity
        3. Cache locality
        4. Memory consumption

        Then looking at various sorting algorithms, we could motivate each in terms of their relative strengths and weaknesses. Like Heapsort does well at 1 & 3, but has terrible cache locality, while quicksort does well at 2-4, and merge sort does well at 1-3 but uses lots of memory. They would also each motivate the data structures we had been previously taught.

  3. Great post! I think maybe the most valuable thing about my own teaching of visualisation is to put aside a lot of time to let the students do graphs on data without too detailed specification what to do, and then to discuss what they come up with; optimally also the students contribute to discussing other people’s visualisations and learn to look critically and to learn from what others do even if it is problematic. I think they learn most when they see how others misinterpret their graphs, or how they realise why an idea that somebody else has used is either great or doesn’t work for their own understanding of the data.

    • Yes, definitely. My first assignment is to give them a dataset that presents some typical challenges — data on different/skewed scales, multiple quantitative and categorical variables, some hierachy– and ask them to create a single visualization that captures what they think is the most important structure or ‘takeaways’ in the data (and it needs to be something that someone else can interpret without having them there to explain it). They have to discuss them in small groups, submit the feedback they got along with any critiques they want to give themselves in retrospect, and I give them a chance to revise to regain points. It’s a great way to make things real early in the class.

      Sadly for this online version I couldn’t control the assignments as much as I’d like. I got the feeling that for this type of online course (which I won’t actually run myself) the learning designers felt there needed to be more hand-holding with the assignments. That was a little hard for me to take in that I think being put on the spot to create something, even if you end up feeling like you failed, is the best way to learn.

  4. Jessica – why is the course being taught in Tableau? As a regular reader of this blog, I was expecting something like R, Python, or similar.

    Just curious.

    • That was out of my control – the online learning company that asked me to do this had determined from their research that a visualization course that involved Tableau would sell best among the professional continuing education crowd.

      In the visualization course I teach at Northwestern, I’ve traditionally always taught D3 because it’s a lot more expressive than R especially for interactive vis. I have been thinking about switching to Observable though.

  5. Both on the links I click on ask for registration.

    I would be interesting in seeing what you have done with Jacque Bertin.

    Prior to enrolling in a biostatistics Msc I had been involved in Semiotics. My advisor asked me to read Jacque Bertin’s book as John Tukey had told him it was important for understanding graphical methods. I read it but could not discern it’s value.

    A year or so later, I met Tukey and asked him about. He said no that was not what I meant, Bertin is good for choices of visual variables like position, size, lightness, hue, etc. (or something like that) not for understanding graphical methods.

    • > Bertin is good for choices of visual variables like position, size, lightness, hue, etc.

      I agree with that. In a series of exercises, after I introduce those ideas through Bertin I juxtapose the visual variables with various graphs and have students deconstruct and construct those visuals using Bertin’s language.

    • Yes, it’s MOOC style so to access everything for the online course you have to register though I’m happy to share specific sets of notes on request.

      Bertin is nuanced and sometimes gets lost in translation. He is best known for theorizing about visual variables and making observations about how they interact with cognition, which led him to group visual variables by ‘levels of perception.’ All this prescribes how to choose them based on properties of the data you want to visualize. It’s remarkable how little the foundation he laid out has been altered over decades of research since then. But he also had some vision around aspects of graphics, like the role of interaction … he talked about the “mobility of the image” and was creating paper-based reorderable matrices (Bertin matrices), which involved thinking about slightly more than just visual variables, like how to scale/normalize variables to best see patterns.

  6. I’ll probably still be teaching D3.js or maybe observable in my CS course

    I think there’s really nothing like D3.js for low level flexibility and motivated gog design, but I have a general problem fitting JS based visualization into my statistical workflow. They don’t play very nicely with R, Python, or Julia, making it hard to really compose with statistical modeling beyond OLS, and in particular trying to pipe data in without uploading to some FTP server somewhere is a huge pain. I’m also not comfortable working with potentially sensitive data in Observable due its tight integration with a web platform, but doing everything in D3 results in a lot of boilerplate. Have you found a way to work nicely with D3.js in R markdown or jupyter notebooks or is that unnecessary?

    • I haven’t, and there’s a disconnect there… they use R or Tableau or python for instance in my CS course to do exploratory data analysis but then use D3 to create interactive visualizations. The interactive visualizations are meant to be more like a product they create after exploring some multidimensional data and thinking about what sorts of questions / comparisons a domain expert might want to have access to in an interactive visualization.

      • There is a lot of good and thought-provoking material in your post. Thanks!

        You write that your CS students “use R or Tableau or python…to do exploratory data analysis”. Why not use a dedicated interactive graphics tool like Martin Theus’s Mondrian? The advantages are that it is fast, flexible, efficient, and simple to learn—since it is only interactive. The disadvantage is that it is a separate piece of software. But EDA is a quite different activity from other parts of the analysis workflow, so maybe it is not such a bad idea to keep it distinct and separate.

        Confession: Writing my book “Graphical Data Analysis with R” I generally used Mondrian first to decide which graphics to draw for particular datasets.

        • I know of Mondrian but had forgotten about it; it’s too bad some of the proper statistical graphics tools haven’t been as widely adopted as Tableau, which some of the students come in having already used. If nothing else I could see it being useful to incorporate an exercise in my undergrad course where students use Mondrian and compare to Tableau, since I expect that would help them recognize more easily some of Tableau’s weak spots for certain types of statistical graphics (it shouldn’t be so tedious to generate a histogram!) and interactions (brushing and linking is possible but since plots by default end up in different worksheets it’s not automatic).

      • > use D3 to create interactive visualizations

        I’ve seen D3 stuff and worked with it a little and it’s neat, but I don’t think of D3 as something particularly accessible. For instance, it’s not even clear to me how to load files into D3 without running a webserver and D3 being just a part of a bigger pipeline.

        Is there a particular type of workspace that you’re using here to make it more accessible or is it down to elbow grease?

  7. Jessica,

    This is a great post. I teach data visualization at a bootcamp, so I have a professional interest in this content, and I have a couple of questions about your syllabus. I know this post has been up for a few days before I got around to reading it, so I hope I’m not too late to ask them.

    1. Can you explain why you think principles for automation is an important topic for these students to learn? I can see why it would be something to cover in a university-level CS course, but what do you think the value is for the audience of this course?

    2. Do you think it makes sense to cover exploratory data analysis as a topic on its own? I know why it has historically been considered separately, but I haven’t found that distinction to be so useful myself. I try to emphasize a process that uses questions to guide design choices in the visualization. In the exploratory phase of a project, you have different questions of course, but I do not think it calls for a different process. But I could be wrong about this.

    • On your first question, I believe that the best way to understand how to approach some problems as humans is to think about automating them. Visualization design is one such problem. The principles for automation module is essentially about instilling in the students that visualization design can be formalized as the problem of searching through a design space composed of all possible combinations of visual encodings that could be applied to the set of n variables you want to plot (where you know the rank of variables by importance) for the best visualization. Mackinlay’s expressiveness and effectiveness criteria (from his 1986 paper on APT) are relatively easy to understand ways of operationalizing ‘best visualization.’ Teaching students to think about vis design this way, while not complete (ie not all good design boils down to choosing the most effective encodings) is probably the most valuable thing I can teach them. Automation implies thinking systematically.

      On 2, I would honestly probably prefer to teach a class on exploratory data analysis, and have visualization be a subpart of it! I realize that’s atypical thinking among people who teach visualization, but this comes from my view that ‘visualization’ should really be thought of more expansively as any type of interface or summary that helps augment reasoning about data. Also that visualizations have to be designed with some acknowledgment of where in a larger analysis process they come in (e.g., a visualization for communicating something to others can be very different than one you make for yourself to understand data, although as I suggested in the post, I think that the idea of visualizations as model checks can be somewhat of a unifier between analytical and communicative uses). The idea that visualizations are for spotting patterns/identifying the unexpected/generating hypotheses can be pretty pervasive, so I like providing some of the history of where it came from. Also I think its important for students to understand that there’s some murkiness when it comes to how well EDA and CDA can really be distinguished in many cases. I got asked by a student in my CS course once, Can you look at data too much? She had read some things related to the replication crisis and was confused about how to think about visually exploring data as a result. So I try to give students some framework for thinking about why someone might worry about that.

      • Thanks for those considered responses.

        Now I see where you are coming from on 1. I do not frame it as automation, but I also draw upon Mackinlay work as providing an approach for reducing the search space when making visualizations, so that makes a lot of sense to me.

Leave a Reply

Your email address will not be published. Required fields are marked *