Week 04 | Interactive Parallel Coordinates

htmlwidgets News This Week

@kbroman starts exploring htmlwidgets.

-@kbroman - test htmlwidgets conversion of the scatterplot from his qtlcharts

@rich-iannone

-@rich-iannone - DiagrammeR hits version 0.4 with full shiny support and graphviz, gets a webpage (thanks JJ Allaire), and becomes the first htmlwidget to surpass 100 stars on Github.

This Week’s Widget - parcoords | Interactive d3.js Parallel Coordinates


Real quickly before I get to this week’s widget, I would like to strongly encourage anyone who likes any of the widgets to let me know through stars/forks on Github, twitter, comments on blog, or posts/examples/rpubs using the widget. It helps me prioritize which to keep alive and improve and also motivates me to continue building more.

By far, one of my favorite reusable d3 charts is the parallel-coordinates project by @syntagmatic. Its unique combination of SVG and canvas quickly digests even large (big for me) datasets while still maintaining easy but powerful interactivity for exploring the data at a high level, inspecting more granular visible features, and determining additional areas for inquiry. I also just think there is something really cool and compelling about it.

What Does it Do?

This paper Ranking Visualization of Correlation using Weber’s Law presented at InfoVis 2014 compares multiple chart types for their effectiveness in conveying correlation. As seen in the image from the paper below, parallel coordinates charts rate very highly, especially in negative correlation relationships.

image from parallel coordinates paper

Ok, enough discussion let’s move on to the examples, since I think most speak for themselves.

Simple Example

# not on CRAN, so install with devtools
#  devtools::install_github("timelyportfolio/parcoords")
library(parcoords)

data(mtcars)

# just plain old stock parallel coordinates
#   don't worry its easy to make a whole lot better
parcoords( mtcars )

If you are familiar with MASS function parcoord, I’m sure you are not that impressed, so we’ll turn on some interactivity in the next example.

Brush On and Reorder

Two of interactive behaviors provided by parallel-coordinates are a brush to filter data and reorderable axes. Try both of these in this revised version of the above example.

parcoords(
  mtcars
  , brush = "1d-axes" # 2d-strums are really neat
  , reorderable = TRUE
)

Queue Render for Bigger Data and Some Color

parallel-coordinates handles bigger data well since the lines are drawn on the canvas. Set queue = T to progressively render the chart on build and interaction. Here is one example and another from the parallel-coordinates site. The diamonds dataset from ggplot2 will give us big enough data to test the progressive render and prove that we are in R. I’ll sample the diamonds to reduce the size of the HTML, but feel free to remove the sample and use the entire dataset on your local machine.

data( diamonds, package = "ggplot2" )
parcoords(
  diamonds[sample(1:nrow(diamonds),1000),]
  , rownames = F # turn off rownames from the data.frame
  , brushMode = "2D-strums"
  , reorderable = T
  , queue = T
  , color = RColorBrewer::brewer.pal(4,"BuPu")[4]
)

I hope you noticed the fancy new brush type. I would improve the chart by using color to represent the data. Let’s try a little more complicated color specification using the color helper from d3.scale.category10(). For even more fun let’s put it in a dplyr chain.

library(dplyr)
diamonds[sample(1:nrow(diamonds),1000),] %>%
  mutate( carat = cut(carat, breaks=c(0,1,2,3,4,5), right = T)) %>%
  select( carat, color, cut, clarity, depth, table, price,  x, y, z) %>%
  parcoords(
    rownames = F # turn off rownames from the data.frame
    , brushMode = "2D-strums"
    , reorderable = T
    , queue = T
    , color = list(
      colorBy = "carat"
      ,colorScale = htmlwidgets::JS("d3.scale.category10()")
    )    
  )

While we are on the topic of color, I wanted to highlight the fine work by Gregor Aisch with chroma.js and palettes. We could use Jeroen Oom’s new V8 for this, but for now, let’s stick with what we have in R and d3.js using a d3.scale.threshold with colors supplied by RColorBrewer::brewer.pal.

diamonds[sample(1:nrow(diamonds),1000),] %>%
  select( carat, color, cut, clarity, depth, table, price,  x, y, z) %>%
  parcoords(
    rownames = F # turn off rownames from the data.frame
    , brushMode = "2D-strums"
    , reorderable = T
    , queue = T
    , color = list(
      colorBy = "carat"
      ,colorScale = htmlwidgets::JS(sprintf('
        d3.scale.threshold()
          .domain(%s)
          .range(%s)
        '
        ,jsonlite::toJSON(seq(0,round(max(diamonds$carat))))
        ,jsonlite::toJSON(RColorBrewer::brewer.pal(6,"PuBuGn"))
      ))
    )
  )

Unfinished

Well, I hope that gives you enough to get started and gauge your interest. This is unfinished with lots of potential for helpful refinements. Please let me know if you would like me to push ahead.

Thanks

Thanks so much for all the work by