Week 47 | vegaliteR

This Week’s Widget - vegaliteR

When vega was first released, I spent a lot of time exploring its potential. Then ggvis sort of eliminated the need to learn the internals of vega. Then ggvis stalled, and vega got lots of new features (mainly frp interactivity), and the team at University of Washington Interactive Data Lab went hyper-productive releasing the vega derivatives voyager, polestar, compass, and vega-lite. After listening to Arvind Satyanarayan on the PolicyVis podcast, I decided that a vega-lite htmlwidget would help get me caught up.

This week’s widget vegaliteR is the result of that exercise. Currently, it doesn’t come close to fulfilling the checklist for a good htmlwidget, but this very literal interpretation will require learning the easy vega-lite schema. With some iteration, I hope to make it much more R-like.

Installation

This is not on CRAN, so to install we will need some help from devtools::install_github.

devtools::install_github("timelyportfolio/vegaliteR")

data.frame into a list of lists

Most good htmlwidgets handle this bit for us, but like I said in the introduction, this first release is a very literal interpretation, so the code might look a little strange with nested lists and data as an array of objects (or dataframe="rows" in jsonlite). For those familiar with rCharts we deliberately avoided all this, since it can be a little bewildering for an R user. However, there is certainly no harm in straddling the line between R and JavaScript. Plus, learning some new techniques with purrr and rlist might even be fun.

Let’s start by looking at 3 ways in R we can make a data.frame into a list of lists or an array of objects.

with apply

apply seems easy, but doesn’t really work since apply coerces to array or matrix, so we lose class. We can make it work, but I’m lazy, so let’s skip it.

unname(apply(mtcars, MARGIN=1, as.list))

with rlist

rlist has lots of list helpers, including list.parse.

library(rlist)

# without rownames
unname(list.parse(mtcars))

# with rownames
unname(list.parse(
  data.frame(name=rownames(mtcars),mtcars,stringsAsFactors = FALSE)
))

with purrr

purrr provides by_row. Let’s see how we can use it.

library(purrr)

# without rownames
by_row(mtcars,as.list)$.out

# with rownames
by_row(
  data.frame(name=rownames(mtcars),mtcars,stringsAsFactors = FALSE),
  as.list
)$.out
)

Examples

Now that we know three different ways of converting a data.frame to a list of lists, let’s make some vega charts. To make it easy, we’ll store our converted data.

# remove . from colnames
colnames(swiss) <- gsub(
  x = colnames(swiss),
  pattern = "\\.",
  replacement = ""
)
swiss_list <- unname(rlist::list.parse(
  data.frame(village=rownames(swiss),swiss,stringsAsFactors = FALSE)
))

Scatter Plot

#devtools::install_github("timelyportfolio/vegaliteR")
library(vegaliteR)

vegalite(
  list(
    data = list(values = swiss_list),
    marktype = "point",
    encoding = list(
      x = list(field = "Fertility", type = "Q"),
      y = list(field = "InfantMortality", type = "Q")
    )
  )
)

vega-lite also makes aggregation easy.

#devtools::install_github("timelyportfolio/vegaliteR")
library(vegaliteR)
library(pipeR)

lapply(
  swiss_list,
  function(x){
    c(
      majority_catholic = if(x[["Catholic"]]>50) "yes" else "no",
      x
    )
  }
) %>>%
  (
    list(
      data = list(values = .),
      marktype = "square",
      encoding = list(
        y = list(field="Fertility",type="Q",aggregate="mean"),
        x = list(field="majority_catholic",type="nominal"),
        color = list(field="majority_catholic",type="nominal")
      ),
      config = list(
        largeBandWidth = 60
      )
    )
  ) %>>%
  vegalite()

Line Chart

For a line chart, we simply change marktype = "line". As a not best-practice example, we can plot Agriculture as a function of Education.

#devtools::install_github("timelyportfolio/vegaliteR")
library(vegaliteR)
library(pipeR)

swiss_list %>>%
  (
    list(
      data = list(values = .),
      marktype = "line",
      encoding = list(
        y = list(field="Agriculture",type="Q"),
        x = list(field="Education",type="Q")
      )
    )
  ) 

Bar Chart

We can easily make our aggregate scatter example into a bar chart by changing the marktype.

#devtools::install_github("timelyportfolio/vegaliteR")
library(vegaliteR)
library(pipeR)

lapply(
  swiss_list,
  function(x){
    c(
      majority_catholic = if(x[["Catholic"]]>50) "yes" else "no",
      x
    )
  }
) %>>%
  (
    list(
      data = list(values = .),
      marktype = "bar",
      encoding = list(
        y = list(field="Fertility",type="Q",aggregate="mean"),
        x = list(field="majority_catholic",type="nominal"),
        color = list(field="majority_catholic",type="nominal")
      ),
      config = list(
        largeBandWidth = 60
      )
    )
  ) %>>%
  vegalite()

Stacked Bar Chart

For a facetted stacked bar, we can recreate the vega-lite stacked bar.

#devtools::install_github("timelyportfolio/vegaliteR")
library(vegaliteR)
library(pipeR)
library(purrr)

data(barley, package="lattice")

by_row(barley,as.list)$.out %>>%
  (
    list(
      data = list(values = .),
      marktype = "bar",
      encoding = list(
        x = list(field="yield",type="Q",aggregate="sum"),
        y = list(field="variety",type="nominal"),
        color = list(field="site",type="nominal")
      )
    )
  ) %>>%
  vegalite()

More

That’s probably enough for this post. I’ll try to make vegaliteR more R-like as I get time and the API stabilizes. For more vega-lite examples, see the vega-editor.

Happy Thanksgiving to those in the US.

Thanks

Thanks University of Washington Interactive Data Lab for their hyper-productive efforts with vega and all its derivatives.

As always, thanks to

  • Ramnath Vaidyanathan and RStudio for htmlwidgets
  • all the contributors to R and JavaScript