7 min read

Week03 - More Networks

Week03 - More Networks

htmlwidgets News This Week

@smartinsightsfromdata has started two new widgets over the past couple of weeks.

-rd3pie - d3pie wrapper. This should make interactive pie charts much easier than my experiment from a couple months ago with d3pie + rCharts.

-rpivotTable - pivottable.js wrapper. pivottable.js from @nicolaskruchten is both nicely done and very active. I had pivottable.js at the top of my list for a widget of the week. I look forward to helping @smartinsightsfromdata push this into a full-fledged and documented htmlwidget assuming he accepts my offer.

@rich-iannone

@rich-iannone is unstoppable. He continues to do amazing things with DiagrammeR. Be sure to check out all the new documentaion, examples, and functionality.

This Week’s Widget - hierNetwork for networkD3

Unfortunately, this week’s widget is still alpha. hierNetwork is not a standalone widget. Rather, it aims to add some new visualization types and interactivity to networkD3. Given the popularity of htmlwidgets, I think packaging/curating multiple widgets in a single R package will be very valuable. These are my early thoughts on packaging htmlwidgets.

  1. type of visualization
  2. field of study or domain of application
  3. similar Javascript dependencies
  4. symbiotic features
  5. type of parameters/inputs, such as diagram in DiagrammeR for both mermaid and graphviz
  6. interest of the package author in support and ongoing maintenance

networkD3 from @christophergandrud and @jjallaire started back in August just one month after the first commit of htmlwidgets. It is clearly one of the first ever htmlwidgets.

hierNetwork all started with this issue requesting a phylogram implementation in networkD3 from existing code. I thought a phylogram would be fun, so I started down that path and got an early working version. However, a phylogram is really a specialized dendrogram, so I thought additional dendrogram-type functionality in networkD3 might be more generally useful.

I still plan to work on phylograms, so if anyone is interested, please let me know. For now, let’s have a quick look at networkD3 and the new hierNetwork().

What Does it Do?

The visualizations from networkD3 offer some really beautiful chart options for all sorts of network data. It even has a as.treeNetwork function for easy conversion of hclust and dendrogram class to the very un-R-like nested data structure preferred by d3. I’ll prove how easy it is with two lines of code.

Two Lines of Code - Existing Niceness

library(networkD3)

# use this simple hclust example from
#  http://rpubs.com/gaston/dendrograms
hc = hclust(dist(mtcars))

treeNetwork( 
  as.treeNetwork(hc, "mtcars")
)

It doesn’t get much easier than that, but d3 provides multiple hierarchical layout, and I’m sure each of you has an opinion on which is your favorite. Also, a little pan/zoom and collapse interactivity makes us happy. Since the hard part of data conversion is mostly already done by networkD3 I went to work and started building this from scratch (well copy/paste from basically everywhere). Fortunately, before I got too far along, I discovered this reusable d3.chart hierarchy layout d3.chart.layout. It is dang near perfect for this and gives us a whole lot.

Let’s take the same hclust that we named hc from above and look at each of the diagram types.

More Layouts with hierNetwork

Since this is not yet officially part of networkD3, to reproduce you will need to use devtools::install_github.

devtools::install_github("timelyportfolio/networkD3@feature/d3.chart.layout")
library(htmltools) 

tagList(
  lapply(
    c("tree.cartesian"
      ,"tree.radial"
      ,"cluster.cartesian"
      ,"cluster.radial"
    )
    ,function(chartType){
      hierNetwork( as.treeNetwork(hc), type = chartType, zoomable = T, collapsible = T )
    }
  )
)

tree.cartesian

tree.radial

cluster.cartesian

cluster.radial

For those of you wondering if there is a difference between cluster and tree, collapse some of the nodes. If you still don’t see it, cluster extends to the outermost level ( much like hang in R when plotting dendrograms ) while tree does not. Also, I hope you enjoyed the pan/zoom functionality, which is enabled by zoomable=T and collapsible=T. I admit it is hard to know that this interaction is available.

Glitzier Layouts

Now on to some “glitzier” layouts that unlike the above layouts require a size or value. These layouts in d3.chart.layout hold the most promise, but also have the most room for improvement. I’ll increase the complexity of the code by visualizing a rpart on the diamonds dataset from ggplot2. If you only care about the visualizations, you can safely skip the data preparation below. If you care about the rpart/partykit piece of this, please let me know since I am very interested in testing, improving, and iterating this, but need some help.

library(rpart)
library(partykit)
library(rlist)
library(pipeR)

#set up a little rpart as an example
# using data from ggplot2 diamonds dataset

data("diamonds",package="ggplot2")

rp <- rpart(
  price ~ carat + cut + color + clarity + depth + table + x + y + z
  ,method = "anova"
  ,data = diamonds
  ,control = rpart.control(minsplit = 2)
)

rpk <- as.party(rp)

## get meta information
rpk.text <- capture.output( print(rpk) ) %>>%
  ( .[grep( x = ., pattern = "(\\[)([0-9]*)(\\])")] ) %>>%
  strsplit( "[\\[\\|\\]]" , perl = T) %>>%
  list.map(
    tail(.,2) %>>%
      (
        data.frame(
          "id" = as.numeric(.[1])
          , description = .[2]
          , stringsAsFactors = F )
      )
  ) %>>% list.stack

# binding the node names from rpk with more of the relevant meta data from rp
# i don't think that partykit imports this automatically for the inner nodes, so i did it manually
rpk.text <- cbind(rpk.text, rp$frame)

# rounding the mean DV value
rpk.text$yval <- round(rpk.text$yval, 2)

# terminal nodes have descriptive stats in their names, so I stripped these out
# so the final plot wouldn't have duplicate data
rpk.text$description <- sapply(strsplit(rpk.text[,2], ":"), "[", 1)


dat = rapply(rpk$node,unclass,how="replace")
#fill in information at the root level for now
#that might be nice to provide to our interactive graph
dat$info = rapply(
  unclass(rpk$data)[-1]
  ,function(l){
    l = unclass(l)
    if( class(l) %in% c("terms","formula","call")) {
      l = paste0(as.character(l)[-1],collapse=as.character(l)[1])
    }          
    attributes(l) <- NULL
    return(l)
  }
  ,how="replace"
)

dat = jsonlite::toJSON(
  dat
  ,auto_unbox = T
)

# replace kids with children to ease d3
dat = gsub( x=dat, pattern = "kids", replacement="children")
# change id to node to ease d3; will replace with name later
dat = gsub ( x=dat, pattern = '"id":([0-9]*)', replacement = '"name":"node\\1","size":nodesize\\1' )
# calling the root node by the dataset name, but it might make more sense to call it
# "root" so that the code can be generalized
dat = sub (x = dat, pattern = "node1", replacement = "diamonds")
# replacing the node names from node1, node2, etc., with the extracted node names and metadata from
# rpk.text, and rp$table. 
for (i in 2:nrow(rpk.text)) {
  dat = sub (
    x = dat
    , pattern = paste("node", i, sep = "")
    , replacement = paste(
      rpk.text[i,2]
      , ", mean = ", rpk.text[i,7]
      , ", n = ", rpk.text[i,4]
      , sep = ""
    )
    , fixed = T
  )
  dat = sub (
    x = dat
    , pattern = paste("nodesize", i, sep = "")
    , replacement = rpk.text[i,4]
    , fixed = T
  )
}

# replace size of root or node1
dat = sub (
  x = dat
  , pattern = "nodesize1"
  , replacement = rpk.text[1,4]
  , fixed = T
)

You can probably see why data conversion might possibly be the most valuable part of an htmlwidget. Now let’s visualize with our other layouts from d3.chart.layout. I intentionally did not use the common flare.json dataset that drives nearly all examples of this type. If you want to see these layouts with this dataset, check out the d3.chart.layout examples. Also, I intentionally left these charts in a raw form, so please don’t dismiss these because the labelling isn’t right.

hN <- hierNetwork( jsonlite::fromJSON(dat), zoomable = T, collapsible = T )
# fromJSON does not translate well so manual override
hN$x$root = dat
lapply(
  c("pack.nested"
    ,"pack.flattened"
    ,"partition.arc"
    ,"partition.rectangle"
    ,"treemap"
  )
  ,function(chartType){
    hN$x$options$type = chartType
    return(hN) 
  }
)

pack.nested

pack.flattened

partition.arc

partition.rectangle

treemap

Direction I’d Like to Head

As a demonstration of what we can achieve with these type layouts with a little more work, please see these examples.

  1. pack.nested from Bill White example
  1. pack.flattened from Mike Bostock example
  1. partition.arc from Kerry Rodden example
  1. treemap from Solomon Kahn on the U.S. budget example

Thanks

Thanks so much for all the work by