Recursion in R

Working in d3.js often requires recursion with nested data. Recursion in R can be tricky.

Flaw? in rapply

Much of the difficulty of recursion in R stems from a “flaw” in rapply.

# make some simple nested data

(nested <- list(
  id = "A",
  children  = list(
    list(
      id="A.1",
      children=list(
        list(id="A.1.a",size=10),
        list(id="A.1.b",size=5)
      )
    ),
    list(
      id="A.2",
      children=list(id="A.2.a",size=200)
    )
  )
))
## $id
## [1] "A"
## 
## $children
## $children[[1]]
## $children[[1]]$id
## [1] "A.1"
## 
## $children[[1]]$children
## $children[[1]]$children[[1]]
## $children[[1]]$children[[1]]$id
## [1] "A.1.a"
## 
## $children[[1]]$children[[1]]$size
## [1] 10
## 
## 
## $children[[1]]$children[[2]]
## $children[[1]]$children[[2]]$id
## [1] "A.1.b"
## 
## $children[[1]]$children[[2]]$size
## [1] 5
## 
## 
## 
## 
## $children[[2]]
## $children[[2]]$id
## [1] "A.2"
## 
## $children[[2]]$children
## $children[[2]]$children$id
## [1] "A.2.a"
## 
## $children[[2]]$children$size
## [1] 200

When we rapply (recursive lapply), we quickly discover the “flaw”.

rapply(nested, function(x){str(x);NULL})
##  chr "A"
##  chr "A.1"
##  chr "A.1.a"
##  num 10
##  chr "A.1.b"
##  num 5
##  chr "A.2"
##  chr "A.2.a"
##  num 200
## NULL

We lose all attributes, such as names and class. Without attributes, selectively choosing elements is impossible. For instance, what if we wanted to collect and sum size?

Solution?

For fuller featured tree functionality, please check out data.tree and GeneralTree. Often though we might not want to fool with a full-featured tree structure.

As I worked on the d3r package, the best solution I found is to lapply within rapply as shown below in the recurse function.

recurse <- function(l, func, ...) {
  l <- func(l, ...)
  if(is.list(l) && length(l)>0){
    lapply(
      l,
      function(ll){
        recurse(ll, func, ...)
      }
    )
  } else {
    l
  }
}

If we wanted to perform our collect size task, now we can do it like this.

# assumes only leaves contain size
#   or otherwise we would have to change traversal order
sum_size <- function(l){
  if(is.list(l) && length(l)>0 && "children" %in% names(l)){
    # unlist
    ul <- unlist(l$children)
    sum_size <- sum(as.numeric(unlist(
      ul[grep(x=names(ul),pattern="size")],
      use.names=FALSE
    )))
    l$size <- sum_size
    l
  } else {
    l
  }
}

Another Difficult Task

For one more example, what if we wanted to rename an element in a nested list? partykit gives us kids instead of the generally expected-by-d3.js children. These d3r lines use the recurse function to accomplish this renaming.

rename_children <- function(l, old_name="kids", new_name="children") {
  if(length(names(l))>0) {
    names(l)[which(names(l)==old_name)] <- new_name
  }
  l
}

Correct Me

I understand my ignorance is greater than my knowledge. Please correct and improve anything from above.