---
title: "vignette_pkggraph"
author: "Srikanth KS"
date: "`r Sys.Date()`"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: false
      smooth_scroll: false
    number_sections: true
vignette: >
  %\VignetteIndexEntry{vignette_pkggraph}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
abstract: The package [`pkggraph`](https://cran.r-project.org/package=pkggraph) is meant to interactively explore various dependencies of a package(s) (on CRAN like repositories) and perform analysis using [tidy](http://tidyverse.org/) philosophy. Most of the functions return a [`tibble`](https://cran.r-project.org/package=tibble) object (enhancement of `dataframe`) which can be used for further analysis. The package offers functions to produce [`network`](https://cran.r-project.org/package=network) and [`igraph`](https://cran.r-project.org/package=igraph) dependency graphs. The `plot` method produces a static plot based on [`ggnetwork`](https://cran.r-project.org/package=ggnetwork) and `plotd3` function produces an interactive D3 plot based on [`networkD3`](https://cran.r-project.org/package=networkD3).
---

# Quickstart

```{R}
suppressPackageStartupMessages(library("dplyr"))          # for tidy data manipulations
suppressPackageStartupMessages(library("magrittr"))       # for friendly piping
suppressPackageStartupMessages(library("network"))        # for plotting
suppressPackageStartupMessages(library("sna"))            # for plotting
suppressPackageStartupMessages(library("statnet.common")) # for plotting
suppressPackageStartupMessages(library("networkD3"))      # for plotting
suppressPackageStartupMessages(library("igraph"))         # for graph computations
suppressPackageStartupMessages(library("pkggraph"))       # attach the package
suppressMessages(init(local = TRUE))                      # initiate the package
```

```{R, eval = TRUE }
get_neighborhood("mlr") # a tibble, every row indicates a dependency
# observe only 'Imports' and reverse 'Imports'
neighborhood_graph("mlr", relation = "Imports") %>% 
  plot()
# observe the neighborhood of 'tidytext' package
get_neighborhood("tidytext") %>% 
  make_neighborhood_graph() %>% 
  plot()
# interact with the neighborhood of 'tm' package
# legend does not appear in the vignette, but it appears directly
neighborhood_graph("tm") %>% 
  plotd3(700, 700)
# which packages work as 'hubs' or 'authorities' in the above graph
neighborhood_graph("tidytext", type = "igraph") %>% 
  extract2(1) %>% 
  authority_score() %>% 
  extract2("vector") %>% 
  tibble(package = names(.), score = .) %>% 
  top_n(10, score) %>% 
  ggplot(aes(reorder(package, score), score)) + 
    geom_bar(stat = "identity") +
    xlab("package") +
    ylab("score") +
    coord_flip()
```

# Introduction

> The package `pkggraph` aims to provide a consistent and intuitive platform to explore the dependencies of packages in CRAN like repositories.

The package attempts to strike a balance between two aspects:

 - Understanding characteristics of the repository, at repository level (relating to 'forest')
 - Discover relevant packages and their contribution (relating to 'trees')

So that, we do not *see trees for the forest* nor *see only a forest* !

# Important Features

The important features  of `pkggraph` are:

- Most functions return a three column `tibble` (`pkg_1`, `relation`, `pkg_2`). The first row in the table below indicates that `dplyr` package 'Imports' `assertthat` package.

```{R, eval = TRUE}
get_imports(c("dplyr", "tidyr"))
```

- There are three function families:
    - **get** family: These functions return a `tibble`. ex: `get_reverse_depends`
    - **neighborhood** family: These functions return a `pkggraph` object containing a `network` or a `igraph` object. ex: `neighborhood_graph`
    - **relies** family: These functions capture recursive dependencies.
  
- `plot` method which uses `ggnetwork` package to generate a static plot.

- `plotd3` function uses `networkD3` to produce a interactive D3 plot.

The five different types of dependencies a package can have over another are: `Depends`, `Imports`, `LinkingTo`, `Suggests` and `Enhances`.

# `init`

Always, begin with `init()`. This creates two variables `deptable` and `packmeta` in the environment where it is called. The variables are created using local copy or computed after downloading from internet (when `local = FALSE`, the default value). It is suggested to use `init(local = FALSE)` to get up to date dependencies.

```{R, eval = FALSE}
library("pkggraph")
init(local = FALSE)
```

The `repository` argument takes CRAN, bioconductor and omegahat repositories. For other CRAN-like repositories not listed in `repository`, an additional argument named `repos` is required.

# `get` family

- These functions return a `tibble`
- All of them take `packages` as their first argument.
- All of them take `level` argument (Default value is 1).

```{R, eval = TRUE}
get_imports("ggplot2")
```

Lets observe packages that 'Suggest' `knitr`.

```{R, eval = TRUE}
get_reverse_suggests("knitr", level = 1)
```

By setting `level = 2`, observe that packages from first level (first column of the previous table) and their suggestors are captured.

```{R}
get_reverse_suggests("knitr", level = 2)
```

> What if we required to capture dependencies of more than one type, say both `Depends` and `Imports`?

## `get_all_dependencies` and `get_all_reverse_dependencies`

These functions capture direct and reverse dependencies until the suggested level for any subset of dependency type.

```{R}
get_all_dependencies("mlr", relation = c("Depends", "Imports"))
get_all_dependencies("mlr", relation = c("Depends", "Imports"), level = 2)
```

Observe that `ada` 'Depends' on `rpart`.

Sometimes, we would like to capture only specified dependencies recursively. In this case, at second level, say we would like to capture only 'Depends'  and 'Imports' of packages which were dependents/imports of `mlr`. Then, set `strict = TRUE`.

```{R, eval = TRUE}
get_all_dependencies("mlr"
                     , relation = c("Depends", "Imports")
                     , level    = 2
                     , strict   = TRUE)
```

Notice that `ada` was 'Suggest'ed by `mlr`. That is why, it appeared when `strict` was `FALSE`(default).

> What if we required to capture both dependencies and reverse dependencies until a specified level?

## `get_neighborhood`

This function captures both dependencies and reverse dependencies until a specified level for a given subset of dependency type.

```{R, eval = TRUE }
get_neighborhood("hash", level = 2)

get_neighborhood("hash", level = 2) %>% 
  make_neighborhood_graph %>% 
  plot()
```

Observe that `testthat` family appears due to `Suggests`. Lets look at `Depends` and `Imports` only:

```{R, eval = TRUE }
get_neighborhood("hash"
                 , level = 2
                 , relation = c("Imports", "Depends")
                 , strict = TRUE) %>% 
  make_neighborhood_graph %>% 
  plot()
```

Observe that the graph below captures the fact: `parallelMap` 'Imports' `BBmisc`

```{R, eval = TRUE }
get_neighborhood("mlr", relation = "Imports") %>% 
  make_neighborhood_graph() %>% 
  plot()
```

`get_neighborhood` looks if any packages until the specified level have a dependency on each other at one level higher. This can be done turned off by setting `interconnect = FALSE`.

```{R, eval = TRUE }
get_neighborhood("mlr", relation = "Imports", interconnect = FALSE) %>% 
  make_neighborhood_graph() %>% 
  plot()
```

# `neighborhood_graph` and `make_neighborhood_graph`

- `neighborhood_graph` creates a graph object of a set of packages of class `pkggraph`. This takes same arguments as `get_neighborhood` and additionally `type`. Argument `type` defaults to `igraph`. The alternative is `network`.

```{R, eval = TRUE }
neighborhood_graph("caret", relation = "Imports") %>% 
  plot()
```

`make_neighborhood_graph` accepts the output of any `get_*` as input and produces a graph object.

> Essentially, you can get the information from `get_` function after some trial and error, then create a graph object for further analysis or plotting.

```{R, eval = TRUE}
get_all_reverse_dependencies("rpart", relation = "Imports") %>% 
make_neighborhood_graph() %>% 
  plot()
```

# Checking dependencies and `relies`

For quick dependency checks, one could use infix operators: `%depends%`, `%imports%`, `%linkingto%`, `%suggests%`, `%enhances%`.

```{R, eval = TRUE}
"dplyr" %imports% "tibble"
```

A package `A` is said to *rely* on package `B` if `A` either 'Depends', 'Imports' or 'LinkingTo' `B`, *recursively*. `relies` function captures this.

```{R, eval = TRUE}
relies("glmnet")[[1]]
# level 1 dependencies of "glmnet" are:
get_all_dependencies("glmnet", relation = c("Imports", "Depends", "LinkingTo"))[[3]]
"glmnet" %relies% "grid"
reverse_relies("tokenizers")[[1]]
```

# `plot` and its handles

`plot` produces a static plot from a `pkggraph` object. The available handles are:

- The default: The node size is based on the number of 'in' and 'out' degree.

```{R, eval = TRUE}
pkggraph::neighborhood_graph("hash") %>%
  plot()
```

- Let node size depend on 'in' degree alone and white 'background':
```{R, eval = TRUE}
pkggraph::neighborhood_graph("hash") %>%
  plot(nodeImportance = "in", background = "white")
```

- Without variable node size and white 'background':
```{R, eval = TRUE}
pkggraph::neighborhood_graph("hash") %>%
  plot(nodeImportance = "none", background = "white")
```

# `plotd3`

For interactive exploration of large graphs, `plotd3` might be better than static plots. Note that,

- By holding the mouse over a vertex, highlights all related nodes and edges.
- By clicking a vertex and dragging it, changes the way graph looks and gives a better view of related 'cluster'.

```{R, eval = TRUE}
# legend does not appear in the vignette, but it appears directly
plotd3(neighborhood_graph("tibble"), height = 1000, width = 1000)
```

# Acknowledgement

> Package authors Srikanth KS and Nikhil Singh would like to thank `R` core, Hadley Wickham for tidyverse framework and the fantastic `R` community! 

- Please write to the maintainer (gmail at sri.teach) for suggesting new ideas!
- Bug reports go here: https://github.com/talegari/pkggraph/issues

----
