% \documentclass[article]{jss}
\documentclass[nojss]{jss}

%\VignetteIndexEntry{An R package for Flexible Graphical Assessment of Experimental Designs}
%\VignetteDepends{vdg, AlgDesign, rsm}
%\VignetteKeyword{experimental design, fraction of design space, variance dispersion graph, scaled prediction variance}
%\VignetteEngine{knitr::knitr}
%\VignettePackage{vdg}

\usepackage{setspace}
\usepackage{amsmath, amsthm}
\usepackage{amsfonts}
\usepackage[font={rm,md,it}]{subfig}
%\usepackage{algpseudocode}
% \usepackage[authoryear,round,longnamesfirst]{natbib}
% \usepackage{Sweave}
\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{graphicx}
\newcommand{\trace}{\mathrm{tr}}
\newcommand{\VR}{VR}
\newcommand{\FDS}{FDS}

\emergencystretch=1em

\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontshape=sl,xleftmargin=2em}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{xleftmargin=2em}
\DefineVerbatimEnvironment{Scode}{Verbatim}{fontshape=sl,xleftmargin=2em}
\fvset{listparameters={\setlength{\topsep}{0pt}}}
\renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%% almost as usual
\author{Pieter C. Schoonees\\Rotterdam School of Management, \\Erasmus University \And
        Ni\"el J. Le Roux\\University of \\Stellenbosch
        \And Roelof L. J. Coetzer\\Group Technology,\\Sasol South Africa }
\title{Flexible Graphical Assessment of Experimental Designs in \proglang{R}: The \pkg{vdg} Package}

%% for pretty printing and a nice hypersummary also set:
\Plainauthor{Pieter Schoonees, Ni\"el le Roux, Roelof Coetzer} %%comma-separated
\Plaintitle{Flexible Graphical Assessment of Experimental Designs in R: The vdg
Package}
%% without formatting
\Shorttitle{The \pkg{vdg} package} %% a short title (if necessary)

%% an abstract and keywords
\Abstract{
This is an adapted version of a paper submitted to the \emph{Journal of Statistical Software} \citep{schooneesJSS}.\\
The \proglang{R} package \pkg{vdg} provides a flexible interface for producing various graphical summaries of the prediction variance associated with specific linear model specifications and experimental designs. These methods include variance dispersion graphs, fraction of design space plots and quantile plots which can assist in choosing between a catalogue of candidate experimental designs. Instead of restrictive optimization methods used in traditional software to explore design regions, \pkg{vdg} utilizes sampling methods to introduce more flexibility. The package takes advantage of \proglang{R}'s modern graphical abilities via \pkg{ggplot2} \citep{ggplot2}, adds facilities for using a variety of distance methods, allows for more flexible model specifications and incorporates quantile regressions to help with model comparison.
}
\Keywords{experimental design, fraction of design space plot, variance dispersion graph, scaled prediction variance}
\Plainkeywords{experimental design, fraction of design space plot, variance dispersion graph, scaled prediction variance} %% without formatting
%% at least one keyword must be supplied

%% publication information
%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
%% \Volume{13}
%% \Issue{9}
%% \Month{September}
%% \Year{2004}
%% \Submitdate{2004-09-29}
%% \Acceptdate{2004-09-29}

%% The address of (at least) one author should be given
%% in the following format:
\Address{
    Pieter C. Schoonees\\
    Department of Marketing Management\\
    Rotterdam School of Management, Erasmus University\\
    3062PA Rotterdam, The Netherlands\\
    E-mail: \email{schoonees@rsm.nl}\\
    URL: \url{http://www.rsm.nl/}\\
    
    Ni\"el J. le Roux\\
    Department of Statistics and Actuarial Science\\
    University of Stellenbosch\\
    Stellenbosch, South Africa\\
    E-mail: \email{njlr@sun.ac.za}\\
    URL: \url{http://academic.sun.ac.za/statistics}\\
    %URL: \url{}
    
    Roelof L.J. Coetzer\\
    Group Technology, Sasol South Africa\\
    Sasolburg, South Africa\\
    E-mail: \email{roelof.coetzer@sasol.com}\\
}
%% It is also possible to add a telephone and fax number
%% before the e-mail in the following format:
%% Telephone: +43/1/31336-5053
%% Fax: +43/1/31336-734

%% for those who use Sweave please include the following line (with % symbols):
%% need no \usepackage{Sweave.sty}

%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}
<<knitr-setup, include=FALSE,cache=FALSE,echo=FALSE>>=
library(knitr)
opts_chunk$set(comment = NA, size = 'normalsize', prompt = TRUE, highlight = FALSE, 
               cache = TRUE, crop = FALSE, concordance = FALSE, fig.align='center',
               fig.path='paper-figures/Paper-', out.width="0.4\\textwidth", fig.lp = "F:",
               background = "#FFFFFF")
opts_knit$set(out.format = "latex")
knit_hooks$set(crop = hook_pdfcrop)
options(width = 70, prompt = "R> ", continue = "+  ", digits = 3, useFancyQuotes = FALSE)

thm <- knit_theme$get('default')

# Set default font colour (fgcolor) to black
thm$highlight <- "\\definecolor{fgcolor}{rgb}{0, 0, 0}\n
\\newcommand{\\hlnum}[1]{\\textcolor[rgb]{0.686,0.059,0.569}{#1}}%\n
\\newcommand{\\hlstr}[1]{\\textcolor[rgb]{0.192,0.494,0.8}{#1}}%\n
\\newcommand{\\hlcom}[1]{\\textcolor[rgb]{0.678,0.584,0.686}{\\textit{#1}}}%\n
\\newcommand{\\hlopt}[1]{\\textcolor[rgb]{0,0,0}{#1}}%\n
\\newcommand{\\hlstd}[1]{\\textcolor[rgb]{0.345,0.345,0.345}{#1}}%\n
\\newcommand{\\hlkwa}[1]{\\textcolor[rgb]{0.161,0.373,0.58}{\\textbf{#1}}}%\n\\newcommand{\\hlkwb}[1]{\\textcolor[rgb]{0.69,0.353,0.396}{#1}}%\n
\\newcommand{\\hlkwc}[1]{\\textcolor[rgb]{0.333,0.667,0.333}{#1}}%\n
\\newcommand{\\hlkwd}[1]{\\textcolor[rgb]{0.737,0.353,0.396}{\\textbf{#1}}}%"

thm$background <- "#FFFFFF"

thm$fontstyle <- "italic"

knit_theme$set(thm)

## Specific to vignette
options(cl.cores = 1)
@
%% include your article here, just as usual
%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
%\onehalfspace
\section[Introduction]{Introduction}\label{S:Intro}
A challenging problem in experimental design is that of choosing the most
appropriate design from a catalogue of possibilities, given a specific model 
form. The designs available will most likely originate from either traditional
design theory, such as well-known (fractional) factorial or Box-Behnken 
designs, or as algorithmically constructed designs according to optimal design
theory. Optimal designs have a more tailor-made character and are 
constructed by choosing the design points so that the selected design criterion
is optimized. Such optimality criteria are usually referred to by a letter 
of the alphabet, such as in D- or I-optimality for ``determinant'' and
``integrated variance'' optimality respectively \cite[see][for further 
details]{Atk2007}.

Whereas traditional experimental design theory depends mostly on qualitative
aspects of designs to choose the most suitable design, comparisons based on 
the value of the optimality criterion are comparatively simple and convenient.
This is often justifiable because many standard designs are also optimal for 
specific models and design criteria, but in nonstandard design settings this
method can be dangerous. Since an optimality criterion is a single number 
summary of a multidimensional concept, it supplies limited information. For
example, the D-criterion minimizes the generalized variance of the parameter 
estimates for a linear regression model but conveys nothing about the estimation
accuracy of the individual parameters.

Consequently, graphical procedures for design assessment have been developed to
enable researchers to study designs more thoroughly. The main types of 
procedures are the variance dispersion graph (VDG) \citep{Gio1989}, the related quantile
plot (QP) \citep{Khu1996} and the more recent fraction of design space 
(FDS) plot \citep{Zah2003}. These graphs provide a less contrived summary of the
prediction variance properties of an experimental design for a given 
model. The prediction variance is especially important where models are used for
process optimization, such as in response surface methodology (RSM) 
\citep{Mye2009}.

There are currently two similar packages available for \proglang{R} on CRAN, namely \pkg{Vdgraph} \citep{Vdgraph} and \pkg{VdgRsm} \citep{VdgRsm}. \pkg{Vdgraph} provides a front-end for the original \proglang{Fortran} program of 
\cite{Vin1993a,Vin1993b} for constructing VDGs, rewritten and simplified to a subroutine. This package allows only full second-order models to be investigated over hyperspherical design regions, using an optimization strategy which combines a Nelder-Mead simplex search with a grid search. In contrast, \pkg{VdgRsm} employs random sampling to explore design regions. \pkg{VdgRsm} also includes FDS plots. However, \pkg{VdgRsm} allows only for full second-order models, and the user has limited control over graphical output.

In this paper we present and discuss the new \proglang{R} \citep{R} package \pkg{vdg} \citep{vdg}, of which version 1.2.2 is available on the Comprehensive \proglang{R} Archive Network (CRAN). The aims of this package are threefold. First, we aim to offer an implementation of VDGs and FDS plots which harness the modern graphical capabilities of \proglang{R} by utilizing the popular \pkg{ggplot2} plotting system \citep{ggplot2}. Furthermore, \pkg{vdg} offers generalizations of VDGs by borrowing ideas from QPs \citep{Khu1996} and increased flexibility by harnessing the well-known \proglang{R} modelling infrastructure. The latter use of formulae and model matrices greatly increase the scope of models that can be explored. Finally, \pkg{vdg} uses random sampling as a design philosophy for exploring design regions, in contrast to optimization methods employed in traditional VDG software. This allows novel enhancements of VDGs such as allowing for flexible design regions, while also alleviating convergence problems and lack of flexibility often characteristic of traditional software tools in this area.

In the subsequent section the theory of these graphical procedures is briefly
reviewed, and in particular some extensions are offered. This is followed by 
a succinct overview of sampling algorithms available in our package. Finally, the 
\pkg{vdg} package is introduced together with a discussion of its main
features, and several examples are considered.

\section[Graphical design assessment techniques]{Graphical design assessment
techniques and extensions}\label{S:Graph}

Let $\xi_{n}$ denote an $n$-run exact experimental design, and suppose $m$
design variables are being considered. The design runs are gathered in the $n 
\times m$ matrix $\mathbf{X}$. Consider a regression model linear in $p$
parameters that requires the expansion of each design run $\mathbf{x}_{i}\!:m 
\times 1$ into a model vector $\mathbf{f} ( \mathbf{x}_{i} )\!: p \times 1$, and
define $\mathbf{F}\!: n \times p$ as the design matrix with these vectors 
as rows. For example, when  $m = 2$ and $p = 4$, consider the model
\begin{equation*}
\hat{y} = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \beta_{12} x_{1}
x_{2}.
\end{equation*}
 \noindent This model requires the expansion of $\mathbf{x} = (x_{1},
x_{2})^{\top}$ into $\mathbf{f(x)} = (1, x_{1}, x_{2}, x_{1} x_{2})^{\top}$
where 
 $\mathbf{x}^{\top}$ denotes the transpose of the column vector $\mathbf{x}$ . 
The Fisher information matrix $\mathbf{M} (\xi_{n})$ is given by
\begin{equation}\label{E:InfoMat}
    \mathbf{M} \left( \xi_{n} \right) = n^{-1} \mathbf{F}^{\top} \mathbf{F}.
\end{equation}

Letting $\sigma^{2}$ denote the unknown error variance of the regression model,
the variance of the estimated response at some $\mathbf{x}$ in the design 
space $\mathcal{X}$ under ordinary least squares estimation (OLS) is
\begin{equation}\label{E:var}
\VAR( \hat{y} ( \mathbf{x} )) = \sigma^{2} \, \mathbf{f(x)}^{\top} \left(
\mathbf{F}^{\top} \mathbf{F} \right) ^{-1}  \mathbf{f(x)}.
\end{equation}

Furthermore, define the scaled prediction variance (SPV) of the design $\xi_{n}$
for a given model as
\begin{equation}\label{E:spv}
    d(\mathbf{x}, \xi_{n} ) = \sigma^{-2} \cdot n  (\VAR( \hat{y} ( \mathbf{x}
))) = \mathbf{f(x)}^{\top} \mathbf{M}^{-1}( \xi_{n} ) \mathbf{f(x)}.
\end{equation}

Note that in Equations~\ref{E:InfoMat}~--~\ref{E:spv} only regression models that are
linear in the parameters are considered.  Models non-linear in the 
parameters have the property that the Fisher information matrix, introduced
below, is a function of the unknown parameters. In such models the design 
properties also depend on these unknown parameters, and designs for these models
are consequently often called ``locally optimal'' in the optimal design context. In such cases a 
Bayesian approach to optimal design construction, as surveyed in \cite{Cha1995}, is more
natural. This falls outside the scope of \pkg{vdg}. %The SPV is influenced by $n$ and the unknown error variance.

\subsection[Variance dispersion graphs]{Variance dispersion
graphs}\label{SS:VDG}
The VDG of \cite{Gio1989} displays a summary of the SPV of the design $\xi_{n}$ over the design region $\mathcal{X}$. In the 
original formulation, the summary is achieved by concentric hyperspheres
spanning $\mathcal{X}$ and usually centred at the origin. The VDG then
summarizes the SPV distribution by plotting the estimated minimum, average and maximum SPV
on the surface of each hypersphere against the radius of the hypersphere. 
In this way the high-dimensional SPV profile can always be summarized in a
two-dimensional plot of the SPV values against the radius. An example is given in Figure~\ref{F:spv-example}, where $\mathcal{X}$ is spherical with  maximum radius $\sqrt{3}$. Here the solid line represents the mean SPV, and the long and short dashed lines the maximum and minimum SPV respectively. The mean of the SPV distribution over the surface of a hypersphere is also known as the \emph{spherical SPV}. 

<<spv-example, echo=FALSE, fig.width=5, fig.height=5, fig.cap = "An example of a variance dispersion graph.",out.width="0.35\\textwidth", message=FALSE>>=
library("vdg")
data("D310")
set.seed(1)
vdgex <- spv(n = 100000, design = D310, formula = ~.^2, at = TRUE)
plot(vdgex, which = "vdgquantile", tau = c(0, 1))[[1]] + theme(legend.position = "none")
@

Due to the uncertainty regarding where in the design space a prediction will be
required, a stable SPV profile is desirable. A stable SPV is associated 
with a relatively flat VDG and cases where the maximum and minimum SPV curves do
not deviate markedly from the average SPV. This is not the case for the design in Figure~\ref{F:spv-example}, where the SPV increases greatly towards the boundary of $\mathcal{X}$. The minimum and maximum SPV 
allow the analyst to judge the dispersion of the SPV about its average. Note
that for rotatable designs the spherical SPV is constant for any given radius 
$r$, in which case the minimum and maximum SPV curves coincide with the average
SPV curve. Quite clearly, the design in Figure~\ref{F:spv-example} is not rotatable for the specified model. Rotatability can be advantageous as it is often the case that 
large deviations between minimum and maximum SPV occur near the perimeter of the experimental region. This is also the case in our example. Another
obvious advantage of rotatability is that in such cases only the spherical SPV has to be 
computed. For non-rotatable designs the minimum and maximum SPV curves must be
estimated by point-wise numerical optimization schemes, a task whose 
computational intensiveness can be aggravated by the presence of local optima.

The choice of concentric hyperspheres (i.e., radius) to summarize the SPV is related to the choice of Euclidean distance as a metric for summarizing the remoteness of hypersphere from the origin (as in Figure~\ref{F:spv-example}). In practice any appropriate metric can be used. For hypercuboidal design regions using concentric hypercubes are indeed more natural. This relates to the choice of the $L_{\infty}$-norm as metric (also known as supremum distance). We will see in Section~\ref{S:Exa} how \pkg{vdg} relies on the \pkg{proxy} package \citep{proxy} to incorporate a variety of metrics into the construction of VDGs.

Subsequently, the traditional VDG theory is discussed, assuming a hyperspherical design region.

\subsubsection[VDG theory]{VDG theory for hyperspherical design regions}\label{SSS:VDGTh} % NB: assume a spherical  design region - state this somewhere!!
Denote by $U_{r}=\{\mathbf{x} \in \mathcal{X}:\mathbf{x}^{\top}\mathbf{x} =
r^{2}\}$ the \emph{surface} of the hypersphere of radius $r$ centred at the 
origin of the spherical design space $\mathcal{X}$. \cite{Gio1989} define the
spherical SPV at $r$, i.e., the mean SPV over $U_{r}$, as:
\begin{align}\label{E:sphSPV}
    V^{r}   &= \frac{n \Theta}{\sigma^2} \int_{U_{r}} \VAR(\hat{y} ( \mathbf{x}
)) \, d \mathbf{x} \notag\\
            &= n\Theta \trace \{ (\mathbf{F}^{\top} \mathbf{F})^{-1}
\int_{U_{r}} \mathbf{f}(\mathbf{x}) \mathbf{f} (\mathbf{x})^{\top} \, d
\mathbf{x}\} 
            \notag\\
            &= n \trace \{ (\mathbf{F}^{\top} \mathbf{F})^{-1} \mathbf{S} \}.
\end{align}

Here $\Theta^{-1} = \int_{U_{r}} d \textbf{x}$ is the surface area of $U_{r}$
and $\textbf{S}:p \times p$ is the matrix of spherical region moments for 
$U_{r}$. Specifically, the elements of $\textbf{S}$ have the form
\citep{Gio1989}:
\begin{equation}\label{E:sphMom}
    \sigma ( \boldsymbol \delta ) =  \Theta \int_{U_{r}} x_{1}^{\delta_{1}}
x_{2}^{\delta_{2}} \dots x_{m}^{\delta_{m}} d \mathbf{x},
\end{equation}
where $\boldsymbol \delta  = (\delta_{1},\dots, \delta_{m})^{\top}$.

Due to the symmetry of the surface $U_{r}$, $\sigma ( \boldsymbol \delta )$ is
zero whenever any of the $\{ \delta_{i} \}$ are odd. Depending on the model, this 
property often leads to $\textbf{S}$ being quite sparse. Often only a few
elements of $\boldsymbol \delta$ are nonzero, and since the symmetry implies
that 
it does not matter which elements these are, it is notationally more convenient
to characterize the spherical region moments in Equation~\ref{E:sphMom} in terms of 
the values of the nonzero elements. As an example, the two-variable first-order
model $y = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \varepsilon$ has 
a spherical region moment matrix of the form \citep{Gio1989}:
\begin{equation}\label{E:2varS}
\left(
  \begin{array}{ccc}
    1 & 0           & 0 \\
    0 & \sigma_{2}  & 0 \\
    0   & 0         & \sigma_{2} \\
  \end{array}
\right)
\end{equation}
where
\begin{equation}\label{E:sig2}
    \sigma_{2} = \frac{r^{2}}{m}.
\end{equation}

For the model linear in the parameters, explicit equations for the spherical SPV
function (as well as for the minimum and maximum SPV curves) exist 
\cite[see][]{Gio1989}. Note also that the second-order model in addition
requires the following spherical region moments:
\begin{align}
    \sigma_{4}    &= \frac{3r^{4}}{m(m+2)} \label{E:sig4} \\
    \sigma_{22}   &= \frac{r^{4}}{m(m+2)}. \label{E:sig22}
\end{align}

In this case the form of $\textbf{S}$ is more involved \citep[for further
details, see][]{Gio1989}.

\subsubsection[Generalizing the VDG]{Generalizing to more flexible
models}\label{SSS:VDGextend}
However, to consider more general models, or models containing only specific
terms, more general spherical region moments are required. For this purpose the VDG 
theory can be extended as follows. The formula for a general spherical region moment
with all elements of $\boldsymbol \delta$ even is:

\begin{equation}\label{E:spvmom}
    \sigma \left( \boldsymbol \delta \right) = \frac{r^{\mathbf{1}^{\top}
\boldsymbol \delta} \Gamma \left( \frac{m}{2} \right) \prod\limits_{i=1}^{m} 
    \Gamma \left( \frac{\delta_{i} + 1}{2} \right) }{\pi^{\frac{m}{2}} \Gamma
\left( \frac{\mathbf{1}^{\top} \boldsymbol \delta + m}{2} \right) }.
\end{equation}

This formula can be derived from the evaluation of Equation~\ref{E:sphMom} after using
a transformation to polar coordinates \citep[see for example][pp. 
55-56]{Mui1982} together with the properties of beta integrals. Considering the
result in Equation~\ref{E:spvmom}, it is apparent that the spherical moments are 
functions of the radius $r$ and the number of design variables $m$ only. The
higher-order moments dominate the average SPV at large radii since the factor 
$r^{\mathbf{1}^{\top} \boldsymbol \delta}$ in the formula is monotonically increasing. Equation~\ref{E:spvmom} has been implemented in \pkg{vdg} to calculate the mean spherical SPV for any model formula (albeit subject to some restrictions -- see \code{?meanspv} for details).

\subsubsection[Advantages of the random sampling approach to VDGs]{Advantages of the random sampling approach to VDGs}
Note that although Equation~\ref{E:spvmom} facilitates the analytical calculation of
the average spherical SPV curve, the minimum and maximum SPV must still be found by 
pointwise numerical optimization techniques \citep[except for first-order models
-- see][]{Gio1989}. This can be a tedious exercise as typically optimization needs to be done over a grid of radii spanning the design region $\mathcal{X}$, and local optima may be problematic. The \proglang{Fortran} program of \cite{Vin1993a,Vin1993b} uses a combination of grid search and the Nelder-Mead simplex method, which can have erratic behaviour. 

Although \proglang{R} provides a wide variety of other optimization possibilities -- see for example the Optimization Task View on CRAN \citep{ctv} -- using the random sampling approach to explore $\mathcal{X}$ with the computational power available holds many advantages. In fact we argue that there is no need to use the exact minimum or maximum to assess the spread of the SPV.  Instead of using pointwise optimization, \pkg{vdg} can use quantile regression through many randomly sampled points to give the user an impression of how e.g., the 5th and 95th percentiles evolve for different radii. For this we use the excellent \pkg{quantreg} package \citep{quantreg}. At the same time the spread of the SPV at the randomly sampled points still gives an indication of the minimum or maximum SPV.

The form of the design region $\mathcal{X}$ is another important aspect to
consider in the calculation of VDGs. If $\mathcal{X}$ is non-spherical, for example 
cuboidal, Equation~\ref{E:spvmom} does not hold for larger radii $r$ for which only
part of the hypersphere is contained in $\mathcal{X}$. In such a case only the 
part of the concentric hyperspheres contained in $\mathcal{X}$ should contribute
to the minimum, average and maximum SPV curves, and truncation will be 
necessary. The ease with which the sampling approach can allow for these and other nonstandard design regions is an important advantage over the optimization approach.

Furthermore, it is important to note that the maximum and minimum SPV values plotted on the
VDG only give an indication of the range of the spherical SPV. It is quite 
possible that two designs could have similar VDGs but quite different SPV
distributions over the hyperspheres. In such a case, additional knowledge of
the spherical SPV distribution is required, which is readily available when using the random sampling approach. A traditional approach of this flavour is the QPs described in the next Section.

\subsection[Quantile plots]{Quantile plots}\label{SS:QP}
Related to the VDG is the QP of \cite{Khu1996}. QPs are based on the same
principle as VDGs, but present more information about the distribution of the
SPV over the hyperspheres. These plots use the quantiles of the SPV over the
hyperspheres to display more information and can therefore be used to
distinguish between designs with similar VDGs but with different SPV distributions.

<<qp-ex, echo = FALSE, fig.width = 6, fig.height=4.5, fig.cap = "An example of a quantile plot, corresponding to the example in Figure~\\ref{F:spv-example}.",out.width="0.35\\textwidth">>=
set.seed(1)
qpex <- spv(n = 50000, design = D310, formula = ~.^2, at = TRUE, nr.rad = 6)
my_ecdf <- function(x) {
  xs <- sort(x)
  xun <- unique(xs)
  n <- length(xun)
  prop <- rep(NA, n)
  for (i in seq_along(xun))
    prop[i] <- sum(xs <= xun[i]) / n
  return(cbind(x = xun, y = prop))
}
ds <- formatC(sqrt(rowSums(qpex$sample^2)), format = "f", digits = 3)
lst <- split(qpex$spv[-1], f = factor(ds[-1]))
ecd <- lapply(lst, my_ecdf)
df <- as.data.frame(do.call(rbind, ecd))
df$Radius <- factor(ds[-1])
df$Distance <- sqrt(rowSums(qpex$sample^2))[-1]
ggplot(data = df, mapping = aes(y = x, x = y, group = Radius, colour = Radius)) + geom_line(size = 1) + ylab("SPV Quantile") + xlab("Proportion")
@

In its original formulation, the basic idea of the QP is to graph the empirical
cumulative distribution functions (or ecdfs) of the SPV over each hypersphere 
for a number of concentric hyperspheres with radii spanning the design region.
These are estimated by randomly sampling a large number of points, say 
$n^{*}$, on the surface of each hypersphere. At each sampled point $i$ the SPV
$d_{i}^{r}$ is calculated and the ecdf of $\{d_{i}^{r}\}$ is computed. The 
QP then consists of all the ecdf's. If the sampled points adequately cover the experimental region, the resulting graphs give detailed information about the distribution 
of the SPV over the experimental region. An example of such a QP is shown in Figure~\ref{F:qp-ex}, where the quantiles of the SPV is plotted on the vertical axis. The lines in the plot represent the ecdf's for five concentric hyperspheres, which is a more refined way of representing the information in Figure~\ref{F:spv-example}.

An alternative version of the QP displays boxplots side by side to summarize the SPV distribution at different radii. This leads to a display similar to the
corresponding VDG. In \pkg{vdg} we typically combine the information on display in a VDG and QP in a single graph (or a series of related graphs).

\subsection[Fraction of design space plots]{Fraction of design space plots}
The FDS plot, introduced by \cite{Zah2003}, is based
on the argument that accurate prediction over the entire $\mathcal{X}$ must 
take into account the fraction of the volume of $\mathcal{X}$ that is associated
with the various values of the SPV. In contrast, the VDG and QP give equal 
weight to the SPV for all radii $r$, although the fraction of the volume of the
design region represented by the hypersphere of each radius differs 
substantially. This means that a small SPV at a radius close to the origin does
not give as much assurance of good prediction as a similar SPV value at a 
larger radius.

The FDS plot is a way of correcting for this weighting problem which can make
the use of the VDG and/or QP in isolation problematic. For example, it might 
occur that one design is far superior (in terms of the SPV) compared to another
for a small to moderate $r$, but only somewhat inferior at large $r$. From 
the VDG it will be tempting to choose the first design since it is superior for
most radii, but for higher dimensional design problems the design 
performance at the boundaries of the design region becomes
exponentially more important. Therefore the second design may be preferable.

\subsubsection{FDS theory}\label{SSS:FDSTheory}
For a fixed value $\nu$ of the SPV $d(\mathbf{x}, \xi_{n} )$, the FDS criterion
is defined as
\begin{equation}
\FDS \left( \nu; \xi_{n} \right) = \Phi^{-1} \int_{A} d\mathbf{x},
\end{equation}

\noindent where $A = \{ \mathbf{x} \in \mathcal{X}: d \left( \mathbf{x}, \xi_{n}
\right) < \nu \}$ and $\Phi$ is the volume of $\mathcal{X}$. The criterion 
therefore expresses the volume of the set $A$ containing all points in
$\mathcal{X}$ with SPV lower than $\nu$ as a proportion of the total design
space 
volume.

For each specified SPV value $\nu$, the FDS plot graphs $\nu$ against
FDS($\nu$). This in essence amounts to calculating the cumulative distribution 
function (cdf) for the SPV values and representing it graphically, albeit with a
reversal of the usual horizontal and vertical axes \citep{Gol2004}. The 
latter reversal is motivated by the need to make the FDS plots directly
comparable to VDGs.

Since calculating the FDS criterion analytically becomes cumbersome in higher
dimensions, an ecdf is used in practice to estimate the cdf described above. 
This is done through Monte Carlo simulation. First, a large number of points,
say $n^{*}$, throughout $\mathcal{X}$ is generated. For each of these points 
the SPV is computed and finally the ecdf is calculated. A design where the SPV
values are relatively small and does not vary much over the design region is 
desirable.

<<fds-example, echo = FALSE, fig.width = 5.5, fig.height=5.5, out.width="0.35\\textwidth", fig.cap = "An example of and FDS plot, corresponding to Figures~\\ref{F:spv-example} and \\ref{F:qp-ex}.",results='hide'>>=
set.seed(1)
fdsex <- spv(n = 50000, design = D310, formula = ~.^2)
plot(fdsex, which = "fds", np = 100)
@

Figure~\ref{F:fds-example} is an example of an FDS plot for the running example in Figures~\ref{F:spv-example} and \ref{F:qp-ex}. We see that for this design and model, the SPV is below 5 for 60\% of the design region. The SPV exceeds 7.5 for roughly 12.5\% of $\mathcal{X}$, with less that 5\% of $\mathcal{X}$ having an SPV value in excess of 10.

\subsubsection[Variants of FDS plots]{Variants of FDS plots}

Several variations of FDS plots exist. One variation is to use the so-called
unscaled prediction variance (or UPV) instead of the SPV of Equation~\ref{E:spv}. The UPV $d^{*} (\mathbf{x}, \xi_{n} ) $ is defined as
\begin{equation}
d^{*} (\mathbf{x}, \xi_{n} ) = n \cdot d(\mathbf{x}, \xi_{n} ) = \sigma^{-2} \,
\VAR(\hat{y}( \mathbf{x} )).
\end{equation}

The UPV is appropriate when the size and the related cost of two or more designs
are not considered to be important. This can occur when the cost of 
individual runs are insignificant.

Another version of FDS plots is the scaled FDS plot which focuses on the
stability of the prediction variance over the design region $\mathcal{X}$ 
\citep{Zah2002}. Here the normal FDS plot is adjusted by using the scaled SPV,
which can be expressed as
\begin{equation}\label{E:scaledSPV}
\frac{d (\mathbf{x}, \xi_{n} )}{\min_{\mathbf{x} \in \mathcal{X}} d (
\mathbf{x}, \xi_{n} )}.
\end{equation}

A design with a steeper scaled SPV curve compared to another has a SPV which is
less stable over the design region. This version of the FDS plot also 
allows the analyst to read off the ratio of the maximum to the minimum SPV.

One further variant is the variance ratio FDS plot \citep[or VRFDS,][]{Rod2010}.
This plot is especially useful when a number of candidate designs are 
being compared to a reference design. To construct such a plot, the SPV is
calculated for a number of different designs at the same randomly simulated 
points over the design region $\mathcal{X}$. The VRFDS plot is then constructed
by replacing the SPV with the natural logarithm of the ratio of the SPV of 
each design to the SPV of the reference design. Supposing that two designs,
$\xi_{n_{1}}^{1}$ and $\xi_{n_{2}}^{2}$ are of interest, the VRFDS plot is 
constructed by calculating the log variance ratio
\begin{equation}\label{E:logVR}
\log \VR \left( \mathbf{x}^{*}, \xi_{n_{1}}^{1};\xi_{n_{2}}^{2} \right) = \log
\frac{d \left( \mathbf{x}^{*}, \xi_{n_{1}}^{1} \right)}{d \left(
\mathbf{x}^{*}, \xi_{n_{2}}^{2} \right)}
\end{equation}
for each simulated point $\mathbf{x}^{*}$.

In Equation~\ref{E:logVR}, $\xi_{n_{2}}^{2}$ is the reference  design and is
represented by a constant log variance ratio of zero in the plot. If the log
variance 
ratio for a design is negative, it has a lower SPV than the reference design and
is therefore the preferred design. Similarly, when the log variance ratio 
is positive it has a larger SPV than the reference design and is therefore not
preferred. A design which leads to better predictions over much of the 
experimental region will have a curve largely below the horizontal line
representing the reference design. These VRFDS plots can be useful for
eliminating 
designs which are not admissible in the sense that they perform worse compared
to any other candidate design over most of the experimental region.

\section[Simulation algorithms and guidelines]{Simulation algorithms and guidelines }\label{S:Alg}
In this section the sampling algorithms used in the \pkg{vdg} package are
introduced for exploring the design space $\mathcal{X}$.  In addition, 
recommendations are provided regarding the number of samples required.

\subsection{Simulation algorithms}\label{SS:SimAlgs}
It is important to note that random sampling of $\mathcal{X}$ has advantages
compared to using a grid of points covering the design region. \cite{Bor2003} 
shows that using a grid of points places too much emphasis on the boundary
regions of $\mathcal{X}$, which leads to inaccurate SPV estimates since the SPV 
is likely to be large at the perimeter of $\mathcal{X}$. Although similar
accuracy can be achieved for a fine grid of points, the relative efficiency 
remains lower.

In the design literature generally cuboidal or spherical design regions are
commonly employed. Therefore these two cases will receive special attention 
here. Note however that using random sampling to explore the design space
implies that any type of design region can be considered as long as it 
is possible to sample uniformly over it.

\subsubsection{Sampling in hypercubes}\label{SSS:SampCube}
A trivial way to generate uniform samples in an $m$-dimensional hypercube is to
generate each of the $m$ coordinates independently as uniform random 
numbers. For $\mathcal{X} = [ -  a, \,  a ]^{m}$, the $m$-dimensional hypercube
with sides of length $2a$, each coordinate is simply generated 
independently as uniform variates on  $[ -  a, \,  a ]$.

Space-filling designs originate from the literature on computer experiments and
provide an alternative to this method \citep{Fan2006}. Simulation studies 
for FDS graphs have shown that Latin hypercube sampling (LHS) can be more
efficient for large design problems \citep{Sch2010}. LHS was first introduced
by 
\cite{Mck1979}. A Latin hypercube design is  a design where each column of the
$n \times m$ matrix $\mathbf{V}$ is a random permutation of the column 
levels $1,2,\ldots,n$. This matrix is then transformed into the $n \times m$
design matrix $\mathbf{X}$, by adding a random uniform value to each column. 
LHS amounts to first dividing  $\mathcal{X}$ into a grid of equal sized
hypercuboidal blocks. Then exactly one block is selected from every dimension. 
Finally, a random uniform value is added to each selected block. An example of
a LHS is given in Figure~\ref{F:lhs} and can be constructed as follows:
<<lhs,fig.height=5.5,fig.width=5.5,fig.cap="An example of an LHS of 10 points in a two-dimensional design space.",results='hide'>>=
library("vdg")
set.seed(8745)
samp <- LHS(n = 10, m = 2, lim = c(-1, 1))
plot(samp, main = "", pty = "s", pch = 16, ylim = c(-1, 1), 
asp = 1, xlab = expression(X[1]), ylab = expression(X[2]))
abline(h = seq(-1, 1, length.out = 10), 
v = seq(-1, 1, length.out = 10), lty = 3, col = "grey")
@

Latin hypercube sampling thus provides an alternative to the grid procedure and
the simple uniform random sampling outlined previously. It can be seen as 
an extension to stratified sampling as it ensures that every portion of the
range of each design factor is represented. \cite{Fan2006} show that LHS 
produces samples with a smaller variance of the sample mean than simple random
sampling. The interested reader should consult their book for a detailed 
discussion of the subject. 

Algorithm~\ref{A:LHS} provides an overview of the method. Four different variants are implemented in \pkg{vdg} -- see \code{?LHS} and the examples for more details. It is also possible to interface to other space-filling design implementations, such as those available in for example \pkg{lhs} \citep{lhs} and \pkg{DiceDesign} \citep{DiceDesign}. An example is given in \code{?sampler}.
\begin{algorithm}
\caption{Latin hypercube sampling algorithm on $[-a,a]^{m}$
\citep[after][]{Fan2006}}\label{A:LHS}
\begin{itemize}
\item Calculate $\mathbf{b} = [-a + \frac{i}{n}], i = 1, 2, \ldots, n$.
\item For $j = 1 \to m$:
\begin{itemize}
\item Permute the elements of $\mathbf{b}$ to give $\mathbf{v}_{j}$.
\item Sample a vector of random numbers $\mathbf{u}_{j}$.
\item Calculate $\mathbf{x}_{j}$ as $\mathbf{v}_{j} - \frac{2a}{n} \cdot
\mathbf{u}_{j}$.
\end{itemize}
\item Form the matrix of samples as $\mathbf{X} =  \begin{pmatrix}
                        \mathbf{x}_{1} &
                        \mathbf{x}_{2} &
                        \cdots &
                        \mathbf{x}_{m} \\
                    \end{pmatrix}$.
\end{itemize}
\end{algorithm}
\subsubsection{Sampling in hyperspheres}\label{SSS:SampSphere}
The obvious method for obtaining a uniform random sample inside a hypersphere is
the rejection method -- sample uniformly from the smallest hypercube 
containing the hypersphere and reject all samples falling outside the
hypersphere. For higher dimensions and large samples this method is however 
inefficient. It is easy to show that the proportion of the volume of a
hypercube, contained in the largest hypersphere, rapidly approaches zero as the 
number of dimensions $m$ increases. It is therefore of interest to use
alternative methods to generate uniform random samples within hyperspheres.

This can conveniently be achieved in two independent steps. The first requires a
uniform sample on the surface of the hypersphere with unit radius, after 
which each point is shrunken or extendend to the interior of the hypersphere by
sampling radii from a particular distribution. The theory of spherical 
distributions, which includes the (multivariate) normal distribution, can be
used to show that a point on the surface of the unit hypersphere can be found 
by sampling from a spherical distribution and rescaling the resulting vector to
unit length \citep{Sch2010,Mui1982}. This is readily achieved by 
concatenating $m$ univariate samples from e.g., the normal distribution
(\code{rnorm()} in \proglang{R}) and rescaling.

Furthermore, it can be shown that in order to ensure uniformity over the
hypersphere, the cdf of the radius $r$ on $[0,\,R]$ is given by:
\begin{equation}\label{E:cdfradius}
    F(r) = \frac{\Gamma (m/2 +1)}{m \pi^{m/2} R^{m}} 2^{(m-1)(m-2)/2+1} r^{m}
\prod_{i=1}^{m-2} B ( \frac{m-i}{2},  \frac{m-i}{2} )
\end{equation}
\noindent where $B(\cdot,\cdot)$ denotes the Beta function. The inverse cdf
method is used in \pkg{vdg} to sample the required radii.

\subsection{Simulation size}\label{SS:SampSize}
\cite{Ozo2004} recommends using at least 10,000 randomly sampled points for an
FDS graph of up to eight factors. In practice, it is not harmful in terms of 
computation time to use more. It should be clear from the plots produced by
\pkg{vdg} whether or not a sufficient number of samples had been used.
%simulating on and over hyperspheres and hypercubes. parallelisation?
%\section[Existing Software]{Existing Software}\label{S:Exi}
%Hier of vroe\"er al? Klaar genoem vir VDG, maar dalk meer detail? Of illustreer saam met pakket?

\section[The vdg package]{The \pkg{vdg} package}\label{S:Pac}
%Basiese beskrywing vd filosofie/metodes
The design of \pkg{vdg} revolves around the \code{spv()} function, which creates objects of S3 classes \code{spv}, \code{spvlist}, \code{spvforlist} and \code{spvlistforlist}. These classes differ with respect to the number of designs and model formulae passed to them. For example, an object of class \code{spvlistforlist} results from a call where both arguments \code{design} and \code{formula} are lists of designs and formulae repectively. 

Objects of these classes contain the SPV (or unscaled prediction variance if \code{unscaled = TRUE}) evaluated for all designs and formulae at a set of $n$ points sampled throughout $\mathcal{X}$. The SPV is calculated by a simple \proglang{Fortran} subroutine, and parallelization over multiple designs or formulae are facilitated by the built-in \pkg{parallel} package. 

The sample can be passed explicitly by the user via the \code{sample} argument, but usually is constructed automatically by the \code{sampler()} function. The latter is a simple wrapper for the sampling algorithms built into the package (see Section~\ref{S:Alg}), and automatically handles spherical and cuboidal design regions (via argument \code{type}). The user can request sampling on the surface of concentric hyperspheres or hypercubes by setting \code{at = TRUE} in a call to \code{spv()}. In such cases accurate FDS plots cannot be constructed however and hence will not be produced. Rejection sampling for nonstandard design regions can be achieved by passing an appropriate function as the \code{keepfun} argument to \code{spv()}. In such cases sampling will continue until a sample of at least the requested size is obtained.

Besides simple \code{print()} methods, the power of \pkg{vdg} lies in its \code{plot()} methods, which produce a variety of graphs returned as a list of \pkg{ggplot2} objects. Use is made of facets to produce different panels for different designs and/or formulae. These graphical objects can subsequently be manipulated further, notably by using the theme functions in \pkg{ggplot2}. Which plots are produced depend on the input class as well as a variety of arguments (including \code{which = 1:5}). Only the most important can be mentioned here. 

The \code{alpha} parameter determines the transparency of the plotted points, which helps to build a picture of the density of the prediction variance. It is often worth fine tuning this parameter, although a default is provided. The vector \code{tau} of values between zero and one determines which quantile regressions are included in the VDGs. A nonnull value for the \code{radii} argument implies that the mean spherical prediction variance will be added to the VDGs according to Equation~\ref{E:spvmom}. It is advisable to read the help page of \code{meanspv()} when using this facility -- there are limits to what types of \code{formula}s can be handled safely. 

Finally, the optional \code{method} argument is passed to \code{proxy::dist()} and determines how the distance between a sampled point and the origin of $\mathcal{X}$ is determined. Several metrics are available, as described in \code{summary(pr_DB)} in \pkg{proxy}. If unset, the \code{type} argument will determine an appropriate value, namely Euclidean and supremum distance for spherical and cuboidal design regions respectively.

A starting point for exploring the package is \code{?'vdg-package'}. The call 
<<vign, eval=FALSE>>=
vignette(topic = "vdg", package = "vdg")
@
will display a version of this paper which also serves as package vignette.

\section[Examples]{Examples}\label{S:Exa}
\subsection{Comparing four-factor hybrid designs}
As a first example, we compare Roquemore's \citep{Roq1976} near-saturated 16-run hybrid designs \code{D416B} and \code{D416C} for a full second-order model in four factors. These designs are available in \pkg{vdg} (and was taken from \pkg{Vdgraph}) as

<<load-roq>>=
data("D416B")
data("D416C")
@
We can construct a VDG for these designs with \pkg{vdg} simply by
<<vdgroq,fig.width=9, fig.height=5.5, results='hide', fig.cap="A VDG for Roquemore's hybrid designs D416B and D416C for a full quadratic model.",out.width="0.7\\textwidth">>=
quad4 <- formula( ~ (x1 + x2 + x3 + x4)^2 +  I(x1^2) + I(x2^2) + 
I(x3^2) + I(x4^2))
set.seed(1234)
spv1 <- spv(n = 5000, design = list(D416B = D416B, 
D416C = D416C), formula = quad4)
plot(spv1, which = "vdgboth")
@
Of course \code{quad4} could also have been specified more compactly as
<<quad4, eval=FALSE>>=
quad4 <- formula( ~ .^2 +  I(x1^2) + I(x2^2) + I(x3^2) + I(x4^2))
@

Figure~\ref{F:vdgroq} shows the resulting VDG for these two designs, where the 5th and 95th percentiles are shown as quantile regression fits, together with the mean spherical SPV. The individual sampled points are overplotted, which gives an indication of the distribution of the SPV over the design region. It is evident that the SPV profiles for the hyperspherical design region $\mathcal{X}$ are very similar for both designs. D416B has a slightly higher SPV in proximity of the origin, but performs better near the perimeter of the design space. Hence it can be expected that D416B is preferable to D416C for this model and design space. This is confirmed by the FDS plots constructed by
<<ex1-bothroqfds,fig.width=6, fig.height=5, results='hide',fig.cap="A standard and variance ratio FDS plot for Roquemore's hybrid designs D416B and D416C for a full quadratic model.",fig.show='hold'>>=
plot(spv1, which = "fds")
plot(spv1, which = "fds", VRFDS = TRUE, np = 100)
@
as shown in Figure~\ref{F:ex1-bothroqfds}. The right panel shows that D416B is slightly superior over the majority of the design region.

Additional options to the \code{which} argument are outlined in \code{?plot.spv}. Multiple plots can be produced in a single call by passing a vector to \code{which}. Note that the plots are by default returned as a list of \pkg{ggplot2} graphical objects \citep[see e.g.,][]{murrell2011}. The plots are rendered when they are \code{print()}ed, which implies that on some graphics devices only the last plot will be visible. This can be dealt with by either storing the resulting list of graphical objects and rendering them one by one, or by specifying \code{arrange = TRUE} in the \code{plot()} call. The latter will arrange all plots in a single device by using \code{grid.arrange()} from the \pkg{gridExtra} package \citep{gridExtra}. Yet another option is to set \code{par(ask = TRUE)} before creating the plots, which will prompt the user before advancing to the next plot.

An advantage of storing the plots as graphical objects is that we can make post-hoc changes before rendering the plot. For example, we might not like the default \pkg{ggplot2} theme used for Figure~\ref{F:vdgroq}. We can easily change the theme to something more traditional with
<<vdgroq-theme,fig.width=7, fig.height=4, results='hide', fig.cap="A second version of Figure~\\ref{F:vdgroq}.",out.width="0.5\\textwidth">>=
p <- plot(spv1, which = "vdgboth")
p$vdgboth + theme_bw() + theme(panel.grid = element_blank())
@
The result is shown in Figure~\ref{F:vdgroq-theme}. Much more refined approaches are also possible; see for example \code{?ggplot2::theme} and \citet{ggplot2}.

\subsection{Central composite, D- and A-optimal designs}
In this example we consider optimal design alternatives to central composite designs (CCDs) for three design factors. A hyperspherical design region is assumed, and the axial distance for the CCD is assumed to be $\alpha = \sqrt{3}$. This spherical CCD is based on a full factorial design, augmented with four center runs and the usual six axial runs. The CCD hence contains 22 runs, and we can construct it using \pkg{rsm} \citep{rsm} as
<<make-ccd3>>=
library("rsm") 
ccd3 <- as.data.frame(ccd(basis = 3, n0 = 4, 
alpha = "spherical", oneblock = TRUE))[, 3:5]
@
We also construct a D- and A-optimal design with \code{optFederov()} from \pkg{AlgDesign} \citep{AlgDesign}. For this purpose a candidate list of 10 000 randomly sampled points is constructed within the sphere of radius $\alpha$, with \code{runif_sphere()} in \pkg{vdg}:
<<algdes-cand,results='hide'>>=
set.seed(8619) 
cand <- runif_sphere(n = 10000, m = 3)
colnames(cand) <- colnames(ccd3)
@
The algorithm then attempts to select the 22 runs from these candidates which optimize the requested criterion. D-optimality seeks to maximize the determinant of the information matrix in Equation~\ref{E:InfoMat}, which implies minimizing the volume of the confidence ellipsoid of the regression parameters. A-optimality seeks to minimize the trace of the inverse information matrix, which minimizes the average variance of the regression parameters \citep[for more details, see for example][]{Mye2009}. The D- and A-optimal designs can be found by executing
<<algdes-AD>>=
quad3 <- formula( ~ (x1 + x2 + x3)^2 + I(x1^2) + I(x2^2) + I(x3^2))
library("AlgDesign") 
set.seed(3476)
desD <- optFederov(quad3, data = cand, nTrials = 22, criterion = "D")
desA <- optFederov(quad3, data = cand, nTrials = 22, criterion = "A")
@
All the information needed for the plots is obtained by calling \code{spv()} as 
<<ex2-spv,results='hide',fig.show='hide'>>=
spv2 <- spv(n = 10000, formula = quad3, 
design = list(CCD = ccd3, D = desD$design, A = desA$design))
plot(spv2, which = 2:3) 
@
<<ccdfds,include=FALSE,echo=FALSE,fig.height=5,fig.width=6,fig.show='hide'>>=
plot(spv2, which = 2)[[1]] + theme(plot.title = element_text(size = 16), legend.position = "none") 
@
<<ccdvdg,include=FALSE,echo=FALSE,fig.height=5,fig.width=8,fig.show='hide'>>=
plot(spv2, which = 3)[[1]] + theme(plot.title = element_text(size = 16)) 
@
\begin{figure}
\centering
\captionsetup[subfloat]{labelfont=up}
    \subfloat[]{\label{F:ccdfds}
        \includegraphics[width=0.4071415\textwidth]{paper-figures/Paper-ccdfds-1}} %0.42857
    \subfloat[]{\label{F:ccdvdgindv}
        \includegraphics[width=0.5428585\textwidth]{paper-figures/Paper-ccdvdg-1}} \\
\caption{A FDS plot and VDG for the CCD, D- and A-optimal designs, for a full quadratic model in three factors on a spherical design region.}\label{F:ccdvdg}
\end{figure}
The VDG and FDS plots are shown in Figure~\ref{F:ccdvdg}. From the VDG it is apparent that the CCD and A-optimal designs have similar SPV profiles over $\mathcal{X}$, where the SPV is very low near the origin but becomes increasingly worse towards the perimeter. The D-optimal design does worse in the interior but at the same time have better SPV near the perimeter of $\mathcal{X}$. 

The FDS plot shows that the D-optimal design has the most stable SPV. Although the CCD and A-optimal designs achieve lower SPV values near the origin, this comes at the cost of a significantly higher SPV near the perimeter of $\mathcal{X}$. The FDS plot places more emphasis on the perimeter since this is where the majority of the volume of $\mathcal{X}$ is located. Hence the D-optimal design provides an alternative to the CCD which features better prediction over the majority of the design region.

\subsection{Cubic model with restricted design region}
Chapter 5 of \cite{Goo2011} describes an experimental problem in two variables, namely time (seconds) and temperature (degrees Kelvin), for a chemical reaction in an industrial setting. The aim was to refine the current operating conditions to optimize the yield of the chemical process. However, based on prior knowledge the experimental region $\mathcal{X}$ has a restricted shape, as shown in Figure~\ref{F:GJdesreg}. Outside this region the combination of design factors was deemed not worth exploring by researchers with intimate knowledge of the process. 
<<GJdesreg,echo=FALSE,fig.width=5,fig.height=5,fig.cap="The design region and D-optimal design of \\cite{Goo2011}. Some runs are replicated.">>=
df <- data.frame(Time = c(360, 420, 720, 720, 660, 360, 360), Temperature = c(550, 550, 523, 520, 520, 529, 550))
p <- ggplot(data = df, aes(x = Time, y = Temperature)) + geom_path() + geom_point(data = GJ54)
p
@

Furthermore, a full cubic model was decided on since a quadratic model was deemed inadequate to capture the nonlinearity of the response process. A 15-run design was feasible -- the D-optimal design employed by \citet[][Table 5.4]{Goo2011} and shown in Figure~\ref{F:GJdesreg} is included in \pkg{vdg} as \code{GJ54} (not all design points lie strictly within $\mathcal{X}$). In order to construct a VDG and FDS plot for this design, we need to be able to sample from $\mathcal{X}$. Translating to standardized coordinates on $[-1,1]$ in both time and termperature (using e.g., \code{stdrange()} in \pkg{vdg}), the equations of the linear restrictions on $\mathcal{X}$ are:
\begin{align}
y &= -1.08 x + 0.28 \notag \\
y &= -0.36 x - 0.76.\notag
\end{align}
We can use this to construct a function \code{keepfun()} which takes a data matrix \code{x} and returns a logical vector indicating which of the samples in the rows of \code{x} lie within $\mathcal{X}$:
<<keepfun>>=
keepfun <- function(x) apply(x >= -1 & x <= 1, 1, all) & 
(x[, 2] <= -1.08 * x[, 1] + 0.28) & (x[, 2] >= -0.36 * x[, 1] - 0.76) 
@
We compare the SPV for the full cubic model, for which the design was constructed, to this model without the cubic term for Time and the  interaction between Temperature and the quadratic term in Time. This latter model was the final model after removing the two insignificant effects -- see \citet{Goo2011}. The formula for this model is obtained as
<<ex3-for>>=
cube2 <- formula( ~ (Time + Temperature)^2 + I(Time^2) + 
I(Temperature^2) + I(Time^3) + I(Temperature^3) + 
Time:I(Temperature^2) + I(Time^2):Temperature)
GJmod <- update(cube2, ~ . - I(Time^3) - I(Time^2):Temperature) 
@
To sample uniformly over $\mathcal{X}$, we can use \code{keepfun()} to construct an \code{spv} object for the standardized design. The algorithm will automatically keep sampling until at least the specified number of samples have been obtained. We now pass a list of model formulae to \code{spv()}, and use Latin hypercube sampling for illustration:
<<ex3-spv>>=
spv3 <- spv(n = 10000, design = stdrange(GJ54), type = "lhs", 
formula = list(Cubic = cube2, GoosJones = GJmod), 
keepfun = keepfun) 
@
\vskip-20pt
<<vdggj-noplot,eval=FALSE>>=
plot(spv3, which = 1, points.size = 2) 
@
<<vdggj,fig.width=8,fig.height=4,results='hide',echo=FALSE,fig.cap="VDG for the D-optimal design of \\citet{Goo2011}, for the two models.",out.width="0.5\\textwidth">>=
plot(spv3, which = 1, points.size = 2)[[1]] + theme(plot.title = element_text(size = 16)) 
@
%<<fdsgj,fig=TRUE,include=TRUE>>=
%plot(out2, which = 2)
%@
Of course the SPV surface for the second model is much favoured to the full cubic model, but the unimportance of the dropped terms is difficult if not impossible to establish before constructing a design. Note in Figure~\ref{F:vdggj} that supremum distance is automatically selected to summarize the SPV since the argument \code{type = "cuboidal"} was specified. This can be changed to e.g., Manhattan distance by passing \code{method = "Manhattan"} to the call to \code{plot()}. Users are encouraged to experiment with the option \code{hexbin = TRUE} which uses \pkg{ggplot2}'s hexagonal binning feature as an alternative to overplotting.

Since this example has only two variables, the SPV surface can also be inspected by contour plots, or in three dimensions by e.g., \pkg{rgl} \citep[plot not shown;][]{rgl}:
<<rgl-code,eval=FALSE>>=
library("rgl") 
with(spv3$Cubic, plot3d(x = sample[, "Time"], 
y = sample[, "Temperature"], z = spv))
@
\section[Conclusions]{Conclusions}\label{S:Con}
The \pkg{vdg} package provides a modern and flexible \proglang{R} interface to important graphical procedures in experimental design, producing VDGs, FDS and related plots using random sampling. Multiple designs and/or multiple model formulae are seamlessly allowed for, while using \proglang{R}'s \code{formula} interface allows for flexibility in model specification. Plots are produced with the \pkg{ggplot2} package, which allows for post hoc manipulation of graphical elements. 

% Flexibility are allowed for with respect to irregular design regions, sampling strategies, model specification and different variants of the standard plots, such as the VRFDS plot, are provided for. Furthermore, plot method features not illustrated in the examples include specifiying specific distance measures from the \code{proxy} package via the \code{method} argument and  using hexagonal binning via \code{hexbin = TRUE} instead of overplotting. These capabilities make \pkg{vdg} a valuable tool in the 

The flexibility of \pkg{vdg} empowers users to investigate irregular design regions; to use different sampling schemes; to consider complicated model specifications; to construct different variants of the standard plots, such as the VRFDS plot, as well as to enhance plots with novelties like quantile regressions. In addition to its capabilities illustrated in the examples given above, the plot method of \pkg{vdg} includes features like specifying specific distance measures from the \pkg{proxy} package via the \code{method} argument and using hexagonal binning via \code{hexbin = TRUE} instead of overplotting. Thus, \pkg{vdg}  not only plays a complimentary role to the collection of existing \proglang{R} packages available to the practitioner in the field of experimental designs, but adds worthwhile capabilities for use in this field.


%% Note: If there is markup in \(sub)section, then it has to be escape as above.

\bibliography{vdg}

%\section*{Appendix A: Designs and Matrices}\label{A:AppA}

% \appendix
% \section{Designs and Matrices}\label{S:AppA}

\end{document}