---
title: "Command Line Technical Specifications"
author: "Decision Patterns"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Command Line Technical Specifications}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

This document provides a language-agnostic specification about how application are invoked from an interactive terminal or a text-based/non-gui/batch system. It is intend for developers who wish to understand details and considerations about command-line parsing. It is unlikely to be of much value to end-users looking to add command-line parsing to their own applications.  For the implementation in *optigrab*, please see the accompanying *Using Optigrab* document. 


## Anatomy of the command-line:

An innvocattion of command-line application commonly has this structure: 

    [env] [interpreter] script  [verb] [options] [targets]
                               <-------- arguments ------->
    <------------------- command-line -------------------->
    
(Components in [bracket] are optional.)    

The sections below, detail and discuss the considerations for parsing each of the components.  
    
    
### Environment Variable

See (Wikipedia: Environment Variable)[https://en.wikipedia.org/wiki/Environment_variable]

Things to note about environment variables: 
- inherited by child processed
- implemented by the OS
                
                         
### Intepreter

The *interpreter* is the application that executes the *script*. The interpreter may be specified on the command line or There are numerous ways the interpreter may appear: `Rscript`, `R CMD BATCH`, r (little-r). On *nix systems the *interpreter* is optional and may be specified in the script through the #! (SHEBANG) syntax.


### Script

The *script* is the (path to the) program file/code to be run. 


### Arguments

Collectively, [arguments](#Arguments) refers to the [verb}(#Verb), [options](#Options) and [target(s)](#targets).  Each type of argument is handled in its own way.

####  Verb

A *verb* is a single optional command. (Not all applications use verbs; *git* is one popular application that does). The verb is the first value after the script and before any options. 

The constraint that the verb is the first argument before any options is not universally followed. Some application allow placement of the verb anywhere so long as it is cannot be interpretted as part of an option. This introduces ambiguity with applications that allow the binding of multiple values to a name. This is inherently true of R since R atomistic object are vectors and not scalars.

Typically application allow only one verb. It might be easy to define verbs as all values that occur between the name of the script and the first flag (if any). If there are more than one verbs in the arguments array, the first is taken as the verb and a warning is issued


#### Options

     [flag[=][value [value]] [flag[=][value [value]] [...]]]
     
The *options* are the name-value(s) pairs that can be parsed into variables. Names are identified by a flag, usually a slight syntactical variation on the name such as preceeding it with one or two characters.  "-" (Java-style), "--" (GNU-style) or "/" (Microsoft-style).

    -name
    --name
    /name

Associated with each name can be zero, one or several values. If there are no values provided, the type of the variable is assumed to be boolean/logical and the value is taken as TRUE (1)

Parsing and the interpretaion of options are the focus of this library. Options occur after the *script* and any *verbs* and before any targets. The challenge of translating command-line options to variable in the program require several steps. The following conventions are used 
As an example consider the following simple innvocation:

    script --foo bar
    
 
   Component    Example              Description
--------------  ------------------   -----------
command-line    script.r --foo bar   referes to thte complete command-line: script, verb, etc. 
  
  arguments     verb options targets refer to the script arguments as they appear on the command line. By default they have no type.

  opt_string    '--foo bar'          string; options as they are typed at the command-line 
 
   opts         `c('--foo', 'bar')`  character vector; how options appear in `commandArgs`
    
   flag         --foo                how the flags can appear on the command line; not needed by the developer
 
   name         foo                  the name used for the option or variable
    
   value        bar                  the value of the variable
    
   

#### Targets

The *targets* are typically files or other resources that are affected by the commands. See [Target](#Target).

Targets are similar to the [verbs](#Verb) in that they do not require much interpretation.  Typically, they are one or more files or resources. Sometimes the expressions are globs.  

Targets can be ambiguous since *option* can have an indefinite number of values. Consider the following command-line:

    script.r --foo 1 --bar my-file-1 my-file-2 

In this example, it is unclear whether `my-file-1` and `my-file-2` should be associated with `--bar` or or targets or a combination.  


## Parsing

Parsing command-line arguments is a bit more tricky than it may appear. Parsing involves severals steps:

1. Retrieve commandArgs()
2. Pre-process command-line according to the style
3. Parsing command-line: script [verb] [options] [targets]
** Identify script (this_file) 
** Identify verb (if exists)  
** Parsing options (according to specifications)
*** Differentiate flags from values
*** Define variabels / assign names and values
** Disambuigate target(s) from options


## opt_fill Filling recursive structures

`opt_fill` gets values from the command-line using an existing recursive object as a template. Values in the existing recursive structure may be 
clobberred/overwritten or created. There are several use cases:


* `clobber == TRUE`: command-line application in which `opts` are overwritting application defaults.

* `clobber == TRUE && create == TRUE`: same as above, but also get other missing commands such as 

Reursive structures have several options for filling based on `opts`. 

* Only fill existing names in `x`
name exists
x/opts:   TRUE                       FALSE    
-------- -------------------------  -------
 TRUE     Clobber(?) or not          Keep x as default
 FALSE    Create(?)                  Keep x as default

It seems there are two use-case.

1. `x` is defaults; values should be clobbered and set.
2. `x` is set; alternative values should be set

`.clobber` and `.create`

*clobber*: if `x` exists is the value overwritten.  

clobber   &   create : clobber existing values; create new ones 
                       USE CASE: overwriting default, (commando)
                       
clobber   & ! create : only clobber values 
                       USE CASE: x are all the values that we need ...only                        get those. commando with mutliple commands and one                        config file, etc. (commando: multiple commands)
                       
! clobber &   create : get new values (e.g. don't change defaults)
                       USE CASE: (rare) keep old values adding new ones

! clobber & ! create : non-op (with warning)
                       USE CASE: None 


## References

(Wikipedia: Environment Variable)[https://en.wikipedia.org/wiki/Environment_variable]


## Appendix
