---
title: "Extracting Features with communication"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{extracting-features}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(communication)
```

This is an early version of a project in active development. Over the
next few months, the package will be greatly extended to include
additional options for featurization and a number of helper functions
for model fitting. In its present version, the package contains the
necessary functions to replicate the analysis conducted in "A Dynamic
Model of Speech for the Social Sciences" (Knox and Lucas,
forthcoming).

In this vignette, we briefly demonstrate how to do feature extraction
with communication. To extract features from audio files in a list,
first collect all filenames, then extract, as 

```{r, eval = FALSE}
## extract features
wav.fnames = list.files(file.path('PATH_TO_YOUR_DIRECTORY'),
                        pattern = 'wav$',
                        recursive = TRUE,
                        full.names = TRUE
                        )
audio <- extractAudioFeatures(wav.fnames = wav.fnames,
                              derivatives = 0
                              )
```

After feature extraction, you most likely want to standardize
features, and can do so as follows:

```{r, eval = FALSE}
## standardize full training set together
audio$data <- standardizeFeatures(
    lapply(audio$data, function(x) na.omit(x))
    )
```

Lastly, you can estimate hidden Markov models with the following
function, which wraps a fast C++ implementation.

```{r, eval = FALSE}
mod <- hmm(audio$data,
    nstates = 2,
    control = list(verbose = TRUE)
    )
```