---
title: "The labelled_spss_survey class"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{The labelled_spss_survey class}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(retroharmonize)
```

Use the `labelled_spss_survey()` helper function to create vectors of class *retroharmonize_labelled_spss_survey*.

```{r}
sl1 <- labelled_spss_survey (
  x = c(1,1,0,8,8,8), 
  labels = c("yes" =1,
             "no" = 0,
             "declined" = 8),
  label = "Do you agree?",
  na_values = 8, 
  id = "survey1")

print(sl1)
```
You can check the type: 

```{r}
is.labelled_spss_survey (sl1)
```
The `labelled_spss_survey()` class inherits some properties from `haven::labelled()`, which can be manipulated by the `labelled` package (See particularly the vignette *Introduction to labelled* by Joseph Larmarange.)

```{r}
haven::is.labelled(sl1)
```
```{r}
labelled::val_labels(sl1)
```
```{r}
labelled::na_values(sl1)
```
It can also be subsetted:

```{r}
sl1[3:4]
```

When used within the modernized version of *data.frame*, `tibble::tibble()`, the summary of the variable content prints in an informative way.

```{r}
df <- tibble::tibble (v1 = sl1)
## Use tibble instead of data.frame(v1=sl1) ...
print(df)
## ... which inherits the methods of a data.frame 
subset(df, v1 == 1)
```

## Coercion rules and type casting

To avoid any confusion with mis-labelled surveys, coercion with double or integer vectors will result in a double or integer vector. The use of `vctrs::vec_c` is generally safer than base R `c()`.

```{r}
#double
c(sl1, 1/7)
vctrs::vec_c(sl1, 1/7)
```
```{r integer}
c(sl1, 1:3)
```
Conversion to character works as expected:

```{r character}
as.character(sl1)
```
The base `as.factor` converts to integer and uses the integers as levels, because base R factors are integers with a `levels` attribute.

```{r as.factor}
as.factor(sl1)
```

Conversion to factor with `as_factor` converts the value labels to factor levels:

```{r as_factor}
as_factor(sl1)
```
Similarly, when converting to numeric types, we have to convert the user-defined missing values to `NA` values used in the R language. For numerical analysis, convert with `as_numeric`.

```{r numerics}
as.numeric(sl1)
as_numeric(sl1)
```
## Arithmetics 

The median value is correctly displayed, because user-defined missing values are removed from the calculation. Only a few arithmetic methods are implemented, such as 

* median()

```{r}
median (as.numeric(sl1))
median (sl1)
```

* quantile()

```{r}
quantile (as.numeric(sl1), 0.9)
quantile (sl1, 0.9)
```

* mean()

```{r}
mean (as.numeric(sl1))
mean (sl1)
mean (sl1, na.rm=TRUE)
```

* weighted.mean() - always removes NA values.

```{r}
weights1 <- runif (n = 6, min = 0, max = 1)
weighted.mean(as.numeric(sl1), weights1)
weighted.mean(sl1, weights1)
```

* sum()

```{r}
sum (as.numeric(sl1))
sum (sl1, na.rm=TRUE)
```

The result of the conversion to numeric can be used for other mathematical / statistical function. 

```{r}
as_numeric(sl1)
min ( as_numeric(sl1))
min ( as_numeric(sl1), na.rm=TRUE)
```
