---
title: "Pediatric Complex Chronic Conditions"
output:
 rmarkdown::html_vignette:
   toc: true
   number_sections: false
bibliography: references.bib
vignette: >
 %\VignetteIndexEntry{Pediatric Complex Chronic Conditions}
 %\VignetteEngine{knitr::rmarkdown}
 %\VignetteEncoding{UTF-8}
---

```{r, label = "setup", include = FALSE}
# IMPORTANT SYNTAX NOTE:
#
# DO NOT USE the pipeOp `|>`
#
# While convenient, that is a R 4.1.0 feature at a minimum. Notable improvements
# to the pipeOp come in 4.2.0 and 4.2.1.  To keep this package dependent on R >=
# 3.5.0 do not use the pipeOp.

library(kableExtra)
options(qwraps2_markup = "markdown")
options(knitr.kable.NA = '')
knitr::opts_chunk$set(collapse = TRUE, fig.align = "center")
```

```{r, label = 'medicalcoder-namespace'}
library(medicalcoder)
packageVersion("medicalcoder")
```

# Introduction

The Pediatric Complex Chronic Condition (PCCC) coding schema version 2 was
published in 2014 [@feudtner2014pediatric] and updated to version 3 in 2024
[@feinstein2024pediatric]. Both versions identify 11 conditions, each with
multiple subconditions.

```{r, label = "tbl-syntactically-valid-conditions", echo = FALSE, results = "asis"}
CNDS <- subset(get_pccc_conditions(), select = c("condition", "condition_label"))
data.table::setDT(CNDS)
data.table::setkey(CNDS, condition)
CNDS <- unique(CNDS)

kbl(CNDS,
    caption = "Syntactically valid names for complex chronic conditions",
    row.names = TRUE)
```

The PCCC system provides a standardized approach to identifying children with
complex chronic conditions using administrative data. This has several important
implications:

- **Research comparability**: Standard definitions allow findings to be compared
  across studies and institutions.
- **Policy and planning**: Accurate identification of populations with high
  medical needs supports resource allocation and health policy decisions.
- **Clinical insight**: Coding consistency helps clinicians and researchers
  understand disease burden and outcomes in pediatric populations.

Without a common framework such as the PCCC, studies of chronic pediatric
conditions would be fragmented, limiting their impact on both research and
practice.

# PCCC Version 2.0 vs PCCC Version 3.0

Versions 2 and 3 differ mainly in how technology dependence is treated. Many ICD
codes map to both a primary condition and either technology dependence or
transplant.

In both versions, transplant-related codes indicate organ system failure. A
patient with such a code is flagged as having both a transplant and the related
condition.

Technology dependence, however, diverges between versions. In version 2, the
presence of a technology dependence code classifies the patient as having both
the associated condition and technology dependence. For example, ICD-10 Z46.81
is both a metabolic and technology dependence code. A patient with this code is
classified as having a metabolic condition and technology dependence.

Version 3 refines this rule: technology dependence codes are assessed
conditionally, recognizing that many do not reflect chronic conditions.

Example: ICD-10 Z46.81
(`r subset(get_icd_codes(with.descriptions = TRUE), code == "Z4681")$desc`) is a
metabolic and technology dependence code.  If a patient had this code in their
medical records then they would be classified has having a metabolic condition
and a tech dependence.  Under version 2, this patient would be flagged as having
a metabolic condition and technology dependence.  Under version 3, the patient
would only be flagged with a metabolic condition and technology dependence if at
least one non-technology condition is flagged.

## ICD Codes for PCCC
Let's look at the codes that are in the PCCC schema.  Calling `get_pccc_codes`
returns a data.frame.
```{r, label = "get-pccc-codes"}
pccc_codes <- get_pccc_codes()
str(pccc_codes)
```

The columns are:

* `icdv`: integer, ICD version
* `dx`: 1 for diagnostic (ICD-9-CM or ICD-10-CM) codes, 0 for procedure (ICD-9-PCS
  or ICD-10-PCS) codes.
* `full_code`: character, the ICD code retaining any applicable decimal point.
* `code`: character, the compact ICD code; any applicable decimal point omitted.
  Examples: ICD-9-CM full code 553.3 is represented as 5533 as a compact code.
  ICD-10-CM full code C96.9 is represented as compact code C969.
* `condition`: character, the pccc condition with syntactically valid names.
* `subcondition`: character, the pccc subcondition with syntactically valid
  names.
* `transplant_flag`: integer, 1L if the ICD code is associated with a
  transplant, 0L otherwise
* `tech_dep_flag`: integer, 1L if the ICD code is associated with technology
  dependence, 0L otherwise.
* `pccc_vX.Y`: integer, 1L if the code is part of variant `X.Y`, 0L otherwise

## Examples

Example: Consider a patient with following four diagnostic and two procedure
ICD-9 codes:
```{r, label = "define-pat1"}
pat1 <-
  data.frame(dx = c(1, 1, 1, 1, 0, 0),
             icdv = 9L,
             code = c("34590", "78065", "3432", "78065", "9929", "8606"))
```

An inner join between the `pccc_codes` and `pat1` will yield the conditions
this patient has.

```{r, label = "inner-join-pat1-pccc-codes"}
merge(x = pccc_codes, y = pat1, all = FALSE, by = c("icdv", "dx", "code"))
```

For all PCCC variants, there is one matching dx code, 343.2, for
infantile cerebral palsy, matches for a neuromuscular condition.  The procedure
code 86.06 matches for a technology dependent metabolic condition.

Under version 2.0 of PCCC (variants `pccc_v2.0` and `pccc_v2.1`), this patient
has two conditions, neuromuscular, metabolic.  This patient also has a flag for
device and technology use.
```{r, label = "pat1-pccc-v2"}
pat1_pccc_v2.0 <-
  comorbidities(
       data = pat1,
       icd.codes = "code",
       dx.var = "dx",
       icdv = 9,
       method = "pccc_v2.0",
       flag.method = "current", # default
       poa = 1                  # default for flag.method = 'current'
  )

pat1_pccc_v2.1 <-
  comorbidities(
       data = pat1,
       icd.codes = "code",
       dx.var = "dx",
       icdv = 9,
       method = "pccc_v2.1",
       flag.method = "current",
       poa = 1
  )

all.equal(pat1_pccc_v2.0, pat1_pccc_v2.1, check.attributes = FALSE)
pat1_pccc_v2.0
```

Under version 3 of the PCCC, this patient has two conditions: neuromuscular and
metabolic.  The technology dependence flags are also 1 for this patient, but are
not counted in the total number of conditions.

```{r, label = "pat1-pccc-v3"}
pat1_pccc_v3.0 <-
  comorbidities(data = pat1,
       icd.codes = "code",
       dx.var = "dx",
       icdv = 9,
       method = "pccc_v3.0",
       flag.method = 'current',
       poa = 1
  )

pat1_pccc_v3.1 <-
  comorbidities(data = pat1,
       icd.codes = "code",
       dx.var = "dx",
       icdv = 9,
       method = "pccc_v3.1",
       flag.method = 'current',
       poa = 1
  )

all.equal(pat1_pccc_v3.0, pat1_pccc_v3.1, check.attributes = FALSE)

# retain the needed columns, there are four columns for each condition in v3
pat1_pccc_v3.0[, grep("^(cmrb_flag|num_cmrb|neuromus|metabolic|tech_dep_flag)", names(pat1_pccc_v3.0))]
```

In the output from version 3, we have four 0/1 indicator columns for each of the
conditions.

* `<condition>_dxpr_only`: the `<condition>` has been flagged due to diagnostic
  (dx) or procedure (pr) codes only.

* `<condition>_tech_only`: the `<condition>` has been flagged due to the
  presence of a technology dependence code _and_ at least one other condition
  has been flagged by dx or pr codes.

* `<condition>_dxpr_and_tech`: the `<condition>` has been flagged due to the
  presence of a dx or pr code that is not associated with a technology
  dependence _and_ another dx or pr code which is associated with technology
  dependence.

* `<condition>_dxpr_or_tech`: the `<condition>` has been flagged.  **These
  columns answer the question "does `<condition>` exist for this
  patient/encounter."**

The details in the list above might be easier to understand in a tabular form of
possible sets.  In the case of no conditions, only the
`<condition>_dxpr_or_tech` columns are flagged as 0/1 with the
`<condition>_dxpr_only`, `<condition>_tech_only`, and
`<condition>_dxpr_and_tech` columns set to `NA`.  When at least one condition
is flagged, all the columns will be populated as 0/1.

 | `cmrb_flag` | `num_cmrb` | `<condition>_dxpr_or_tech` | `<condition>_dxpr_only` | `<condition>_tech_only | <condition>_dxpr_and_tech | <other condition(s)>_dxpr_or_tech` |
 | :---:       | :---:      | :---:                      | :---:                   | :---:                  | :---:                     | :---:                              |
 | 0           | 0          | 0                          | 0                       | 0                      | 0                         | 0                                  |
 | 1           | 1          | 0                          | 0                       | 0                      | 0                         | 1                                  |
 | 1           | 1          | 1                          | 1                       | 0                      | 0                         | 0                                  |
 | 1           | 1          | 1                          | 1                       | 1                      | 1                         | 0                                  |
 | 1           | >1         | 0                          | 0                       | 0                      | 0                         | 1                                  |
 | 1           | >1         | 1                          | 0                       | 1                      | 0                         | 1                                  |
 | 1           | >1         | 1                          | 1                       | 0                      | 0                         | 1                                  |
 | 1           | >1         | 1                          | 1                       | 1                      | 1                         | 1                                  |

Now, consider another patient, pat2, with the same codes as pat1 except for
3432, the code mapping to a neuromuscular condition.
```{r, label = 'define-pat2'}
pat2 <- subset(pat1, code != "3432")
```
Under version 2 of the PCCC this patient will still have metabolic and
technology dependence conditions because of the code
86.06 is in the record, but will not have the neuromuscular condition.
```{r, label = "pat2-pccc-v2"}
pat2_pccc_v2.1 <-
  comorbidities(
    data = pat2,
    icd.codes = "code",
    dx.var = "dx",
    icdv = 9,
    method = "pccc_v2.1",
    flag.method = 'current',
    poa = 1
  )
Filter(f = function(x) x > 0, pat2_pccc_v2.1)
```

Under version 3 of the PCCC, this patient will have no conditions. This is
because no condition was identified based on non-technology dependent codes and
thus the one technology dependent code is ignored.
```{r, label = "pat2-pccc-v3"}
pat2_pccc_v3.1 <-
  comorbidities(
    data = pat2,
    icd.codes = "code",
    dx.var = "dx",
    icdv = 9,
    method = "pccc_v3.1",
    flag.method = 'current',
    poa = 1
  )
Filter(f = function(x) x > 0, pat2_pccc_v3.1)
```

# Expected Data Structures

The expected input data format for `comorbidities` is a "long" format.  The only
mandatory column is one column of ICD codes.  These codes can be full codes
(include the decimal point) or compact codes (omitting the decimal point).
Additionally, column(s) for identifying patients, encounters, and any other
important groups are encouraged.  A column to indicate the ICD version
(9 or 10), and another column for identifying the code as a diagnostic or
procedure code are also encouraged.  The example `mdcr` data set has three
columns, a patient id (patid), the ICD compact codes (code), and a column to
indicate if the ICD code is a diagnostic or procedure code, (dx: 1 for
diagnostic, 0 for procedure).

The `mdcr` data is provided with columns for

* patient id,
* ICD version,
* compact code, and
* diagnostic/procedure indicator.

```{r, label = "mdcr-data"}
head(mdcr)
str(mdcr)
```

Applying `pccc_v2.1` and `pccc_v3.1` methods to `mdcr` could be as simple as:
```{r, label = "mdcr-results-01"}
mdcr_results_v2.1_01 <-
  comorbidities(data = mdcr,
       icd.codes = "code",
       id.vars = "patid",
       poa = 1,
       flag.method = 'current',
       method = "pccc_v2.1")

mdcr_results_v3.1_01 <-
  comorbidities(data = mdcr,
       icd.codes = "code",
       id.vars = "patid",
       poa = 1,
       flag.method = 'current',
       method = "pccc_v3.1")
```
and a useful summary of the object returned from `comorbidities` is gained by
calling `summary()`. The return is a `data.table` with columns for the count
and percentages.  For `pccc_v2.0` and `pccc_v2.1` the condition, label, count,
and percentage, are reported.  For `pccc_v3.0` and `pccc_v3.1` the columns are
extended to provide the counts and percentages for `dxpr_or_tech`, `dxpr_only`,
`tech_only`, and `dxpr_and_tech`.

```{r, label = "comorbidities-summary-table-str"}
str(summary(mdcr_results_v2.1_01))
str(summary(mdcr_results_v3.1_01))
```

The summary tables are `data.frame`s and can be manipulated by the end user for
reporting as they want, see the following table.

```{r, label = "mdcr-results-01-summary"}
x <-
  merge(
    summary(mdcr_results_v2.1_01),
    summary(mdcr_results_v3.1_01),
    all = TRUE,
    by = c("condition", "label"),
    sort = FALSE
  )
x[["condition"]] <- NULL
```

```{r, label = "mdcr-results-01-summary-kable", echo = FALSE, results = "asis"}
tab <-
  kableExtra::kbl(
    x,
    digits = 1,
    col.names = c("", rep(c("count", "%"), times = 5)),
    caption = "Summary Table for `mdcr_results_v2.1_01` and `mdcr_results_v3.1_01`."
  )
tab <-
  kableExtra::pack_rows(tab, group_label = c("Conditions"), start_row = 1, end_row = 11)
tab <-
  kableExtra::pack_rows(tab, group_label = c("Flags"), start_row = 12, end_row = 13)
tab <-
  kableExtra::pack_rows(tab, group_label = c("Total Conditions"), start_row = 14, end_row = 24)
tab <-
  kableExtra::add_header_above(tab, c("", "v2.1" = 2, "dxpr or tech" = 2, "dxpr only" = 2, "tech only" = 2, "dxpr and tech" = 2))
tab <-
  kableExtra::add_header_above(tab, c("", "", "", "v3.1" = 8))
tab
```

There are additional details we should consider with respect to the ICD codes.
The ICD version, 9 or 10, and if the code is a diagnostic or a procedure code.
For example, code ICD-9 diagnostic code 332.1 has the same compact code as ICD-9
procedure code 33.21, 3321.  In the case of the `mdcr` data where we have only
compact codes, the need to distinguish between ICD-9 diagnostic and ICD-9
procedure is critically important.  In the `mdcr` data the code 3321 does appear
as both diagnostic and procedure.

```{r, label = "note-dx-pr-in-mdcr"}
pccc_codes[pccc_codes$code == "3321", ]
table(mdcr[mdcr$code == "3321", "dx"])
```

To account for the diagnostic or procedure status of the codes specify a value
for the `dx.var` argument.
```{r, label = "mdcr-results-02"}
mdcr_results_v2.1_02 <-
  comorbidities(
    data = mdcr,
    id.vars = "patid",
    icd.codes = "code",
    dx.var = "dx",
    flag.method = 'current',
    poa = 1,
    method = "pccc_v2.1"
  )

mdcr_results_v3.1_02 <-
  comorbidities(
    data = mdcr,
    id.vars = "patid",
    icd.codes = "code",
    dx.var = "dx",
    flag.method = 'current',
    poa = 1,
    method = "pccc_v3.1"
  )
```

Specificity is increased by using the diagnostic/procedure flag.
Using `pccc_v2.1` there are `r sum(mdcr_results_v2.1_02$cmrb_flag != mdcr_results_v2.1_01$cmrb_flag)`
false positive flags when the diagnostic/procedure flag is omitted from the
`comorbidities` call.
Using `pccc_v3.1` there are `r sum(mdcr_results_v3.1_02$cmrb_flag != mdcr_results_v3.1_01$cmrb_flag)`
false positive flags when the diagnostic/procedure flag is omitted from the
`comorbidities` call.

```{r}
# verify that the cmrb_flag and number of conditions is the same or less after
# accounting for the diagnostic/procedure flag in the comorbidities call
stopifnot(all(mdcr_results_v2.1_02$cmrb_flag <= mdcr_results_v2.1_01$cmrb_flag))
stopifnot(all(mdcr_results_v2.1_02$num_cmrb  <= mdcr_results_v2.1_01$num_cmrb))

sum(mdcr_results_v2.1_02$cmrb_flag != mdcr_results_v2.1_01$cmrb_flag)
sum(mdcr_results_v2.1_02$num_cmrb  != mdcr_results_v2.1_01$num_cmrb)

stopifnot(all(mdcr_results_v3.1_02$cmrb_flag <= mdcr_results_v3.1_01$cmrb_flag))
stopifnot(all(mdcr_results_v3.1_02$num_cmrb  <= mdcr_results_v3.1_01$num_cmrb))

sum(mdcr_results_v3.1_02$cmrb_flag != mdcr_results_v3.1_01$cmrb_flag)
sum(mdcr_results_v3.1_02$num_cmrb  != mdcr_results_v3.1_01$num_cmrb)
```

Let's explore the record for patient 87420.

```{r, include = FALSE}
# this chunk is not included, just test that the results for patient 87420 has
# not changed.
#
# NOTE: vignettes are built in one R session.  As a result, in a prior version
# of this package, mdcr was modified in the icd.Rmd script. Those changes
# persisted into this vignette and resulted in an error.  The fix was to use an
# object in the icd.Rmd vignette called mdcr_copy.
subset(mdcr, patid %in% mdcr[mdcr$code == "5641" & mdcr$dx == 1, "patid"])
pat87420 <- subset(mdcr, patid == 87420)
stopifnot(
  isTRUE(
    all.equal(
      pat87420,
      structure(list(patid = c(87420L, 87420L), icdv = c(9L, 9L), code = c("78321", "5641"), dx = c(1L, 1L)), row.names = 4073:4074, class = "data.frame"),
      check.attributes = FALSE
    )
  )
)
```

```{r}
subset(mdcr, patid == "87420")
subset(get_pccc_codes(), code %in% c("78321", "5641"))
```

```{r}
subset(mdcr_results_v2.1_01, patid == "87420", select = c("cmrb_flag", "renal"))
subset(mdcr_results_v2.1_02, patid == "87420", select = c("cmrb_flag", "renal"))

subset(mdcr_results_v3.1_01, patid == "87420", select = c("cmrb_flag", "renal_dxpr_or_tech"))
subset(mdcr_results_v3.1_02, patid == "87420", select = c("cmrb_flag", "renal_dxpr_or_tech"))

subset(get_icd_codes(with.descriptions = TRUE), full_code %in% c("56.41", "564.1"))
```

In the above, the compact code "5641" matches procedure code 56.41 for a renal
condition.  In `mdcr_results_v2.1_01` and `mdcr_results_v3.1_01` where no
distinction was made between diagnostic and procedure codes this patient was
flagged as having a renal condition. However, when reviewing the patient
record, the compact code "5641" is listed as a diagnostic criteria and the full
code 564.1 is for
`r subset(get_icd_codes(with.descriptions = TRUE), full_code == "564.1" & desc_end == 2015)$desc`.
This is an example of where discriminating between diagnostic and procedure
codes is critically important when looking for complex chronic conditions.

If we explicitly look at an inner join between this patient's data and the pccc
lookup table we see that the code 5641 matches the procedure code in the pccc
lookup table.  By not accounting for diagnostic and procedure codes, the
overlaps between the two coding structures can lead to false positives.

```{r}
merge(x = subset(mdcr, patid == "87420"),
      y = pccc_codes,
      by.x = c("code"),
      by.y = c("code"),
      suffixes = c(".mdcr", ".pccc_codes")
)
```

Using full codes can prevent false positives too.  Here are several different
ways that `comorbidities()` could be called resulting in different outcomes.

Note: this is a good example of how `medicalcoder` can handle full and compact
codes within a single record.

```{r}
DF <- data.frame(id = c("full dx", "full pr", "compact dx", "compact pr"),
                 code = c("564.1", "56.41", "5641", "5641"),
                 dx = c(1, 0, 1, 0))

# ideal: using the dx/pr status and matching on full and compact codes.
comorbidities(
  data = DF,
  id.vars = "id",
  dx.var = "dx",
  icd.codes = "code",
  poa = 1,
  method = "pccc_v3.1"
)[, c("id", "cmrb_flag", "renal_dxpr_or_tech")]

# false positive for the compact dx
comorbidities(
  data = DF,
  id.vars = "id",
  icd.codes = "code",
  poa = 1,
  method = "pccc_v3.1"
)[, c("id", "cmrb_flag", "renal_dxpr_or_tech")]

# false negative for compact pr
comorbidities(
  data = DF,
  id.vars = "id",
  icd.codes = "code",
  poa = 1,
  full.code = TRUE,
  compact.codes = FALSE,
  method = "pccc_v3.1"
)[, c("id", "cmrb_flag", "renal_dxpr_or_tech")]

# false positive for compact dx
comorbidities(
  data = DF,
  id.vars = "id",
  icd.codes = "code",
  poa = 1,
  full.code = FALSE,
  compact.codes = TRUE,
  method = "pccc_v3.1"
)[, c("id", "cmrb_flag", "renal_dxpr_or_tech")]

# false negatives for compact and full pr
comorbidities(
  data = DF,
  id.vars = "id",
  icd.codes = "code",
  dx.var = "dx",
  poa = 1,
  full.code = FALSE,
  compact.codes = TRUE,
  method = "pccc_v3.1"
)[, c("id", "cmrb_flag", "renal_dxpr_or_tech")]
```

Another consideration is the version of ICD, 9 or 10.

The record for patid 95471 is a great example of the problem that a compact code
can cause.  "E030" matches ICD-9 dx compact and full code E030 (no decimal
point), and matches the ICD-10 dx compact code for full code E03.0 with only the
ICD-10 version being related to a chronic complex condition.

Inputs to the `comorbidities()` call for the ICD version will impact the output.
When calling `comorbidities()` with a variable to indicate the ICD version `NA`
values will not be joined against and the codes are ignored resulting in no
condition being flagged.  If we know that we only want to compare against ICD-9
or ICD-10 values then using the `icdv` argument can simplify the call and in
this case, no condition for ICD-9 and a condition is flagged for ICD-10.

```{r, label = "patid95471"}
subset(mdcr, patid == "95471")

# no flag becuse icdv = 9 which treats all input codes as ICD-9
comorbidities(
  data = subset(mdcr, patid == "95471"),
  icd.codes = "code",
  id.vars = 'patid',
  dx.var = "dx",
  icdv = 9L,
  poa = 1,
  method = "pccc_v3.1"
)[, c('patid', 'cmrb_flag')]

# flag because icdv = 10 - same as using `icdv.var = "icdv"`
comorbidities(
  data = subset(mdcr, patid == "95471"),
  icd.codes = "code",
  id.vars = 'patid',
  dx.var = "dx",
  icdv = 10L,
  poa = 1,
  method = "pccc_v3.1"
)[, c('patid', 'cmrb_flag')]

comorbidities(
  data = subset(mdcr, patid == "95471"),
  icd.codes = "code",
  id.vars = 'patid',
  dx.var = "dx",
  icdv.var = "icdv",
  poa = 1,
  method = "pccc_v3.0"
)[, c('patid', 'cmrb_flag')]
```

Lastly, it should be noted that a lot of the ambiguity resulting from compact
codes can be avoided when full codes are available.  `medicalcoder` can handle
both forms. In the example below we again use the "E030" and assess it against
all full and compact codes (default), against only full codes, and lastly
against only compact codes.  Note in this example that we are not specifying the
ICD version nor the diagnostic/procedure status of the code.

```{r}
lookup_icd_codes("E030")
data <- data.frame(id = c("Ambiguous compact code", "Full ICD-9 code", "Full ICD-10 code"),
                   code  = c("E030", "E030", "E03.0"))
data

args <-
  list(
    data = data,
    id.vars = "id",
    icd.codes = "code",
    poa = 1,
    method = "pccc_v3.1"
  )

default <-
  do.call(comorbidities, c(args, list(full.codes = TRUE,  compact.codes = TRUE )))
full_only <-
  do.call(comorbidities, c(args, list(full.codes = TRUE,  compact.codes = FALSE)))
compact_only <-
  do.call(comorbidities, c(args, list(full.codes = FALSE, compact.codes = TRUE )))

default[,       c("id", "cmrb_flag")]
full_only[,     c("id", "cmrb_flag")]
compact_only[,  c("id", "cmrb_flag")]
```

With no information about the "E030" being ICD-9 or ICD-10, full or compact,
(can only be a diagnostic code in either ICD-9 or ICD-10) we get different
flags.  The default, the most liberal approach flags this example patient as
having a condition in all cases.  When only considering the code to be a full code,
then only the ICD-10 version matches.  When only considering the compact codes
the flag is true for the ambiguous version and the ICD-9 full version since
ICD-9 E030 is a full code with the same compact form.


# Longitudinal Conditions

The `medicalcoder` package includes the example data set, `mdcr_longitudinal`,
with ICD-9 and ICD-10 codes for `r length(unique(mdcr_longitudinal$patid))`
synthetic patients with multiple encounters. Each row has a date (encounter) for
when the ICD code was reported.

```{r}
head(mdcr_longitudinal)
```

Let's look at the `pccc_v2.1` flags for each patient, using all the information from
all the encounters.  This can easily by done by specifying `id.vars =
c("patid")` such that the `comorbidities` method considers call codes as
occurring on one encounter.

```{r, results = 'asis'}
longitudinal_v2_patid <-
  comorbidities(data = mdcr_longitudinal,
       icd.codes = "code",
       id.vars = c("patid"),
       icdv.var = "icdv",
       method = "pccc_v2.1",
       flag.method = "current",
       poa = 1
  )
kableExtra::kbl(longitudinal_v2_patid)
```

We can look at the conditions flagged at each encounter by specifying the
`id.vars = c("patid", "date")`.

```{r, results = 'asis'}
longitudinal_v2_patid_date <-
  comorbidities(data = mdcr_longitudinal,
       icd.codes = "code",
       id.vars = c("patid", "date"),
       icdv.var = "icdv",
       method = "pccc_v2.1",
       flag.method = "current",
       poa = 1)
kableExtra::kbl(
  subset(longitudinal_v2_patid_date, patid == "9663901"),
  row.names = FALSE
)
```

Looking at patid 9663901 at an encounter level we see that the conditions occur
at different moments in time and the condition the patient has change overtime.
Because these are chronic conditions, once the condition is observed, it should
be considered to exist in perpetuity.

For `pccc_v2.1` a simple carry-forward method can be applied to the data set
to mark the presence of a condition at the time of reporting and thereafter.


```{r}
longitudinal_v2_patid_date_cumulative_poa0 <-
  comorbidities(
    data = mdcr_longitudinal,
    icd.codes = "code",
    id.vars = c("patid", "date"),
    icdv.var = "icdv",
    method = "pccc_v2.1",
    flag.method = "cumulative",
    poa = 0
  )

kableExtra::kbl(
  subset(longitudinal_v2_patid_date_cumulative_poa0, patid == "9663901"),
  row.names = FALSE
)
```

```{r, results="asis"}
longitudinal_v2_patid_date_cumulative_poa1 <-
  comorbidities(
    data = mdcr_longitudinal,
    icd.codes = "code",
    id.vars = c("patid", "date"),
    icdv.var = "icdv",
    method = "pccc_v2.1",
    flag.method = "cumulative",
    poa = 1
  )
kableExtra::kbl(
  subset(longitudinal_v2_patid_date_cumulative_poa1, patid == "9663901"),
  row.names = FALSE
)
```

For `pccc_v3.0` and `pccc_v3.1` a simple carry-forward method would not be easy
to use as information about technology dependent codes is omitted when
non-technology dependent codes do not exist.

Let's use three ICD-10 diagnostic codes for this example and we will explore
all six possible permutations of the codes.  We'll generate a data set with
seven encounters and one code appearing on each of encounters 2, 4, and 6.

The codes we'll use are:
* H49.811: metabolic (other metabolic disorders),
* J84.111: respiratory (chronic respiratory diseases), and
* Z96.41:  metabolic (device and technology use).

```{r}
codes <- c("H49.811", "J84.111", "Z96.41")
subset(get_pccc_codes(), full_code %in% codes)
```
The constructed data and permutations are:
```{r}
permutations <-
  data.table::data.table(
    permutation = rep(1:6, each = 7),
    encounter_id = rep(1:7, times = 6),
    code =
      codes[c(NA, 1, NA, 2, NA, 3, NA,
              NA, 1, NA, 3, NA, 2, NA,
              NA, 2, NA, 1, NA, 3, NA,
              NA, 2, NA, 3, NA, 1, NA,
              NA, 3, NA, 1, NA, 2, NA,
              NA, 3, NA, 2, NA, 1, NA)]
  )

permutations[, plabel := paste(na.omit(code), collapse = ", "), by = .(permutation)]
permutations[, plabel := paste0("Permutation ", permutation, ": ", plabel)]
str(permutations, vec.len = 1)
```

```{r, echo = FALSE, results = "asis"}
cat(paste("*", permutations[, unique(plabel)]), sep = "\n")
```

We'll apply the `pccc_v3.1` to this code set with `flag.method = "cumulative"`
and all codes considered to be present-on-admission.
```{r}
rtn <-
  comorbidities(
    data = permutations,
    icd.codes = "code",
    id.vars = c("permutation", "plabel", "encounter_id"),
    icdv = 10L,
    compact.codes = FALSE,
    method = "pccc_v3.1",
    flag.method = "cumulative",
    poa = 1
  )
```

```{r, label = "setup-rtn-for-discussion", include = FALSE}
rtn_wide <-
  data.table::dcast(
    encounter_id ~ plabel,
    data = rtn,
    value.var = c("metabolic_dxpr_or_tech", "metabolic_dxpr_only",
                  "metabolic_tech_only", "metabolic_dxpr_and_tech",
                  "respiratory_dxpr_or_tech", "respiratory_dxpr_only",
                  "respiratory_tech_only", "respiratory_dxpr_and_tech",
                  "cmrb_flag", "num_cmrb")
  )
```

```{r, label = "define-pkbl", include = FALSE}
pkbl <- function(permutation = 1) {
  stopifnot(length(permutation) == 1L)

  x <- rtn_wide[
        ,
        .SD,
        .SDcols = c("encounter_id",
                    grep(paste0("Permutation ", permutation), names(rtn_wide), value = TRUE))
       ]

  pl <- rtn[["plabel"]][rtn$permutation == permutation][1]

  tab <-
    kableExtra::kbl(
      x,
      col.names = c("encounter_id", rep(c("dxpr or tech", "dxpr only", "tech only", "dxpr and tech"), times = 2), "ccc flag", "num ccc")
    )
  tab <- kableExtra::add_header_above(kable_input = tab, header = c("", c("Metabolic" = 4, "Respiratory" = 4), "", ""))
  tab <- kableExtra::add_header_above(kable_input = tab, header = c("", setNames(10, pl)))
  tab
}
```

Let's walk through the results for each permutation.

**Permutation 1**

```{r, echo = FALSE, results = "asis"}
pkbl(1)
```

The first code to appear in this permutation is H49.811, metabolic (other).
This is a diagnostic code and will flag the metabolic condition for encounters 2
through 7 as `_dxpr_or_tech`.  The Z96.41 code, metabolic (tech), appears on
encounter 6.  Thus, for encounters 2 through 5 metabolic should be flagged as
`_dxpr_or_tech = 1`, `dxpr_only = 1`, `tech_only = 0`, and `dxpr_and_tech = 0`.
Encounters 6 and 7 then have `dxpr_only = 0` and `tech_only = 0` with
`dxpr_and_tech = 1`.
The J84.111 for respiratory is a non-tech code appearing on encounter 4 and
should flag as `dxpr_or_tech = 1`, `dxpr_only = 1`, `tech_only = 0`, and
`dxpr_and_tech = 0` for encounters 4 through 7.

**Permutation 2**
```{r, echo = FALSE, results = "asis"}
pkbl(2)
```

As with permutation 1, having the non-tech dependent metabolic code H49.811
appearing on encounter 2 means that metabolic is flagged for encounters 2
through 7.  What should differ is that `dxpr_only` is 1 for encounters 2 and 3,
with `dxpr_and_tech` flagging for encounters 4 through 7.  Lastly, the non-tech
code J84.111 for respiratory condition flagging as `dxpr_or_tech = dxpr_only = 1`
for encounters 6 and 7.

**Permutation 3**
```{r, echo = FALSE, results = "asis"}
pkbl(3)
```
Permutation three has respiratory flagged for encounters 2 through 7.  The
non-tech metabolic code on encounter 4 results in the flagging of metabolic for
encounters 4 through 7.

**Permutation 4**
```{r, echo = FALSE, results = "asis"}
pkbl(4)
```

Permutation 4 is notable as presence of the respiratory condition on encounters 2
through 7 means that when the technology dependent metabolic code appears on
encounter 4, a metabolic is flagged for encounters 4 through 7.  Compare this
with permutations 5 and 6.

**Permutation 5**
```{r, echo = FALSE, results = "asis"}
pkbl(5)
```

For permutation 5 the first code is a tech dependent metabolic code on encounter 2.
Because the _only_ code for flagging a condition is a technology dependent code
the PCCC version 3 algorithm results in _no_ condition being flagged for
encounters 2 and 3. On encounter 4, when the non-tech metabolic code appears
then the metabolic condition is flagged and the past history of the technology
dependent code persists.

**Permutation 6**
```{r, echo = FALSE, results = "asis"}
pkbl(6)
```

As with permutation 5, since the only code in the record for encounter 2 and 3
is the technology dependent metabolic code, there is no flagged condition. On
encounter 4, when the dxpr code for a respiratory condition is reported then the
respiratory condition _and_ the metabolic condition is flagged as technology
dependent. Note that technology only conditions are flagged if at least one
other condition is flagged.

# Subconditions

The documentation for PCCC version 2 [@feudtner2014pediatric] and version 3
[@feinstein2024pediatric] include subconditions under each of the major
conditions.  However, to our knowledge, no software prior to `medicalcoder`
implemented flagging for the subconditions.

The subconditions for each condition are shown in the next table.

```{r, label = "tbl-syntactically-valid-subconditions", echo = FALSE, results="asis"}
SCNDS <- get_pccc_conditions()
data.table::setDT(SCNDS)
data.table::setkey(SCNDS, condition, subcondition)
SCNDS[, condition := paste(condition, condition_label, sep = ": ")]

tab <-
  kableExtra::kbl(SCNDS[, .(subcondition, subcondition_label)],
      capttion = "Syntactically valid names for subconditions",
      row.names = FALSE
  )
tab <-
  kableExtra::pack_rows(tab, index = table(SCNDS$condition))
tab
```

To get the subconditions all you need to do is use the `subconditions = TRUE`
argument in the `comorbidities` call.  For this example we will apply
`pccc_v3.1` with and without comorbidities.

```{r}
without_subconditions <-
  comorbidities(
    data = mdcr,
    id.vars = "patid",
    icd.codes = "code",
    icdv.var = "icdv",
    dx.var = "dx",
    poa = 1,
    method = "pccc_v3.1",
    subconditions = FALSE
  )

with_subconditions <-
  comorbidities(
    data = mdcr,
    id.vars = "patid",
    icd.codes = "code",
    icdv.var = "icdv",
    dx.var = "dx",
    poa = 1,
    method = "pccc_v3.1",
    subconditions = TRUE
  )
```

The structure of the return object `with_subconditions` is a list with two
elements.  The first element, `conditions`, is identical to the results
of calling `comorbidities()` with `subconditions = FALSE`.

```{r}
with_subconditions

all.equal(with_subconditions$conditions,
          without_subconditions,
          check.attributes = FALSE)
```

The second element of `with_subconditions` is list of `data.frame`s, one for
each condition, with indicators for only those with the condition.

A quick and easy way to get a summary of the subconditions is to call
`summary()`.

```{r}
str(
  summary(with_subconditions)
)
```

The subconditions are available for all pccc variants.  A summary is presented
in the following table.

```{r, include = FALSE}
args <-
  list(
    data = mdcr,
    id.vars = "patid",
    icd.codes = "code",
    icdv.var = "icdv",
    dx.var = "dx",
    poa = 1,
    subconditions = TRUE
  )
with_subconditions_v2.0 <- do.call(comorbidities, c(args, list(method = "pccc_v2.0")))
with_subconditions_v2.1 <- do.call(comorbidities, c(args, list(method = "pccc_v2.1")))
with_subconditions_v3.0 <- do.call(comorbidities, c(args, list(method = "pccc_v3.0")))
with_subconditions_v3.1 <- do.call(comorbidities, c(args, list(method = "pccc_v3.1")))
```

```{r, echo = FALSE, results = "asis"}
rslts <-
  merge(
    merge(
      summary(with_subconditions_v2.0),
      summary(with_subconditions_v2.1),
      by = c("condition", "subcondition"),
      suffixes = c("_v2.0", "_v2.1"),
      sort = FALSE
    ),
    merge(
      summary(with_subconditions_v3.0),
      summary(with_subconditions_v3.1),
      by = c("condition", "subcondition"),
      suffixes = c("_v3.0", "_v3.1"),
      sort = FALSE
    ),
    by = c("condition", "subcondition"),
    sort = FALSE
  )

rslts$idx <- 1:nrow(rslts)

rslts <-
  merge(rslts,
        unique(get_pccc_conditions()[c("condition", "condition_label")]),
        all = TRUE,
        by = "condition",
        sort = FALSE)
rslts <-
  merge(rslts,
        unique(get_pccc_conditions()[c("subcondition", "subcondition_label")]),
        all = TRUE,
        by = "subcondition",
        sort = FALSE)
rslts$lab <- rslts$subcondition_label
rslts$lab[is.na(rslts$subcondition)] <- rslts$condition_label[is.na(rslts$subcondition)]
rslts <- rslts[order(rslts$idx), ]

tab <-
  rslts[,
    c(
      "lab",
      "count_v2.0",
      "percent_of_cohort_v2.0",
      "percent_of_those_with_condition_v2.0",
      "count_v2.1",
      "percent_of_cohort_v2.1",
      "percent_of_those_with_condition_v2.1",
      "count_v3.0",
      "percent_of_cohort_v3.0",
      "percent_of_those_with_condition_v3.0",
      "count_v3.1",
      "percent_of_cohort_v3.1",
      "percent_of_those_with_condition_v3.1"
    )
  ]

tab <-
  kableExtra::kbl(
    tab,
    col.names = c("", rep(c("count", "% of cohort", "% of those with condition"), 4)),
    row.names = FALSE,
    digits = 1
  )
tab <- kableExtra::column_spec(tab, column = 1, bold = is.na(rslts$subcondition))
tab <- kableExtra::kable_styling(tab, "striped")
tab <- kableExtra::add_indent(tab, which(!is.na(rslts$subcondition)))
tab <- kableExtra::add_header_above(tab, c("", "v2.0" = 3, "v2.1" = 3, "v3.0" = 3, "v3.1" = 3))

tab
```

The longitudinal assessment for subconditions work as well.  Using the same
`permutations` data set from above we will look at the metabolic and respiratory
conditions and subconditions.

```{r}
rslts <-
  comorbidities(
    data = permutations,
    icd.codes = "code",
    id.vars = c("permutation", "plabel", "encounter_id"),
    icdv = 10L,
    compact.codes = FALSE,
    method = "pccc_v3.1",
    flag.method = "cumulative",
    poa = 1,
    subconditions = TRUE
  )
```
Let's start by looking at the respiratory results.  The only subcondition that
should be, and is, flagged is chronic respiratory diseases.  A reminder: the
`data.frame` for a subcondition only report rows for when the primary condition
was flagged.  We see in the following encounters where the chronic
respiratory disease is flagged is consistent with when the primary respiratory
condition is flagged.

```{r}
all(rslts$subconditions$respiratory$chronic_respiratory_diseases == 1)
sapply(rslts$subconditions$respiratory[, -(1:3)], max)

# which encounters flag for primary condition respiratory?
cnd <-
  rslts$conditions[
    respiratory_dxpr_or_tech == 1,
    .(cencid = paste(encounter_id, collapse = ", ")),
    by = .(plabel)
  ]


# which encounters flag for the subcondition chronic_respiratory_diseases?
scnd <-
  rslts$subconditions$respiratory[
    ,
    .(sencid = paste(encounter_id, collapse = ", ")),
    by = .(plabel)
  ]
```
```{r, echo = FALSE, results = "asis"}
tab <-
  kableExtra::kbl(
    merge(cnd, scnd, all = TRUE, by = "plabel"),
    caption = "Encounters flagging for respiratory condition and the chronic respiratory disease subcondition.",
    col.names = c("", "Condition", "Subcondition")
  )
tab <- kableExtra::add_header_above(tab, c("", "Encounters" = 2))
tab
```

For the metabolic condition we have two subconditions to look at, 1) device and
technology use, and 2) other metabolic disorders.

```{r}
# which encounters flag for primary condition metabolic?
cnd <-
  rslts$conditions[
    metabolic_dxpr_or_tech == 1,
    .(cencid = paste(encounter_id, collapse = ", ")),
    by = .(plabel)
  ]

# which encounters flag for the subconditions?
scnd <-
  data.table::melt(
    rslts$subconditions$metabolic,
    id.vars = c("plabel", "encounter_id"),
    measure.vars = c("device_and_technology_use", "other_metabolic_disorders"),
    variable.factor = FALSE,
    variable.name = "subcondition"
  )
scnd <- scnd[value == 1]
scnd <-
  scnd[
    ,
    .(sencid = paste(encounter_id, collapse = ", ")),
    by = .(plabel, subcondition)
  ]

scnd <-
  data.table::dcast(
    scnd,
    plabel ~ subcondition,
    value.var = "sencid"
  )
```

```{r, echo = FALSE, results = "asis"}
tab <-
  kableExtra::kbl(
    x = merge(cnd, scnd, all = TRUE, by = "plabel"),
    caption = "Encounters flagging for a metabolic condition and the encounters flagging for subconidtions device and technology use and/or other metabolic disorders.",
    col.names = c("", "Condition", "Device and Technology Use", "Other Metabolic Disorders")
  )
tab <-
  kableExtra::add_header_above(tab, c("", "Encounters" = 3))
tab
```


# References

<!-- ----------------------------------------------------------------------- -->
<!--                              End of File                                -->
<!-- ----------------------------------------------------------------------- -->
