About Rtables

rtables is an R package developed by Gabriel Becker & Adrian Waddell that is sponsored by the Roche Group and is available open source at github.com/Roche/rtables. The rtables package was initially designed by Adrian Waddell as a proof of concept and was used within Roche for tabulating clinical trials data. Starting with 2019, Gabriel Becker joined the rtables project and togehter we redesigned the package with focus on a more involved data structure and a more powerful tabulation framework. The redesign allows more general accessor on modifiers using pathing, better pagination, and a future feature allowing tiles, footnotes, and a cell referencing system. Further, the redesign improved the tabulation speed.

This document includes two vignettes, introduction and clinical_trials from the rtables package on the commit 83f653080cd5b21bc87b74c5701f664d474c1d74. These vignettes are supposed to give a good overview of the capability of rtables. There are other vignettes available in the package if one wants to get a deeper understanding of the rtables framework. You can install the version used for this document with:

devtools::install_github("Roche/rtables", ref = "83f653080cd5b21bc87b74c5701f664d474c1d74")

The rtables package is currently not published on CRAN as we are in the process of refining some desgin details (mainly around pathing and visualizing the tree data structure) before submitting it to CRAN.

rtables outputs the tables to ASCII with the toString function and HTML with the as_html function. Note that it is currently not possible to add row gaps (empty rows or white spaces) when outputting the table. Row gaps are a feature that are neither essential for tabulation nor for designing the table data structure. Instead, most often the row gaps can be determined from the underlying table data structure by the outputting algorithm. However, specifying row gaps is a feature that is on our roadmap.

Introduction to rtables

Overview

The rtables R package provides a framework to create, tabulate and output tables in R. Most of the design requirements for rtables have their origin in studying tables that are commonly used to report analyses from clinical trials; however, we were careful to keep rtables a general purpose toolkit.

There are a number of other table frameworks available in R such as gt from RStudio, xtable, tableone, and tables to name a few. There is a number of reasons to implement rtables (yet another tables R package):

output tables in ASCII to text files
rtables has two powerful tabulation frameworks: rtabulate and the layouting based tabulation framework
table view (ASCII, HTML, etc.) is separate from the data model. Hence, one always has access to the non-rounded/non-formatted numbers.
pagination to meet the health authority submission requirements
cell, row, column, table reference system (to be implemented)
title footnotes (to be implemented)
path based access to cell content which will be useful for automated content generation

In the remainder of this section, we give a short introduction into rtables and tabulating a table. The content is based on the useR 2020 presentation from Gabriel Becker.

The packages used for this section are rtables and dplyr:

library(rtables)
library(dplyr)

Data

The data used in this section is a made up using random number generators. The data content is relatively simple: one row per imaginary person and one column per measurement: study arm, the country of origin, gender, handedness, age, and weight.

n <- 400

set.seed(1)

df <- tibble(
  arm = factor(sample(c("Arm A", "Arm B"), n, replace = TRUE), levels = c("Arm A", "Arm B")),
  country = factor(sample(c("CAN", "USA"), n, replace = TRUE, prob = c(.55, .45)), levels = c("CAN", "USA")),
  gender = factor(sample(c("Female", "Male"), n, replace = TRUE), levels = c("Female", "Male")),
  handed = factor(sample(c("Left", "Right"), n, prob = c(.6, .4), replace = TRUE), levels = c("Left", "Right")),
  age = rchisq(n, 30) + 10
) %>% mutate(
  weight = 35 * rnorm(n, sd = .5) + ifelse(gender == "Female", 140, 180)
) 

head(df)

# A tibble: 6 x 6
  arm   country gender handed   age weight
  <fct> <fct>   <fct>  <fct>  <dbl>  <dbl>
1 Arm A USA     Female Left    31.3   139.
2 Arm B CAN     Female Right   50.5   116.
3 Arm A USA     Male   Right   32.4   186.
4 Arm A USA     Male   Right   34.6   169.
5 Arm B USA     Female Right   43.0   160.
6 Arm A USA     Female Right   43.2   126.

Note that we use factors variables so that the level order is represented in the row or column order when we tabulate the information of df below.

Building an Table

The aim of this section is to build the following table step by step:

                    Arm A                     Arm B         
             Female        Male        Female        Male   
             (N=96)      (N=105)       (N=92)      (N=107)  
------------------------------------------------------------
CAN        45 (46.9%)    64 (61%)     46 (50%)    62 (57.9%)
  Left     32 (33.3%)    42 (40%)    26 (28.3%)   37 (34.6%)
    mean      38.9         40.4         40.3         37.7   
  Right    13 (13.5%)    22 (21%)    20 (21.7%)   25 (23.4%)
    mean      36.6         40.2         40.2         40.6   
USA        51 (53.1%)    41 (39%)     46 (50%)    45 (42.1%)
  Left     34 (35.4%)   19 (18.1%)   25 (27.2%)   25 (23.4%)
    mean      40.4         39.7         39.2         40.1   
  Right    17 (17.7%)    22 (21%)    21 (22.8%)   20 (18.7%)
    mean      36.9         39.8         38.5          39

Starting Simple

In rtables a basic table is defined to have 0 rows and one column representing all data. Analyzing a variable is one way of adding a row:

l <- basic_table() %>%
  analyze("age", mean, format = "xx.x")

build_table(l, df)

       all obs
--------------
mean    39.4

Layout Instructions

In the code above we first described the table and assigned that description to a variable l. We then built the table using the actual data with build_table. The description of a table is called a table layout. basic_table is the start of every table layout and contains the information that we have one column representing all data. The analyze instruction adds to the layout that the age variable should be analyzed with the mean analysis function and the result should be rounded to 1 decimal place.

Hence, a layout is “pre-data”, that is, it’s a description of how to build a table once we get data. We can look at the layout isolated:

A Pre-data Table Layout

Column-Split Structure:
( () ->  () -> ) () 

Row-Split Structure:
age (** analysis **)

The general layouting instructions are summarized below:

basic_table is a layout representing a table with zero rows and one column
Nested splitting
- in row space: split_rows_by, split_rows_by_multivar, split_rows_by_cuts, split_rows_by_cutfun, split_rows_by_quartiles
- in column space: split_cols_by, split_cols_by_cuts, split_cols_by_cutfun, split_cols_by_quartiles
Summarizing Groups: summarize_row_groups
Analyzing Variables: analyze, analyze_against_baseline, analyze_colvars, analyze_row_groups

using those functions it is possible to create a wide variety of tables as we will show in this document.

Adding Column Structure

We will now add more structure to the columns by adding a column split based on the factor variable arm:

l <- basic_table() %>%
  split_cols_by("arm") %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

       Arm A   Arm B
--------------------
mean   39.5    39.4

The resulting table has one column per factor level of arm. So the data represented by the first column is df[df$arm == "ARM A", ]. Hence, the split_cols_by partitions the data among the columns by default.

Column splitting can be done in a recursive/nested manner by adding sequential split_cols_by layout instruction. It’s also possible to add a non-nested split. Here we splitting each arm further by the gender:

l <- basic_table() %>%
  split_cols_by("arm") %>%
  split_cols_by("gender") %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

           Arm A           Arm B    
       Female   Male   Female   Male
------------------------------------
mean    38.8    40.1    39.6    39.2

The first column represents the data in df where df$arm == "A" & df$gender == "Female" and the second column the data in df where df$arm == "A" & df$gender == "Male", and so on.

Adding Row Structure

So far, we have created layouts with analysis and column splitting instructions, i.e. analyze and split_cols_by, respectively. This resulted with a table with multiple columns and one data row. We will add more row structure by stratifying the mean analysis by country (i.e. adding a split in the row space):

l <- basic_table() %>%
  split_cols_by("arm") %>%
  split_cols_by("gender") %>%
  split_rows_by("country") %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

             Arm A           Arm B    
         Female   Male   Female   Male
--------------------------------------
CAN                                   
  mean    38.2    40.3    40.3    38.9
USA                                   
  mean    39.2    39.7    38.9    39.6

In this table the data used to derive the first data cell (average of age of female canadians in Arm A) is where df$country == "CAN" & df$arm == "Arm A" & df$gender == "Female". This cell value can also be calculated manually:

mean(df$age[df$country == "CAN" & df$arm == "Arm A" & df$gender == "Female"])

[1] 38.22447

Adding Group Information

When adding row splits we get by default label rows for each split level, for example CAN and USA in the table above. Besides the column space subsetting, we have now further subsetted the data for each cell. It is often useful when defining a row splitting to display information about each row group. In rtables this is referred to as content information, i.e. mean on row 2 is a descendant of CAN (visible via the indenting, though the table has an underlying tree structure that is not of importance for this section). In order to add content information and turn the CAN label row into a content row the summarize_row_groups function is required. By default, the count (nrows) and percentage of data relative to the column associated data is calculated:

l <- basic_table() %>%
  split_cols_by("arm") %>%
  split_cols_by("gender") %>%
  split_rows_by("country") %>%
  summarize_row_groups() %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

                 Arm A                   Arm B        
           Female       Male      Female       Male   
------------------------------------------------------
CAN      45 (46.9%)   64 (61%)   46 (50%)   62 (57.9%)
  mean      38.2        40.3       40.3        38.9   
USA      51 (53.1%)   41 (39%)   46 (50%)   45 (42.1%)
  mean      39.2        39.7       38.9        39.6

The relative percentage for average age of female Canadians is calculated as follows:

df_cell <- subset(df, df$country == "CAN" & df$arm == "Arm A" & df$gender == "Female")
df_col_1 <- subset(df, df$arm == "Arm A" & df$gender == "Female")

c(count = nrow(df_cell), percentage = nrow(df_cell)/nrow(df_col_1))

     count percentage 
  45.00000    0.46875

so the group percentages per row split sum up to 1 for each column.

We can further split the row space by dividing each country by handedness:

l <- basic_table() %>%
  split_cols_by("arm") %>%
  split_cols_by("gender") %>%
  split_rows_by("country") %>%
  summarize_row_groups() %>%
  split_rows_by("handed") %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

                   Arm A                   Arm B        
             Female       Male      Female       Male   
--------------------------------------------------------
CAN        45 (46.9%)   64 (61%)   46 (50%)   62 (57.9%)
  Left                                                  
    mean      38.9        40.4       40.3        37.7   
  Right                                                 
    mean      36.6        40.2       40.2        40.6   
USA        51 (53.1%)   41 (39%)   46 (50%)   45 (42.1%)
  Left                                                  
    mean      40.4        39.7       39.2        40.1   
  Right                                                 
    mean      36.9        39.8       38.5         39

Next, we further add a count and percentage summary for handedness within each country:

l <- basic_table() %>%
  split_cols_by("arm") %>%
  split_cols_by("gender") %>%
  split_rows_by("country") %>%
  summarize_row_groups() %>%
  split_rows_by("handed") %>%
  summarize_row_groups() %>%
  analyze("age", afun = mean, format = "xx.x")

build_table(l, df)

                    Arm A                     Arm B         
             Female        Male        Female        Male   
------------------------------------------------------------
CAN        45 (46.9%)    64 (61%)     46 (50%)    62 (57.9%)
  Left     32 (33.3%)    42 (40%)    26 (28.3%)   37 (34.6%)
    mean      38.9         40.4         40.3         37.7   
  Right    13 (13.5%)    22 (21%)    20 (21.7%)   25 (23.4%)
    mean      36.6         40.2         40.2         40.6   
USA        51 (53.1%)    41 (39%)     46 (50%)    45 (42.1%)
  Left     34 (35.4%)   19 (18.1%)   25 (27.2%)   25 (23.4%)
    mean      40.4         39.7         39.2         40.1   
  Right    17 (17.7%)    22 (21%)    21 (22.8%)   20 (18.7%)
    mean      36.9         39.8         38.5          39

Tables used in Clinical Trials

Overview

In this section we create a

demographic table
adverse event table
response table

using the rtables layout facility. That is, we demonstrate how the layout based tabulation framework can specify the structure and relations that are commonly found when analyzing clinical trials data.

Note that all the data is created using random number generators. All ex_* data which is currently attached to the rtables package were created using random.cdisc.data another R package that we intend to release as open source soon.

The packages used in this section are:

library(rtables)
library(tibble)
library(dplyr)

Demographic Table

Demographic tables summarize the variables content for different population subsets (encoded in the columns).

One feature of analyze that we have not introduced in the previous section is that the analysis function afun can specify multiple rows with the in_rows function:

ADSL <- ex_adsl  # Example ADSL dataset

basic_table() %>%
  split_cols_by("ARM") %>%
  analyze(vars = "AGE", afun = function(x) {
    in_rows(
      "Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
      "Range" = rcell(range(x), format = "xx.xx - xx.xx")
    )
  }) %>%
  build_table(ADSL)

             A: Drug X     B: Placebo    C: Combination
-------------------------------------------------------
Mean (sd)   33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
Range         21 - 50        21 - 62        20 - 69

Multiple variables can be analyzed in one analyze call:

basic_table() %>%
  split_cols_by("ARM") %>%
  analyze(vars = c("AGE", "BMRKR1"), afun = function(x) {
    in_rows(
      "Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
      "Range" = rcell(range(x), format = "xx.xx - xx.xx")
    )
  }) %>%
  build_table(ADSL)

               A: Drug X      B: Placebo    C: Combination
----------------------------------------------------------
AGE                                                       
  Mean (sd)   33.77 (6.55)   35.43 (7.9)     35.43 (7.72) 
  Range         21 - 50        21 - 62         20 - 69    
BMRKR1                                                    
  Mean (sd)   5.97 (3.55)     5.7 (3.31)     5.62 (3.49)  
  Range       0.41 - 17.67   0.65 - 14.24    0.17 - 21.39

Hence, if afun can process different data vector types (i.e. variables selected from the data) then we are fairly close to a standard demographic table. Here is a function that either creates a count table or some number summary if the argument x is a factor or numeric, respectively:

s_summary <- function(x) {
  if (is.numeric(x)) {
    in_rows(
      "n" = rcell(sum(!is.na(x)), format = "xx"),
      "Mean (sd)" = rcell(c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE)), format = "xx.xx (xx.xx)"),
      "IQR" = rcell(IQR(x, na.rm = TRUE), format = "xx.xx"),
      "min - max" = rcell(range(x, na.rm = TRUE), format = "xx.xx - xx.xx")
    )
  } else if (is.factor(x)) {
    
    vs <- as.list(table(x))
    do.call(in_rows, lapply(vs, rcell, format = "xx"))
    
  } else (
    stop("type not supported")
  )
}

Note we use rcells to wrap the results in order to add formatting instructions for rtables. We can use s_summary outside the context of tabulation:

s_summary(ADSL$AGE)

in_rows object print method:
----------------------------
   row_name formatted_cell indent_mod row_label
1         n            400          0         n
2 Mean (sd)   34.88 (7.44)          0 Mean (sd)
3       IQR             10          0       IQR
4 min - max        20 - 69          0 min - max

and

s_summary(ADSL$SEX)

in_rows object print method:
----------------------------
          row_name formatted_cell indent_mod        row_label
1                F            222          0                F
2                M            166          0                M
3                U              9          0                U
4 UNDIFFERENTIATED              3          0 UNDIFFERENTIATED

We can now create a commonly used variant of the demographic table:

lyt <- basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  analyze(c("AGE", "SEX"), afun = s_summary) 

tbl <- build_table(lyt, ADSL)
tbl

                      A: Drug X     B: Placebo    C: Combination
----------------------------------------------------------------
AGE                                                             
  n                      134            134            132      
  Mean (sd)          33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
  IQR                     11            10              10      
  min - max            21 - 50        21 - 62        20 - 69    
SEX                                                             
  F                       79            77              66      
  M                       51            55              60      
  U                       3              2              4       
  UNDIFFERENTIATED        1              0              2

Note that analyze can also be called multiple times in sequence:

tbl2 <- basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  analyze("AGE", s_summary) %>%
  analyze("SEX", s_summary) %>%
  build_table(ADSL) 

tbl2

                      A: Drug X     B: Placebo    C: Combination
----------------------------------------------------------------
AGE                                                             
  n                      134            134            132      
  Mean (sd)          33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
  IQR                     11            10              10      
  min - max            21 - 50        21 - 62        20 - 69    
SEX                                                             
  F                       79            77              66      
  M                       51            55              60      
  U                       3              2              4       
  UNDIFFERENTIATED        1              0              2

which leads to the identical table as tbl:

identical(tbl, tbl2)

[1] TRUE

In clinical trials analyses the number of patients per column is often referred to as N (rather than the overall population which outside of clinical trials is commonly referred to as N). Column Ns are added using the add_colcounts function:

basic_table() %>% 
  split_cols_by(var = "ARMCD") %>%
  add_colcounts() %>%
  analyze(c("AGE", "SEX"), s_summary) %>%
  build_table(ADSL)

                        ARM A          ARM B         ARM C    
                       (N=134)        (N=134)       (N=132)   
--------------------------------------------------------------
AGE                                                           
  n                      134            134           132     
  Mean (sd)          33.77 (6.55)   35.43 (7.9)   35.43 (7.72)
  IQR                     11            10             10     
  min - max            21 - 50        21 - 62       20 - 69   
SEX                                                           
  F                       79            77             66     
  M                       51            55             60     
  U                       3              2             4      
  UNDIFFERENTIATED        1              0             2

Variations on the demographic table

We will now show a couple of variations of the demographic table that we developed above. These variations are in structure and not in analysis, hence they don’t require a modification to the s_summary function.

We will start with a standard table analyzing the variables AGE and BMRKR2 variables:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  analyze(c("AGE", "BMRKR2"), s_summary) %>%
  build_table(ADSL)

               A: Drug X     B: Placebo    C: Combination
                (N=134)        (N=134)        (N=132)    
---------------------------------------------------------
AGE                                                      
  n               134            134            132      
  Mean (sd)   33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
  IQR              11            10              10      
  min - max     21 - 50        21 - 62        20 - 69    
BMRKR2                                                   
  LOW              50            45              40      
  MEDIUM           37            56              42      
  HIGH             47            33              50

Assume we would like to have this analysis carried out per gender encoded in the row space:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX") %>%
  analyze(c("AGE", "BMRKR2"), s_summary) %>%
  build_table(ADSL)

                    A: Drug X      B: Placebo    C: Combination
                     (N=134)        (N=134)         (N=132)    
---------------------------------------------------------------
F                                                              
  AGE                                                          
    n                   79             77              66      
    Mean (sd)      32.76 (6.09)   34.12 (7.06)    35.2 (7.43)  
    IQR                 9              8              6.75     
    min - max        21 - 47        23 - 58         21 - 64    
  BMRKR2                                                       
    LOW                 26             21              26      
    MEDIUM              21             38              17      
    HIGH                32             18              23      
M                                                              
  AGE                                                          
    n                   51             55              60      
    Mean (sd)      35.57 (7.08)   37.44 (8.69)    35.38 (8.24) 
    IQR                 11             9               11      
    min - max        23 - 50        21 - 62         20 - 69    
  BMRKR2                                                       
    LOW                 21             23              11      
    MEDIUM              15             18              23      
    HIGH                15             14              26      
U                                                              
  AGE                                                          
    n                   3              2               4       
    Mean (sd)      31.67 (3.21)    31 (5.66)      35.25 (3.1)  
    IQR                 3              4              3.25     
    min - max        28 - 34        27 - 35         31 - 38    
  BMRKR2                                                       
    LOW                 2              1               1       
    MEDIUM              1              0               2       
    HIGH                0              1               1       
UNDIFFERENTIATED                                               
  AGE                                                          
    n                   1              0               2       
    Mean (sd)        28 (NA)        NaN (NA)       45 (1.41)   
    IQR                 0              NA              1       
    min - max        28 - 28       Inf - -Inf       44 - 46    
  BMRKR2                                                       
    LOW                 1              0               2       
    MEDIUM              0              0               0       
    HIGH                0              0               0

We will now subset ADSL to include only males and females in the analysis in order to reduces the number of rows in the table:

ADSL_M_F <- filter(ADSL, SEX %in% c("M", "F"))

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX") %>%
  analyze(c("AGE", "BMRKR2"), s_summary) %>%
  build_table(ADSL_M_F)

                    A: Drug X      B: Placebo    C: Combination
                     (N=130)        (N=132)         (N=126)    
---------------------------------------------------------------
F                                                              
  AGE                                                          
    n                   79             77              66      
    Mean (sd)      32.76 (6.09)   34.12 (7.06)    35.2 (7.43)  
    IQR                 9              8              6.75     
    min - max        21 - 47        23 - 58         21 - 64    
  BMRKR2                                                       
    LOW                 26             21              26      
    MEDIUM              21             38              17      
    HIGH                32             18              23      
M                                                              
  AGE                                                          
    n                   51             55              60      
    Mean (sd)      35.57 (7.08)   37.44 (8.69)    35.38 (8.24) 
    IQR                 11             9               11      
    min - max        23 - 50        21 - 62         20 - 69    
  BMRKR2                                                       
    LOW                 21             23              11      
    MEDIUM              15             18              23      
    HIGH                15             14              26      
U                                                              
  AGE                                                          
    n                   0              0               0       
    Mean (sd)        NaN (NA)       NaN (NA)        NaN (NA)   
    IQR                 NA             NA              NA      
    min - max       Inf - -Inf     Inf - -Inf      Inf - -Inf  
  BMRKR2                                                       
    LOW                 0              0               0       
    MEDIUM              0              0               0       
    HIGH                0              0               0       
UNDIFFERENTIATED                                               
  AGE                                                          
    n                   0              0               0       
    Mean (sd)        NaN (NA)       NaN (NA)        NaN (NA)   
    IQR                 NA             NA              NA      
    min - max       Inf - -Inf     Inf - -Inf      Inf - -Inf  
  BMRKR2                                                       
    LOW                 0              0               0       
    MEDIUM              0              0               0       
    HIGH                0              0               0

Note that the UNDIFFERENTIATED and U levels still show up in the table. This is because tabulation respects the factor levels and level order, exactly as the split and table function do. If empty levels should be dropped then rtables needs to know that at splitting time via the split_fun argument in split_rows_by. There are a number of predefined functions. For this example drop_split_levels is required to drop the empty levels at splitting time. Splitting is a big topic and will be eventually addressed in a specific package vignette.

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze(c("AGE", "BMRKR2"), s_summary) %>%
  build_table(ADSL_M_F)

                 A: Drug X      B: Placebo    C: Combination
                  (N=130)        (N=132)         (N=126)    
------------------------------------------------------------
F                                                           
  AGE                                                       
    n                79             77              66      
    Mean (sd)   32.76 (6.09)   34.12 (7.06)    35.2 (7.43)  
    IQR              9              8              6.75     
    min - max     21 - 47        23 - 58         21 - 64    
  BMRKR2                                                    
    LOW              26             21              26      
    MEDIUM           21             38              17      
    HIGH             32             18              23      
M                                                           
  AGE                                                       
    n                51             55              60      
    Mean (sd)   35.57 (7.08)   37.44 (8.69)    35.38 (8.24) 
    IQR              11             9               11      
    min - max     23 - 50        21 - 62         20 - 69    
  BMRKR2                                                    
    LOW              21             23              11      
    MEDIUM           15             18              23      
    HIGH             15             14              26

In the table above the labels M and F are not very descriptive. You can add the full label as follows:

ADSL_M_F_l <- ADSL_M_F %>% 
  mutate(lbl_sex = case_when(
    SEX == "M" ~ "Male",
    SEX == "F" ~ "Female",
    SEX == "U" ~ "Unknown",
    SEX == "UNDIFFERENTIATED" ~ "Undifferentiated"
  ))

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze(c("AGE", "BMRKR2"), s_summary) %>%
  build_table(ADSL_M_F_l)

                 A: Drug X      B: Placebo    C: Combination
                  (N=130)        (N=132)         (N=126)    
------------------------------------------------------------
Female                                                      
  AGE                                                       
    n                79             77              66      
    Mean (sd)   32.76 (6.09)   34.12 (7.06)    35.2 (7.43)  
    IQR              9              8              6.75     
    min - max     21 - 47        23 - 58         21 - 64    
  BMRKR2                                                    
    LOW              26             21              26      
    MEDIUM           21             38              17      
    HIGH             32             18              23      
Male                                                        
  AGE                                                       
    n                51             55              60      
    Mean (sd)   35.57 (7.08)   37.44 (8.69)    35.38 (8.24) 
    IQR              11             9               11      
    min - max     23 - 50        21 - 62         20 - 69    
  BMRKR2                                                    
    LOW              21             23              11      
    MEDIUM           15             18              23      
    HIGH             15             14              26

For the next table variation we only stratify by gender for the AGE analysis. To do this the nested argument has to be set to FALSE in analyze call:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze("AGE", s_summary, show_labels = "visible") %>%
  analyze("BMRKR2", s_summary, nested = FALSE,  show_labels = "visible") %>%
  build_table(ADSL_M_F_l)

                 A: Drug X      B: Placebo    C: Combination
                  (N=130)        (N=132)         (N=126)    
------------------------------------------------------------
Female                                                      
  AGE                                                       
    n                79             77              66      
    Mean (sd)   32.76 (6.09)   34.12 (7.06)    35.2 (7.43)  
    IQR              9              8              6.75     
    min - max     21 - 47        23 - 58         21 - 64    
Male                                                        
  AGE                                                       
    n                51             55              60      
    Mean (sd)   35.57 (7.08)   37.44 (8.69)    35.38 (8.24) 
    IQR              11             9               11      
    min - max     23 - 50        21 - 62         20 - 69    
BMRKR2                                                      
  LOW                47             44              37      
  MEDIUM             36             56              40      
  HIGH               47             32              49

Once we split the rows into groups (Male and Female here) one might want to summarize groups: usually by showing count and column percentages. This is especially important if we have missing data. For example if we create the above table but add missing data to the AGE variable:

insert_NAs <- function(x) {
  x[sample(c(TRUE, FALSE), length(x), TRUE, prob = c(0.2, 0.8))] <- NA
  x
}

set.seed(1)
ADSL_NA <- ADSL_M_F_l %>% 
  mutate(AGE = insert_NAs(AGE))

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze("AGE", s_summary) %>%
  analyze("BMRKR2", s_summary, nested = FALSE,  show_labels = "visible") %>%
  build_table(filter(ADSL_NA, SEX %in% c("M", "F")))

               A: Drug X      B: Placebo    C: Combination
                (N=130)        (N=132)         (N=126)    
----------------------------------------------------------
Female                                                    
  n                65             61              54      
  Mean (sd)   32.71 (6.07)   34.33 (7.31)    34.61 (6.78) 
  IQR              9              10             6.75     
  min - max     21 - 47        23 - 58         21 - 54    
Male                                                      
  n                44             44              50      
  Mean (sd)   35.66 (6.78)   36.93 (8.18)    35.64 (8.42) 
  IQR             10.5           8.25           10.75     
  min - max     24 - 48        21 - 58         20 - 69    
BMRKR2                                                    
  LOW              47             44              37      
  MEDIUM           36             56              40      
  HIGH             47             32              49

Here it is not easy to see how many females and males there are in each arm as n represents the number of non-missing data elements in the variables. Groups within rows that are defined by splitting can be summarized with summarize_row_groups, for example:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
  summarize_row_groups()  %>% 
  analyze("AGE", s_summary) %>%
  analyze("BMRKR2", afun = s_summary, nested = FALSE,  show_labels = "visible") %>%
  build_table(filter(ADSL_NA, SEX %in% c("M", "F")))

               A: Drug X      B: Placebo    C: Combination
                (N=130)        (N=132)         (N=126)    
----------------------------------------------------------
Female         79 (60.8%)     77 (58.3%)      66 (52.4%)  
  n                65             61              54      
  Mean (sd)   32.71 (6.07)   34.33 (7.31)    34.61 (6.78) 
  IQR              9              10             6.75     
  min - max     21 - 47        23 - 58         21 - 54    
Male           51 (39.2%)     55 (41.7%)      60 (47.6%)  
  n                44             44              50      
  Mean (sd)   35.66 (6.78)   36.93 (8.18)    35.64 (8.42) 
  IQR             10.5           8.25           10.75     
  min - max     24 - 48        21 - 58         20 - 69    
BMRKR2                                                    
  LOW              47             44              37      
  MEDIUM           36             56              40      
  HIGH             47             32              49

There are a couple of things to note here.

Group summaries produce “content” rows. Visually it’s impossible to distinguish data rows from content rows. Their difference is justified (and it’s an important design decision) because when we paginate tables the content rows are by default repeated if a group gets divided via pagination.
Conceptually the content rows summarize the patient population which is analyzed and hence is often the count & group percentages (default behavior of summarize_row_groups).

We can recreate this default behavior (count percentage) by defining a cfun for illustrative purposes here as it results in the same table as above:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  add_colcounts() %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
    in_rows(
      rcell(nrow(df) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
      .labels = labelstr
    )
  })  %>% 
  analyze("AGE", s_summary) %>%
  analyze("BEP01FL", afun = s_summary, nested = FALSE,  show_labels = "visible") %>%
  build_table(filter(ADSL_NA, SEX %in% c("M", "F")))

               A: Drug X      B: Placebo    C: Combination
                (N=130)        (N=132)         (N=126)    
----------------------------------------------------------
Female        79 (60.77%)    77 (58.33%)     66 (52.38%)  
  n                65             61              54      
  Mean (sd)   32.71 (6.07)   34.33 (7.31)    34.61 (6.78) 
  IQR              9              10             6.75     
  min - max     21 - 47        23 - 58         21 - 54    
Male          51 (39.23%)    55 (41.67%)     60 (47.62%)  
  n                44             44              50      
  Mean (sd)   35.66 (6.78)   36.93 (8.18)    35.64 (8.42) 
  IQR             10.5           8.25           10.75     
  min - max     24 - 48        21 - 58         20 - 69    
BEP01FL                                                   
  Y                67             63              65      
  N                63             69              61

Note that cfun differs from afun (which is used in analyze) in that cfun does not operate on variables but rather on data.frames or tibbles which are passed via the df argument (afun can optionally request df too). Further, cfun gives the default group label (factor level from splitting) as an argument to labelstr and hence it could be modified:

basic_table() %>% 
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "hidden") %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
    in_rows(
       rcell(nrow(df) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
       .labels = paste0(labelstr, ": count (perc.)")
    )
  })  %>% 
  analyze("AGE", s_summary) %>%
  analyze("BEP01FL", s_summary, nested = FALSE, show_labels = "visible") %>%
  build_table(filter(ADSL_NA, SEX %in% c("M", "F")))

                         A: Drug X      B: Placebo    C: Combination
--------------------------------------------------------------------
Female: count (perc.)   79 (60.77%)    77 (58.33%)     66 (52.38%)  
  n                          65             61              54      
  Mean (sd)             32.71 (6.07)   34.33 (7.31)    34.61 (6.78) 
  IQR                        9              10             6.75     
  min - max               21 - 47        23 - 58         21 - 54    
Male: count (perc.)     51 (39.23%)    55 (41.67%)     60 (47.62%)  
  n                          44             44              50      
  Mean (sd)             35.66 (6.78)   36.93 (8.18)    35.64 (8.42) 
  IQR                       10.5           8.25           10.75     
  min - max               24 - 48        21 - 58         20 - 69    
BEP01FL                                                             
  Y                          67             63              65      
  N                          63             69              61

Using Layouts

Layouts have a couple of advantages over tabulating the tables directly:

the creation of layouts requires the analyst to describe the problem in an abstract way
- i.e. they separate the analyses description from the actual data
referencing variable names happens via strings (no non-standard evaluation (NSE) is needed, though this is arguably either feature or a short coming)
layouts can be reused

Here is an example that demonstrates the reusability of layouts:

lyt <- NULL %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze(c("AGE", "SEX"), afun = s_summary)

lyt

A Pre-data Table Layout

Column-Split Structure:
ARM (lvls) 

Row-Split Structure:
( (** multivar analysis **) -> AGE, SEX (** multivar analysis **) -> ) (** multivar analysis **)

We can now build a table for ADSL

build_table(lyt, ADSL)

                      A: Drug X     B: Placebo    C: Combination
                       (N=134)        (N=134)        (N=132)    
----------------------------------------------------------------
AGE                                                             
  n                      134            134            132      
  Mean (sd)          33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
  IQR                     11            10              10      
  min - max            21 - 50        21 - 62        20 - 69    
SEX                                                             
  F                       79            77              66      
  M                       51            55              60      
  U                       3              2              4       
  UNDIFFERENTIATED        1              0              2

or for all patients that are older than 18:

build_table(lyt, ADSL %>% filter(AGE > 18))

                      A: Drug X     B: Placebo    C: Combination
                       (N=134)        (N=134)        (N=132)    
----------------------------------------------------------------
AGE                                                             
  n                      134            134            132      
  Mean (sd)          33.77 (6.55)   35.43 (7.9)    35.43 (7.72) 
  IQR                     11            10              10      
  min - max            21 - 50        21 - 62        20 - 69    
SEX                                                             
  F                       79            77              66      
  M                       51            55              60      
  U                       3              2              4       
  UNDIFFERENTIATED        1              0              2

Adverse Events

There are a number of different adverse event tables. We will now present two tables that show adverse events by id and then by grade and by id.

This time we won’t use the ADAE dataset from random.cdisc.data but rather generate a dataset on the fly (see Adrian’s 2016 Phuse paper):

set.seed(1)

lookup <- tribble(
  ~AEDECOD,                          ~AEBODSYS,                                         ~AETOXGR,
  'HEADACHE',                        "NERVOUS SYSTEM DISORDERS",                        "5",
  'BACK PAIN',                       "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "2",
  'GINGIVAL BLEEDING',               "GASTROINTESTINAL DISORDERS",                      "1",
  'HYPOTENSION',                     "VASCULAR DISORDERS",                              "3",
  'FAECES SOFT',                     "GASTROINTESTINAL DISORDERS",                      "2",
  'ABDOMINAL DISCOMFORT',            "GASTROINTESTINAL DISORDERS",                      "1",
  'DIARRHEA',                        "GASTROINTESTINAL DISORDERS",                      "1",
  'ABDOMINAL FULLNESS DUE TO GAS',   "GASTROINTESTINAL DISORDERS",                      "1",
  'NAUSEA (INTERMITTENT)',           "GASTROINTESTINAL DISORDERS",                      "2",
  'WEAKNESS',                        "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "3",
  'ORTHOSTATIC HYPOTENSION',         "VASCULAR DISORDERS",                              "4"
)

normalize <- function(x) x/sum(x)
weightsA <- normalize(c(0.1, dlnorm(seq(0, 5, length.out = 25), meanlog = 3)))
weightsB <- normalize(c(0.2, dlnorm(seq(0, 5, length.out = 25))))

N_pop <- 300
ADSL2 <- data.frame(
  USUBJID = seq(1, N_pop, by = 1),
  ARM = sample(c('ARM A', 'ARM B'), N_pop, TRUE),
  SEX = sample(c('F', 'M'), N_pop, TRUE),
  AGE = 20 + rbinom(N_pop, size=40, prob=0.7)
)
                                      
l.adae <- mapply(ADSL2$USUBJID, ADSL2$ARM, ADSL2$SEX, ADSL2$AGE, FUN = function(id, arm, sex, age) {
  n_ae <- sample(0:25, 1, prob = if (arm == "ARM A") weightsA else weightsB)
  i <- sample(1:nrow(lookup), size = n_ae, replace = TRUE, prob = c(6, rep(1, 10))/16)
  lookup[i, ] %>% 
    mutate(
      AESEQ = seq_len(n()),
      USUBJID = id, ARM = arm, SEX = sex, AGE = age
    )
}, SIMPLIFY = FALSE)

ADAE2 <- do.call(rbind, l.adae)
ADAE2 <- ADAE2 %>% 
  mutate(
    ARM = factor(ARM, levels = c("ARM A", "ARM B")),
    AEDECOD = as.factor(AEDECOD),
    AEBODSYS = as.factor(AEBODSYS), 
    AETOXGR = factor(AETOXGR, levels = as.character(1:5))
  ) %>% 
  select(USUBJID, ARM, AGE, SEX, AESEQ, AEDECOD, AEBODSYS, AETOXGR)
  
ADAE2

# A tibble: 3,118 x 8
   USUBJID ARM     AGE SEX   AESEQ AEDECOD           AEBODSYS            AETOXGR
     <dbl> <fct> <dbl> <chr> <int> <fct>             <fct>               <fct>  
 1       1 ARM A    45 F         1 NAUSEA (INTERMIT… GASTROINTESTINAL D… 2      
 2       1 ARM A    45 F         2 HEADACHE          NERVOUS SYSTEM DIS… 5      
 3       1 ARM A    45 F         3 HEADACHE          NERVOUS SYSTEM DIS… 5      
 4       1 ARM A    45 F         4 HEADACHE          NERVOUS SYSTEM DIS… 5      
 5       1 ARM A    45 F         5 HEADACHE          NERVOUS SYSTEM DIS… 5      
 6       1 ARM A    45 F         6 HEADACHE          NERVOUS SYSTEM DIS… 5      
 7       1 ARM A    45 F         7 HEADACHE          NERVOUS SYSTEM DIS… 5      
 8       1 ARM A    45 F         8 HEADACHE          NERVOUS SYSTEM DIS… 5      
 9       1 ARM A    45 F         9 HEADACHE          NERVOUS SYSTEM DIS… 5      
10       1 ARM A    45 F        10 FAECES SOFT       GASTROINTESTINAL D… 2      
# … with 3,108 more rows

Adverse Events By ID

We start by defining an events summary function:

s_events_patients <- function(x, labelstr, .N_col) {
  in_rows(
    "Total number of patients with at least one event" = 
      rcell(length(unique(x)) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
    
    "Total number of events" = rcell(length(x), format = "xx")
  )
}

So, for a population of 5 patients where

one patient has 2 AEs
one patient has 1 AE
three patients have no AEs

we would get the following summary:

s_events_patients(x = c("id 1", "id 1", "id 2"), .N_col = 5)

in_rows object print method:
----------------------------
                                          row_name formatted_cell indent_mod
1 Total number of patients with at least one event        2 (40%)          0
2                           Total number of events              3          0
                                         row_label
1 Total number of patients with at least one event
2                           Total number of events

The .N_col argument is a special keyword argument which build_table passes the population size for each respective column. For a list of keyword arguments for the functions passed to afun in analyze refer to the documentation with ?analyze.

We now use the s_events_patients summary function in a tabulation:

basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze("USUBJID", s_events_patients) %>%
  build_table(ADAE2)

                                                      ARM A         ARM B    
                                                    (N=2060)       (N=1058)  
-----------------------------------------------------------------------------
Total number of patients with at least one event   114 (5.53%)   150 (14.18%)
Total number of events                                2060           1058

Note that the column N’s are wrong as by default they are set to the number of rows per group (i.e. number of AEs per arm here). This also affects the percentages. For this table we are interested in the number of patients per column/arm which is usually taken from ADSL (variable ADSL2 here):

N_per_arm <- table(ADSL2$ARM)
N_per_arm


ARM A ARM B 
  146   154

Since this information is not “pre-data” it needs to go to the table creation function build_table:

basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze("USUBJID", s_events_patients) %>%
  build_table(ADAE2, col_counts = N_per_arm)

                                                      ARM A          ARM B   
                                                     (N=146)        (N=154)  
-----------------------------------------------------------------------------
Total number of patients with at least one event   114 (78.08%)   150 (97.4%)
Total number of events                                 2060          1058

We next calculate this information per system organ class:

l <- basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze("USUBJID", s_events_patients) %>%
  split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE)  %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients)
  build_table(l, ADAE2, col_counts = N_per_arm)

                                                        ARM A          ARM B    
                                                       (N=146)        (N=154)   
--------------------------------------------------------------------------------
Total number of patients with at least one event     114 (78.08%)   150 (97.4%) 
Total number of events                                   2060           1058    
GASTROINTESTINAL DISORDERS                                                      
  Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
  Total number of events                                 760            374     
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
  Total number of patients with at least one event   98 (67.12%)     81 (52.6%) 
  Total number of events                                 273            142     
NERVOUS SYSTEM DISORDERS                                                        
  Total number of patients with at least one event   113 (77.4%)    133 (86.36%)
  Total number of events                                 787            420     
VASCULAR DISORDERS                                                              
  Total number of patients with at least one event    93 (63.7%)     75 (48.7%) 
  Total number of events                                 240            122

We now have to the add a count table of AEDECOD for each AEBODSYS. The default analyze behavior for a factor is to create the count table per level (using rtab_inner):

tbl1 <- basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1)  %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", indent_mod = -1) %>%
  build_table(ADAE2, col_counts = N_per_arm)

tbl1

                                                        ARM A          ARM B    
                                                       (N=146)        (N=154)   
--------------------------------------------------------------------------------
GASTROINTESTINAL DISORDERS                                                      
  Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
  Total number of events                                 760            374     
  ABDOMINAL DISCOMFORT                                   113             65     
  ABDOMINAL FULLNESS DUE TO GAS                          119             65     
  BACK PAIN                                               0              0      
  DIARRHEA                                               107             53     
  FAECES SOFT                                            122             58     
  GINGIVAL BLEEDING                                      147             71     
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                  152             62     
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
  Total number of patients with at least one event   98 (67.12%)     81 (52.6%) 
  Total number of events                                 273            142     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                              135             75     
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                               138             67     
NERVOUS SYSTEM DISORDERS                                                        
  Total number of patients with at least one event   113 (77.4%)    133 (86.36%)
  Total number of events                                 787            420     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                               787            420     
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
VASCULAR DISORDERS                                                              
  Total number of patients with at least one event    93 (63.7%)     75 (48.7%) 
  Total number of events                                 240            122     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                            104             58     
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                136             64     
  WEAKNESS                                                0              0

The indent_mod argument enables relative indenting changes if the tree structure of the table does not result in the desired indentation by default.

This table so far is however not the usual adverse event table as it counts the total number of events and not the number of subjects one or more events for a particular term. To get the correct table we need to write a custom analysis function:

table_count_once_per_id <- function(df, termvar = "AEDECOD", idvar = "USUBJID") {

  x <- df[[termvar]]
  id <- df[[idvar]]
 
  counts <- table(x[!duplicated(id)])
  
  in_rows(
    .list = as.vector(counts),
    .labels = names(counts)
  )
}

table_count_once_per_id(ADAE2)

in_rows object print method:
----------------------------
                        row_name formatted_cell indent_mod
1           ABDOMINAL DISCOMFORT             23          0
2  ABDOMINAL FULLNESS DUE TO GAS             21          0
3                      BACK PAIN             20          0
4                       DIARRHEA              7          0
5                    FAECES SOFT             11          0
6              GINGIVAL BLEEDING             15          0
7                       HEADACHE            100          0
8                    HYPOTENSION             16          0
9          NAUSEA (INTERMITTENT)             21          0
10       ORTHOSTATIC HYPOTENSION             14          0
11                      WEAKNESS             16          0
                       row_label
1           ABDOMINAL DISCOMFORT
2  ABDOMINAL FULLNESS DUE TO GAS
3                      BACK PAIN
4                       DIARRHEA
5                    FAECES SOFT
6              GINGIVAL BLEEDING
7                       HEADACHE
8                    HYPOTENSION
9          NAUSEA (INTERMITTENT)
10       ORTHOSTATIC HYPOTENSION
11                      WEAKNESS

So the desired AE table is:

basic_table() %>%
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1)  %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", afun = table_count_once_per_id, show_labels = "hidden", indent_mod = -1) %>%
  build_table(ADAE2, col_counts = N_per_arm)

                                                        ARM A          ARM B    
                                                       (N=146)        (N=154)   
--------------------------------------------------------------------------------
GASTROINTESTINAL DISORDERS                                                      
  Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
  Total number of events                                 760            374     
  ABDOMINAL DISCOMFORT                                    24             28     
  ABDOMINAL FULLNESS DUE TO GAS                           18             26     
  BACK PAIN                                               0              0      
  DIARRHEA                                                17             17     
  FAECES SOFT                                             17             14     
  GINGIVAL BLEEDING                                       18             25     
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   20             20     
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
  Total number of patients with at least one event   98 (67.12%)     81 (52.6%) 
  Total number of events                                 273            142     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               58             45     
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                40             36     
NERVOUS SYSTEM DISORDERS                                                        
  Total number of patients with at least one event   113 (77.4%)    133 (86.36%)
  Total number of events                                 787            420     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                               113            133     
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
VASCULAR DISORDERS                                                              
  Total number of patients with at least one event    93 (63.7%)     75 (48.7%) 
  Total number of events                                 240            122     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                             44             31     
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 49             44     
  WEAKNESS                                                0              0

Note that we are missing the overall summary in the first two rows. This can be added with another analyze call and then setting nested to FALSE in the subsequent summarize_row_groups call:

tbl <- basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze("USUBJID", afun = s_events_patients) %>% 
  split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE, indent_mod = 1)  %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", table_count_once_per_id, show_labels = "hidden", indent_mod = -1) %>%
  build_table(ADAE2, col_counts = N_per_arm)

tbl

                                                        ARM A          ARM B    
                                                       (N=146)        (N=154)   
--------------------------------------------------------------------------------
Total number of patients with at least one event     114 (78.08%)   150 (97.4%) 
Total number of events                                   2060           1058    
GASTROINTESTINAL DISORDERS                                                      
  Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
  Total number of events                                 760            374     
  ABDOMINAL DISCOMFORT                                    24             28     
  ABDOMINAL FULLNESS DUE TO GAS                           18             26     
  BACK PAIN                                               0              0      
  DIARRHEA                                                17             17     
  FAECES SOFT                                             17             14     
  GINGIVAL BLEEDING                                       18             25     
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   20             20     
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
  Total number of patients with at least one event   98 (67.12%)     81 (52.6%) 
  Total number of events                                 273            142     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               58             45     
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                40             36     
NERVOUS SYSTEM DISORDERS                                                        
  Total number of patients with at least one event   113 (77.4%)    133 (86.36%)
  Total number of events                                 787            420     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                               113            133     
  HYPOTENSION                                             0              0      
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 0              0      
  WEAKNESS                                                0              0      
VASCULAR DISORDERS                                                              
  Total number of patients with at least one event    93 (63.7%)     75 (48.7%) 
  Total number of events                                 240            122     
  ABDOMINAL DISCOMFORT                                    0              0      
  ABDOMINAL FULLNESS DUE TO GAS                           0              0      
  BACK PAIN                                               0              0      
  DIARRHEA                                                0              0      
  FAECES SOFT                                             0              0      
  GINGIVAL BLEEDING                                       0              0      
  HEADACHE                                                0              0      
  HYPOTENSION                                             44             31     
  NAUSEA (INTERMITTENT)                                   0              0      
  ORTHOSTATIC HYPOTENSION                                 49             44     
  WEAKNESS                                                0              0

Finally, if we wanted to prune the 0 counts row we can do that with the trim_rows function:

trim_rows(tbl)

                                                        ARM A          ARM B    
                                                       (N=146)        (N=154)   
--------------------------------------------------------------------------------
Total number of patients with at least one event     114 (78.08%)   150 (97.4%) 
Total number of events                                   2060           1058    
GASTROINTESTINAL DISORDERS                                                      
  Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
  Total number of events                                 760            374     
  ABDOMINAL DISCOMFORT                                    24             28     
  ABDOMINAL FULLNESS DUE TO GAS                           18             26     
  DIARRHEA                                                17             17     
  FAECES SOFT                                             17             14     
  GINGIVAL BLEEDING                                       18             25     
  NAUSEA (INTERMITTENT)                                   20             20     
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
  Total number of patients with at least one event   98 (67.12%)     81 (52.6%) 
  Total number of events                                 273            142     
  BACK PAIN                                               58             45     
  WEAKNESS                                                40             36     
NERVOUS SYSTEM DISORDERS                                                        
  Total number of patients with at least one event   113 (77.4%)    133 (86.36%)
  Total number of events                                 787            420     
  HEADACHE                                               113            133     
VASCULAR DISORDERS                                                              
  Total number of patients with at least one event    93 (63.7%)     75 (48.7%) 
  Total number of events                                 240            122     
  HYPOTENSION                                             44             31     
  ORTHOSTATIC HYPOTENSION                                 49             44

Pruning is a larger topic with a separate rtables package vignette.

Adverse Events By ID and By Grade

The adverse events table by ID and by grade shows how many patients had at least one adverse event per grade for different subsets of the data (e.g. defined by system organ class).

For this table we do not show the zero count grades. Note that we add the “overall” groups with a custom split function.

table_count_grade_once_per_id <- function(df, labelstr = "", gradevar = "AETOXGR", idvar = "USUBJID", grade_levels = NULL) {
  
  id <- df[[idvar]]
  grade <- df[[gradevar]]
  
  if (!is.null(grade_levels)) {
    stopifnot(all(grade %in% grade_levels))
    grade <- factor(grade, levels = grade_levels)
  }
  
  id_sel <- !duplicated(id)
  
  in_rows(
      "--Any Grade--" = sum(id_sel),
      .list =  as.list(table(grade[id_sel]))
    )
}

table_count_grade_once_per_id(ex_adae, grade_levels = 1:5)

in_rows object print method:
----------------------------
       row_name formatted_cell indent_mod     row_label
1 --Any Grade--            365          0 --Any Grade--
2             1            131          0             1
3             2             70          0             2
4             3             74          0             3
5             4             25          0             4
6             5             65          0             5

All of the layouting concepts needed to create this table have already been introduced so far:

basic_table() %>% 
  split_cols_by("ARM") %>%
  add_colcounts() %>%
  analyze("AETOXGR", 
          afun = table_count_grade_once_per_id, 
          extra_args = list(grade_levels = 1:5),
          var_labels = "- Any adverse events -", show_labels = "visible") %>%
  split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE, indent_mod = 1) %>%
  summarize_row_groups(cfun = table_count_grade_once_per_id, format = "xx", indent_mod = 1) %>%
  split_rows_by("AEDECOD", child_labels = "visible", indent_mod = -2)  %>%
  analyze("AETOXGR", 
          afun = table_count_grade_once_per_id, 
          extra_args = list(grade_levels = 1:5), show_labels = "hidden") %>%
  build_table(ADAE2, col_counts = N_per_arm)

                                                   ARM A     ARM B 
                                                  (N=146)   (N=154)
-------------------------------------------------------------------
- Any adverse events -                                             
  --Any Grade--                                     114       150  
  1                                                 32        34   
  2                                                 22        30   
  3                                                 11        21   
  4                                                  8         6   
  5                                                 41        59   
GASTROINTESTINAL DISORDERS                                         
    --Any Grade--                                   114       130  
    1                                               77        96   
    2                                               37        34   
    3                                                0         0   
    4                                                0         0   
    5                                                0         0   
ABDOMINAL DISCOMFORT                                               
  --Any Grade--                                     68        49   
  1                                                 68        49   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ABDOMINAL FULLNESS DUE TO GAS                                      
  --Any Grade--                                     73        51   
  1                                                 73        51   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
BACK PAIN                                                          
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
DIARRHEA                                                           
  --Any Grade--                                     68        40   
  1                                                 68        40   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
FAECES SOFT                                                        
  --Any Grade--                                     76        44   
  1                                                  0         0   
  2                                                 76        44   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
GINGIVAL BLEEDING                                                  
  --Any Grade--                                     80        52   
  1                                                 80        52   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HEADACHE                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HYPOTENSION                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
NAUSEA (INTERMITTENT)                                              
  --Any Grade--                                     83        50   
  1                                                  0         0   
  2                                                 83        50   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ORTHOSTATIC HYPOTENSION                                            
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
WEAKNESS                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                    
    --Any Grade--                                   98        81   
    1                                                0         0   
    2                                               58        45   
    3                                               40        36   
    4                                                0         0   
    5                                                0         0   
ABDOMINAL DISCOMFORT                                               
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ABDOMINAL FULLNESS DUE TO GAS                                      
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
BACK PAIN                                                          
  --Any Grade--                                     79        62   
  1                                                  0         0   
  2                                                 79        62   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
DIARRHEA                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
FAECES SOFT                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
GINGIVAL BLEEDING                                                  
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HEADACHE                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HYPOTENSION                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
NAUSEA (INTERMITTENT)                                              
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ORTHOSTATIC HYPOTENSION                                            
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
WEAKNESS                                                           
  --Any Grade--                                     73        43   
  1                                                  0         0   
  2                                                  0         0   
  3                                                 73        43   
  4                                                  0         0   
  5                                                  0         0   
NERVOUS SYSTEM DISORDERS                                           
    --Any Grade--                                   113       133  
    1                                                0         0   
    2                                                0         0   
    3                                                0         0   
    4                                                0         0   
    5                                               113       133  
ABDOMINAL DISCOMFORT                                               
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ABDOMINAL FULLNESS DUE TO GAS                                      
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
BACK PAIN                                                          
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
DIARRHEA                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
FAECES SOFT                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
GINGIVAL BLEEDING                                                  
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HEADACHE                                                           
  --Any Grade--                                     113       133  
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                 113       133  
HYPOTENSION                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
NAUSEA (INTERMITTENT)                                              
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ORTHOSTATIC HYPOTENSION                                            
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
WEAKNESS                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
VASCULAR DISORDERS                                                 
    --Any Grade--                                   93        75   
    1                                                0         0   
    2                                                0         0   
    3                                               44        31   
    4                                               49        44   
    5                                                0         0   
ABDOMINAL DISCOMFORT                                               
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ABDOMINAL FULLNESS DUE TO GAS                                      
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
BACK PAIN                                                          
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
DIARRHEA                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
FAECES SOFT                                                        
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
GINGIVAL BLEEDING                                                  
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HEADACHE                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
HYPOTENSION                                                        
  --Any Grade--                                     66        43   
  1                                                  0         0   
  2                                                  0         0   
  3                                                 66        43   
  4                                                  0         0   
  5                                                  0         0   
NAUSEA (INTERMITTENT)                                              
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0   
ORTHOSTATIC HYPOTENSION                                            
  --Any Grade--                                     70        54   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                 70        54   
  5                                                  0         0   
WEAKNESS                                                           
  --Any Grade--                                      0         0   
  1                                                  0         0   
  2                                                  0         0   
  3                                                  0         0   
  4                                                  0         0   
  5                                                  0         0

Response Table

The response table that we will create here is composed of 3 parts:

Binary response table
Unstratified analysis comparison vs. control group
Multinomial response table

Let’s start with the first part which is fairly simple to derive:

ADRS_BESRSPI <- ex_adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  mutate(
    rsp = factor(AVALC %in% c("CR", "PR"), levels = c(TRUE, FALSE), labels = c("Responders", "Non-Responders")),
    is_rsp = (rsp == "Responders")
  )

s_proportion <- function(x, .N_col) {
   in_rows(.list = lapply(as.list(table(x)), function(xi) rcell(xi * c(1, 1/.N_col), format = "xx.xx (xx.xx%)")))
}

basic_table() %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  add_colcounts() %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  build_table(ADRS_BESRSPI)

                    ARM A          ARM B         ARM C    
                   (N=134)        (N=134)       (N=132)   
----------------------------------------------------------
Responders       114 (85.07%)   90 (67.16%)   120 (90.91%)
Non-Responders   20 (14.93%)    44 (32.84%)    12 (9.09%)

Note that we did set the ref_group argument in split_cols_by which for the current table had no effect as we only use the cell data for the responder and non-responder counting. The ref_group argument is needed for the part 2. and 3. of the table.

We will now look the implementation of part “2. Unstratified analysis comparison vs. control group.” Let’s start with the analysis function:

s_unstratified_response_analysis <- function(x, .ref_group, .in_ref_col) {
  
  if (.in_ref_col) {
    return(in_rows(
        "Difference in Response Rates (%)" = rcell(numeric(0)),
        "95% CI (Wald, with correction)" = rcell(numeric(0)),
        "p-value (Chi-Squared Test)" = rcell(numeric(0)),
        "Odds Ratio (95% CI)" = rcell(numeric(0))
    ))
  }
  
  fit <- stats::prop.test(
    x = c(sum(x), sum(.ref_group)),
    n = c(length(x), length(.ref_group)),
    correct = FALSE
  )
  
  fit_glm <- stats::glm(
    formula = rsp ~ group,
    data = data.frame(
      rsp = c(.ref_group, x), 
      group = factor(rep(c("ref", "x"), times = c(length(.ref_group), length(x))), levels = c("ref", "x"))
    ),
    family = binomial(link = "logit")
  )

  in_rows(
      "Difference in Response Rates (%)" = non_ref_rcell((mean(x) - mean(.ref_group)) * 100,
                                                         .in_ref_col, format = "xx.xx") ,
      "95% CI (Wald, with correction)" = non_ref_rcell(fit$conf.int * 100,
                                                       .in_ref_col, format = "(xx.xx, xx.xx)"),
      "p-value (Chi-Squared Test)" = non_ref_rcell(fit$p.value,
                                                   .in_ref_col, format = "x.xxxx | (<0.0001)"),
      "Odds Ratio (95% CI)" = non_ref_rcell(c(
          exp(stats::coef(fit_glm)[-1]),
          exp(stats::confint.default(fit_glm, level = .95)[-1, , drop = FALSE])
      ),
      .in_ref_col, format = "xx.xx (xx.xx - xx.xx)")
  )
}

s_unstratified_response_analysis(
  x = ADRS_BESRSPI %>% filter(ARM == "A: Drug X") %>% pull(is_rsp), 
  .ref_group = ADRS_BESRSPI %>% filter(ARM == "B: Placebo") %>% pull(is_rsp),
  .in_ref_col = FALSE
)

in_rows object print method:
----------------------------
                          row_name     formatted_cell indent_mod
1 Difference in Response Rates (%)              17.91          0
2   95% CI (Wald, with correction)      (7.93, 27.89)          0
3       p-value (Chi-Squared Test)             0.0006          0
4              Odds Ratio (95% CI) 2.79 (1.53 - 5.06)          0
                         row_label
1 Difference in Response Rates (%)
2   95% CI (Wald, with correction)
3       p-value (Chi-Squared Test)
4              Odds Ratio (95% CI)

Hence we can now add the next section to the table:

basic_table() %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  add_colcounts() %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze("is_rsp", s_unstratified_response_analysis, show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
  build_table(ADRS_BESRSPI)

                                        ARM A             ARM B               ARM C       
                                       (N=134)           (N=134)             (N=132)      
------------------------------------------------------------------------------------------
Responders                           114 (85.07%)      90 (67.16%)         120 (90.91%)   
Non-Responders                       20 (14.93%)       44 (32.84%)          12 (9.09%)    
Unstratified Response Analysis                                                            
  Difference in Response Rates (%)                       -17.91                5.83       
  95% CI (Wald, with correction)                     (-27.89, -7.93)      (-1.94, 13.61)  
  p-value (Chi-Squared Test)                             0.0006               0.1436      
  Odds Ratio (95% CI)                               0.36 (0.2 - 0.65)   1.75 (0.82 - 3.75)

Next we will add part 3. the “multinomial response table”. To do so, we are adding a row-split by response level, and then doing the same thing as we did for the binary response table above.

s_prop <- function(df, .N_col) {
  in_rows(
    "95% CI (Wald, with correction)" = rcell(binom.test(nrow(df), .N_col)$conf.int * 100, format = "(xx.xx, xx.xx)")
  )
}

s_prop(
  df = ADRS_BESRSPI %>% filter(ARM == "A: Drug X", AVALC == "CR"), 
  .N_col = sum(ADRS_BESRSPI$ARM == "A: Drug X")
)

in_rows object print method:
----------------------------
                        row_name formatted_cell indent_mod
1 95% CI (Wald, with correction) (49.38, 66.67)          0
                       row_label
1 95% CI (Wald, with correction)

We can now create the final response table with all three parts:

basic_table() %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  add_colcounts() %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze("is_rsp", s_unstratified_response_analysis, 
          show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
  split_rows_by(
    var = "AVALC",
    split_fun = reorder_split_levels(neworder = c("CR", "PR", "SD", "NON CR/PD", "PD", "NE"), drlevels = TRUE), 
    nested = FALSE
  ) %>%
  summarize_row_groups() %>%
  analyze("AVALC", afun = s_prop) %>%
  build_table(ADRS_BESRSPI)

                                         ARM A              ARM B               ARM C       
                                        (N=134)            (N=134)             (N=132)      
--------------------------------------------------------------------------------------------
Responders                            114 (85.07%)       90 (67.16%)         120 (90.91%)   
Non-Responders                        20 (14.93%)        44 (32.84%)          12 (9.09%)    
Unstratified Response Analysis                                                              
  Difference in Response Rates (%)                         -17.91                5.83       
  95% CI (Wald, with correction)                       (-27.89, -7.93)      (-1.94, 13.61)  
  p-value (Chi-Squared Test)                               0.0006               0.1436      
  Odds Ratio (95% CI)                                 0.36 (0.2 - 0.65)   1.75 (0.82 - 3.75)
CR                                     78 (58.2%)         55 (41%)            97 (73.5%)    
  95% CI (Wald, with correction)     (49.38, 66.67)    (32.63, 49.87)       (65.1, 80.79)   
PR                                     36 (26.9%)        35 (26.1%)           23 (17.4%)    
  95% CI (Wald, with correction)     (19.58, 35.2)     (18.92, 34.41)       (11.38, 24.99)  
SD                                     20 (14.9%)        44 (32.8%)           12 (9.1%)     
  95% CI (Wald, with correction)     (9.36, 22.11)     (24.97, 41.47)       (4.79, 15.34)

In case the we wanted to rename the levels of AVALC and remove the CI for NE we could do that as follows:

rsp_label <- function(x) {
  rsp_full_label <- c(
    CR          = "Complete Response (CR)",
    PR          = "Partial Response (PR)",
    SD          = "Stable Disease (SD)",
    `NON CR/PD` = "Non-CR or Non-PD (NON CR/PD)",
    PD          = "Progressive Disease (PD)",
    NE          = "Not Evaluable (NE)",
    Missing     = "Missing",
    `NE/Missing` = "Missing or unevaluable"
  )
  stopifnot(all(x %in% names(rsp_full_label)))
  rsp_full_label[x]
}


tbl <- basic_table() %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  add_colcounts() %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze("is_rsp", s_unstratified_response_analysis, 
          show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
  split_rows_by(
    var = "AVALC",
    split_fun = keep_split_levels(c("CR", "PR", "SD", "NON CR/PD", "PD"), reorder = TRUE), 
    nested = FALSE
  ) %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col) {
    in_rows(nrow(df) * c(1, 1/.N_col), .formats = "xx (xx.xx%)", .labels = rsp_label(labelstr))
  }) %>%
  analyze("AVALC", afun = s_prop) %>%
  analyze("AVALC", afun = function(x, .N_col) {
    in_rows(rcell(sum(x == "NE") * c(1, 1/.N_col), format = "xx.xx (xx.xx%)"), .labels = rsp_label("NE"))
  }, nested = FALSE) %>%
  build_table(ADRS_BESRSPI)

tbl

                                         ARM A              ARM B               ARM C       
                                        (N=134)            (N=134)             (N=132)      
--------------------------------------------------------------------------------------------
Responders                            114 (85.07%)       90 (67.16%)         120 (90.91%)   
Non-Responders                        20 (14.93%)        44 (32.84%)          12 (9.09%)    
Unstratified Response Analysis                                                              
  Difference in Response Rates (%)                         -17.91                5.83       
  95% CI (Wald, with correction)                       (-27.89, -7.93)      (-1.94, 13.61)  
  p-value (Chi-Squared Test)                               0.0006               0.1436      
  Odds Ratio (95% CI)                                 0.36 (0.2 - 0.65)   1.75 (0.82 - 3.75)
Complete Response (CR)                78 (58.21%)        55 (41.04%)         97 (73.48%)    
  95% CI (Wald, with correction)     (49.38, 66.67)    (32.63, 49.87)       (65.1, 80.79)   
Partial Response (PR)                 36 (26.87%)        35 (26.12%)         23 (17.42%)    
  95% CI (Wald, with correction)     (19.58, 35.2)     (18.92, 34.41)       (11.38, 24.99)  
Stable Disease (SD)                   20 (14.93%)        44 (32.84%)          12 (9.09%)    
  95% CI (Wald, with correction)     (9.36, 22.11)     (24.97, 41.47)       (4.79, 15.34)   
Progressive Disease (PD)                 0 (0%)            0 (0%)               0 (0%)      
  95% CI (Wald, with correction)       (0, 2.72)          (0, 2.72)           (0, 2.76)     
Not Evaluable (NE)                       0 (0%)            0 (0%)               0 (0%)

Note that the table is missing the rows gaps to make it more readable. The row spacing feature is on the rtables roadmap and will be implemented in future.

Conclusion

The table topic poses a rich set of problems on its own right including but not only: table data structures, tabulation, outputting, formatting, and table processing. We are still actively working on rtables and expect that in the next year the rtables framework keeps evolving to meet all requirements for submitting clinical trials data analyses in a regulatory context and we also hope that our framework proves to be useful for other industries that rely on visualizing the data with tables.

We would like to thank Roche for financing the rtables project and allowing to be developed open source. Further, we would also like to thank the NEST project (at Roche) team members for their valuable feedback and involvement in the refinement of rtables. That is, many thanks go to Tadeusz Lewandowski who is the NEST business lead, and to the subject matter expert team members: Nick Paszty, Jana Stoilova, Heng Wang, Francois Collin, Daniel Sabanés Bové, and Nina Qi.

A Not So Short Introduction to rtables

Gabriel Becker & Adrian Waddell

October 31, 2020

About Rtables

Introduction to rtables

Overview

Data

Building an Table

Starting Simple

Layout Instructions

Adding Column Structure

Adding Row Structure

Adding Group Information

Tables used in Clinical Trials

Overview

Demographic Table

Variations on the demographic table

Using Layouts

Adverse Events

Adverse Events By ID

Adverse Events By ID and By Grade

Response Table

Conclusion