5  Data and Environment Setup

5.1 Source Setup Script

In order to run the code included in this book, you first need to run “/R/setup-script.R” to install/load required packages and read the data. The data are located in “/data/preregistration_3_data_public.csv”.

Show/Hide Code
source("R/setup-script.R")

5.2 Session Info

Show/Hide Code
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.4    forcats_1.0.0      stringr_1.5.1      dplyr_1.1.4       
 [5] purrr_1.0.4        readr_2.1.5        tidyr_1.3.1        tibble_3.2.1      
 [9] tidyverse_2.0.0    ggpubr_0.6.0       rstatix_0.7.2      showtext_0.9-7    
[13] showtextdb_3.0     sysfonts_0.8.9     patchwork_1.3.0    GGally_2.2.1      
[17] ggplot2_3.5.2      sjtable2df_0.0.4   sjlabelled_1.2.0   sjmisc_2.8.10     
[21] sjPlot_2.8.17      lmerTest_3.1-3     lme4_1.1-37        mediation_4.5.0   
[25] sandwich_3.1-1     mvtnorm_1.3-3      Matrix_1.7-3       MASS_7.3-65       
[29] gvlma_1.0.0.3      performance_0.15.1 kableExtra_1.4.0   pacman_0.5.1      

loaded via a namespace (and not attached):
 [1] Rdpack_2.6.4        gridExtra_2.3       rlang_1.1.6        
 [4] magrittr_2.0.3      compiler_4.5.0      systemfonts_1.2.2  
 [7] vctrs_0.6.5         crayon_1.5.3        pkgconfig_2.0.3    
[10] fastmap_1.2.0       backports_1.5.0     rmarkdown_2.29     
[13] tzdb_0.5.0          nloptr_2.2.1        bit_4.6.0          
[16] xfun_0.52           jsonlite_2.0.0      ggeffects_2.2.1    
[19] parallel_4.5.0      broom_1.0.8         cluster_2.1.8.1    
[22] R6_2.6.1            stringi_1.8.7       RColorBrewer_1.1-3 
[25] car_3.1-3           boot_1.3-31         rpart_4.1.24       
[28] numDeriv_2016.8-1.1 Rcpp_1.0.14         knitr_1.50         
[31] zoo_1.8-14          base64enc_0.1-3     splines_4.5.0      
[34] nnet_7.3-20         timechange_0.3.0    tidyselect_1.2.1   
[37] rstudioapi_0.17.1   abind_1.4-8         yaml_2.3.10        
[40] codetools_0.2-20    curl_6.2.2          lattice_0.22-7     
[43] plyr_1.8.9          withr_3.0.2         evaluate_1.0.3     
[46] foreign_0.8-90      archive_1.1.12      ggstats_0.9.0      
[49] xml2_1.3.8          lpSolve_5.6.23      pillar_1.10.2      
[52] carData_3.0-5       checkmate_2.3.2     reformulas_0.4.0   
[55] insight_1.4.2       generics_0.1.3      vroom_1.6.5        
[58] hms_1.1.3           scales_1.4.0        minqa_1.2.8        
[61] glue_1.8.0          Hmisc_5.2-3         tools_4.5.0        
[64] data.table_1.17.8   ggsignif_0.6.4      grid_4.5.0         
[67] rbibutils_2.3       datawizard_1.2.0    colorspace_2.1-1   
[70] nlme_3.1-168        htmlTable_2.4.3     Formula_1.2-5      
[73] cli_3.6.5           viridisLite_0.4.2   svglite_2.1.3      
[76] sjstats_0.19.0      gtable_0.3.6        digest_0.6.37      
[79] htmlwidgets_1.6.4   farver_2.1.2        htmltools_0.5.8.1  
[82] lifecycle_1.0.4     bit64_4.6.0-1      

5.3 Datasets

The analyses in the main text are run using two forms of the data, one for each research question / analysis type. Summaries of the data charactaristics are described below.

RQ1 Data

These data are in long format - (22 items x 149 participants - 3 NA values) x 3 judgments per item = 9825 observations.

Note

the variable perception is equivalent to “judgment” in the manuscript.

Show/Hide Code
data_rq1 |> glimpse()
Rows: 9,825
Columns: 17
$ Participant           <chr> "mvU3yT4uTFpW58Z0", "mvU3yT4uTFpW58Z0", "mvU3yT4…
$ Condition             <fct> Control, Control, Control, Control, Control, Con…
$ Gender                <fct> Men, Men, Men, Men, Men, Men, Men, Men, Men, Men…
$ Cohort                <fct> Cohort 1, Cohort 1, Cohort 1, Cohort 1, Cohort 1…
$ Timepoint             <fct> Baseline, Baseline, Baseline, Baseline, Baseline…
$ Semester_Week         <dbl> 1.912752, 1.912752, 1.912752, 1.912752, 1.912752…
$ Test_Version          <fct> B, B, B, B, B, B, B, B, B, B, B, B, B, B, B, B, …
$ Part                  <chr> "Quantitative", "Quantitative", "Quantitative", …
$ Question              <dbl> 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, …
$ Item                  <chr> "B01", "B01", "B01", "B02", "B02", "B02", "B03",…
$ Score                 <dbl> 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
$ `Item-Level Accuracy` <fct> Incorrect, Incorrect, Incorrect, Incorrect, Inco…
$ Accuracy_Raw          <dbl> 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
$ Baseline_Threat       <dbl> 1.63132, 1.63132, 1.63132, 1.63132, 1.63132, 1.6…
$ EMA_Threat            <dbl> 3.416667, 3.416667, 3.416667, 3.416667, 3.416667…
$ perception            <fct> Confidence, Anxiety, Difficulty, Confidence, Anx…
$ rating                <dbl> 1, 6, 6, 5, 6, 2, 4, 6, 3, 5, 6, 4, 5, 6, 4, 4, …

Participant-Level Variables

Show/Hide Code
rq1_participants <- data_rq1 |>
  select(
    Participant,
    Condition,
    Gender,
    Cohort,
    Semester_Week,
    Baseline_Threat
  ) |>
  unique()

rq1_participants |>
  select(-where(is.numeric)) |>
  summary()
 Participant              Condition                  Gender        Cohort  
 Length:149         Control    :73   Men                :66   Cohort 1:61  
 Class :character   Mindfulness:76   Women or Non-binary:83   Cohort 2:16  
 Mode  :character                                             Cohort 3:72  
Show/Hide Code
rq1_participants |>
  get_summary_stats(show = c("min", "max", "mean", "sd"))
variable n min max mean sd
Semester_Week 149 -4.087 4.913 0 2.790
Baseline_Threat 149 -3.102 2.865 0 1.297

Item (Observation)-Level Variables

Note

Score includes values between 0 and 1 since 4 of the items (A01, B01, A10, and B10) were open-ended and could be awarded partial credit. Item-Level Accuracy (factor version) and Accuracy_Raw (numeric version) were derived from Score to compare incorrect responses to those that were at least partially correct (Score > 0).

Show/Hide Code
data_rq1 |>
  select(
    Item,
    Timepoint,
    Test_Version,
    `Item-Level Accuracy`,
    perception
  ) |>
  summary()
     Item              Timepoint    Test_Version Item-Level Accuracy
 Length:9825        Baseline:4470   A:4908       Incorrect:5547     
 Class :character   Posttest:5355   B:4917       Correct  :4278     
 Mode  :character                                                   
      perception  
 Confidence:3275  
 Anxiety   :3275  
 Difficulty:3275  
Show/Hide Code
data_rq1 |>
  select(
    Score,
    Accuracy_Raw,
    rating
  ) |>
  get_summary_stats(show = c("min", "max", "mean", "sd"))
variable n min max mean sd
Score 9825 0 1 0.404 0.478
Accuracy_Raw 9825 0 1 0.435 0.496
rating 9825 1 6 3.474 1.408

RQ2 Data

Data are in wide format, so everything varies at the participant level. Judgment ratings and accuracy scores are averaged at baseline and posttest. Continuous variables are standardized and numeric dummy variables with contrast coding are created for Cohort because the mediation package does not accept factors with contrast coding, but this has no bearing on the results. One participant’s data was removed because they did not complete any EMA surveys.

Show/Hide Code
data_rq2 |> glimpse()
Rows: 148
Columns: 21
$ Participant           <chr> "mvU3yT4uTFpW58Z0", "bTvjXPvFUSNyAt5t", "hx3UKKU…
$ Condition             <fct> Control, Control, Control, Control, Control, Con…
$ Gender                <fct> Men, Men, Men, Women or Non-binary, Men, Women o…
$ Cohort                <fct> Cohort 1, Cohort 1, Cohort 1, Cohort 1, Cohort 3…
$ Semester_Week         <dbl> 0.6786886, 0.6786886, 0.6786886, 1.0374240, -0.7…
$ Baseline_Threat       <dbl> 1.25100904, -0.75116043, -1.05918650, 1.89273003…
$ EMA_Threat            <dbl> 1.55096412, 1.29550795, -0.95250627, 1.29550795,…
$ Baseline_n_items      <int> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, …
$ Posttest_n_items      <int> 12, 12, 12, 12, 12, 12, 11, 12, 12, 12, 12, 12, …
$ Baseline_Score        <dbl> -0.31125932, 0.33654914, -0.49120612, -1.6068762…
$ Posttest_Score        <dbl> -1.34709242, 1.43972775, -0.78972839, -2.4618204…
$ Baseline_Confidence   <dbl> -0.37485943, 0.01220473, 0.39926888, -2.18115882…
$ Posttest_Confidence   <dbl> 0.23891063, 0.53917812, 0.53917812, -3.36429935,…
$ Baseline_Anxiety      <dbl> 2.49202355, -0.85159304, -0.57295832, 1.65611940…
$ Posttest_Anxiety      <dbl> 2.15254688, -0.25189270, -0.99809809, 2.40128201…
$ Baseline_Difficulty   <dbl> 1.15470752, -1.16207408, -0.20810518, 2.24495769…
$ Posttest_Difficulty   <dbl> 1.4257825, -0.7405502, -0.0184393, 2.4573694, 0.…
$ Baseline_Test_Version <chr> "B", "A", "B", "A", "B", "A", "A", "B", "A", "A"…
$ Posttest_Test_Version <fct> A, B, A, B, A, B, B, A, B, B, A, B, B, B, B, B, …
$ Cohort_2              <dbl> -0.3469771, -0.3469771, -0.3469771, -0.3469771, …
$ Cohort_3              <dbl> -0.9569993, -0.9569993, -0.9569993, -0.9569993, …
Show/Hide Code
data_rq2 |>
  mutate(across(Baseline_Test_Version, factor)) |>
  purrr::discard(is.numeric) |>
  summary()
 Participant              Condition                  Gender        Cohort  
 Length:148         Control    :73   Men                :66   Cohort 1:61  
 Class :character   Mindfulness:75   Women or Non-binary:82   Cohort 2:16  
 Mode  :character                                             Cohort 3:71  
 Baseline_Test_Version Posttest_Test_Version
 A:75                  A:73                 
 B:73                  B:75                 
                                            
Show/Hide Code
data_rq2 |>
  get_summary_stats(show = c("min", "max", "mean", "sd"))
variable n min max mean sd
Semester_Week 148 -1.474 1.755 0.00 1.000
Baseline_Threat 148 -2.394 2.201 0.00 1.000
EMA_Threat 148 -2.230 2.420 0.00 1.000
Baseline_n_items 148 10.000 10.000 10.00 0.000
Posttest_n_items 148 11.000 12.000 11.98 0.141
Baseline_Score 148 -2.111 3.511 0.00 1.000
Posttest_Score 148 -2.462 2.957 0.00 1.000
Baseline_Confidence 148 -3.213 2.206 0.00 1.000
Posttest_Confidence 148 -3.464 1.840 0.00 1.000
Baseline_Anxiety 148 -2.152 2.492 0.00 1.000
Posttest_Anxiety 148 -1.993 2.650 0.00 1.000
Baseline_Difficulty 148 -2.389 2.381 0.00 1.000
Posttest_Difficulty 148 -2.701 2.457 0.00 1.000
Cohort_2 148 -0.347 2.863 0.00 1.000
Cohort_3 148 -0.957 1.038 0.00 1.000