readyforwhatsnext is a prototype modular and open source economic model of youth mental health that is being implemented in R. The project is led by researchers at Monash University.

What makes readyforwhatsnext model modular?

readyforwhatsnext is developed with ready4 - a software framework for transparent, reusable and updatable health economic models. The model is comprised of four sub-models. Each sub-model is comprised of model modules that can be independently reused (e.g., in other models) and safely and flexibly combined (e.g., to model more extensive systems).

What is it being used for?

Currently, readyforwhatsnext is being applied to explore multiple economic topics in youth mental health.

Can I use it?

readyforwhatsnext is publicly available and free for you to assess (to verify and validate), apply (to generate novel insights into decision problems of interest to you) and to derive your own derivative works from (to leverage and enhance the work of others) under liberal terms of use.

Why is it a prototype?

Currently readyforwhatsnext model software is only available in the form of development releases. That means readyforwhatsnext modules may require more development, documentation and testing before they could be confidently used for scientific purposes other than the specific studies to which our development team have already applied it.

Can I help?

readyforwhatsnext is a collaborative project and we’d love your help in progressing our priorty project goals! You can help fund our development, contribute code improvements, enhance our documentation and community support, give us advice and/or lead a modelling project.

Where should I go next?

We’d recommend reading the documentation in the order in which sections appear in the table of contents (so go next to Examples, then to Getting started and so on). A scientific manuscript is also available.

2 - Examples

See how readyforwhatsnext has been applied to model real world decision problems in youth mental health.

An scientific summary briefly introducing the readyforwhatsnext model is available as a pre-print manuscript.

Examples of how we are applying readyforwhatsnext in youth mental health include:

Mapping psychological and functional measures to health utility for young people using primary mental health services (a pre-print manuscript, a replication and results dataset, replication code and R packages for developing and applying mapping models);
Assessing the heterogeneity of quality of life in a clinical sample of young people (a peer-reviewed manuscript and R package for exploring and characteristing heterogeneity in quality of life data);
Predicting the online helpseeking choices of socially anxious young people (an R package for designing and analysing discrete choice experiments, a replication dataset and replication code - a manuscript will also be available shortly); and
Predicting the future spatial distribution of mental disorder in young people (a policy briefing, an app and a dataset of sample reports generated by the app).

3 - Getting started

What you need to know to start using readyforwhatsnext.

3.1 - System requirements

What you need in order to be able to use readyforwhatsnext model software on your machine.

Currently, all readyforwhatsnext model software is written in R (for model module libraries), R Markdown (for analysis programs and reporting sub-routines) and JavaScript (for the user interface component of Shiny applications) using the ready4 framework.

Therefore:

to use readyforwhatsnext model module libraries and programs / subroutines you must have an up to date version of R and the ready4 R library installed on your machine and it is recommended that you install the RStudio IDE; and
the requirements for using readyforwhatsnext model user interfaces depend on whether you are running a version we have deployed to the web (in which case you just need a supported browser) or whether you are running the app on your local machine (in which case you will need R, the ready4 library and RStudio).

3.2 - Installing readyforwhatsnext model modules

To implement a modelling analysis with readyforwhatsnext you need to install model module R libraries.

Before you install

If you plan on using existing readyforwhatsnext modules for a modelling project, you can review currently available module libraries, to identify which libraries are relevant to your project.

However, please note that no readyforwhatsnext module library is yet available as a [production release](https://www.ready4-dev.com/docs/software/status/production-releases/. You should therefore understand the limitations of using readyforwhatsnext model software development releases before you make the decision to install this software.

Installation

readyforwhatsnext module libraries are currently only available as development releases, so you will need to use a tool like devtools to assist with installing readyforwhatsnext R packages directly from our GitHub organisation. If you do not have devtools on your machine you can install it with the following command.

install.packages("devtools")

The command to install each readyforwhatsnext module takes the following format.

devtools::install_github("ready4-dev/PACKAGE_NAME")

For example, if you are planning to predict health utility using some of the mapping algorithms that we have previously developed, you can install the youthu library with the following command.

devtools::install_github("ready4-dev/youthu")

Configuration

A small number of readyforwhatsnext modules require that you configure some of the dependencies installed with them before they can be used. In particular:

if you are using modules from the TTU package to undertake a utility mapping study, you will need to have both installed and configured the cmdstanr R package as per the instructions on that package’s documentation website; and
if you are using the mychoice package to undertake a discrete choice experiment study and are using a Mac, you need to ensure that you have a Fortran compiler installed. Some relevant advice on this: https://mac.r-project.org/tools/ .

Try it out!

Before you apply readyforwhatsnext modules to your own project, you should make sure you can run some or all of the example code included in relevant library vignette articles. The package website URL takes the form of https://ready4-dev.github.io/PACKAGE_NAME/articles/ (e.g. the vignettes for the youthvars package are available at https://ready4-dev.github.io/youthvars/articles/).

3.3 - Terms of use

readyforwhatsnext is distributed without warranties under open source licenses - we just ask you to appropriately cite it.

3.3.1 - Open source licensing

readyforwhatsnext is freely available to all under copy-left licensing arrangements.

To help ensure the models we develop are as transparent as possible and to make their algorithms as useful to others as possible, all readyforwhatsnext software is free and open-source. You are encouraged to make as widespread use of our software, including the creation of derivative works, as you see fit, so long as it is consistent with each item’s license. Our software is typically licensed under GPL-3, a copy-left open-source licensing regime.

3.3.2 - Citing readyforwhatsnext

If you find readyforwhatsnext useful, please cite it appropriately - it is easy to do!

To make it easier to cite our software, each software item bundle includes a CITATION.cff file. Inclusion of this file means that the repositories storing our software can generate appropriate citations in the format of most relevance to you.

Currently:

Zenodo provides a free text field under the heading “Cite as” which enables you to generate a wide range of citation manager and journal specific citation outputs. There is also an “Export” tool that will generate citation metadata in multiple output formats;
OpenAire Explore has a “Cite this software” button that allows you to generate a citation in multiple journal formats or to download BibTeX or RIS files;
Github repositories have a “Cite this repository” button that can generate both BibTeX and APA output as well as link to the Citation.cff file.

Additionally, we have included a CITATION file in each of our R libraries so that you can generate a citation from within an R session using the citation function (for example: citation("ready4").

3.3.3 - Disclaimer

readyforwhatsnext is distributed without any warranties.

All readyforwhatsnext model software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Furthermore, no readyforwhatsnext model software is yet sufficiently well documented and tested to be given a production release. All readyforwhatsnext model software should therefore viewed as experimental development releases.

3.4 - The readyforwhatsnext model

readyforwhatsnext is an in-development modular economic model of the systems shaping the mental helath of young people. It is comprised of four sub-models.

3.4.1 - Modules for modelling people

Modules to model the characteristics, relationships, behaviours, risk factors and outcomes of young people and individuals who interact with young people are collectively referred to as the “Spring To Life” sub-model. A table summarising Spring To Life module libraries for modelling people is available. Additional information (e.g. tutorials and blog articles) about currently available Spring To Life modules is labelled with the “model-modules-people” tag. Resources about Spring To Life datasets are tagged with “data-datasets-people”. Brief information about additional unreleased Spring To Life modules that are in development is also available.

3.4.2 - Modules for modelling places

Modules for spatio-temporal modelling of the environments that shape young people’s mental health are collectively referred to as the “Springtides” sub-model. Both Springtides module libraries for modelling places that are available are highly preliminary and lack tutorials to demonstrate their use. A deprecated app built using these libraries is available for illustration purposes. Resources relating to preliminary and unreleased modules for the Springtides model is tagged with the “model-modules-places” tag and those relating to compatible datasets are tagged with “data-datasets-places”. Brief information on unreleased work in progress module libraries are also available.

3.4.3 - Modules for modelling platforms

Modules that model the processes, eligibility requirements, staffing and configurations of youth service platforms are collectively referred to as the “First Bounce” sub-model. No First Bounce modules are yet available - see details on unreleased work in progress.

3.4.4 - Modules for modelling programs

Modules for modelling the efficacy, cost-effectiveness and budget impact of youth mental health programs (e.g. interventions for prevention, treatment and wellbeing) are collectively referred to as the “On Target” sub-model. There are currently two development releases of On Target module libraries for modelling programs but both are highly preliminary. Resources (including tutorials) relating to these module libaries is tagged with “model-modules-programs”.

3.5 - Modules pipeline

Unreleased software and other preliminary work is currently being developed into readyforwhatsnext model modules.

3.5.1 - Pipeline of people modules

Current unreleased work to develop modules for modelling the characteristics, relationships, behaviours, risk factors and outcomes of young people and those important to them.

Our current pipeline of modules for modelling people is principally focused on developing tools for:

creating synthetic household datasets from multiple longitudinal datasets of varying structure, including modules specifically designed to streamline wrangling data from the HILDA and LSAC datasets (both from Australia); and
implementing agent based model simulations.

A significant amount of work has already been completed on the first project and initial development releases of each, along with one scientific manuscript, are planned for late 2024.

3.5.2 - Pipeline of places modules

Current unreleased work to develop modules for modelling the demographic, environmental and proximity drivers of access, equity and outcomes in youth mental health.

Our current pipeline of modules for modelling places (from the Springtides sub-model) will extend the libraries listed in summary table of module libraries for modelling places to:

predict prevalence and incidence by area; and
provide a user-interface (i.e. software to implement an updated version of the currently deprecated Springtides app).

Although unreleased, the source code for the above projects has been used to generate analysis during the early phase of the COVID-19 pandemic. Initial development releases of places module libraries, along with an updated app, are anticipated in the second half of 2024.

3.5.3 - Pipeline of platforms modules

Current unreleased work to develop modules for modelling the optimal staffing and configuration of support services for young people.

Our current pipeline of modules for modelling platforms includes code for implementing:

a discrete event simulation of primary mental health services for young people;
a simple cohort model of early psychosis services; and
a blended (systems dynamics / discrete event simulation) model for optimising eligibility and referral policies across multiple services.

The first two of the above models are currently implemented in R and are sufficiently advanced to produce exploratory analysis. However, neither are adequately documented or tested and need to be redeveloped as First Bounce sub-model modules and re-validated prior to development releases. The optimisation model was implemented in Java and was populated with toy data - this will require more substantial development prior to public release.

3.5.4 - Pipeline of programs modules

Current preliminary work to develop modules for modelling the affordability, value for money and appropriate targeting of interventions for young people.

We have no current pipeline of new module libraries for modelling programs (ie for the On Target sub-model). The currently released On Target libraries modules itemised in the summary table of module libraries for modelling programs are highly preliminary and are therefore our focus for future development in this area.

4 - Tutorials

Learn how to find and use readyforwhatsnext modules and datasets.

4.1 - Find model modules and datasets

How to find individual modules, module libraries, dataset collections and datasets.

4.1.1 - Finding specific modules and sub-modules

How to find individual readyforwhatsnext modules and sub-modules.

This below section renders a vignette article from the ready4 library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(ready4)

Motivation

When considering whether to use a model module, it is useful to first see tutorials about appropriate use of that module.

Implementation

A table itemising individual model modules and sub-modules authored with ready4 can be generated using make_modules_tb. This function scrapes relevant data from the websites of module libraries that have been developed within a specified project’s GitHub organisation.

Use

In this example, we are going to examine modules from the readyforwhatsnext model. The value supplied to the gh_repo_1L_chr argument specifies the repository in which a dataset of readyforwhatsnext module libraries is stored. Note, the following command may take a couple of minutes to execute.

modules_tb <-  make_modules_tb(gh_repo_1L_chr = "ready4-dev/ready4")

A slightly quicker method to achieve a similar result is to use the get_modules_tb function. This function retrieves an archived version (and therefore potentially less up to date) of the modules summary table.

# Not run
# modules_tb <- get_modules_tb(gh_repo_1L_chr = "ready4-dev/ready4")

The modules_tb object itemises both model modules (which always use R’s “S4” class type) and sub-modules (“S3” class type).

To display a HTML summary of just model modules, you can use the print_modules function.

print_modules(modules_tb, what_1L_chr = "S4")

Class	Description	Examples
AusACT	Meta data for processing ACT population projections
AusHeadspace	Meta data for constructing Headspace Centre geometries
AusLookup	Lookup tables for Australian geometry and spatial attribute data
AusOrygen	Meta data for constructing OYH Specialist Mental Health Catchment geometries
AusProjections	Meta data for constructing custom Australian population projections boundary
AusTasmania	Meta data for processing Tasmanian population projections
CostlyCorrespondences	Collection of input, standards definition and results datasets for projects to generate standardised costing datasets	1, 2
CostlyCountries	Collection of input, standards definition and results datasets for projects to generate standardised country data for use in costing datasets	1, 2
CostlyCurrencies	Collection of input, standards definition and results datasets for projects to generate standardised currency data for use in costing datasets	2
CostlySeed	Original (non-standardised) dataset (and metadata)	1, 2
CostlySource	Input dataset (and metadata) for generating standardised costing datasets
CostlyStandards	Dataset (and metadata) defining the allowable values of specified variables	1, 2
ScorzAqol6	A dataset and metadata to support implementation of an AQoL-6D scoring algorithm
ScorzAqol6Adol	A dataset and metadata to support implementation of a scoring algorithm for the adolescent version of AQoL-6D	3
ScorzAqol6Adult	A dataset and metadata to support implementation of a scoring algorithm for the adult version of AQoL-6D
ScorzEuroQol5	A dataset and metadata to support implementation of an EQ-5D scoring algorithm	4
ScorzProfile	A dataset to be scored, its associated metadata and details of the scoring instrument
SpecificConverter	Container for seed objects used for creating SpecificModels modules	5
SpecificFixed	Modelling project dataset, input parameters and complete fixed models results
SpecificInitiator	Modelling project dataset, input parameters and empty results placeholder
SpecificMixed	Modelling project dataset, input parameters and complete mixed models results
SpecificModels	Modelling project dataset, input parameters and model comparison results
SpecificParameters	Input parameters that specify candidate models to be explored
SpecificPredictors	Modelling project dataset, input parameters and predictor comparison results
SpecificPrivate	Analysis outputs not intended for public dissemination
SpecificProject	Modelling project dataset, parameters and results
SpecificResults	Analysis results
SpecificShareable	Analysis outputs intended for public dissemination
SpecificSynopsis	Input, Output and Authorship Data For Generating Reports
TTUProject	Input And Output Data For Undertaking and Reporting Utility Mapping Studies	6
TTUReports	Metadata to produce utility mapping study reports
TTUSynopsis	Input, Output and Authorship Data For Generating Utility Mapping Study Reports
VicinityArguments	Function arguments for constructing a spatial object
VicinityLocal	Object defining data to be saved in local directory
VicinityLocalProcessed	Object defining data to be saved in local directory in a processed (R) format
VicinityLocalRaw	Object defining data to be saved in local directory in a raw (unprocessed) format
VicinityLookup	Look up tables for spatiotemporal data
VicinityMacro	Macro level context
VicinityMesoArea	Meso level context - area
VicinityMesoRegion	Meso level context - region
VicinityMicro	Micro level context
VicinityProfile	Information to create a profiled area object
VicinitySpaceTime	Spatiotemporal environment
YouthvarsDescriptives	Metadata about descriptive statistics to be generated
YouthvarsProfile	A dataset and its associated dictionary, descriptive statistics and metadata	8
YouthvarsSeries	A longitudinal dataset and its associated dictionary, descriptive statistics and metadata	8

You can use the same function to display only model sub-modules.

print_modules(modules_tb, what_1L_chr = "S3")

Class	Description	Examples
specific_models	Candidate models lookup table
specific_predictors	Candidate predictors lookup table
vicinity_abbreviations	ready4 submodule class for tibble object lookup table for spatial data abbreviations
vicinity_identifiers	ready4 submodule class for tibble object lookup table of unique feature identifiers used for different spatial objects
vicinity_mapes	ready4 submodule class for tibble object that stores spatial simulation parameters relating to Mean Absolute Prediction Errors
vicinity_parameters	ready4 submodule class for tibble object that stores simulation structural parameters relating to the spatial environment
vicinity_points	ready4 submodule class for tibble object lookup table of the longitude and latitude cordinates of sites of services / homes
vicinity_processed	ready4 submodule class for tibble object lookup table of meta-data for spatial data packs (imported and pre-processed data)
vicinity_raw	ready4 submodule class for tibble object lookup table of metadata about raw (un-processed) spatial data to import
vicinity_resolutions	ready4 submodule class for tibble object lookup table of the relative resolutions of different spatial objects
vicinity_templates	ready4 submodule class for tibble object lookup table for base file used in creation of certain spatial objects
vicinity_values	ready4 submodule class for tibble object that stores simulation parameter values for each iteration
youthvars_aqol6d_adol	youthvars ready4 sub-module (S3 class) for Assessment of Quality of Life Six Dimension Health Utility - Adolescent Version (AQoL6d Adolescent)	7
youthvars_bads	youthvars ready4 sub-module (S3 class) for Behavioural Activation for Depression Scale (BADS) scores	7
youthvars_chu9d_adolaus	youthvars ready4 sub-module (S3 class) for Child Health Utility Nine Dimension Health Utility - Australian Adolescent Scoring (CHU-9D Australian Adolescent)	7
youthvars_gad7	youthvars ready4 sub-module (S3 class) for Generalised Anxiety Disorder Scale (GAD-7) scores	7
youthvars_k10	youthvars ready4 sub-module (S3 class) for Kessler Psychological Distress Scale (K10) - US Scoring System scores	7
youthvars_k10_aus	youthvars ready4 sub-module (S3 class) for Kessler Psychological Distress Scale (K10) - Australian Scoring System scores	7
youthvars_k6	youthvars ready4 sub-module (S3 class) for Kessler Psychological Distress Scale (K6) - US Scoring System scores	7
youthvars_k6_aus	youthvars ready4 sub-module (S3 class)for Kessler Psychological Distress Scale (K6) - Australian Scoring System scores	7
youthvars_oasis	youthvars ready4 sub-module (S3 class) for Overall Anxiety Severity and Impairment Scale (OASIS) scores	7
youthvars_phq9	youthvars ready4 sub-module (S3 class) for Patient Health Questionnaire (PHQ-9) scores	7
youthvars_scared	youthvars ready4 sub-module (S3 class) for Screen for Child Anxiety Related Disorders (SCARED) scores	7
youthvars_sofas	youthvars ready4 sub-module (S3 class) for Social and Occupational Functioning Assessment Scale (SOFAS)	7

Details of how to search for themed collections of modules is described in another article.

4.1.2 - Model module libraries

Bundles of readyforwhatsnext modules are distributed as R libraries.

readyforwhatsnext model modules are intended to be both transferable (they are tools that can be used in multiple decision contexts) and modular (they are comprised of self-contained components, each of which performs a narrow sub-set of tasks). For these reasons, readyforwhatsnext model modules are developed and distributed as libraries of modules.

The three types of readyforwhatsnext module libraries are:

- modules for describing and quality assuring model data;
- modules to specify, assess and report statisitical models; and
- modules for making predictions.

A table summarising currently available readyforwhatsnext module libraries can be retrieved from an online repository by using the get_libraries_tb function from the ready4 framework library.

library(ready4)

libraries_tb <- get_libraries_tb() %>% update_libraries_tb(include_1L_chr = "modules")#make_libraries_tb("modules")

Module libraries are thematically grouped under one of four “sub-models” of readyforwhatsnext, one each for modelling People (collectively, the “Spring To Life” sub-model), Places (the “Springtides” sub-model), Platforms (collectively, the “First Bounce” sub-model) and Programs (the “On Target” sub-model). We can use the print_packages function to display the module libraries currently available for each section (currently, there are no publicly available libraries of readyforwhatsnext modules for modelling platforms).

Module libraries for modelling people

print_packages(libraries_tb %>% dplyr::filter(Section == "People"))

Purpose	Documentation	Code	Examples
Describe and Validate Youth Mental Health Dataset Variables	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	12, 13
Score Multi-Attribute Utility Instruments	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	14, 15
Model Youth Choice Behaviours	Citation , Website , Citation	Dev , Archive
Implement Transfer to Utility Mapping Algorithms	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	16
Explore and Characterise Heterogeneity in Quality of Life Data	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive
Specify Models to Solve Inverse Problems	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	17
Transform Youth Outcomes to Health Utility Predictions	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	18

Module libraries for modelling places

print_packages(libraries_tb %>% dplyr::filter(Section == "Places"))

Type	Package	Purpose	Documentation	Code	Examples
		Model Australian Spatial Data	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive
		Model Spatial Features of Health Systems	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive

Module libraries for modelling programs

print_packages(libraries_tb %>% dplyr::filter(Section == "Programs"))

Type	Package	Purpose	Documentation	Code	Examples
		Undertake Health Economic Budget Impact Analysis.	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive
		Develop, Use and Share Unit Cost Datasets for Health Economic	Citation , Website , Manual - Short (PDF) , Manual - Full (PDF)	Dev , Archive	19, 20

4.1.3 - Find open access model data

Tools from the ready4 framework library can be used to search for relevant open access readyforwhatsnext model data collections and datasets.

library(ready4)

The make_datasets_tb function from the ready4 library can be used to create a summary table of the open access datasets we curate in our ready4 Dataverse Collection.

make_datasets_tb("ready4") -> x

One way to inspect this information is to group contents by Dataverse Collections using the print-data function.

print_data(x,
           by_dv_1L_lgl = T) %>%
  kableExtra::scroll_box(width = "100%")

Dataverse	Name	Description	Creator	Datasets
TTU	Transfer to Utility	A collection of transfer to utility datasets developed with the ready4 open science framework.	Orygen	1, 2, 3
fakes	Fake Data For Instruction And Illustration	Fake data used to illustrate toolkits developed with the ready4 open science framework.	Orygen	4 , 5 , 6 , 7 , 8 , 9 , 10, 11
firstbounce	First Bounce	A ready4 framework model of platforms. Aims to identify opportunities to improve the efficiency and equity of mental health services.	Orygen
ready4fw	ready4 Framework	A collection of datasets that support implementation of the ready4 framework for open science computational models of mental health systems.	Orygen	12
readyforwhatsnext	readyforwhatsnext	Data collections for the readyforwhatsnext mental health systems model.	Orygen	13, 14
springtides	Springtides	A ready4 framework model of places. Synthesises geometry (boundary, coordinate) and spatial attribute (e.g. population counts, environmental characteristics, service identifier and model coefficients associated with areas) data.	Orygen	15
springtolife	Spring To Life	A ready4 framework model of people. Models the characteristics, behaviours, relationships and outcomes of groups of individuals relevant to policymakers and service planners aiming to improve population mental health.	Orygen	16

Alternatively, we can itemise individual Dataverse Datasets. When doing so, it makes sense to prepare separate views for toy datasets designed for instruction and real datasets appropriate for use in modelling.

Datasets appropriate for use in modelling projects can be returned by supplying the value “real” to the what_1L_chr argument of print_data.

print_data(x,
           what_1L_chr = "real") %>%
  kableExtra::scroll_box(width = "100%")

Title	Description	Dataverse
Transfer to AQoL-6D Utility Mapping Algorithms	Catalogues of models (and the programs that produced them) that can be used in conjunction with the youthu R package to predict AQoL-6D health utility (and thus, derive QALYs) from measures collected in youth mental health services.	TTU
Transfer to AQoL-6D From Measures Collected In Primary Youth Mental Health Services	This is a work in progress dataset to support the implementation and reporting of a study to map measures collected in Australian primary youth mental health services to AQoL-6D health utility.	TTU
Transfer to CHU-9D From Measures Collected In Primary Youth Mental Health Services	This is a work in progress dataset to support the implementation and reporting of a study to map measures collected in Australian primary youth mental health services to CHU-9D health utility	TTU
ready4 Framework Abbreviations and Definitions	This dataset contains resources that help ready4 Framework Developers adopt common standards and workflows.	ready4fw
readyforwhatsnext posters	A collection of poster summaries about the readyforwhatsnext project and its outputs.	readyforwhatsnext
Australian demographic input parameters for Springtides model	Geometry, spatial attribute and metadata inputs for the demographic module of the readyforwhatsnext model. The demographic module is a systems dynamics spatial simulation of area demographic characteristics. The current version of the model is quite rudimentary and is designed to be extended by other models developped with the ready4 open science mental health modelling tools.	readyforwhatsnext
Springtides reports for Local Government Areas in the North West of Melbourne	This dataset is a collection of reports generated by a development version of the Springtides Model Of Places. Each report summarises prevalence projections for a specified mental disorder / mental health condition for a Local Government Area that is wholly or partially within the catchment area of the Orygen youth mental health service in North West Melbourne. As these reports were generated by a development version of the Springtides Model, these projections should be regarded as exploratory.	springtides
Modelling the online helpseeking choice of socially anxious young people	Models to predict the online helpseeking choices of socially anxious young people in Australia and replication code and documentation to implement the discrete choice experiment that generated the models. All study outputs were created with the aid of the mychoice R package (https://ready4-dev.github.io/mychoice).	springtolife

To view toy datasets, instead supply the value “fakes”.

print_data(x,
           what_1L_chr = "fakes") %>%
  kableExtra::scroll_box(width = "100%")

Title	Description	Dataverse
TTU (Transfer to Utility) R package - AQoL-6D vignette output	This dataset has been generated from fake data as an instructional aid. It is not to be used to inform decision making.	fakes
TTU (Transfer to Utility) R package - EQ-5D vignette output	This dataset is provided as a teaching aid. It is the output of tools from the TTU R package, applied to a synthetic dataset (Fake Data) of psychological distress and psychological wellbeing. It is not to be used to support decision-making.	fakes
Synthetic (fake) youth mental health datasets and data dictionaries	The datasets in this collection are entirely fake. They were developed principally to demonstrate the workings of a number of utility scoring and mapping algorithms. However, they may be of more general use to others. In some limited cases, some of the included files could be used in exploratory simulation based analyses. However, you should read the metadata descriptors for each file to inform yourself of the validity and limitations of each fake dataset. To open the RDS format files included in this dataset, the R package ready4use needs to be installed (see ). It is also recommended that you install the youthvars package ( ) as this provides useful tools for inspecting and validating each dataset.	fakes
ready4use R package vignette output	This dataset is provided so that others can compare the output they generate when implementing vignette code with that generated by the authors.	fakes
Replication Data For Quality of Life Heterogeneity Analysis In A Clinical Youth Mental Health Sample	This dataset is provided so that others can apply and test the analysis algorithms we have developed. It includes synthetic (fake) data that was generated for the sole purpose of enabling users to rerun our analysis algorithm.	fakes
Specific R Package - AQoL-6D Vignette Output	This dataset is provided so that others can apply the algorithms we have developed, consistent with the principles of the ready4 open science framework for data synthesis and simulation in mental health.	fakes
Synthetic (fake) dataset for hypothetical replication of study mapping psychological distress and functioning measures to AQoL-6D health utility	This dataset is comprised of fake data that has been created to illustrate the potential transfer of a study algorithm for creating utility mapping models to new data. Outputs in this dataset are for instructional purposes only and should not be used to inform decision making.	fakes
Synthetic (fake) dataset for hypothetical replication of study mapping psychological distress and functioning measures to CHU-9D health utility	This dataset is comprised of fake data that has been created to illustrate the potential transfer of a study algorithm for creating CHU-9D utility mapping models to new data. Outputs in this dataset are for instructional purposes only and should not be used to inform decision making	fakes

4.2 - Use readyforwhastsnext model modules

How to use readyforwhatsnext model modules to model the people, places, platforms and programs that shape young people’s mental health.

4.2.1 - Add metadata to datasets of individual human records

Appending appropriate metadata to datasets of individual unit records can facilitate partial automation of some modelling tasks. This tutorial describes how a module from the youthvars R package can help you to add metadata to a youth mental health dataset so that it can be more readily used by other readyforwhatsnext modules.

This below section renders a vignette article from the youthvars library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

Note: This vignette is illustrated with fake data. The dataset explored in this example should not be used to inform decision-making.

library(ready4)
library(youthvars)

Youthvars provides two ready4 framework modules - YouthvarsProfile and YouthvarsSeries that form part of the readyforwhatsnext economic model of youth mental health. The ready4 modules in youthvars extend the Ready4useDyad module and can be used to help describe key structural properties of youth mental health datasets.

Ingest data

To start we ingest X, a Ready4useDyad (dataset and data dictionary pair) that we can download from a remote repository.

X <- ready4use::Ready4useRepos(dv_nm_1L_chr = "fakes",
                               dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/W95KED",
                               dv_server_1L_chr = "dataverse.harvard.edu") %>%
  ingest(fls_to_ingest_chr = "ymh_clinical_dyad_r4",
         metadata_1L_lgl = F)

Add metadata

If a dataset is cross-sectional or we wish to treat it as if it were (i.e., where data collection rounds are ignored) we can create Y, an instance of the YouthvarsProfile module, to add minimal metadata (the name of the unique identifier variable).

Y <- YouthvarsProfile(a_Ready4useDyad = X, id_var_nm_1L_chr = "fkClientID")

If the temporal dimension of the dataset is important, it may be therefore preferable to instead transform X into a YouthvarsSeries module instance. YouthvarsSeries objects contain all of the fields of YouthvarsProfile objects, but also include additional fields that are specific for longitudinal datasets (e.g. timepoint_var_nm_1L_chr and timepoint_vals_chr that respectively specify the data-collection timepoint variable name and values and participation_var_1L_chr that specifies the desired name of a yet to be created variable that will summarise the data-collection timepoints for which each unit record supplied data).

Z <- YouthvarsSeries(a_Ready4useDyad = X,
                     id_var_nm_1L_chr = "fkClientID",
                     participation_var_1L_chr = "participation",
                     timepoint_vals_chr = c("Baseline","Follow-up"),
                     timepoint_var_nm_1L_chr = "round")

YouthvarsProfile methods

Inspect data

We can now specify the variables that we would like to prepare descriptive statistics for by using the renew method. The variables to be profiled are specified in the profile_chr argument, the number of decimal digits (default = 3) of numeric values in the summary tables to be generated can be specified with nbr_of_digits_1L_int.

Y <- renew(Y, nbr_of_digits_1L_int = 2L, profile_chr = c("d_age","d_sexual_ori_s","d_studying_working"))

We can now view the descriptive statistics we created in the previous step.

Y %>%
  exhibit(profile_idx_int = 1L, scroll_box_args_ls = list(width = "100%"))

Descriptive summary

		(N =	1711)
Age	Mean (SD)	17.64	(3.09)
	Median (Q1, Q3)	18.00	(15.00, 20.00)
	Min - Max	12.00	25.00
	Missing	0.00
Sexual orientation	Heterosexual	1178.00	(71.74%)
	Other	464.00	(28.26%)
	Missing	69.00
Education and employment status	Not studying or working	311.00	(18.75%)
	Studying and working	451.00	(27.19%)
	Studying only	572.00	(34.48%)
	Working only	325.00	(19.59%)
	Missing	52.00

We can also plot the distributions of selected variables in our dataset.

depict(Y, var_nms_chr = c("c_sofas"), labels_chr = c("SOFAS"))

SOFAS total scores

YouthvarsSeries methods

Validate data

To explore longitudinal data we need to first use the ratify method to ensure that Z has been appropriately configured for methods examining datasets reporting measures at two timepoints.

Z <- ratify(Z,
            type_1L_chr = "two_timepoints")

Inspect data

We can now specify the variables that we would like to prepare descriptive statistics for using the renew method. The variables to be profiled are specified in arguments beginning with “compare_”. Use compare_ptcpn_chr to compare variables based on whether cases reported data at one or both timepoints and compare_by_time_chr to compare the summary statistics of variables by timepoints, e.g at baseline and follow-up. If you wish these comparisons to report p values, then use the compare_ptcpn_with_test_chr and compare_by_time_with_test_chr arguments.

Z <- renew(Z,
           compare_by_time_chr = c("d_age","d_sexual_ori_s","d_studying_working"),
           compare_by_time_with_test_chr = c("k6_total", "phq9_total", "bads_total"),
           compare_ptcpn_with_test_chr = c("k6_total", "phq9_total", "bads_total"))

The tables generated in the preceding step can be inspected using the exhibit method.

Z %>%
  exhibit(profile_idx_int = 1L,
          scroll_box_args_ls = list(width = "100%"))

Outcomes by data completeness
		Baseline only		Baseline and follow-up
		(N =	1068)	(N =	643)	p
Kessler Psychological Distress Scale (6 Dimension)	Mean (SD)	12.153	(5.409)	11.069	(5.778)	0.001
	Median (Q1, Q3)	12.000	(8.000, 16.000)	11.000	(7.000, 15.000)	0.001
	Min - Max	0.000	24.000	0.000	24.000	0.001
	Missing	0.000		3.000		0.001
Patient Health Questionnaire	Mean (SD)	12.632	(6.086)	11.194	(6.434)	0.000
	Median (Q1, Q3)	13.000	(8.000, 17.000)	11.000	(6.000, 16.000)	0.000
	Min - Max	0.000	27.000	0.000	27.000	0.000
	Missing	1.000		5.000		0.000
Behavioural Activation for Depression Scale	Mean (SD)	79.814	(26.478)	83.571	(25.809)	0.010
	Median (Q1, Q3)	79.000	(62.000, 95.250)	84.000	(66.000, 101.000)	0.010
	Min - Max	0.000	150.000	0.000	150.000	0.010
	Missing	1.000		10.000		0.010

Z %>%
  exhibit(profile_idx_int = 2L,
          scroll_box_args_ls = list(width = "100%"))

Outcomes by data collection round
		Baseline		Follow-up
		(N =	1068)	(N =	643)
Age	Mean (SD)	17.555	(3.090)	17.770	(3.091)
	Median (Q1, Q3)	17.000	(15.000, 20.000)	18.000	(16.000, 20.000)
	Min - Max	12.000	25.000	12.000	25.000
	Missing	0.000		0.000
Sexual orientation	Heterosexual	738.000	(71.860%)	440.000	(71.545%)
	Other	289.000	(28.140%)	175.000	(28.455%)
	Missing	41.000		28.000
Education and employment status	Not studying or working	159.000	(15.347%)	152.000	(24.398%)
	Studying and working	305.000	(29.440%)	146.000	(23.435%)
	Studying only	405.000	(39.093%)	167.000	(26.806%)
	Working only	167.000	(16.120%)	158.000	(25.361%)
	Missing	32.000		20.000

Z %>%
  exhibit(profile_idx_int = 3L,
          scroll_box_args_ls = list(width = "100%"))

Outcomes by data collection round (with p values)
		Baseline		Follow-up
		(N =	1068)	(N =	643)	p
Kessler Psychological Distress Scale (6 Dimension)	Mean (SD)	12.082	(5.603)	10.100	(5.665)	0.000
	Median (Q1, Q3)	12.000	(8.000, 16.000)	10.000	(6.000, 14.000)	0.000
	Min - Max	0.000	24.000	0.000	24.000	0.000
	Missing	1.000		2.000		0.000
Patient Health Questionnaire	Mean (SD)	12.646	(6.230)	9.736	(6.210)	0.000
	Median (Q1, Q3)	13.000	(8.000, 17.000)	10.000	(5.000, 14.000)	0.000
	Min - Max	0.000	27.000	0.000	27.000	0.000
	Missing	4.000		2.000		0.000
Behavioural Activation for Depression Scale	Mean (SD)	78.429	(25.608)	89.615	(25.205)	0.000
	Median (Q1, Q3)	78.000	(61.000, 95.000)	88.000	(73.000, 106.000)	0.000
	Min - Max	0.000	150.000	0.000	150.000	0.000
	Missing	7.000		4.000		0.000

The depict method can create plots, comparing numeric variables by timepoint.

depict(Z,
       type_1L_chr = "by_time",
       var_nms_chr = c("c_sofas"),
       label_fill_1L_chr = "Time",#
       labels_chr = c("SOFAS"),#
       y_label_1L_chr = "")

SOFAS total scores by data collection round

Share data

If and only if the dataset you are working with is appropriate for public dissemination (e.g. is synthetic data), you can use the following workflow for sharing it. We can share the dataset we created for this example using the share method, specifying the repository to which we wish to publish the dataset (and for which we have write permissions) in a (Ready4useRepos object).

A <- Ready4useRepos(gh_repo_1L_chr = "ready4-dev/youthvars", # Replace with your repository 
                          gh_tag_1L_chr = "Documentation_0.0"), # (need write permissions).
A <- share(A,
           obj_to_share_xx = Z,
           fl_nm_1L_chr = "ymh_YouthvarsSeries")

Z is now available for download as the file ymh_YouthvarsSeries.RDS from the “Documentation_0.0” release of the youthvars package.

4.2.2 - Validate variable total scores

Vector based classes can be used to help validate variable values. This tutorial describes how to do that with sub-module classes exported as part of the youthvars R package.

This below section renders a vignette article from the youthvars library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(ready4)
library(youthvars)

Variable classes and data integrity

The youthvars package includes a number of ready4 framework sub-module classes that form part of the ready4 economic model of youth mental health. The primary use of youthvars sub-modules is to quality assure the variables used in model input and output datasets by:

facilitating automated data integrity checks that verify no impermissible values (e.g. utility scores greater than one) are present in source data, transformed data or results; and
implementing rules-based automated selection and application of appropriate methods for each dataset variable.

Included sub-module classes

The initial set of sub-module classes included in the youthvars package are one class for Assessment of Quality of Life (Adolescent) health utility and one for each of the predictors used in the utility prediction algorithms included in the related youthu package.

Assessment of Quality of Life Six Dimension (Adolescent) Health Utility

The youthvars_aqol6d_adol class is defined for numeric vectors with a minimum value of 0.03 and maximum value of 1.0.

youthvars_aqol6d_adol(0.4)
#> [1] 0.4
#> attr(,"class")
#> [1] "youthvars_aqol6d_adol" "numeric"

youthvars_aqol6d_adol(c(0.03,0.2,1))
#> [1] 0.03 0.20 1.00
#> attr(,"class")
#> [1] "youthvars_aqol6d_adol" "numeric"

Non numeric objects and values outside these ranges will produce errors.

youthvars_aqol6d_adol("0.5")
#> Error in make_new_youthvars_aqol6d_adol(x): is.numeric(x) is not TRUE

youthvars_aqol6d_adol(-0.1)
#> Error: All non-missing values in valid youthvars_aqol6d_adol object must be greater than or equal to 0.03.

youthvars_aqol6d_adol(1.2)
#> Error: All non-missing values in valid youthvars_aqol6d_adol object must be less than or equal to 1.

Child Health Utility Nine Dimension - Australian Adolescent Scoring

The youthvars_chu9d_adolaus class is defined for numeric vectors with a minimum value of -0.2118 and maximum value of 1.0.

youthvars_chu9d_adolaus(0.4)
#> [1] 0.4
#> attr(,"class")
#> [1] "youthvars_chu9d_adolaus" "numeric"

youthvars_chu9d_adolaus(c(0.03,0.2,1))
#> [1] 0.03 0.20 1.00
#> attr(,"class")
#> [1] "youthvars_chu9d_adolaus" "numeric"

Non numeric objects and values outside these ranges will produce errors.

youthvars_chu9d_adolaus("0.5")
#> Error in make_new_youthvars_chu9d_adolaus(x): is.numeric(x) is not TRUE

youthvars_chu9d_adolaus(-0.3)
#> Error: All non-missing values in valid youthvars_chu9d_adolaus object must be greater than or equal to -0.2118.

youthvars_chu9d_adolaus(1.2)
#> Error: All non-missing values in valid youthvars_chu9d_adolaus object must be less than or equal to 1.

Behavioural Activation for Depression Scale (BADS)

The youthvars_bads class is defined for integer vectors with a minimum value of 0 and maximum value of 150.

youthvars_bads(143L)
#> [1] 143
#> attr(,"class")
#> [1] "youthvars_bads" "integer"

youthvars_bads(as.integer(c(1,15,150)))
#> [1]   1  15 150
#> attr(,"class")
#> [1] "youthvars_bads" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_bads(22.5)
#> Error in make_new_youthvars_bads(x): is.integer(x) is not TRUE

youthvars_bads(-1L)
#> Error: All non-missing values in valid youthvars_bads object must be greater than or equal to 0.

youthvars_bads(160L)
#> Error: All non-missing values in valid youthvars_bads object must be less than or equal to 150.

Generalised Anxiety Disorder Scale (GAD-7)

The youthvars_gad7 class is defined for integer vectors with a minimum value of 0 and a maximum value of 21.

youthvars_gad7(15L)
#> [1] 15
#> attr(,"class")
#> [1] "youthvars_gad7" "integer"

youthvars_gad7(as.integer(c(0,14,21)))
#> [1]  0 14 21
#> attr(,"class")
#> [1] "youthvars_gad7" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_gad7(14.6)
#> Error in make_new_youthvars_gad7(x): is.integer(x) is not TRUE

youthvars_gad7(-1L)
#> Error: All non-missing values in valid youthvars_gad7 object must be greater than or equal to 0.

youthvars_gad7(22L)
#> Error: All non-missing values in valid youthvars_gad7 object must be less than or equal to 21.

Kessler Psychological Distress Scale (K6) - Australian Scoring System

The youthvars_k6_aus class is defined for integer vectors with a minimum value of 6 and a maximum value of 30.

youthvars_k6_aus(21L)
#> [1] 21
#> attr(,"class")
#> [1] "youthvars_k6_aus" "integer"

youthvars_k6_aus(as.integer(c(6,13,25)))
#> [1]  6 13 25
#> attr(,"class")
#> [1] "youthvars_k6_aus" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_k6_aus(11.2)
#> Error in make_new_youthvars_k6_aus(x): is.integer(x) is not TRUE

youthvars_k6_aus(1L)
#> Error: All non-missing values in valid youthvars_k6_aus object must be greater than or equal to 6.

youthvars_k6_aus(31L)
#> Error: All non-missing values in valid youthvars_k6_aus object must be less than or equal to 30.

Kessler Psychological Distress Scale (K6) - US Scoring System

The youthvars_k6 class is defined for integer vectors with a minimum value of 0 and a maximum value of 24.

youthvars_k6(21L)
#> [1] 21
#> attr(,"class")
#> [1] "youthvars_k6" "integer"

youthvars_k6(as.integer(c(0,13,24)))
#> [1]  0 13 24
#> attr(,"class")
#> [1] "youthvars_k6" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_k6(11.2)
#> Error in make_new_youthvars_k6(x): is.integer(x) is not TRUE

youthvars_k6(-1L)
#> Error: All non-missing values in valid youthvars_k6 object must be greater than or equal to 0.

youthvars_k6(25L)
#> Error: All non-missing values in valid youthvars_k6 object must be less than or equal to 24.

Kessler Psychological Distress Scale (K10) - Australian Scoring System

The youthvars_k10_aus class is defined for integer vectors with a minimum value of 10 and a maximum value of 50.

youthvars_k10_aus(21L)
#> [1] 21
#> attr(,"class")
#> [1] "youthvars_k10_aus" "integer"

youthvars_k10_aus(as.integer(c(13,25,41)))
#> [1] 13 25 41
#> attr(,"class")
#> [1] "youthvars_k10_aus" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_k10_aus(11.2)
#> Error in make_new_youthvars_k10_aus(x): is.integer(x) is not TRUE

youthvars_k10_aus(9L)
#> Error: All non-missing values in valid youthvars_k10_aus object must be greater than or equal to 10.

youthvars_k10_aus(51L)
#> Error: All non-missing values in valid youthvars_k10_aus object must be less than or equal to 50.

Kessler Psychological Distress Scale (K10) - US Scoring System

The youthvars_k10 class is defined for integer vectors with a minimum value of 0 and a maximum value of 40.

youthvars_k10(21L)
#> [1] 21
#> attr(,"class")
#> [1] "youthvars_k10" "integer"

youthvars_k10(as.integer(c(0,13,34)))
#> [1]  0 13 34
#> attr(,"class")
#> [1] "youthvars_k10" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_k10(11.2)
#> Error in make_new_youthvars_k10(x): is.integer(x) is not TRUE

youthvars_k10(-1L)
#> Error: All non-missing values in valid youthvars_k10 object must be greater than or equal to 0.

youthvars_k10(41L)
#> Error: All non-missing values in valid youthvars_k10 object must be less than or equal to 40.

Overall Anxiety Severity and Impairment Scale (OASIS)

The youthvars_oasis class is defined for integer vectors with a minimum value of 0 and a maximum value of 20.

youthvars_oasis(15L)
#> [1] 15
#> attr(,"class")
#> [1] "youthvars_oasis" "integer"

youthvars_oasis(as.integer(c(0,12,20)))
#> [1]  0 12 20
#> attr(,"class")
#> [1] "youthvars_oasis" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_oasis(14.2)
#> Error in make_new_youthvars_oasis(x): is.integer(x) is not TRUE

youthvars_oasis(-1L)
#> Error: All non-missing values in valid youthvars_oasis object must be greater than or equal to 0.

youthvars_oasis(21L)
#> Error: All non-missing values in valid youthvars_oasis object must be less than or equal to 20.

Patient Health Questionnaire (PHQ-9)

The youthvars_phq9 class is defined for integer vectors with a minimum value of 0 and a maximum value of 27.

youthvars_phq9(11L)
#> [1] 11
#> attr(,"class")
#> [1] "youthvars_phq9" "integer"

youthvars_phq9(as.integer(c(0,13,27)))
#> [1]  0 13 27
#> attr(,"class")
#> [1] "youthvars_phq9" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_phq9(15.2)
#> Error in make_new_youthvars_phq9(x): is.integer(x) is not TRUE

youthvars_phq9(-1L)
#> Error: All non-missing values in valid youthvars_phq9 object must be greater than or equal to 0.

youthvars_phq9(28L)
#> Error: All non-missing values in valid youthvars_phq9 object must be less than or equal to 27.

The youthvars_scared class is defined for integer vectors with a minimum value of 0 and a maximum value of 82.

youthvars_scared(77L)
#> [1] 77
#> attr(,"class")
#> [1] "youthvars_scared" "integer"

youthvars_scared(as.integer(c(0,42,82)))
#> [1]  0 42 82
#> attr(,"class")
#> [1] "youthvars_scared" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_scared(33.2)
#> Error in make_new_youthvars_scared(x): is.integer(x) is not TRUE

youthvars_scared(-1L)
#> Error: All non-missing values in valid youthvars_scared object must be greater than or equal to 0.

youthvars_scared(83)
#> Error in make_new_youthvars_scared(x): is.integer(x) is not TRUE

The youthvars_sofas class is defined for integer vectors with a minimum value of 0 and a maximum value of 100.

youthvars_sofas(44L)
#> [1] 44
#> attr(,"class")
#> [1] "youthvars_sofas" "integer"

youthvars_sofas(as.integer(c(0,23,89)))
#> [1]  0 23 89
#> attr(,"class")
#> [1] "youthvars_sofas" "integer"

Non-integers and values outside these ranges will produce errors.

youthvars_sofas(73.2)
#> Error in make_new_youthvars_sofas(x): is.integer(x) is not TRUE

youthvars_sofas(-1L)
#> Error: All non-missing values in valid youthvars_sofas object must be greater than or equal to 0.

youthvars_sofas(103L)
#> Error: All non-missing values in valid youthvars_sofas object must be less than or equal to 100.

4.2.3 - Standardise Variable Values With Fuzzy Logic And Correspondence Tables

Costing health economic datasets is an activity that can involve repeated use of lookup tables. This tutorial describes how a module from the costly R package can help you to use a combination of fuzzy logic and correspondence tables to standardise variable values and thus facilitate partial automation of costing algorithms.

This below section renders a vignette article from the costly library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(ready4)
library(ready4use)
library(costly)

In brief

The steps described and explained in this vignette can also be (more succinctly) accomplished with the following code.

X <- CostlyCountries() 
X <- renew(X, type_1L_chr = "default") 
X <- renew(X, "jw", type_1L_chr = "slot", what_1L_chr = "logic") 
X <- renew(X, T, type_1L_chr = "slot", what_1L_chr = "force")
X <- ratify(X)

Create project

We begin by creating X, an instance of the CostlyCorrespondences module.

X <- CostlyCorrespondences()

Supply seed dataset

We begin by creating a CostlySeed module instance that includes a dataset containing our variable of interest (in this case, countries). The dataset needs to be paired with a dataset dictionary using the Ready4useDyad module from the ready4use R library. You can supply a custom standards dataset (a tibble), dictionary (a ready4use_dictionary) and the concept represented by our variable of interest using a command of the following format.

# Not run
# A <- CostlySeed(Ready4useDyad_r4 = Ready4useDyad(ds_tb = tibble::tibble(), dictionary_r3 = ready4use_dictionary()), include_chr = c("Country"), label_1L_chr = "Country")

The add_default_country_seed function will perform the previous step using values that pair the world.cities dataset of the maps R library with an appropriate dictionary and specifies countries as the concept we will be standardising.

A <- CostlySeed() %>% add_default_country_seed()

We can now inspect the first few records from our labelled seed dataset.

renewSlot(A, "Ready4useDyad_r4", type_1L_chr = "label") %>%
exhibitSlot("Ready4useDyad_r4", display_1L_chr = "head", scroll_box_args_ls = list(width = "100%"))

Dataset
City name	Country name	Population size	Latitude coordinate	Longitude coordinate	Is the nation's capital city
'Abasan al-Jadidah	Palestine	5629	31.31	34.34	0
'Abasan al-Kabirah	Palestine	18999	31.32	34.35	0
'Abdul Hakim	Pakistan	47788	30.55	72.11	0
'Abdullah-as-Salam	Kuwait	21817	29.36	47.98	0
'Abud	Palestine	2456	32.03	35.07	0
'Abwein	Palestine	3434	32.03	35.20	0

We can also inspect the data dictionary contained in A.

exhibitSlot(A, "Ready4useDyad_r4", type_1L_chr = "dict", scroll_box_args_ls = list(width = "100%"))

Data Dictionary
Variable	Category	Description	Class
name	City	City name	character
country.etc	Country	Country name	character
pop	Population	Population size	integer
lat	Latitude	Latitude coordinate	numeric
long	Longitude	Longitude coordinate	numeric
capital	Capital	Is the nation's capital city	integer

We now specify the dictionary category that corresponds to the variable we wish to standardise (“Country”). We need to use the same category name to label the results objects that we generate in subsequent steps.

A@include_chr <- A@label_1L_chr <- "Country"

We now add A to X.

X <- renew(X, A, what_1L_chr = "seed")

Specify standards

We next must specify a dataset that includes the complete list of allowable variable values.

This workflow for this step is similar to that for specifying standards, except that instead of a CostlySeed module we use a CostlyStandards module.

# Not run
# Y <- CostlyStandards(Ready4useDyad_r4 = Ready4useDyad(ds_tb = tibble::tibble(), dictionary_r3 = ready4use_dictionary()))

In many cases using the ISO_3166_1 dataset from the ISOcodes library will be the optimal choice for the standardised form of country names. We can use the add_country_standards function to pair this dataset with its dictionary and create B, a CostlyStandards module instance.

B <- CostlyStandards(Ready4useDyad_r4 = Ready4useDyad() %>% add_country_standards())

We can inspect the first few cases of the labelled version of the dataset in B.

renewSlot(B, "Ready4useDyad_r4", type_1L_chr = "label") %>% 
  exhibitSlot("Ready4useDyad_r4", display_1L_chr = "head", scroll_box_args_ls = list(width = "100%"))

Dataset
Alpabetical country code (two letters)	Alpabetical country code (three letters)	Numeric country code	Country name	Country name (official)	Country name (common alternative)
AW	ABW	533	Aruba	NA	NA
AF	AFG	004	Afghanistan	Islamic Republic of Afghanistan	NA
AO	AGO	024	Angola	Republic of Angola	NA
AI	AIA	660	Anguilla	NA	NA
AX	ALA	248	Åland Islands	NA	NA
AL	ALB	008	Albania	Republic of Albania	NA

We can also inspect the data dictionary contained in B.

exhibitSlot(B, "Ready4useDyad_r4", type_1L_chr = "dict", scroll_box_args_ls = list(width = "100%"))

Data Dictionary
Variable	Category	Description	Class
Alpha_2	A2	Alpabetical country code (two letters)	character
Alpha_3	A3	Alpabetical country code (three letters)	character
Numeric	N	Numeric country code	character
Name	Country	Country name	character
Official_name	Official	Country name (official)	character
Common_name	Common	Country name (common alternative)	character

We can now specifying both the concept (from the “Category” column of the data dictionary) that specifies allowable values for our target variable and all concepts we plan to use for fuzzy logic matching (described below).

B@label_1L_chr <- "Country"
B@include_chr <- c("Country", "Official","Common","A3","A2")

We now add B to X.

X <- renew(X, B, what_1L_chr = "standards")

Compare variable of interest values from seed and standards dataset.

To identify any disparities between the variable of interest in our seed and standards datasets we can use the ratify method. Supplying the value “identity” ensures that the output will differ from input only in the slot reserved for results.

X <- ratify(X, new_val_xx = "identity")

We can now identify the values from our seed dataset variable of interest that were not in our standard values.

X@results_ls$Country_Output_Validation$Invalid_Values

We can also identify standard values that were not present in the seed dataset variable of interest.

X@results_ls$Country_Output_Validation$Absent_Values
#>  [1] "Åland Islands"                                "Antarctica"                                   "Bolivia, Plurinational State of"              "Bonaire, Sint Eustatius and Saba"             "Bouvet Island"                               
#>  [6] "British Indian Ocean Territory"               "Brunei Darussalam"                            "Cabo Verde"                                   "Christmas Island"                             "Cocos (Keeling) Islands"                     
#> [11] "Congo, The Democratic Republic of the"        "Côte d'Ivoire"                                "Curaçao"                                      "Czechia"                                      "Eswatini"                                    
#> [16] "Falkland Islands (Malvinas)"                  "French Southern Territories"                  "Guernsey"                                     "Heard Island and McDonald Islands"            "Holy See (Vatican City State)"               
#> [21] "Hong Kong"                                    "Iran, Islamic Republic of"                    "Korea, Democratic People's Republic of"       "Korea, Republic of"                           "Lao People's Democratic Republic"            
#> [26] "Macao"                                        "Micronesia, Federated States of"              "Moldova, Republic of"                         "Palestine, State of"                          "Réunion"                                     
#> [31] "Russian Federation"                           "Saint Barthélemy"                             "Saint Helena, Ascension and Tristan da Cunha" "Saint Martin (French part)"                   "Saint Vincent and the Grenadines"            
#> [36] "Sint Maarten (Dutch part)"                    "South Georgia and the South Sandwich Islands" "Syrian Arab Republic"                         "Taiwan, Province of China"                    "Tanzania, United Republic of"                
#> [41] "Timor-Leste"                                  "Türkiye"                                      "Turks and Caicos Islands"                     "United Kingdom"                               "United States"                               
#> [46] "United States Minor Outlying Islands"         "Venezuela, Bolivarian Republic of"            "Viet Nam"                                     "Virgin Islands, British"                      "Virgin Islands, U.S."

Standardise variable values

We can explore the extent to which we can use fuzzy logic to reconcile some of these discrepancies. To identify the types of fuzzy logic algorithms we could use, run the following command to explore the relevant part of the documentation from the stringdist library.

# Not run
# help("stringdist-metrics", package=stringdist)

In this case, we have chosen the Jaro, or Jaro-Winkler distance method (“jw”).

X <- renew(X, "jw", type_1L_chr = "slot", what_1L_chr = "logic") 
X <- ratify(X, new_val_xx = NULL)

This method will replace every previously invalid seed dataset variable value with the best available match identified by the selected fuzzy logic algorithm.

X@results_ls$Country_Output_Validation$Invalid_Values
#> character(0)

However, some of the replacements will be spurious as can be seen by inspecting the record of the replacements made.

X@results_ls$Country_Output_Correspondences
#> # A tibble: 42 × 2
#>    old_nms_chr               new_nms_chr                          
#>    <chr>                     <chr>                                
#>  1 Azores                    Timor-Leste                          
#>  2 Bolivia                   Bolivia, Plurinational State of      
#>  3 British Virgin Islands    Virgin Islands, British              
#>  4 Brunei                    Brunei Darussalam                    
#>  5 Canary Islands            Åland Islands                        
#>  6 Cape Verde                Cabo Verde                           
#>  7 Congo Democratic Republic Congo, The Democratic Republic of the
#>  8 Czech Republic            Czechia                              
#>  9 East Timor                Eswatini                             
#> 10 Easter Island             Christmas Island                     
#> # ℹ 32 more rows

For each of the incorrect correspondences, we will need to manually specify correct values. We can do this using the ready4show_correspondences sub-module.

# Not run
# a <- ready4show::renew.ready4show_correspondences(ready4show::ready4show_correspondences(), 
#         old_nms_chr = c("old_name_1", "old_name_2", "etc...."), new_nms_chr = c("new_name_1", "new_name_2", "etc...."))

The make_country_correspondences can be used as a shortcut for creating the alternative correspondences for this specific example.

a <- make_country_correspondences("cities")

We can inspect the values of this correspondence table.

exhibit(a, scroll_box_args_ls = list(width = "100%"))

Old name	New name
Azores	Portugal
Canary Islands	Spain
Easter Island	Chile
East Timor	Timor-Leste
Ivory Coast	Côte d'Ivoire
Kosovo	Kosovo
Madeira	Portugal
Netherlands Antilles	Bonaire, Sint Eustatius and Saba
Sicily	Italy
Vatican City	Holy See (Vatican City State)

When the ratify method was used to apply the fuzzy logic algorithm in a previous step, X was modified so that this logic is by default switched off for future calls to ratify. If we had created a new correspondence table that specified replacements for all invalid values, this would not be a problem. However, in this example we are only specifying correspondences where the fuzzy logic algorithm failed, so we need to again supply our desired fuzzy logic value.

X <- renew(X, "jw", type_1L_chr = "slot", what_1L_chr = "logic")

We now rerun our ratify method (which in this example will combine fuzzy logic with lookups from the manually created correspondences table).

X <- ratify(X, new_val_xx = a)

We once again inspect results.

Our correspondences table looks better.

X@results_ls$Country_Output_Correspondences
#> # A tibble: 42 × 2
#>    old_nms_chr               new_nms_chr                          
#>    <chr>                     <chr>                                
#>  1 Azores                    Portugal                             
#>  2 Bolivia                   Bolivia, Plurinational State of      
#>  3 British Virgin Islands    Virgin Islands, British              
#>  4 Brunei                    Brunei Darussalam                    
#>  5 Canary Islands            Spain                                
#>  6 Cape Verde                Cabo Verde                           
#>  7 Congo Democratic Republic Congo, The Democratic Republic of the
#>  8 Czech Republic            Czechia                              
#>  9 East Timor                Timor-Leste                          
#> 10 Easter Island             Chile                                
#> # ℹ 32 more rows

There is still a value that is not included in our standards.

X@results_ls$Country_Output_Validation$Invalid_Values
#> [1] "Kosovo"

We can rerun the ratify method to force the removal of any record that is not included in our standards dataset.

X <- renew(X, T, type_1L_chr = "slot", what_1L_chr = "force") 
X <- ratify(X, new_val_xx = "identity")

No invalid values remain.

X@results_ls$Country_Output_Validation$Invalid_Values
#> character(0)

However, there are also a some values from our standards dataset that are not represented in the results dataset values.

X@results_ls$Country_Output_Validation$Absent_Values
#>  [1] "Åland Islands"                                "Antarctica"                                   "Bouvet Island"                                "British Indian Ocean Territory"               "Christmas Island"                            
#>  [6] "Cocos (Keeling) Islands"                      "Curaçao"                                      "French Southern Territories"                  "Heard Island and McDonald Islands"            "Hong Kong"                                   
#> [11] "Macao"                                        "Sint Maarten (Dutch part)"                    "South Georgia and the South Sandwich Islands" "United States Minor Outlying Islands"

Whether this is a problem or not depends on the intended purposes of the standardised dataset we are creating. We could choose to rerun the previous steps after making edits to either or both of the standards dataset (e.g. we could delete any superfluous, outdated or incorrect records or use an entirely new standards dataset) and seed dataset (e.g. adding new records or recategorising existing records so that there are corresponding values for every missing standard value). In this case we are going to assume that the above missing values are not a cause for concern for the valid use of our updated dataset for it intended purposes. We can now create a new object Y, using our results dataset’s Ready4useDyad module instance.

Y <- X@results_ls$Country_Output_Lookup

We can inspect the records for cases corresponding to capital cities from our new dataset.

renewSlot(Y,"ds_tb",Y@ds_tb %>% dplyr::filter(capital==1)) %>%
  renew(type_1L_chr = "label") %>%
  exhibit(scroll_box_args_ls = list(width = "100%"))

Dataset
City name	Country name	Population size	Latitude coordinate	Longitude coordinate	Is the nation's capital city
'Amman	Jordan	1303197	31.95	35.93	1
Abu Dhabi	United Arab Emirates	619316	24.48	54.37	1
Abuja	Nigeria	178462	9.18	7.17	1
Accra	Ghana	2029143	5.56	-0.20	1
Adamstown	Pitcairn	51	-25.05	-130.10	1
Addis Abeba	Ethiopia	2823167	9.03	38.74	1
Agana	Guam	1041	13.47	144.75	1
Algiers	Algeria	2029936	36.77	3.04	1
Alofi	Niue	627	-19.05	-169.92	1
Amsterdam	Netherlands	744159	52.37	4.89	1
Andorra la Vella	Andorra	20314	42.51	1.51	1
Ankara	Türkiye	3579706	39.93	32.85	1
Antananarivo	Madagascar	1463754	-18.89	47.51	1
Apia	Samoa	40805	-13.83	-171.76	1
Asgabat	Turkmenistan	823013	37.95	58.38	1
Asmara	Eritrea	578860	15.33	38.94	1
Astana	Kazakhstan	351343	51.17	71.47	1
Asuncion	Paraguay	507574	-25.30	-57.63	1
Athens	Greece	725049	37.98	23.73	1
Avarua	Cook Islands	13645	-21.20	-159.76	1
Baghdad	Iraq	5753612	33.33	44.44	1
Bairiki	Kiribati	45982	1.33	172.99	1
Baku	Azerbaijan	1118725	40.39	49.86	1
Bamako	Mali	1342519	12.65	-7.99	1
Bandar Seri Begawan	Brunei Darussalam	67077	4.93	114.95	1
Bangkok	Thailand	4935988	13.73	100.50	1
Bangui	Central African Republic	547668	4.36	18.56	1
Banjul	Gambia	34388	13.46	-16.60	1
Basse-Terre	Guadeloupe	11298	16.00	-61.72	1
Basseterre	Saint Kitts and Nevis	12883	17.31	-62.73	1
Bayrut	Lebanon	1273440	33.88	35.50	1
Beijing	China	7602069	39.93	116.40	1
Belgrade	Serbia	1113589	44.83	20.50	1
Belmopan	Belize	14590	17.25	-88.79	1
Berlin	Germany	3378275	52.52	13.38	1
Bern	Switzerland	120596	46.95	7.44	1
Biskek	Kyrgyzstan	915625	42.87	74.57	1
Bissau	Guinea-Bissau	404119	11.87	-15.60	1
Bogota	Colombia	7235084	4.63	-74.09	1
Brasilia	Brazil	2260541	-15.78	-47.91	1
Bratislava	Slovakia	422452	48.16	17.13	1
Brazzaville	Congo	1326975	-4.25	15.26	1
Bridgetown	Barbados	98725	13.11	-59.61	1
Brussels	Belgium	1031925	50.83	4.33	1
Bucharest	Romania	1862930	44.44	26.10	1
Budapest	Hungary	1700019	47.51	19.08	1
Buenos Aires	Argentina	11595183	-34.61	-58.37	1
Bujumbura	Burundi	336561	-3.37	29.35	1
Cairo	Egypt	7836243	30.06	31.25	1
Canberra	Australia	324736	-35.31	149.13	1
Caracas	Venezuela, Bolivarian Republic of	1808937	10.54	-66.93	1
Castries	Saint Lucia	12904	14.03	-60.98	1
Cayenne	French Guiana	62926	4.92	-52.34	1
Charlotte Amalie	Virgin Islands, U.S.	10415	18.35	-64.94	1
Chisinau	Moldova, Republic of	623671	47.03	28.83	1
Cockburn Town	Turks and Caicos Islands	174	21.46	-71.14	1
Colombo	Sri Lanka	649496	6.93	79.85	1
Conakry	Guinea	1970382	9.55	-13.67	1
Copenhagen	Denmark	1091978	55.68	12.57	1
Dakar	Senegal	2406598	14.72	-17.48	1
Damascus	Syrian Arab Republic	1580909	33.50	36.32	1
Dhaka	Bangladesh	6724976	23.70	90.39	1
Dili	Timor-Leste	163305	-8.57	125.58	1
Dodoma	Tanzania, United Republic of	188150	-6.17	35.74	1
Doha	Qatar	351381	25.30	51.51	1
Douglas	Isle of Man	25621	54.15	-4.48	1
Dublin	Ireland	1030431	53.33	-6.25	1
Dushanbe	Tajikistan	538456	38.57	68.78	1
Dzaoudzi	Mayotte	14558	-12.77	45.25	1
Fakaofo	Tokelau	267	-9.38	-171.22	1
Fort-de-France	Martinique	89233	14.60	-61.08	1
Freetown	Sierra Leone	818709	8.49	-13.24	1
Gaborone	Botswana	214412	-24.65	25.91	1
George Town	Cayman Islands	30570	19.28	-81.39	1
Georgetown	Guyana	236878	6.79	-58.16	1
Gibraltar	Gibraltar	26404	36.14	-5.35	1
Guatemala	Guatemala	1010253	14.63	-90.55	1
Ha Noi	Viet Nam	1452055	21.03	105.84	1
Hamilton	Bermuda	889	32.30	-64.79	1
Harare	Zimbabwe	1575127	-17.82	31.05	1
Havanna	Cuba	2163132	23.13	-82.39	1
Helsinki	Finland	558341	60.17	24.94	1
Honiara	Solomon Islands	57410	-9.43	159.91	1
Islamabad	Pakistan	794431	33.72	73.06	1
Jakarta	Indonesia	8556798	-6.18	106.83	1
Jamestown	Saint Helena, Ascension and Tristan da Cunha	603	-15.92	-5.71	1
Jerusalem	Israel	731731	31.78	35.22	1
Jibuti	Djibouti	633884	11.56	43.15	1
Kabul	Afghanistan	3120963	34.53	69.17	1
Kampala	Uganda	1403619	0.32	32.58	1
Kathmandu	Nepal	822930	27.71	85.31	1
Khartoum	Sudan	2090001	15.58	32.52	1
Kiev	Ukraine	2491404	50.43	30.52	1
Kigali	Rwanda	800003	-1.94	30.06	1
Kingston	Jamaica	585300	17.99	-76.80	1
Kingston	Norfolk Island	890	-29.03	168.05	1
Kingstown	Saint Vincent and the Grenadines	18160	13.16	-61.23	1
Kinshasa	Congo, The Democratic Republic of the	8096254	-4.31	15.32	1
Koror	Palau	11458	7.35	134.51	1
Kuala Lumpur	Malaysia	1482359	3.16	101.71	1
Libreville	Gabon	591356	0.39	9.45	1
Lilongwe	Malawi	683477	-13.97	33.80	1
Lima	Peru	7857121	-12.07	-77.05	1
Lisbon	Portugal	508209	38.72	-9.14	1
Ljubljana	Slovenia	254188	46.06	14.51	1
Lome	Togo	737751	6.17	1.35	1
London	United Kingdom	7489022	51.52	-0.10	1
Longyearbyen	Svalbard and Jan Mayen	1263	78.21	15.61	1
Luanda	Angola	2875277	-8.82	13.24	1
Lusaka	Zambia	1306577	-15.42	28.29	1
Luxemburg	Luxembourg	76380	49.62	6.12	1
Madrid	Spain	3146804	40.42	-3.71	1
Malabo	Equatorial Guinea	161409	3.74	8.79	1
Male	Maldives	87154	4.17	73.50	1
Managua	Nicaragua	990417	12.15	-86.27	1
Manama	Bahrain	147894	26.21	50.58	1
Manila	Philippines	10546511	14.62	120.97	1
Maputo	Mozambique	1220167	-25.95	32.57	1
Maseru	Lesotho	116268	-29.31	27.49	1
Mata'utu	Wallis and Futuna	1310	-13.28	-176.13	1
Mbabane	Eswatini	78740	-26.32	31.14	1
Mexico City	Mexico	8659409	19.43	-99.14	1
Minsk	Belarus	1747482	53.91	27.55	1
Mogadishu	Somalia	2723378	2.05	45.33	1
Monaco-Ville	Monaco	975	43.74	7.42	1
Monrovia	Liberia	954458	6.31	-10.80	1
Montevideo	Uruguay	1271664	-34.87	-56.17	1
Moroni	Comoros	43704	-11.74	43.23	1
Moscow	Russian Federation	10472629	55.75	37.62	1
Muscat	Oman	24122	23.61	58.54	1
N'Djamena	Chad	737281	12.11	15.05	1
Nairobi	Kenya	2864667	-1.29	36.82	1
Nassau	Bahamas	231519	25.06	-77.33	1
Ni Dilli	India	321883	28.60	77.22	1
Niamey	Niger	801297	13.52	2.12	1
Nicosia	Cyprus	202488	35.16	33.38	1
Nicosia	Cyprus	42372	35.18	33.37	1
Nouakchott	Mauritania	731242	18.09	-15.98	1
Noumea	New Caledonia	94751	-22.27	166.44	1
Nuku'alofa	Tonga	23733	-21.14	-175.22	1
Nuuk	Greenland	15243	64.18	-51.73	1
Oranjestad	Aruba	30710	12.53	-70.03	1
Oslo	Norway	821445	59.91	10.75	1
Ottawa	Canada	885542	45.42	-75.71	1
Ouagadougou	Burkina Faso	1119775	12.37	-1.53	1
Pago Pago	American Samoa	4180	-14.24	-170.72	1
Palikir	Micronesia, Federated States of	4552	6.92	158.16	1
Panama	Panama	406070	8.97	-79.53	1
Papeete	French Polynesia	26400	-17.52	-149.56	1
Paramaribo	Suriname	224925	5.85	-55.20	1
Paris	France	2141839	48.86	2.34	1
Phnum Penh	Cambodia	1673131	11.57	104.92	1
Port Louis	Mauritius	156760	-20.17	57.51	1
Port Moresby	Papua New Guinea	289861	-9.48	147.18	1
Port Stanley	Falkland Islands (Malvinas)	2269	-51.70	-57.82	1
Port of Spain	Trinidad and Tobago	49764	10.66	-61.51	1
Port-au-Prince	Haiti	1277104	18.54	-72.34	1
Porto Novo	Benin	238199	6.48	2.63	1
Prague	Czechia	1168374	50.08	14.43	1
Praia	Cabo Verde	117342	14.93	-23.54	1
Pretoria	South Africa	1687779	-25.73	28.22	1
Pyongyang	Korea, Democratic People's Republic of	2992272	39.02	125.75	1
Quito	Ecuador	1399814	-0.19	-78.50	1
Rabat	Morocco	1688738	34.02	-6.84	1
Rangoon	Myanmar	4572948	16.79	96.15	1
Reykjavik	Iceland	114576	64.14	-21.92	1
Riga	Latvia	738386	56.97	24.13	1
Rita	Marshall Islands	21270	7.12	171.06	1
Riyadh	Saudi Arabia	4328067	24.65	46.77	1
Road Town	Virgin Islands, British	8613	18.43	-64.63	1
Rome	Italy	2561181	41.89	12.50	1
Roseau	Dominica	16577	15.30	-61.39	1
Saint George's	Grenada	4315	12.06	-61.74	1
Saint Helier	Jersey	28910	49.19	-2.11	1
Saint John's	Antigua and Barbuda	25321	17.11	-61.85	1
Saint Peter Port	Guernsey	16702	49.47	-2.55	1
Saint-Denis	Réunion	137787	-20.87	55.46	1
Saint-Pierre	Saint Pierre and Miquelon	6254	46.79	-56.18	1
San Jose	Costa Rica	32187	10.97	-85.13	1
San Jose	Costa Rica	339588	9.93	-84.08	1
San Juan	Puerto Rico	417154	18.44	-66.13	1
San Marino	San Marino	4624	43.94	12.43	1
San Salvador	El Salvador	534409	13.69	-89.19	1
San'a	Yemen	1921589	15.38	44.21	1
Santiago	Chile	4893495	-33.46	-70.64	1
Santo Domingo	Dominican Republic	2253437	18.48	-69.91	1
Sao Tome	Sao Tome and Principe	63772	0.37	6.73	1
Sarajevo	Bosnia and Herzegovina	737350	43.85	18.38	1
Singapore	Singapore	3601745	1.30	103.85	1
Skopje	North Macedonia	477493	42.00	21.47	1
Sofia	Bulgaria	1166143	42.69	23.31	1
Seoul	Korea, Republic of	10409345	37.56	126.99	1
Stockholm	Sweden	1260712	59.33	18.07	1
Sucre	Bolivia, Plurinational State of	232669	-19.06	-65.26	1
Susupe	Northern Mariana Islands	2402	15.14	145.70	1
Taipei	Taiwan, Province of China	2491662	25.02	121.45	1
Tallinn	Estonia	392386	59.44	24.74	1
Tashkent	Uzbekistan	1967879	41.31	69.30	1
Tbilisi	Georgia	1038343	41.72	44.79	1
Tegucigalpa	Honduras	872403	14.09	-87.22	1
Tehran	Iran, Islamic Republic of	7160094	35.67	51.43	1
The Valley	Anguilla	1435	18.22	-63.05	1
Thimphu	Bhutan	74175	27.48	89.70	1
Tirana	Albania	380403	41.33	19.82	1
Tokyo	Japan	8372440	35.67	139.77	1
Torshavn	Faroe Islands	13313	62.03	-6.80	1
Tripoli	Libya	1164634	32.87	13.18	1
Tunis	Tunisia	693294	36.84	10.22	1
Ulaanbaatar	Mongolia	862842	47.93	106.91	1
Vaduz	Liechtenstein	5248	47.14	9.53	1
Vaiaku	Tuvalu	4835	-8.52	179.20	1
Valletta	Malta	6748	35.91	14.52	1
Vatican City	Holy See (Vatican City State)	767	41.90	12.46	1
Victoria	Seychelles	22611	-4.62	55.45	1
Vienna	Austria	1570976	48.22	16.37	1
Vientiane	Lao People's Democratic Republic	199863	17.97	102.61	1
Vila	Vanuatu	37141	-17.74	168.31	1
Vilnius	Lithuania	542014	54.70	25.27	1
Warsaw	Poland	1634441	52.26	21.02	1
Washington	United States	548359	38.91	-77.02	1
Wellington	New Zealand	182254	-41.28	174.78	1
Willemstad	Bonaire, Sint Eustatius and Saba	98339	12.10	-68.93	1
Windhoek	Namibia	277349	-22.56	17.09	1
Yamoussoukro	Côte d'Ivoire	200103	6.82	-5.28	1
Yaounde	Cameroon	1344617	3.87	11.52	1
Yaren	Nauru	4587	-0.55	166.91	1
Yerevan	Armenia	1090537	40.17	44.52	1
Zagreb	Croatia	700717	45.80	15.97	1
al-'Ayun	Western Sahara	188084	27.16	-13.20	1
al-Kuwayt	Kuwait	63596	29.38	47.99	1

4.2.4 - Standardise Variable Values With Lookup Codes

This tutorial describes how a module from the costly R package can help you to use lookup codes to standardise variable values and thus facilitate partial automation of costing algorithms.

This below section renders a vignette article from the costly library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(ready4)
library(ready4use)
library(costly)

Note. Parts of the workflow described in this article are common to steps explained in more detail in the article outlining the workflow using fuzzy logic and correspondence tables.

In brief

The steps described and explained in this vignette can also be (more succinctly) accomplished with the following code.

X <- CostlyCountries()
X <- renew(X,
           new_val_xx = add_default_currency_seed(X@CostlySeed_r4, include_1L_chr = "Country"), 
           what_1L_chr = "seed")
X <- renew(X, "jw", type_1L_chr = "slot", what_1L_chr = "logic") 
X <- renew(X, new_val_xx = make_country_correspondences("currencies"), what_1L_chr = "correspondences") 
X <- renew(X, T, type_1L_chr = "slot", what_1L_chr = "force") 
X <- ratify(X)
Y <- CostlyCurrencies()
Y <- renew(Y, new_val_xx = add_default_currency_seed(Y@CostlySeed_r4,
                                                     Ready4useDyad_r4 = X@results_ls$Country_Output_Lookup), 
           what_1L_chr = "seed")
Y <- ratify(Y, type_1L_chr = "Lookup")
Y <- renew(Y, T, type_1L_chr = "slot", what_1L_chr = "force") 
Y <- ratify(Y, type_1L_chr = "Lookup")

Create project

We begin by creating X, a CostlyCorrespondences module instance.

X <- CostlyCorrespondences()

Supply seed dataset

We next create a CostlySeed module instance that includes a dataset containing our variable of interest (in this case, countries). The dataset needs to be paired with a dataset dictionary using the Ready4useDyad module from the ready4use R library. You can supply a custom standards dataset (a tibble), dictionary (a ready4use_dictionary) and the concept represented by our variable of interest using a command of the following format.

# Not run
# A <- CostlySeed(Ready4useDyad_r4 = Ready4useDyad(ds_tb = tibble::tibble(), dictionary_r3 = ready4use_dictionary()), include_chr = c("Country"), label_1L_chr = "Country")

The add_default_country_seed function will perform the previous step using values that pair the world.cities dataset of the maps R library with an appropriate dictionary and specifies countries as the concept we will be standardising.

A <- CostlySeed() %>% add_default_currency_seed()

We now add A to a new CostlyCorrespondences module instance Y, which we use to standardise the country concept variable using a fuzzy logic And correspondence tables workflow.

A@include_chr <- A@label_1L_chr <- "Country"
Y <- CostlyCountries(CostlySeed_r4 = A) %>%
  renew("jw", type_1L_chr = "slot", what_1L_chr = "logic") %>%
  renew(new_val_xx = make_country_correspondences("currencies"), what_1L_chr = "correspondences") %>%
  renew(T, type_1L_chr = "slot", what_1L_chr = "force") %>%
  ratify()

We now update X with the results Ready4useDyad from Y (a seed dataset for which country names have been standardised).

X <- renew(X, new_val_xx = CostlySeed(Ready4useDyad_r4 = Y@results_ls$Country_Output_Lookup), what_1L_chr = "seed") #

We can now inspect the first few records from our labelled seed dataset.

renewSlot(X, "CostlySeed_r4@Ready4useDyad_r4", type_1L_chr = "label") %>%
exhibitSlot("CostlySeed_r4@Ready4useDyad_r4", display_1L_chr = "head", scroll_box_args_ls = list(width = "100%"))

Dataset
Country name	Currency name	Currency symbol	Currency alphabetical ISO code (three letter)	Currency's fractional unit	Number of fractional units in basic unit
Afghanistan	Afghan afghani	؋‎	AFN	Pul	100
Albania	Albanian lek	Lek	ALL	Qintar	100
Algeria	Algerian dinar	DA	DZD	Centime	100
Andorra	Euro	€	EUR	Cent	100
Angola	Angolan kwanza	Kz	AOA	Cêntimo	100
Anguilla	Eastern Caribbean dollar	\$	XCD	Cent	100

We can also inspect the seed dataset’s dictionary.

exhibitSlot(X, "CostlySeed_r4@Ready4useDyad_r4", type_1L_chr = "dict", scroll_box_args_ls = list(width = "100%"))

Data Dictionary
Variable	Category	Description	Class
State / Territory\[1\]	Country	Country name	character
Currency\[1\]\[2\]	Currency	Currency name	character
Symbol\[D\] orAbbrev.\[3\]	Symbol	Currency symbol	character
ISO code\[2\]	A3	Currency alphabetical ISO code (three letter)	character
Fractionalunit	Fractional	Currency's fractional unit	character
Numberto basic	Number	Number of fractional units in basic unit	character

We specify the seed dataset concept that we are looking to standardise and the concept that we will use to lookup replacement values from the standards dataset.

X@CostlySeed_r4@label_1L_chr <- "Currency"
X@CostlySeed_r4@match_1L_chr <- "A3"

Specify standards

We can now create B, a CostlyStandards module instance that includes a dataset specifying the complete list of allowable variable values. In many cases using the ISO_4217 dataset from the ISOcodes library will be the optimal source of standardised names for currencies. Using the add_currency_standards function will pair this dataset with a dictionary.

B <- CostlyStandards(Ready4useDyad_r4 = Ready4useDyad() %>% add_currency_standards())

We can inspect the first few cases of the labelled version of the standards dataset in B.

renewSlot(B, "Ready4useDyad_r4", type_1L_chr = "label") %>% 
  exhibitSlot("Ready4useDyad_r4", display_1L_chr = "head", scroll_box_args_ls = list(width = "100%"))

Dataset
Alpabetical currency code (three letters)	Numeric currency code	Currency name
AED	784	UAE Dirham
AFN	971	Afghani
ALL	008	Lek
AMD	051	Armenian Dram
ANG	532	Netherlands Antillean Guilder
AOA	973	Kwanza

We can also inspect the data dictionary contained in B.

exhibitSlot(B, "Ready4useDyad_r4", type_1L_chr = "dict", scroll_box_args_ls = list(width = "100%"))

Data Dictionary
Variable	Category	Description	Class
Letter	A3	Alpabetical currency code (three letters)	character
Numeric	N	Numeric currency code	character
Currency	Currency	Currency name	character

We can now specifying both the concept (“Currency”) that specifies allowable values for our target variable and the concepts we plan to use for lookup matching (described below).

#B@include_chr <- c("Currency", "Letter")
B@label_1L_chr <- "Currency"
B@match_1L_chr <- "A3"

We now add B to X.

X <- renew(X, B, what_1L_chr = "standards")

Compare variable of interest values from seed and standards dataset.

Currently, the majority of our currency names need to be standardised. In many cases this may be due to something as simple as the use of lower case.

X <- ratify(X, new_val_xx = "identity")
X@results_ls$Currency_Output_Validation$Invalid_Values
#>   [1] "Afghan afghani"                          "Albanian lek"                            "Algerian dinar"                          "Angolan kwanza"                          "Argentine peso"                         
#>   [6] "Armenian dram"                           "Aruban florin"                           "Australian dollar"                       "Azerbaijani manat"                       "Bahamian dollar"                        
#>  [11] "Bahraini dinar"                          "Bangladeshi taka"                        "Barbadian dollar"                        "Belarusian ruble"                        "Belize dollar"                          
#>  [16] "Bermudian dollar"                        "Bhutanese ngultrum"                      "Bitcoin[4]"                              "Bolivian boliviano"                      "Bosnia and Herzegovina convertible mark"
#>  [21] "Botswana pula"                           "Brazilian real"                          "Brunei dollar"                           "Bulgarian lev"                           "Burmese kyat"                           
#>  [26] "Burundian franc"                         "Cambodian riel"                          "Canadian dollar"                         "Cape Verdean escudo"                     "Cayman Islands dollar"                  
#>  [31] "Central African CFA franc"               "CFP franc"                               "Chilean peso"                            "Colombian peso"                          "Comorian franc"                         
#>  [36] "Congolese franc"                         "Cook Islands dollar"                     "Costa Rican colón"                       "Cuban peso"                              "Czech koruna"                           
#>  [41] "Danish krone"                            "Djiboutian franc"                        "Dominican peso"                          "Eastern Caribbean dollar"                "Egyptian pound"                         
#>  [46] "Eritrean nakfa"                          "Ethiopian birr"                          "Falkland Islands pound"                  "Faroese króna"                           "Fijian dollar"                          
#>  [51] "Gambian dalasi"                          "Georgian lari"                           "Ghanaian cedi"                           "Gibraltar pound"                         "Guatemalan quetzal"                     
#>  [56] "Guernsey pound"                          "Guinean franc"                           "Guyanese dollar"                         "Haitian gourde"                          "Honduran lempira"                       
#>  [61] "Hong Kong dollar"                        "Hungarian forint"                        "Icelandic króna"                         "Indian rupee"                            "Indonesian rupiah"                      
#>  [66] "Iranian rial"                            "Iraqi dinar"                             "Israeli new shekel"                      "Jamaican dollar"                         "Japanese yen"                           
#>  [71] "Jersey pound"                            "Jordanian dinar"                         "Kazakhstani tenge"                       "Kenyan shilling"                         "Kiribati dollar[E]"                     
#>  [76] "Kuwaiti dinar"                           "Kyrgyz som"                              "Lao kip"                                 "Lebanese pound"                          "Lesotho loti"                           
#>  [81] "Liberian dollar"                         "Libyan dinar"                            "Macanese pataca"                         "Macedonian denar"                        "Malagasy ariary"                        
#>  [86] "Malawian kwacha"                         "Malaysian ringgit"                       "Maldivian rufiyaa"                       "Manx pound"                              "Mauritanian ouguiya"                    
#>  [91] "Mauritian rupee"                         "Mexican peso"                            "Moldovan leu"                            "Mongolian tögrög"                        "Moroccan dirham"                        
#>  [96] "Mozambican metical"                      "Namibian dollar"                         "Nepalese rupee"                          "Netherlands Antillean guilder"           "New Taiwan dollar"                      
#> [101] "New Zealand dollar"                      "Nicaraguan córdoba"                      "Nigerian naira"                          "Niue dollar[E]"                          "North Korean won"                       
#> [106] "Norwegian krone"                         "Omani rial"                              "Pakistani rupee"                         "Panamanian balboa"                       "Papua New Guinean kina"                 
#> [111] "Paraguayan guaraní"                      "Peruvian sol"                            "Philippine peso"                         "Pitcairn Islands dollar[E]"              "Polish złoty"                           
#> [116] "Qatari riyal"                            "Renminbi"                                "Romanian leu"                            "Russian ruble"                           "Rwandan franc"                          
#> [121] "Sahrawi peseta"                          "Saint Helena pound"                      "Samoan tālā"                             "São Tomé and Príncipe dobra"             "Saudi riyal"                            
#> [126] "Serbian dinar"                           "Seychellois rupee"                       "Sierra Leonean leone"                    "Singapore dollar"                        "Solomon Islands dollar"                 
#> [131] "Somali shilling"                         "South African rand"                      "South Korean won"                        "South Sudanese pound"                    "Sri Lankan rupee"                       
#> [136] "Sterling"                                "Sudanese pound"                          "Surinamese dollar"                       "Swazi lilangeni"                         "Swedish krona"                          
#> [141] "Swiss franc"                             "Syrian pound"                            "Tajikistani somoni"                      "Tanzanian shilling"                      "Thai baht"                              
#> [146] "Tongan paʻanga[K]"                       "Trinidad and Tobago dollar"              "Tunisian dinar"                          "Turkish lira"                            "Turkmenistani manat"                    
#> [151] "Tuvaluan dollar"                         "Ugandan shilling"                        "Ukrainian hryvnia"                       "United Arab Emirates dirham"             "United States dollar"                   
#> [156] "United States dollar[F]"                 "Uruguayan peso"                          "Uzbekistani sum"                         "Vanuatu vatu"                            "Venezuelan digital bolívar"             
#> [161] "Venezuelan sovereign bolívar"            "Vietnamese đồng"                         "West African CFA franc"                  "Yemeni rial"                             "Zambian kwacha"                         
#> [166] "Zimbabwe gold"                           "Zimbabwean dollar"

Standardised currency names not currently present in our seed dataset are as follows.

X@results_ls$Currency_Output_Validation$Absent_Values
#>   [1] "ADB Unit of Account"                                               "Afghani"                                                           "Algerian Dinar"                                                   
#>   [4] "Argentine Peso"                                                    "Armenian Dram"                                                     "Aruban Florin"                                                    
#>   [7] "Australian Dollar"                                                 "Azerbaijan Manat"                                                  "Bahamian Dollar"                                                  
#>  [10] "Bahraini Dinar"                                                    "Baht"                                                              "Balboa"                                                           
#>  [13] "Barbados Dollar"                                                   "Belarusian Ruble"                                                  "Belize Dollar"                                                    
#>  [16] "Bermudian Dollar"                                                  "Bolívar Soberano"                                                  "Boliviano"                                                        
#>  [19] "Bond Markets Unit European Composite Unit (EURCO)"                 "Bond Markets Unit European Monetary Unit (E.M.U.-6)"               "Bond Markets Unit European Unit of Account 17 (E.U.A.-17)"        
#>  [22] "Bond Markets Unit European Unit of Account 9 (E.U.A.-9)"           "Brazilian Real"                                                    "Brunei Dollar"                                                    
#>  [25] "Bulgarian Lev"                                                     "Burundi Franc"                                                     "Cabo Verde Escudo"                                                
#>  [28] "Canadian Dollar"                                                   "Cayman Islands Dollar"                                             "CFA Franc BCEAO"                                                  
#>  [31] "CFA Franc BEAC"                                                    "CFP Franc"                                                         "Chilean Peso"                                                     
#>  [34] "Codes specifically reserved for testing purposes"                  "Colombian Peso"                                                    "Comorian Franc"                                                   
#>  [37] "Congolese Franc"                                                   "Convertible Mark"                                                  "Cordoba Oro"                                                      
#>  [40] "Costa Rican Colon"                                                 "Cuban Peso"                                                        "Czech Koruna"                                                     
#>  [43] "Dalasi"                                                            "Danish Krone"                                                      "Denar"                                                            
#>  [46] "Djibouti Franc"                                                    "Dobra"                                                             "Dominican Peso"                                                   
#>  [49] "Dong"                                                              "East Caribbean Dollar"                                             "Egyptian Pound"                                                   
#>  [52] "El Salvador Colon"                                                 "Ethiopian Birr"                                                    "Falkland Islands Pound"                                           
#>  [55] "Fiji Dollar"                                                       "Forint"                                                            "Ghana Cedi"                                                       
#>  [58] "Gibraltar Pound"                                                   "Gold"                                                              "Gourde"                                                           
#>  [61] "Guarani"                                                           "Guinean Franc"                                                     "Guyana Dollar"                                                    
#>  [64] "Hong Kong Dollar"                                                  "Hryvnia"                                                           "Iceland Krona"                                                    
#>  [67] "Indian Rupee"                                                      "Iranian Rial"                                                      "Iraqi Dinar"                                                      
#>  [70] "Jamaican Dollar"                                                   "Jordanian Dinar"                                                   "Kenyan Shilling"                                                  
#>  [73] "Kina"                                                              "Kuna"                                                              "Kuwaiti Dinar"                                                    
#>  [76] "Kwanza"                                                            "Kyat"                                                              "Lao Kip"                                                          
#>  [79] "Lari"                                                              "Lebanese Pound"                                                    "Lek"                                                              
#>  [82] "Lempira"                                                           "Leone"                                                             "Liberian Dollar"                                                  
#>  [85] "Libyan Dinar"                                                      "Lilangeni"                                                         "Loti"                                                             
#>  [88] "Malagasy Ariary"                                                   "Malawi Kwacha"                                                     "Malaysian Ringgit"                                                
#>  [91] "Mauritius Rupee"                                                   "Mexican Peso"                                                      "Mexican Unidad de Inversion (UDI)"                                
#>  [94] "Moldovan Leu"                                                      "Moroccan Dirham"                                                   "Mozambique Metical"                                               
#>  [97] "Mvdol"                                                             "Naira"                                                             "Nakfa"                                                            
#> [100] "Namibia Dollar"                                                    "Nepalese Rupee"                                                    "Netherlands Antillean Guilder"                                    
#> [103] "New Israeli Sheqel"                                                "New Taiwan Dollar"                                                 "New Zealand Dollar"                                               
#> [106] "Ngultrum"                                                          "North Korean Won"                                                  "Norwegian Krone"                                                  
#> [109] "Ouguiya"                                                           "Pa’anga"                                                           "Pakistan Rupee"                                                   
#> [112] "Palladium"                                                         "Pataca"                                                            "Peso Convertible"                                                 
#> [115] "Peso Uruguayo"                                                     "Philippine Peso"                                                   "Platinum"                                                         
#> [118] "Pound Sterling"                                                    "Pula"                                                              "Qatari Rial"                                                      
#> [121] "Quetzal"                                                           "Rand"                                                              "Rial Omani"                                                       
#> [124] "Riel"                                                              "Romanian Leu"                                                      "Rufiyaa"                                                          
#> [127] "Rupiah"                                                            "Russian Ruble"                                                     "Rwanda Franc"                                                     
#> [130] "Saint Helena Pound"                                                "Saudi Riyal"                                                       "SDR (Special Drawing Right)"                                      
#> [133] "Serbian Dinar"                                                     "Seychelles Rupee"                                                  "Silver"                                                           
#> [136] "Singapore Dollar"                                                  "Sol"                                                               "Solomon Islands Dollar"                                           
#> [139] "Som"                                                               "Somali Shilling"                                                   "Somoni"                                                           
#> [142] "South Sudanese Pound"                                              "Sri Lanka Rupee"                                                   "Sucre"                                                            
#> [145] "Sudanese Pound"                                                    "Surinam Dollar"                                                    "Swedish Krona"                                                    
#> [148] "Swiss Franc"                                                       "Syrian Pound"                                                      "Taka"                                                             
#> [151] "Tala"                                                              "Tanzanian Shilling"                                                "Tenge"                                                            
#> [154] "The codes assigned for transactions where no currency is involved" "Trinidad and Tobago Dollar"                                        "Tugrik"                                                           
#> [157] "Tunisian Dinar"                                                    "Turkish Lira"                                                      "Turkmenistan New Manat"                                           
#> [160] "UAE Dirham"                                                        "Uganda Shilling"                                                   "Unidad de Fomento"                                                
#> [163] "Unidad de Valor Real"                                              "Unidad Previsional"                                                "Uruguay Peso en Unidades Indexadas (UI)"                          
#> [166] "US Dollar"                                                         "US Dollar (Next day)"                                              "Uzbekistan Sum"                                                   
#> [169] "Vatu"                                                              "WIR Euro"                                                          "WIR Franc"                                                        
#> [172] "Won"                                                               "Yemeni Rial"                                                       "Yen"                                                              
#> [175] "Yuan Renminbi"                                                     "Zambian Kwacha"                                                    "Zimbabwe Dollar"                                                  
#> [178] "Zloty"

Standardise variable values

We standardise the target variable values, specifying that we are using the lookup codes method and not the fuzzy-logic / correspondences method.

X <- ratify(X, type_1L_chr = "Lookup")

This significantly reduces the umber of non-standard values for our target variable.

X@results_ls$Currency_Output_Validation$Invalid_Values
#>  [1] "Bitcoin[4]"                 "Cook Islands dollar"        "Faroese króna"              "Guernsey pound"             "Jersey pound"               "Kiribati dollar[E]"         "Manx pound"                 "Niue dollar[E]"            
#>  [9] "Pitcairn Islands dollar[E]" "Sahrawi peseta"             "Tuvaluan dollar"            "Zimbabwe gold"              "Zimbabwean dollar"

If we wish we can remove the non-standardised values.

X <- renew(X, T, type_1L_chr = "slot", what_1L_chr = "force") 
X <- ratify(X, type_1L_chr = "Lookup")

We can no inspect our results a dataset for which the country names and currency names now conform to ISO standards.

X@results_ls$Currency_Output_Lookup %>%
  renew(type_1L_chr = "label") %>%
  exhibit(scroll_box_args_ls = list(width = "100%"))

Dataset
Country name	Currency name	Currency symbol	Currency alphabetical ISO code (three letter)	Currency's fractional unit	Number of fractional units in basic unit
Afghanistan	Afghani	؋‎	AFN	Pul	100
Albania	Lek	Lek	ALL	Qintar	100
Algeria	Algerian Dinar	DA	DZD	Centime	100
Andorra	Euro	€	EUR	Cent	100
Angola	Kwanza	Kz	AOA	Cêntimo	100
Anguilla	East Caribbean Dollar	\$	XCD	Cent	100
Antigua and Barbuda	East Caribbean Dollar	\$	XCD	Cent	100
Argentina	Argentine Peso	\$	ARS	Centavo	100
Armenia	Armenian Dram	֏	AMD	Luma	100
Aruba	Aruban Florin	ƒ	AWG	Cent	100
Saint Helena, Ascension and Tristan da Cunha	Saint Helena Pound	£	SHP	Penny	100
Australia	Australian Dollar	\$	AUD	Cent	100
Austria	Euro	€	EUR	Cent	100
Azerbaijan	Azerbaijan Manat	₼	AZN	Qəpik	100
Bahamas	Bahamian Dollar	\$	BSD	Cent	100
Bahrain	Bahraini Dinar	BD	BHD	Fils	1000
Bangladesh	Taka	৳	BDT	Poisha	100
Barbados	Barbados Dollar	\$	BBD	Cent	100
Belarus	Belarusian Ruble	Br	BYN	Kopeck	100
Belgium	Euro	€	EUR	Cent	100
Belize	Belize Dollar	\$	BZD	Cent	100
Benin	CFA Franc BCEAO	Fr	XOF	Centime	100
Bermuda	Bermudian Dollar	\$	BMD	Cent	100
Bhutan	Ngultrum	Nu	BTN	Chetrum	100
Bhutan	Indian Rupee	₹	INR	Paisa	100
Bolivia, Plurinational State of	Boliviano	Bs	BOB	Centavo	100
Bonaire, Sint Eustatius and Saba	US Dollar	\$	USD	Cent	100
Bosnia and Herzegovina	Convertible Mark	KM	BAM	Fening	100
Botswana	Pula	P	BWP	Thebe	100
Brazil	Brazilian Real	R\$	BRL	Centavo	100
British Indian Ocean Territory	US Dollar	\$	USD	Cent	100
Virgin Islands, British	US Dollar	\$	USD	Cent	100
Brunei Darussalam	Brunei Dollar	\$	BND	Sen	100
Brunei Darussalam	Singapore Dollar	\$	SGD	Cent	100
Bulgaria	Bulgarian Lev	Lev	BGN	Stotinka	100
Burkina Faso	CFA Franc BCEAO	Fr	XOF	Centime	100
Burundi	Burundi Franc	Fr	BIF	Centime	100
Cambodia	Riel	៛	KHR	Sen	100
Cambodia	US Dollar	\$	USD	Cent	100
Cameroon	CFA Franc BEAC	Fr	XAF	Centime	100
Canada	Canadian Dollar	\$	CAD	Cent	100
Cabo Verde	Cabo Verde Escudo	\$	CVE	Centavo	100
Cayman Islands	Cayman Islands Dollar	\$	KYD	Cent	100
Central African Republic	CFA Franc BEAC	Fr	XAF	Centime	100
Chad	CFA Franc BEAC	Fr	XAF	Centime	100
Chile	Chilean Peso	\$	CLP	Centavo	100
China	Yuan Renminbi	¥	CNY	Jiao\[G\]	10
Colombia	Colombian Peso	\$	COP	Centavo	100
Comoros	Comorian Franc	Fr	KMF	Centime	100
Congo, The Democratic Republic of the	Congolese Franc	Fr	CDF	Centime	100
Congo	CFA Franc BEAC	Fr	XAF	Centime	100
Cook Islands	New Zealand Dollar	\$	NZD	Cent	100
Costa Rica	Costa Rican Colon	₡	CRC	Céntimo	100
Côte d'Ivoire	CFA Franc BCEAO	Fr	XOF	Centime	100
Croatia	Euro	€	EUR	Cent	100
Cuba	Cuban Peso	\$	CUP	Centavo	100
Curaçao	Netherlands Antillean Guilder	ƒ	ANG	Cent	100
Cyprus	Euro	€	EUR	Cent	100
Czechia	Czech Koruna	Kč	CZK	Heller	100
Denmark	Danish Krone	kr	DKK	Øre	100
Djibouti	Djibouti Franc	Fr	DJF	Centime	100
Dominica	East Caribbean Dollar	\$	XCD	Cent	100
Dominican Republic	Dominican Peso	\$	DOP	Centavo	100
Timor-Leste	US Dollar	\$	USD	Centavo	100
Ecuador	US Dollar	\$	USD	Centavo	100
Egypt	Egyptian Pound	LE	EGP	Piastre\[B\]	100
El Salvador	US Dollar	\$	USD	Cent	100
Equatorial Guinea	CFA Franc BEAC	Fr	XAF	Centime	100
Eritrea	Nakfa	Nkf	ERN	Cent	100
Estonia	Euro	€	EUR	Cent	100
Eswatini	Lilangeni	L or E (pl.)	SZL	Cent	100
Eswatini	Rand	R	ZAR	Cent	100
Ethiopia	Ethiopian Birr	Br	ETB	Santim	100
Falkland Islands (Malvinas)	Falkland Islands Pound	£	FKP	Penny	100
Falkland Islands (Malvinas)	Pound Sterling	£	GBP	Penny	100
Faroe Islands	Danish Krone	kr	DKK	Øre	100
Fiji	Fiji Dollar	\$	FJD	Cent	100
Finland	Euro	€	EUR	Cent	100
France	Euro	€	EUR	Cent	100
French Polynesia	CFP Franc	₣	XPF	Centime	100
French Southern Territories	Euro	€	EUR	Cent	100
Gabon	CFA Franc BEAC	Fr	XAF	Centime	100
Gambia	Dalasi	D	GMD	Butut	100
Georgia	Lari	₾	GEL	Tetri	100
Germany	Euro	€	EUR	Cent	100
Ghana	Ghana Cedi	₵	GHS	Pesewa	100
Gibraltar	Gibraltar Pound	£	GIP	Penny	100
Gibraltar	Pound Sterling	£	GBP	Penny	100
Greece	Euro	€	EUR	Cent	100
Greenland	Danish Krone	kr	DKK	Øre	100
Grenada	East Caribbean Dollar	\$	XCD	Cent	100
Guatemala	Quetzal	Q	GTQ	Centavo	100
Guernsey	Pound Sterling	£	GBP	Penny	100
Guinea	Guinean Franc	Fr	GNF	Centime	100
Guinea-Bissau	CFA Franc BCEAO	Fr	XOF	Centime	100
Guyana	Guyana Dollar	\$	GYD	Cent	100
Haiti	Gourde	G	HTG	Centime	100
Honduras	Lempira	L	HNL	Centavo	100
Hong Kong	Hong Kong Dollar	\$	HKD	Cent	100
Hungary	Forint	Ft	HUF	Fillér	100
Iceland	Iceland Krona	kr	ISK	Eyrir	100
India	Indian Rupee	₹	INR	Paisa	100
Indonesia	Rupiah	Rp	IDR	Sen	100
Iran, Islamic Republic of	Iranian Rial	Rl or Rls (pl.)	IRR	Rial	1
Iraq	Iraqi Dinar	ID	IQD	Fils	1000
Ireland	Euro	€	EUR	Cent	100
Isle of Man	Pound Sterling	£	GBP	Penny	100
Israel	New Israeli Sheqel	₪	ILS	Agora	100
Italy	Euro	€	EUR	Cent	100
Jamaica	Jamaican Dollar	\$	JMD	Cent	100
Japan	Yen	¥	JPY	Sen\[C\]	100
Jersey	Pound Sterling	£	GBP	Penny	100
Jordan	Jordanian Dinar	JD	JOD	Piastre\[H\]	100
Kazakhstan	Tenge	₸	KZT	Tıyn	100
Kenya	Kenyan Shilling	Sh or Shs (pl.)	KES	Cent	100
Kiribati	Australian Dollar	\$	AUD	Cent	100
Korea, Democratic People's Republic of	North Korean Won	₩	KPW	Chon	100
Korea, Republic of	Won	₩	KRW	Jeon	100
Kuwait	Kuwaiti Dinar	KD	KWD	Fils	1000
Kyrgyzstan	Som	som	KGS	Tyiyn	100
Lao People's Democratic Republic	Lao Kip	₭	LAK	Att	100
Latvia	Euro	€	EUR	Cent	100
Lebanon	Lebanese Pound	LL	LBP	Piastre	100
Lesotho	Loti	L or M (pl.)	LSL	Sente	100
Lesotho	Rand	R	ZAR	Cent	100
South Georgia and the South Sandwich Islands	Falkland Islands Pound	£	FKP	Penny	100
South Georgia and the South Sandwich Islands	Pound Sterling	£	GBP	Penny	100
Liberia	Liberian Dollar	\$	LRD	Cent	100
Liberia	US Dollar	\$	USD	Cent	100
Libya	Libyan Dinar	LD	LYD	Dirham	1000
Liechtenstein	Swiss Franc	Fr	CHF	Rappen	100
Lithuania	Euro	€	EUR	Cent	100
Luxembourg	Euro	€	EUR	Cent	100
Macao	Pataca	MOP\$	MOP	Avo	100
Macao	Hong Kong Dollar	\$	HKD	Cent	100
Madagascar	Malagasy Ariary	Ar	MGA	Iraimbilanja	5
Malawi	Malawi Kwacha	K	MWK	Tambala	100
Malaysia	Malaysian Ringgit	RM	MYR	Sen	100
Maldives	Rufiyaa	Rf	MVR	Laari	100
Mali	CFA Franc BCEAO	Fr	XOF	Centime	100
Malta	Euro	€	EUR	Cent	100
Marshall Islands	US Dollar	\$	USD	Cent	100
Mauritania	Ouguiya	UM	MRU	Khoums	5
Mauritius	Mauritius Rupee	Re or Rs (pl.)	MUR	Cent	100
Mexico	Mexican Peso	\$	MXN	Centavo	100
Micronesia, Federated States of	US Dollar	\$	USD	Cent	100
Moldova, Republic of	Moldovan Leu	Leu or Lei (pl.)	MDL	Ban	100
Monaco	Euro	€	EUR	Cent	100
Mongolia	Tugrik	₮	MNT	Möngö	100
Montenegro	Euro	€	EUR	Cent	100
Montserrat	East Caribbean Dollar	\$	XCD	Cent	100
Morocco	Moroccan Dirham	DH	MAD	Centime	100
Mozambique	Mozambique Metical	Mt	MZN	Centavo	100
Myanmar	Kyat	K or Ks (pl.)	MMK	Pya	100
Namibia	Namibia Dollar	\$	NAD	Cent	100
Namibia	Rand	R	ZAR	Cent	100
Nauru	Australian Dollar	\$	AUD	Cent	100
Nepal	Nepalese Rupee	Re or Rs (pl.)	NPR	Paisa	100
Nepal	Indian Rupee	₹	INR	Paisa	100
Netherlands	Euro	€	EUR	Cent	100
New Caledonia	CFP Franc	₣	XPF	Centime	100
New Zealand	New Zealand Dollar	\$	NZD	Cent	100
Nicaragua	Cordoba Oro	C\$	NIO	Centavo	100
Niger	CFA Franc BCEAO	Fr	XOF	Centime	100
Nigeria	Naira	₦	NGN	Kobo	100
Niue	New Zealand Dollar	\$	NZD	Cent	100
North Macedonia	Denar	DEN	MKD	Deni	100
Norway	Norwegian Krone	kr	NOK	Øre	100
Oman	Rial Omani	RO	OMR	Baisa	1000
Pakistan	Pakistan Rupee	Re or Rs (pl.)	PKR	Paisa	100
Palau	US Dollar	\$	USD	Cent	100
Palestine, State of	New Israeli Sheqel	₪	ILS	Agora	100
Palestine, State of	Jordanian Dinar	JD	JOD	Piastre\[H\]	100
Panama	Balboa	B/	PAB	Centésimo	100
Panama	US Dollar	\$	USD	Cent	100
Papua New Guinea	Kina	K	PGK	Toea	100
Paraguay	Guarani	₲	PYG	Céntimo	100
Peru	Sol	S/	PEN	Céntimo	100
Philippines	Philippine Peso	₱	PHP	Sentimo	100
Pitcairn	New Zealand Dollar	\$	NZD	Cent	100
Poland	Zloty	zł	PLN	Grosz	100
Portugal	Euro	€	EUR	Cent	100
Qatar	Qatari Rial	QR	QAR	Dirham	100
Romania	Romanian Leu	Leu or Lei (pl.)	RON	Ban	100
Russian Federation	Russian Ruble	₽	RUB	Kopeck	100
Rwanda	Rwanda Franc	Fr	RWF	Centime	100
Bonaire, Sint Eustatius and Saba	US Dollar	\$	USD	Cent	100
Western Sahara	Moroccan Dirham	DH	MAD	Centime	100
Saint Helena, Ascension and Tristan da Cunha	Saint Helena Pound	£	SHP	Penny	100
Saint Helena, Ascension and Tristan da Cunha	Pound Sterling	£	GBP	Penny	100
Saint Kitts and Nevis	East Caribbean Dollar	\$	XCD	Cent	100
Saint Lucia	East Caribbean Dollar	\$	XCD	Cent	100
Saint Pierre and Miquelon	Euro	€	EUR	Cent	100
Saint Pierre and Miquelon	Canadian Dollar	\$	CAD	Cent	100
Saint Vincent and the Grenadines	East Caribbean Dollar	\$	XCD	Cent	100
Samoa	Tala	\$	WST	Sene	100
Saint Barthélemy	Euro	€	EUR	Cent	100
San Marino	Euro	€	EUR	Cent	100
Sao Tome and Principe	Dobra	Db	STN	Cêntimo	100
Saudi Arabia	Saudi Riyal	Rl or Rls (pl.)	SAR	Halala	100
Senegal	CFA Franc BCEAO	Fr	XOF	Centime	100
Serbia	Serbian Dinar	DIN	RSD	Para	100
Seychelles	Seychelles Rupee	Re or Rs (pl.)	SCR	Cent	100
Sierra Leone	Leone	Le	SLE	Cent	100
Singapore	Singapore Dollar	\$	SGD	Cent	100
Singapore	Brunei Dollar	\$	BND	Sen	100
Bonaire, Sint Eustatius and Saba	US Dollar	\$	USD	Cent	100
Sint Maarten (Dutch part)	Netherlands Antillean Guilder	ƒ	ANG	Cent	100
Slovakia	Euro	€	EUR	Cent	100
Slovenia	Euro	€	EUR	Cent	100
Solomon Islands	Solomon Islands Dollar	\$	SBD	Cent	100
Somalia	Somali Shilling	Sh or Shs (pl.)	SOS	Cent	100
South Africa	Rand	R	ZAR	Cent	100
South Sudan	South Sudanese Pound	LS	SSP	Piaster	100
Spain	Euro	€	EUR	Cent	100
Sri Lanka	Sri Lanka Rupee	Re or Rs (pl.)	LKR	Cent	100
Sudan	Sudanese Pound	LS	SDG	Piastre	100
Suriname	Surinam Dollar	\$	SRD	Cent	100
Sweden	Swedish Krona	kr	SEK	Öre	100
Switzerland	Swiss Franc	Fr	CHF	Rappen\[J\]	100
Syrian Arab Republic	Syrian Pound	LS	SYP	Piastre	100
Taiwan, Province of China	New Taiwan Dollar	\$	TWD	Cent	100
Tajikistan	Somoni	SM	TJS	Diram	100
Tanzania, United Republic of	Tanzanian Shilling	Sh or Shs (pl.)	TZS	Cent	100
Thailand	Baht	฿	THB	Satang	100
Togo	CFA Franc BCEAO	Fr	XOF	Centime	100
Tonga	Pa'anga	T\$	TOP	Seniti	100
Trinidad and Tobago	Trinidad and Tobago Dollar	\$	TTD	Cent	100
Tunisia	Tunisian Dinar	DT	TND	Millime	1000
Türkiye	Turkish Lira	₺	TRY	Kuruş	100
Turkmenistan	Turkmenistan New Manat	m	TMT	Tenge	100
Turks and Caicos Islands	US Dollar	\$	USD	Cent	100
Tuvalu	Australian Dollar	\$	AUD	Cent	100
Uganda	Uganda Shilling	Sh or Shs (pl.)	UGX	(none)	(none)
Ukraine	Hryvnia	₴	UAH	Kopeck	100
United Arab Emirates	UAE Dirham	Dh or Dhs (pl.)	AED	Fils	100
United Kingdom	Pound Sterling	£	GBP	Penny	100
United States	US Dollar	\$	USD	Cent\[A\]	100
Uruguay	Peso Uruguayo	\$	UYU	Centésimo	100
Uzbekistan	Uzbekistan Sum	soum	UZS	Tiyin	100
Vanuatu	Vatu	VT	VUV	Cent	100
Holy See (Vatican City State)	Euro	€	EUR	Cent	100
Venezuela, Bolivarian Republic of	Bolívar Soberano	Bs.S	VES	Céntimo	1
Venezuela, Bolivarian Republic of	Bolívar Soberano	Bs.D	VED	Céntimo	100
Venezuela, Bolivarian Republic of	US Dollar	\$	USD	Cent	100
Viet Nam	Dong	₫	VND	Hào\[L\]	10
Wallis and Futuna	CFP Franc	₣	XPF	Centime	100
Yemen	Yemeni Rial	Rl or Rls (pl.)	YER	Fils	100
Zambia	Zambian Kwacha	K	ZMW	Ngwee	100
Zimbabwe	US Dollar	\$	USD	Cent	100

4.2.5 - Score health utility

Using modules from the scorz R package, individual responses to a multi-attribute utility instrument survey can be converted into health utility total scores. This tutorial describes how to do for adolescent AQoL-6D health utility.

This below section renders a vignette article from the scorz library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

Note: This vignette is illustrated with fake data. The dataset explored in this example should not be used to inform decision-making. Some of the methods illustrated in this AQoL-6D vignette can also be used to score other health utility instruments - see a vignette about scoring EQ-5D.

library(ready4)
library(scorz)

AQoL-6D scoring

To derive a health utility score from the raw responses to a multi-attribute utility instrument it is necessary to implement a scoring algorithm. Scoring algorithms for the Assessment of Quality of Life Six Dimension (AQoL-6D) are publicly available in SPSS format (https://www.aqol.com.au/index.php/scoring-algorithms).

However, to include scoring algorithms in reproducible research workflows, it is desirable to have these algorithms available in open science languages such as R. The scorz package includes ready4 framework model modules of the ready4 youth mental health economic model that provide R implementations of the adult and adolescent versions of the AQoL-6D scoring algorithms.

Ingest data

To begin, we ingest an unscored dataset as an instance of the Ready4useDyad from the ready4use package. In this case we download our data from a remote repository.

X <- ready4use::Ready4useRepos(dv_nm_1L_chr = "fakes",
                               dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/W95KED",
                               dv_server_1L_chr = "dataverse.harvard.edu") %>%
  ingest(fls_to_ingest_chr = "ymh_clinical_dyad_r4",
         metadata_1L_lgl = F)

To make the ingested dataset easier to interpret, we can add labels from the dictionary.

X <- X %>%
  renew(type_1L_chr = "label")

We can now inspect our ingested dataset using the exhibit method.

exhibit(X,
        display_1L_chr = "head",
         scroll_box_args_ls = list(width = "100%"))

Dataset
Unique client identifier	Round of data collection	Date of data collection	Age	Gender	Sex at birth	Sexual orientation	Aboriginal or Torres Strait Islander	Country Of birth	Speaks English at home	Native English speaker	Education and employment status	Relationship status	Service centre name	Primary diagnosis	Clinical stage	Kessler Psychological Distress Scale (6 Dimension)	Patient Health Questionnaire	Behavioural Activation for Depression Scale	Generalised Anxiety Disorder Scale	Overall Anxiety Severity and Impairment Scale	Screen for Child Anxiety Related Disorders	Social and Occupational Functioning Assessment Scale	Assessment of Quality of Life (6 Dimension) question 1	Assessment of Quality of Life (6 Dimension) question 2	Assessment of Quality of Life (6 Dimension) question 3	Assessment of Quality of Life (6 Dimension) question 4	Assessment of Quality of Life (6 Dimension) question 5	Assessment of Quality of Life (6 Dimension) question 6	Assessment of Quality of Life (6 Dimension) question 7	Assessment of Quality of Life (6 Dimension) question 8	Assessment of Quality of Life (6 Dimension) question 9	Assessment of Quality of Life (6 Dimension) question 10	Assessment of Quality of Life (6 Dimension) question 11	Assessment of Quality of Life (6 Dimension) question 12	Assessment of Quality of Life (6 Dimension) question 13	Assessment of Quality of Life (6 Dimension) question 14	Assessment of Quality of Life (6 Dimension) question 15	Assessment of Quality of Life (6 Dimension) question 16	Assessment of Quality of Life (6 Dimension) question 17	Assessment of Quality of Life (6 Dimension) question 18	Assessment of Quality of Life (6 Dimension) question 19	Assessment of Quality of Life (6 Dimension) question 20
Participant_1	Baseline	2020-03-22	14	Male	Male	Heterosexual	No	Australia	Yes	Yes	Not studying or working	In a relationship	Southport	Other	0-1a	8	7	96	6	6	28	69	2	3	1	2	3	1	1	2	4	3	3	4	2	4	2	2	2	2	2	1
Participant_2	Baseline	2020-06-15	19	Female	Female	Heterosexual	Yes	Other	No	No	Studying only	In a relationship	Regional Centre	Anxiety	0-1a	13	13	63	12	12	41	58	3	3	1	1	3	2	1	3	2	4	4	3	4	3	1	2	2	2	1	1
Participant_3	Baseline	2020-08-20	21	Female	Female	Other	NA	NA	NA	NA	Studying only	Not in a relationship	Canberra	Anxiety	1b	12	17	72	16	12	43	72	2	3	2	5	1	1	1	2	4	5	2	4	2	2	2	1	1	1	1	1
Participant_4	Baseline	2020-05-23	12	Female	Female	Heterosexual	Yes	Other	No	No	Not studying or working	In a relationship	Southport	Depression and Anxiety	2-4	17	17	75	12	10	51	88	1	2	1	1	3	3	1	4	4	3	3	3	4	2	1	1	2	1	3	1
Participant_5	Baseline	2020-04-05	19	Male	Male	Heterosexual	Yes	Other	No	No	Not studying or working	Not in a relationship	Southport	Depression and Anxiety	0-1a	12	22	82	14	14	51	67	2	2	1	3	5	1	1	1	1	5	4	4	3	2	1	2	1	3	2	3
Participant_6	Baseline	2020-06-09	19	Male	Male	Heterosexual	Yes	Other	No	No	Studying only	In a relationship	Regional Centre	Anxiety	1b	11	8	105	8	3	46	60	1	2	2	1	2	2	4	1	3	3	4	3	4	2	1	2	1	2	1	1

We now add meta-data that identifies our dataset as being longitudinal using the YouthvarsSeries module of the youthvars package.

X <- youthvars::YouthvarsSeries(a_Ready4useDyad = X,
                                id_var_nm_1L_chr = "fkClientID",
                                timepoint_var_nm_1L_chr = "round",
                                timepoint_vals_chr = levels(X@ds_tb$round))

We now use the data and meta-data we have created in the previous steps to create an instance of the ScorzAqol6Adol class. This class is specifically designed to facilitate scoring of the adolescent version of the AQoL-6D instrument.

Y <- ScorzAqol6Adol(a_YouthvarsProfile = X)

By default, instances of the ScorzAqol6Adol class are created with a slot specifying a value for the prefix for AQoL-6D questionnaire item responses.

procureSlot(Y,
            slot_nm_1L_chr = "itm_prefix_1L_chr")
#> [1] "aqol6d_q"

If this default value needs to be updated to match the prefix used in your dataset, use the renewSlot method.

# Not run
# Y <- renewSlot(Y, slot_nm_1L_chr = "itm_prefix_1L_chr", new_val_xx = "new_prefix")

Calculating scores

To calculate AQoL 6D adolescent utility scores, use the renew method.

Y <- renew(Y)

Viewing the updated dataset

We can inspect our updated dataset using the exhibit method. We can see that the updated dataset now has additional variables that include the intermediate and final calculations for AQoL-6D adolescent utility scores.

exhibit(Y,
        display_1L_chr = "head",
         scroll_box_args_ls = list(width = "100%"))

Dataset
Unique client identifier	Round of data collection	Date of data collection	Age	Gender	Sex at birth	Sexual orientation	Aboriginal or Torres Strait Islander	Country Of birth	Speaks English at home	Native English speaker	Education and employment status	Relationship status	Service centre name	Primary diagnosis	Clinical stage	Kessler Psychological Distress Scale (6 Dimension)	Patient Health Questionnaire	Behavioural Activation for Depression Scale	Generalised Anxiety Disorder Scale	Overall Anxiety Severity and Impairment Scale	Screen for Child Anxiety Related Disorders	Social and Occupational Functioning Assessment Scale	Assessment of Quality of Life (6 Dimension) question 1	Assessment of Quality of Life (6 Dimension) question 2	Assessment of Quality of Life (6 Dimension) question 3	Assessment of Quality of Life (6 Dimension) question 4	Assessment of Quality of Life (6 Dimension) question 5	Assessment of Quality of Life (6 Dimension) question 6	Assessment of Quality of Life (6 Dimension) question 7	Assessment of Quality of Life (6 Dimension) question 8	Assessment of Quality of Life (6 Dimension) question 9	Assessment of Quality of Life (6 Dimension) question 10	Assessment of Quality of Life (6 Dimension) question 11	Assessment of Quality of Life (6 Dimension) question 12	Assessment of Quality of Life (6 Dimension) question 13	Assessment of Quality of Life (6 Dimension) question 14	Assessment of Quality of Life (6 Dimension) question 15	Assessment of Quality of Life (6 Dimension) question 16	Assessment of Quality of Life (6 Dimension) question 17	Assessment of Quality of Life (6 Dimension) question 18	Assessment of Quality of Life (6 Dimension) question 19	Assessment of Quality of Life (6 Dimension) question 20	Assessment of Quality of Life (6 Dimension) item disvalue1	Assessment of Quality of Life (6 Dimension) item disvalue2	Assessment of Quality of Life (6 Dimension) item disvalue3	Assessment of Quality of Life (6 Dimension) item disvalue4	Assessment of Quality of Life (6 Dimension) item disvalue5	Assessment of Quality of Life (6 Dimension) item disvalue6	Assessment of Quality of Life (6 Dimension) item disvalue7	Assessment of Quality of Life (6 Dimension) item disvalue8	Assessment of Quality of Life (6 Dimension) item disvalue9	Assessment of Quality of Life (6 Dimension) item disvalue10	Assessment of Quality of Life (6 Dimension) item disvalue11	Assessment of Quality of Life (6 Dimension) item disvalue12	Assessment of Quality of Life (6 Dimension) item disvalue13	Assessment of Quality of Life (6 Dimension) item disvalue14	Assessment of Quality of Life (6 Dimension) item disvalue15	Assessment of Quality of Life (6 Dimension) item disvalue16	Assessment of Quality of Life (6 Dimension) item disvalue17	Assessment of Quality of Life (6 Dimension) item disvalue18	Assessment of Quality of Life (6 Dimension) item disvalue19	Assessment of Quality of Life (6 Dimension) item disvalue20	Disvalue Score for Dimension 1 - Independent Living	Disvalue Score for Dimension 2 - Relationships	Disvalue Score for Dimension 3 - Mental Health	Disvalue Score for Dimension 4 - Coping	Disvalue Score for Dimension 5 - Pain	Disvalue Score for Dimension 6 - Senses	Adult Score Dimension 1 - Independent Living	Adult Score Dimension 2 - Relationships	Adult Score Dimension 3 - Mental Health	Adult Score Dimension 4 - Coping	Adult Score Dimension 5 - Pain	Adult Score Dimension 6 - Senses	Overall score on a 0-1 disvalue scale	Overall score on a life-death disutility scale	AQoL-6D Adolescent Disutility Score (Untransformed)	AQoL-6D Adolescent Disutility Score (Transformed)	Instrument utility score	Instrument utility score rotated	AQOL-6D (weighted total)	AQOL-6D (unweighted total)
Participant_1	Baseline	2020-03-22	14	Male	Male	Heterosexual	No	Australia	Yes	Yes	Not studying or working	In a relationship	Southport	Other	0-1a	8	7	96	6	6	28	69	2	3	1	2	3	1	1	2	4	3	3	4	2	4	2	2	2	2	2	1	0.073	0.240	0.000	0.040	0.461	0.000	0.000	0.133	0.824	0.330	0.368	0.722	0.055	0.826	0.133	0.2	0.072	0.033	0.024	0.000	0.19334101	0.2964368	0.7312060	0.7708396	0.2619285	0.03009428	0.8066590	0.7035632	0.2687940	0.2291604	0.7380715	0.9699057	0.6436897	0.7286568	0.55838936	0.55838936	0.4416106	0.5078265	0.5698492	46
Participant_10	Baseline	2020-08-05	15	Female	Female	Other	Yes	Other	No	No	Studying and working	Not in a relationship	Canberra	Other	0-1a	11	17	34	13	15	38	60	1	2	2	3	5	1	3	3	4	4	3	4	3	3	1	2	2	3	2	1	0.000	0.033	0.041	0.297	1.000	0.000	0.648	0.392	0.824	0.784	0.368	0.722	0.382	0.423	0.000	0.2	0.072	0.223	0.024	0.000	0.27064870	0.7770111	0.8683514	0.6579841	0.1935407	0.13938313	0.7293513	0.2229889	0.1316486	0.3420159	0.8064593	0.8606169	0.7541542	0.8537026	0.74739738	0.74739738	0.2526026	0.3413671	0.3916050	52
Participant_10	Follow-up	2020-11-07	15	Female	Female	Other	Yes	Other	No	No	Not studying or working	Not in a relationship	Regional Centre	Depression	1b	7	17	95	14	10	48	64	2	3	2	1	2	2	2	2	2	3	3	5	3	2	3	1	2	2	3	2	0.073	0.240	0.041	0.000	0.074	0.193	0.197	0.133	0.142	0.330	0.368	1.000	0.382	0.057	0.642	0.0	0.072	0.033	0.205	0.187	0.18835933	0.2602305	0.5155772	0.5858738	0.4342728	0.21476953	0.8116407	0.7397695	0.4844228	0.4141262	0.5657272	0.7852305	0.6473112	0.7327563	0.56418597	0.56418597	0.4358140	0.5027214	0.5645345	47
Participant_100	Baseline	2020-07-19	25	Female	Female	Other	Yes	Other	No	No	Working only	In a relationship	Canberra	Depression and Anxiety	0-1a	7	0	120	3	0	21	76	1	1	1	1	2	1	2	2	2	2	2	2	5	3	2	1	3	1	1	1	0.000	0.000	0.000	0.000	0.074	0.000	0.197	0.133	0.142	0.097	0.064	0.056	1.000	0.423	0.133	0.0	0.338	0.000	0.000	0.000	0.00000000	0.1433888	0.2505682	0.7769222	0.2866694	0.00000000	1.0000000	0.8566112	0.7494318	0.2230778	0.7133306	1.0000000	0.4558633	0.5160373	0.29587849	0.29587849	0.7041215	0.7390198	0.7978085	36
Participant_1000	Baseline	2020-09-06	16	Male	Male	Heterosexual	Yes	Other	No	No	Not studying or working	Not in a relationship	Canberra	Anxiety	0-1a	0	0	128	0	0	0	71	2	1	1	1	1	2	1	2	1	2	2	1	2	3	1	1	1	2	1	1	0.073	0.000	0.000	0.000	0.000	0.193	0.000	0.133	0.000	0.097	0.064	0.000	0.055	0.423	0.000	0.0	0.000	0.033	0.000	0.000	0.02813508	0.1346642	0.1819574	0.3514811	0.0000000	0.01916297	0.9718649	0.8653358	0.8180426	0.6485189	1.0000000	0.9808370	0.2379252	0.2693314	0.08939064	0.08939064	0.9106094	0.9208737	0.9511345	29
Participant_1000	Follow-up	2020-12-20	16	Male	Male	Heterosexual	Yes	Other	No	No	Not studying or working	Not in a relationship	Southport	Anxiety	1b	5	0	117	5	1	14	71	2	2	1	1	1	1	2	1	3	1	2	3	2	2	1	1	1	1	2	1	0.073	0.033	0.000	0.000	0.000	0.000	0.197	0.000	0.392	0.000	0.064	0.338	0.055	0.057	0.000	0.0	0.000	0.000	0.024	0.000	0.04719190	0.1002056	0.2658587	0.2080310	0.0000000	0.01111253	0.9528081	0.8997944	0.7341413	0.7919690	1.0000000	0.9888875	0.2228889	0.2523102	0.07926885	0.07926885	0.9207312	0.9297879	0.9576133	31

Creating summary plots

To create plots, we use the depict method.

We can create a list of summary plots by timepoint for all individual items.

plot_ls <- depict(Y, type_1L_chr = "item_by_time")

We can then select a desired item’s summary plot by using its index number.

plot_ls[[1]]

AQoL-6D Item 1 scores by data-collection round

Alternatively, we can generate individual plots by passing the item index number to the var_idcs_int argument of depict.

depict(Y, type_1L_chr = "item_by_time", var_idcs_int = 2L)

AQoL-6D Item 2 scores by data-collection round

We can also plot domain scores by time.

depict(Y, type_1L_chr = "domain_by_time", var_idcs_int = 1L)

AQoL-6D Independet Living Domain weighted scores by data-collection round

Total AQoL-6D scores can also be plotted using the same approach, where var_idcs_int = 1L is used to plot the weighted total distribution and var_idcs_int = 2L is used for plotting the unweighted total.

depict(Y, type_1L_chr = "total_by_time", var_idcs_int = 1L)

AQoL-6D item total weighted scores by data-collection round

Composite plots can be generated as well, though these are not currently optimised to reliably produce quality plots suitable for publication.

depict(Y, type_1L_chr = "comp_item_by_time")

AQoL-6D item responses by data-collection round

depict(Y, type_1L_chr = "comp_domain_by_time")

AQoL-6D weighted domain scores by data-collection round

We can now publicly share our scored dataset and its associated metadata, using Ready4useRepos and its share method as described in a vignette from the ready4use package.

Z <- ready4use::Ready4useRepos(gh_repo_1L_chr = "ready4-dev/scorz", # Replace with details of your repo.
                               gh_tag_1L_chr = "Documentation_0.0") # You must have write permissions.
Z <- share(Z,
           obj_to_share_xx = Y,
           fl_nm_1L_chr = "ymh_ScorzAqol6Adol")

Y is now available for download as the file ymh_ScorzAqol6Adol.RDS from the “Documentation_0.0” release of the scorz package.

4.2.6 - Explore candidate utility mapping models

Using modules from the specific R package, it is possible to undertake an exploratory utility mapping analysis. This tutorial illustrates a hypotehtical example of exploring how to map to EQ-5D health utility.

This below section renders a vignette article from the specific library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

Note: This vignette uses fake data - it is for illustrative purposes only and should not be used to inform decision making. The specific package includes ready4 framework model modules that form part of the ready4 youth mental health economic model. Currently, these modules are not optimised to be used directly, but are instead intended for use in other model modules. For example, the TTU package includes modules that extend specific modules to help implement utility mapping studies. However, to illustrate the main features of specific modules this vignette demonstrates how specific modules could be used independently. In practice, workflow illustrated in this article would probably need to be performed iteratively in order to identify the optimal model types, predictors and covariates and to update default values to ensure model convergence.

library(ready4)
library(scorz)
library(specific)

By default, modules in the specific package will request your consent before writing files to your machine. This is the safest option. However, as there are many files that need to be written locally for this program to execute, you can overwrite this default by supplying the value “Y” to methods with a consent_1L_chr argument.

consent_1L_chr <- "" # Default value - asks for consent prior to writing each file.

Import data

We start by ingesting our data. As this example uses EQ-5D data, we import a ScorzEuroQol5 ready4 framework module (created using the steps described in this vignette from the scorz pacakge) into a SpecificConverter Module and then apply the metamorphose method to convert it into a SpecificModel module.

X <- SpecificConverter(a_ScorzProfile = ready4use::Ready4useRepos(gh_repo_1L_chr = "ready4-dev/scorz", 
                                                                  gh_tag_1L_chr = "Documentation_0.0") %>%
                         ingest(fls_to_ingest_chr = "ymh_ScorzEuroQol5",  metadata_1L_lgl = F)) %>% 
  metamorphose()

class(X)
#> [1] "SpecificModels"
#> attr(,"package")
#> [1] "specific"

Inspect data

The dataset we are using has a total of 1786 records at two timepoints on 1068 study participants. The first six records are reproduced below.

Dataset

Unique identifier

Data collection round

Date of data collection

Age

Gender (grouped)

Sex at birth

Sexual orientation

Relationship status

Aboriginal or Torres Strait Islander

Culturally And Linguistically Diverse

Region of residence (metropolitan or regional)

Education and employment status

EQ5D - Mobility domain score

EQ5D - Self-Care domain score

EQ5D - Usual Activities domain score

EQ5D - Pain / Discomfort domain score

EQ5D - Anxiety / Depression domain score

Kessler Psychological Distress - 10 Item Total Score

Overall Wellbeing Measure (Winefield et al. 2012)

EuroQol (EQ-5D) - (weighted total)

EuroQol (EQ-5D) - (unweighted total)

1

BL

2019-10-22

14

Male

Male

Heterosexual

In a relationship

No

No

Metro

Not studying or working

1

1

1

1

2

11

87

0.879

6

2

BL

2019-10-17

19

Female

Female

Heterosexual

In a relationship

Yes

Yes

Regional

Studying only

1

2

1

1

1

14

65

0.846

6

2

FUP

2020-02-14

19

Female

Female

Heterosexual

In a relationship

Yes

Yes

Regional

Studying only

3

1

1

1

1

10

71

0.850

7

3

BL

2020-02-15

21

Female

Female

Other

Not in a relationship

NA

NA

Metro

Studying only

1

1

3

1

1

13

74

0.883

7

3

FUP

2020-06-14

21

Female

Female

Other

Not in a relationship

NA

NA

Metro

Studying only

1

1

2

1

1

10

64

0.906

6

4

BL

2019-12-14

12

Female

Female

Heterosexual

In a relationship

Yes

Yes

Metro

Not studying or working

1

1

1

3

1

18

40

0.796

7

To source dataset of X is contained in the a_YouthvarsProfile slot and is a YouthvarsSeries module. For more information about methods that can be used to explore this dataset, read this vignette from the youthvars package.

Specify parameters

In preparation for exploring our dataset, we need to declare a set of model parameters in a b_SpecificParameters slot of X. This can be done in one step, or in sequential steps. In this example, we will proceed sequentially.

Dependent variable

The dependent variable (total EQ-5D utility score) has already been specified when we imported the data from the ScorzEuroQol5 module.

procureSlot(X, "b_SpecificParameters@depnt_var_nm_1L_chr")
#> [1] "eq5d_total_w"

We can now add details of the allowable range of dependent variable values.

X <- renewSlot(X, "b_SpecificParameters@depnt_var_min_max_dbl", c(-1,1))

Candidate predictors

We can now specify the names of candidate predictor variables.

X <- renewSlot(X, "b_SpecificParameters@candidate_predrs_chr", c("K10_int","Psych_well_int"))

We next add meta-data about each candidate predictor variable in the form of a specific_predictors object.

X <- renewSlot(X, "b_SpecificParameters@predictors_lup", class_chr = "integer", class_fn_chr = c("youthvars::youthvars_k10_aus","as.integer"), covariate_lgl = F, increment_dbl = 1,
               long_name_chr = c("Kessler Psychological Distress - 10 Item Total Score", "Overall Wellbeing Measure (Winefield et al. 2012)"), max_val_dbl = c(50,90), min_val_dbl = c(10,18), mdl_scaling_dbl = 0.01,
               short_name_chr = c("K10_int","Psych_well_int"))

The specific_predictors object that we have added to X can be inspected using the exhibitSlot method.

exhibitSlot(X, "b_SpecificParameters@predictors_lup", scroll_box_args_ls = list(width = "100%"))

Variable	Description	Minimum	Maximum	Class	Increment	Function	Scaling	Covariate
K10_int	Kessler Psychological Distress - 10 Item Total Score	10	50	integer	1	youthvars::youthvars_k10_aus	0.01	FALSE
Psych_well_int	Overall Wellbeing Measure (Winefield et al. 2012)	18	90	integer	1	as.integer	0.01	FALSE

Covariates

We also specify the covariates that we aim to explore in conjunction with each candidate predictor.

X <- renewSlot(X, "b_SpecificParameters@candidate_covars_chr", c("d_sex_birth_s", "d_age",  "d_sexual_ori_s", "d_studying_working"))

Descriptive variables

We also specify variables that we will use for generating descriptive statistics about the dataset.

X <- renewSlot(X,"b_SpecificParameters@descv_var_nms_chr", c("d_age","Gender","d_relation_s", "d_sexual_ori_s", "Region", "d_studying_working"))

Temporal variables

The name of the dataset variable for data collection timepoint and all of its unique values were imported when converting the ScorzEuroQol5 module.

procureSlot(X,"a_YouthvarsProfile@timepoint_var_nm_1L_chr")
#> [1] "Timepoint"

procureSlot(X,"a_YouthvarsProfile@timepoint_vals_chr")
#> [1] "BL"  "FUP"

However, we also need to specify the name of the variable that contains the datestamp for each dataset record.

X <- renewSlot(X, "b_SpecificParameters@msrmnt_date_var_nm_1L_chr", "data_collection_dtm")

Candidate models

X was created with a default set of candidate models, stored as a specific_models sub-module, which can be inspected using the exhibitSlot method.

exhibitSlot(X, "b_SpecificParameters@candidate_mdls_lup", scroll_box_args_ls = list(width = "100%"))

Model types lookup table
Reference	Name	Control	Familty	Function	Start	Predict	Transformation	Binomial	Acronym (Fixed)	Acronymy (Mixed)	Type (Mixed)	With
OLS_NTF	Ordinary Least Squares (no transformation)	NA	NA	lm	NA	NA	NTF	FALSE	OLS	LMM	linear mixed model	no transformation
OLS_LOG	Ordinary Least Squares (log transformation)	NA	NA	lm	NA	NA	LOG	FALSE	OLS	LMM	linear mixed model	log transformation
OLS_LOGIT	Ordinary Least Squares (logit transformation)	NA	NA	lm	NA	NA	LOGIT	FALSE	OLS	LMM	linear mixed model	logit transformation
OLS_LOGLOG	Ordinary Least Squares (log log transformation)	NA	NA	lm	NA	NA	LOGLOG	FALSE	OLS	LMM	linear mixed model	log log transformation
OLS_CLL	Ordinary Least Squares (complementary log log transformation)	NA	NA	lm	NA	NA	CLL	FALSE	OLS	LMM	linear mixed model	complementary log log transformation
GLM_GSN_LOG	Generalised Linear Model with Gaussian distribution and log link	NA	gaussian(log)	glm	-0.1,-0.1	response	NTF	FALSE	GLM	GLMM	generalised linear mixed model	Gaussian distribution and log link
BET_LGT	Beta Regression Model with Binomial distribution and logit link	betareg::betareg.control	NA	betareg::betareg	-0.5,-0.1,3	response	NTF	FALSE	GLM	GLMM	generalised linear mixed model	Binomial distribution and logit link
BET_CLL	Beta Regression Model with Binomial distribution and complementary log log link	betareg::betareg.control	NA	betareg::betareg	-0.5,-0.1,3	response	NTF	FALSE	GLM	GLMM	generalised linear mixed model	Binomial distribution and complementary log log link

We can choose to select just a subset of these to explore using the renewSlot method. As this is an illustrative example, we have restricted the models we will explore to just four types, passing the relevant row numbers to the slice_indcs_int argument.

X <- renewSlot(X, "b_SpecificParameters@candidate_mdls_lup", slice_indcs_int = c(1L,5L,7L,8L))

Other parameters

Depending on the type of analysis we plan on undertaking, we can also specify parameters such as the number of folds to use in cross validation, the maximum number of model runs to allow and a seed to ensure reproducibility of results. In this case we are going to use the default values generated when we first created X.

procureSlot(X, "b_SpecificParameters@folds_1L_int")
#> [1] 10

procureSlot(X, "b_SpecificParameters@max_mdl_runs_1L_int")
#> [1] 300

procureSlot(X, "b_SpecificParameters@seed_1L_int")
#> [1] 1234

Model testing

Before we start to use the data stored in X to undertake modelling, we must first validate that it contains all necessary (and internally consistent) data by using the ratify method. The call to ratify will update any variable names that are likely to cause problems when generating reports (e.g. through inclusion of characters like “_” in the variable name that can cause problems when rendering LaTeX documents).

X <- ratify(X)

Set-up workspace

We add details of the directory to which we will write all output. In this example we create a temporary directory (tempdir()), but in practice this would be an existing directory on your local machine.

X <- renewSlot(X, "paths_chr", tempdir())

It can be useful to save fake data (useful for demonstrating the generalisability and replicability of an analysis) and real data (required for write-up and reproducibility) is distinctly labelled directories. By default, X is created with a flag to save all output in a sub-directory “Real”. As we are using fake data, we can override this value.

X <- renewSlot(X, "b_SpecificParameters@fake_1L_lgl", T)

We can now write a number of sub-directories to our specified output directory.

X <- author(X, what_1L_chr = "workspace", consent_1L_chr = consent_1L_chr)
#> New directories created:
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake/Markdown
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake/Output
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake/Reports
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake/Output/_Descriptives
#> C:\Users\mham0053\AppData\Local\Temp\RtmpWkpbqI/Fake/Output/H_Dataverse

Descriptives

The first set of outputs we write to our output directories is a set of descriptive tables and plots.

X <- author(X, consent_1L_chr = consent_1L_chr, digits_1L_int = 3L,  what_1L_chr = "descriptives")

Model comparisons

The investigate method can now be used to compare the candidate models we have specified earlier. In so doing it will transform X into a SpecificPredictors object.

X <- investigate(X, consent_1L_chr = consent_1L_chr, depnt_var_max_val_1L_dbl = 0.99, session_ls = sessionInfo())

class(X)
#> [1] "SpecificPredictors"
#> attr(,"package")
#> [1] "specific"

The investigate method will write each model to be tested to a new sub-directory of our output directory.

The investigate method also outputs a table summarising the performance of each of the candidate models.

exhibit(X, what_1L_chr = "mdl_cmprsn", type_1L_chr = "results")

Comparison of candidate models using highest correlated predictor
	Training model fit (averaged over 10 folds)			Testing model fit (averaged over 10 folds)
Model	R-Squared	RMSE	MAE	R-Squared	RMSE	MAE
Beta Regression Model with Binomial distribution and logit link	0.4318533	0.0742448	0.0587307	0.4128497	0.0741236	0.0587733
Beta Regression Model with Binomial distribution and complementary log log link	0.4174181	0.0751836	0.0593447	0.3996947	0.0750880	0.0594047
Ordinary Least Squares (no transformation)	0.4106104	0.0756222	0.0596955	0.3933147	0.0755461	0.0597672
Ordinary Least Squares (complementary log log transformation)	0.4105040	0.0756284	0.0597793	0.3913360	0.0755268	0.0598295

We can now identify the highest performing model in each category of candidate model based on the testing R² statistic.

procure(X, what_1L_chr = "prefd_mdls") 
#> [1] "BET_LGT" "OLS_NTF"

We can override these automated selections and instead incorporate other considerations (possibly based on judgments informed by visual inspection of the plots and the desirability of constraining predictions to a maximum value of one). We do this in the following command, specifying new preferred model types, in descending order of preference.

X <- renew(X, new_val_xx = c("BET_LGT", "OLS_CLL"), type_1L_chr = "results", what_1L_chr = "prefd_mdls")

Use most preferred model to compare all candidate predictors

We can now compare all of our candidate predictors (with and without candidate covariates) using the most preferred model type.

X <- investigate(X, consent_1L_chr = consent_1L_chr)

class(X)
#> [1] "SpecificFixed"
#> attr(,"package")
#> [1] "specific"

Now, we compare the performance of single predictor models of our preferred model type (in our case, a Beta Regression Model with Binomial distribution and logit link) for each candidate predictor. The last call to the investigate saved the tested models along with model plots in a sub-directory of our output directory. These results are also viewable as a table.

exhibit(X, scroll_box_args_ls = list(width = "100%"), type_1L_chr = "results", what_1L_chr = "predr_cmprsn")

Comparison of all candidate predictors using preferred model
predr_chr	%IncMSE	IncNodePurity
K10	0.0066197	3.888246
Psychwell	0.0011094	2.342784

The most recent call to the investigate method also saved single predictor R model objects (one for each candidate predictors) along with the two plots for each model in a sub-directory of our output directory. The performance of each single predictor model can also be summarised in a table.

exhibit(X, type_1L_chr = "results", what_1L_chr = "fxd_sngl_cmprsn")

Preferred single predictor model performance by candidate predictor
	Training model fit (averaged over 10 folds)			Testing model fit (averaged over 10 folds)
Model	R-Squared	RMSE	MAE	R-Squared	RMSE	MAE
K10	0.4318533	0.0742448	0.0587307	0.4128497	0.0741236	0.0587733
Psychwell	0.1507472	0.0907813	0.0699606	0.1341090	0.0909203	0.0700686

Updated versions of each of the models in the previous step (this time with covariates added) are saved to a new subdirectory of the output directory and we can summarise the performance of each of the updated models, along with all signficant model terms, in a table.

exhibit(X, scroll_box_args_ls = list(width = "100%"), type_1L_chr = "results", what_1L_chr = "fxd_full_cmprsn")

We can now identify which, if any, of the candidate covariates we previously specified are significant predictors in any of the models.

procure(X, type_1L_chr = "results", what_1L_chr = "signt_covars")
#> [1] NA

We can override the covariates to select, potentially because we want to select only covariates that are significant for all or most of the models. However, in the below example we have opted not to do so and continue to use no covariates as selected by the algorithm in the previous step.

# X <- renew(X, new_val_xx = c("COVARIATE OF YOUR CHOICE", "ANOTHER COVARIATE"), type_1L_chr = "results", what_1L_chr = "prefd_covars")

Test preferred model with preferred covariates for each candidate predictor

We now conclude our model testing by rerunning the previous step, except confining our covariates to those we prefer.

X <- investigate(X, consent_1L_chr = consent_1L_chr)

class(X)
#> [1] "SpecificMixed"
#> attr(,"package")
#> [1] "specific"

The previous call to the write_mdls_with_covars_cmprsn function saves the tested models along with two plots for each model in the “E_Predrs_W_Covars_Sngl_Mdl_Cmprsn” sub-directory of “Output”.

Apply preferred model types and predictors to longitudinal data

The next main step is to use the preferred model types and covariates identified from the preceding analysis of cross-sectional data in longitudinal analysis.

Longitudinal mixed modelling

Prior to undertaking longitudinal mixed modelling, we need to check the appropriateness of the default values for modelling parameters that are stored in X. These include the number of model iterations, and any custom control parameters and priors (by default, empty lists).

procureSlot(X, "b_SpecificParameters@iters_1L_int")
#> [1] 4000

In many cases there will be no need to specify any custom control parameters or priors and using the defaults may speed up execution.

procureSlot(X, "b_SpecificParameters@control_ls")
#> [[1]]
#> list()

procureSlot(X,"b_SpecificParameters@prior_ls")
#> [[1]]
#> list()

However, in this example using the default control parameters would result in warning messages suggesting a change to the adapt_delta control value (default = 0.8). Modifying the adapt_delta control parameter value can address this issue.

X <- renewSlot(X, "b_SpecificParameters@control_ls", new_val_xx = list(adapt_delta = 0.99))

X <- investigate(X, consent_1L_chr = consent_1L_chr)

class(X)
#> [1] "SpecificMixed"
#> attr(,"package")
#> [1] "specific"

The last call to investigate function wrote the models it tests to a sub-directory of the output directory along with plots for each model.

Create shareable outputs

The model objects created by the preceding analysis are not suitable for sharing as they contain duplicates of the source dataset. To create model objects that can be shared (where dataset copies are replaced with fake data) use the authorData method.

X <- authorData(X, consent_1L_chr = consent_1L_chr)

Purge dataset copies

For the purposes of efficient computation, multiple objects containing copies of the source dataset were saved to our output directory during the analysis process. We therefore need to delete all of these copies by supplying “purge_write” to the type_1L_chr argument of the author method.

X <- author(X, consent_1L_chr = consent_1L_chr, type_1L_chr = "purge_write")

A copy of the module X is available for download as the file eq5d_ttu_SpecificMixed.RDS from the “Documentation_0.0” release of the specific package.

4.2.7 - Implement a utility mapping study

Using modules from the TTU R package, it is possible to implement a fully reproducible utility mapping study. This tutorial illustrates the main steps using a hypothetical AQoL-6D utility mapping study.

This below section renders a vignette article from the TTU library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

Note: This vignette uses fake data - it is for illustrative purposes only and should not be used to inform decision making. This vignette outlines the workflow for developing utility mapping models using longitudinal data. The workflow for developing utility mapping models is broadly similar, with some minor modifications. An example of developing models using cross-sectional data is available at https://doi.org/10.5281/zenodo.8098595 .

Motivation

Health services do not typically collect health utility data from their clients, which makes it more difficult to place an economic values on outcomes attained in these services. One strategy for addressing this gap is to use data from similar samples of patients that contain both health utility and the types of outcome measures that are collected in clinical services. The TTU package provides a toolkit for conducting and reporting a utility mapping (or Transfer to Utility) study.

Implementation

The TTU package contains modules of the ready4 youth mental health economic model that combine and extend model modules for:

labeling, validating and summarising youth mental health datasets (from the youthvars package);
scoring health utility (from the scorz package);
specifying and testing statistical models (from the specific package);
generating reproducible analysis reports (from the ready4show package); and
sharing data via online data repositories (from the ready4use package).

Additionally, TTU relies on two RMarkdown programs:

ttu_mdl_ctlg: Generate a Template Utility Mapping (Transfer to Utility) Model Catalogue (https://doi.org/10.5281/zenodo.5936870)
ttu_lng_ss: Create a Draft Scientific Manuscript For A Utility Mapping Study (https://doi.org/10.5281/zenodo.5976987)

Outputs generated by the TTU package are designed to be compatible with health economic models developed with the ready4 framework).

Workflow

Background and citation

The following workflow illustrates (using fake data) the same steps we used in a real world study, a summary of which is available at https://doi.org/10.1101/2021.07.07.21260129). Citation information for that study is:

@article {Hamilton2021.07.07.21260129,
    author = {Hamilton, Matthew P and Gao, Caroline X and Filia, Kate M and Menssink, Jana M and Sharmin, Sonia and Telford, Nic and Herrman, Helen and Hickie, Ian B and Mihalopoulos, Cathrine and Rickwood, Debra J and McGorry, Patrick D and Cotton, Sue M},
    title = {Predicting Quality Adjusted Life Years in young people attending primary mental health services},
    elocation-id = {2021.07.07.21260129},
    year = {2021},
    doi = {10.1101/2021.07.07.21260129},
    publisher = {Cold Spring Harbor Laboratory Press},
    URL = {https://www.medrxiv.org/content/early/2021/07/12/2021.07.07.21260129},
    eprint = {https://www.medrxiv.org/content/early/2021/07/12/2021.07.07.21260129.full.pdf},
    journal = {medRxiv}
}

The program applied in that study, which this workflow closely resembles is available at https://doi.org/10.5281/zenodo.6116077 and can be cited as follows:

@software{hamilton_matthew_2022_6212704,
  author       = {Hamilton, Matthew and
                  Gao, Caroline},
  title        = {{Complete study program to reproduce all steps from 
                   data ingest through to results dissemination for a
                   study to map mental health measures to AQoL-6D
                   health utility}},
  month        = feb,
  year         = 2022,
  note         = {{Matthew Hamilton and Caroline Gao  (2022). 
                   Complete study program to reproduce all steps from
                   data ingest through to results dissemination for a
                   study to map mental health measures to AQoL-6D
                   health utility. Zenodo.
                   https://doi.org/10.5281/zenodo.6116077. Version
                   0.0.9.3}},
  publisher    = {Zenodo},
  version      = {0.0.9.3},
  doi          = {10.5281/zenodo.6212704},
  url          = {https://doi.org/10.5281/zenodo.6212704}
}

Load required packages

We begin by loading our required packages.

library(ready4)
library(ready4show)
library(ready4use)
library(youthvars)
library(scorz)
library(TTU)

By default, methods associated with TTU modules will request your consent before writing files to your machine. This is the safest option. However, as there are many files that need to be written locally for this program to execute, you can overwrite this default by supplying the value “Y” to methods with a consent_1L_chr argument.

consent_1L_chr <- "" # Default value - asks for consent prior to writing each file.

Add dataset metadata

We use the Ready4useDyad and Ready4useRepos modules to retrieve and ingest and to then pair a dataset and its data dictionary.

A <- Ready4useDyad(ds_tb = Ready4useRepos(dv_nm_1L_chr = "fakes", dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/HJXYKQ", dv_server_1L_chr = "dataverse.harvard.edu") %>%
                     ingest(fls_to_ingest_chr = c("ymh_clinical_tb"), metadata_1L_lgl = F) %>% youthvars::transform_raw_ds_for_analysis(),
                   dictionary_r3 = Ready4useRepos(dv_nm_1L_chr = "TTU", dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/DKDIB0", dv_server_1L_chr = "dataverse.harvard.edu") %>%
                     ingest(fls_to_ingest_chr = c("dictionary_r3"), metadata_1L_lgl = F)) %>%
  renew(type_1L_chr = "label")

We use the YouthvarsSeries module to supply metadata about our longitudinal dataset vignette.

A <- YouthvarsSeries(a_Ready4useDyad = A, id_var_nm_1L_chr = "fkClientID", timepoint_var_nm_1L_chr = "round",
                     timepoint_vals_chr = levels(procureSlot(A, "ds_tb")$round))

Score health utility

We next use the ScorzAqol6Adol module to score adolescent AQoL-6D health utility.

A <- TTUProject(a_ScorzProfile = ScorzAqol6Adol(a_YouthvarsProfile = A))
A <- renew(A, what_1L_chr = "utility") 
#> Joining with `by = join_by(fkClientID, match_var_chr)`

Evaluate candidate models

Over the next few steps we will use modules from the specific package to specify and assess a number of candidate utility mapping models.

Specify modelling parameters

We begin by specifying the parameters we will use in our modelling project. The initial step is to ensure the fields in A for storing parameter values are internally consistent with the data we have entered in the previous steps.

A <- renew(A, what_1L_chr = "parameters")

We next ingest a lookup table of metadata about the variables we plan to explore as candidate predictors. In this case, we are sourcing the lookup table from an online data repository.

A <- renew(A, "use_renew_mthd", fl_nm_1L_chr = "predictors_r3", type_1L_chr = "predictors_lup", 
           y_Ready4useRepos = Ready4useRepos(dv_nm_1L_chr = "TTU", dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/DKDIB0", 
                                             dv_server_1L_chr = "dataverse.harvard.edu"),
           what_1L_chr = "parameters")

We can inspect the metadata on candidate predictors that we have just ingested.

exhibit(A, scroll_box_args_ls = list(width = "100%"))

We add additional metadata about variables in our dataset that will be used in exploratory modelling.

A <- renew(A, c(0.03,1), type_1L_chr = "range", what_1L_chr = "parameters") %>%
  renew(c("BADS","GAD7", "K6", "OASIS", "PHQ9", "SCARED"),
        type_1L_chr = "predictors_vars", what_1L_chr = "parameters") %>%
  renew(c("d_sex_birth_s", "d_age",  "d_sexual_ori_s", "d_studying_working", "c_p_diag_s", "c_clinical_staging_s", "SOFAS"),     
        type_1L_chr = "covariates", what_1L_chr = "parameters") %>%
  renew(c("d_age","Gender","d_relation_s", "d_sexual_ori_s" ,"Region", "d_studying_working", "c_p_diag_s", "c_clinical_staging_s","SOFAS"), 
        type_1L_chr = "descriptives", what_1L_chr = "parameters") %>%
  renew("d_interview_date", type_1L_chr = "temporal", what_1L_chr = "parameters")

We record that the data we are working with is fake (this step can be skipped if working with real data).

A <- renew(A, T, type_1L_chr = "is_fake", what_1L_chr = "parameters")

We update A for internal consistency with the values we have previously supplied and create a local workspace to which output files will be written.

A <- renew(A, consent_1L_chr = consent_1L_chr, paths_chr = tempdir(), what_1L_chr = "project")

We now generate tables and charts that describe our dataset. These are saved in a sub-directory of our output data directory, and are available for download. One of the plots is also reproduced here.

A <- author(A, consent_1L_chr = consent_1L_chr, digits_1L_int = 3L, what_1L_chr = "descriptives")

We next compare the performance of different model types. We perform this step using the investigate method. This is the first of several times that we use this method. Each time the method is called A is updated to that the next time the method is called, a different algorithm will be used. The sequence of calls to investigate is therefore important (it should be in the same order as outlined in this example and you should not attempt to repeat a call to investigate to redo a prior step).

A <- investigate(A, consent_1L_chr = consent_1L_chr, depnt_var_max_val_1L_dbl = 0.9999, session_ls = sessionInfo())

The outputs of the previous command are saved into a sub-directory of our output directory. An example of this output is available for download). Once we inspect this output, we can then specify the preferred model types to use from this point onwards.

A <- renew(A, c("GLM_GSN_LOG", "OLS_CLL"), type_1L_chr = "models", what_1L_chr = "results")

Next we assess multiple versions of our preferred model type - one single predictor model for each of our candidate predictors and the same models with candidate covariates added.

A <- investigate(A, consent_1L_chr = consent_1L_chr)

The previous step saved output into a sub-directory of our output directory. Example output is available for download: (single predictor comparisons) and multivariate model comparisons. After reviewing this output, we can specify the covariates we wish to add to the models we will assess from this point forward.

A <- renew(A, "SOFAS", type_1L_chr = "covariates", what_1L_chr = "results")

We can now assess the multivariate models.

A <- investigate(A, consent_1L_chr = consent_1L_chr)

As a result of the previous step, more model objects and plot files have been saved to a sub-directory of our output directory. Examples of this output are available for download here and here. Once we inspect this output we can reformulate the models we finalised in the previous step so that they are suitable for modelling longitudinal change. For our primary analysis, we use a mixed model formulation of the models that we previously selected. A series of large model files are written to the local output data directory.

A <- investigate(A, consent_1L_chr = consent_1L_chr)

For our secondary analyses, we specify alternative combinations of predictors and covariates.

A <- investigate(A, consent_1L_chr = consent_1L_chr,
                 scndry_anlys_params_ls = make_scndry_anlys_params(candidate_predrs_chr = c("SOFAS"),
                                                                   candidate_covar_nms_chr = c("d_sex_birth_s", "d_age", "d_sexual_ori_s", "d_studying_working"),
                                                                   prefd_covars_chr = NA_character_) %>%
                   make_scndry_anlys_params(candidate_predrs_chr = c("SCARED","OASIS","GAD7"),
                                            candidate_covar_nms_chr = c("PHQ9", "SOFAS", "d_sex_birth_s", "d_age", "d_sexual_ori_s", "d_studying_working"),
                                            prefd_covars_chr = "PHQ9"))

Report findings

Create shareable models

The model objects created and saved in our working directory by the preceding steps are not suitable for public dissemination. They are both too large in file size and, more importantly, include copies of our source dataset. We can overcome these limitations by creating shareable versions of the models. Two types of shareable version are created - copies of the original model objects in which fake data overwrites the original source data and summary tables of model coefficients.

A <- author(A, consent_1L_chr = consent_1L_chr, what_1L_chr = "models")

Specify study reporting metadata

We update A so that we can begin use it to render and share reports.

A <- renew(A, what_1L_chr = "reporting")

We add metadata relevant to the reports that we will be generating to these fields. Note that the data we supply to the Ready4useRepos object below must relate to a repository to which we have write permissions (otherwise subsequent steps will fail).

A <- renew(A, ready4show::authors_tb, type_1L_chr = "authors", what_1L_chr = "reporting") %>%
  renew(ready4show::institutes_tb, type_1L_chr = "institutes", what_1L_chr = "reporting") %>%
  renew(c(3L,3L), type_1L_chr = "digits", what_1L_chr = "reporting") %>%
  renew(c("PDF","PDF"), type_1L_chr = "formats", what_1L_chr = "reporting") %>%
  renew("A hypothetical utility mapping study using fake data", type_1L_chr = "title", what_1L_chr = "reporting") %>%
  renew(renew(ready4show_correspondences(), old_nms_chr = c("PHQ9", "GAD7"), new_nms_chr = c("PHQ-9", "GAD-7")), type_1L_chr = "changes", what_1L_chr = "reporting") %>%
  renew(Ready4useRepos(dv_nm_1L_chr = "fakes", dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/D74QMP", dv_server_1L_chr = "dataverse.harvard.edu"), type_1L_chr = "repos", what_1L_chr = "reporting")

Author model catalogues

We download a program for generating a catalogue of models and use it to summarising the models created under each study analysis (one primary and two secondary). The catalogues are saved locally.

A <- author(A, consent_1L_chr = consent_1L_chr, download_tmpl_1L_lgl = T, what_1L_chr = "catalogue")

Author manuscript

We add some content about the manuscript we wish to author.

A <- renew(A, "Quality Adjusted Life Years (QALYs) are often used in economic evaluations, yet utility weights for deriving them are rarely directly measured in mental health services.", 
           type_1L_chr = "background", what_1L_chr = "reporting") %>%
  renew("None declared", type_1L_chr = "conflicts", what_1L_chr = "reporting") %>%
  renew("Nothing should be concluded from this study as it is purely hypothetical.", type_1L_chr = "conclusion", what_1L_chr = "reporting") %>%
  renew("The study was reviewed and granted approval by no-one." , type_1L_chr = "ethics", what_1L_chr = "reporting") %>%
  renew("The study was funded by no-one.", type_1L_chr = "funding", what_1L_chr = "reporting") %>%
  renew("three months", type_1L_chr = "interval", what_1L_chr = "reporting") %>%
  renew(c("anxiety", "AQoL","depression", "psychological distress", "QALYs", "utility mapping"), type_1L_chr = "keywords", what_1L_chr = "reporting") %>%
  renew("The study sample is fake data.", type_1L_chr = "sample", what_1L_chr = "reporting")

We create a brief summary of results that can be interpreted by the program that authors the manuscript.

A <- renew(A, c("AQoL-6D", "Adolescent AQoL Six Dimension"), type_1L_chr = "naming", what_1L_chr = "reporting")

A <- renew(A, "use_renew_mthd", type_1L_chr = "abstract", what_1L_chr = "reporting")

We create and save the plots that will be used in the manuscript.

A <- author(A, consent_1L_chr = consent_1L_chr, what_1L_chr = "plots")

We download a program for generating a template manuscript and run it to author a first draft of the manuscript.

A <- author(A, consent_1L_chr = consent_1L_chr, download_tmpl_1L_lgl = T, what_1L_chr = "manuscript")

We can copy the RMarkdown files that created the template manuscript to a new directory (called “Manuscript_Submission”) so that we can then manually edit those files to produce a manuscript that we can submit for publication.

A <- author(A, consent_1L_chr = consent_1L_chr, type_1L_chr = "copy", what_1L_chr = "manuscript")

At this point in the workflow, additional steps are required to adapt / author the manuscript that will be submitted for publication. However, in this example we are going to skip that step and keep working with the unedited template manuscript. If we had a finalised manuscript authoring program stored online, we could now specify the repository from which the program can be retrieved.

# Not run
# A <- renew(A, c("URL of GitHub repository with", "Program version number"), type_1L_chr = "template-manuscript", what_1L_chr = "reporting")

We can now configure the output to be generated by the manuscript authoring program. The below commands will specify a Microsoft Word format manuscript and a PDF technical appendix. Unlike the template manuscript, the figures and tables will be positioned after (and not within) the main body of the manuscript. Note that the Word version of the manuscript generated by these values will require some minor formatting edits (principally to the display of tables and numbering of sections).

A <- renew(A, F, type_1L_chr = "figures-body", what_1L_chr = "reporting") %>%
  renew(F, type_1L_chr = "tables-body", what_1L_chr = "reporting") %>%
  renew(c("Word","PDF"), type_1L_chr = "formats", what_1L_chr = "reporting")

Once any edits to the RMarkdown files for creating the submission manuscript have been finalised, we can run the following command to author the manuscript. If we are using a custom manuscript authoring program downloaded from an online repository the download_tmpl_1L_lgl argument will need to be set to T.

A <- author(A, consent_1L_chr = consent_1L_chr, download_tmpl_1L_lgl = F, type_1L_chr="submission", what_1L_chr = "manuscript")

We can now generate the Supplementary Information for the submission manuscript.

A <- author(A, consent_1L_chr = consent_1L_chr, supplement_fl_nm_1L_chr = "TA_PDF", type_1L_chr="submission", what_1L_chr = "supplement")

We can now share non-confidential elements (ie no copies of individual records) of the outputs that we have created via our study online repository. To run this step you will need write permissions to the online repository. In the below step we are sharing model catalogues, details of the utility instrument, the shareable mapping models (designed to be used in conjunction with the youthu package), our manuscript files and our supplementary information. In most real world studies the manuscript would not be shared via an online repository - the what_chr argument would need to be ammended to reflect this.

A <- share(A, types_chr = c("auto", "submission"), what_chr = c("catalogue", "instrument" ,"manuscript", "models", "supplement"))

The dataset we created in the previous step is viewable here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/D74QMP

Tidy workspace

The preceding steps saved multiple objects (mostly R model objects) that have embedded within them copies of the source dataset. To protect the confidentiality of these records we can now purge all such copies from our output data directory.

A <- author(A, what_1L_chr = "purge")

4.2.8 - Find and deploy utility mapping models

Using tools (soon to be formalised into ready4 modules) from the youthu R package, it is possible to find and deploy relevant utility mapping algorithms.

4.2.8.1 - Example 1: Predict health utility from psychological and functional measures (PHQ-9 and SOFAS)

This tutorial illustrates the main steps for predicting AQoL-6D utility from psychological and functional measures using a longitudinal dataset in long format.

This below section renders a vignette article from the youthu library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(ready4)
library(ready4use)
library(youthu)

This vignette outlines a workflow for:

Searching, selecting and retrieving transfer to utility models;
Preparing a prediction dataset for use with a selected transfer to utility model; and
Applying the selected transfer to utility model to a prediction dataset to predict Quality Adjusted Life Years (QALYs).

The practical value of implementing such a workflow is discussed in the economic analysis vignette and a scientific manuscript. Note, this example uses fake data - it should should not be used to inform decision making.

Search, select and retrieve transfer to utility models

To identify datasets that contain transfer to utility models compatible with youthu (ie those developped with the TTU package), you can use the get_ttu_dv_dss function. The function searches specified dataverses (in the below example, the TTU dataverse) for datasets containing output from the TTU package.

ttu_dv_dss_tb <- get_ttu_dv_dss("TTU")

The ttu_dv_dss_tb table summarises some pertinent details about each dataset containing TTU models found by the preceding command. These details include a link to any scientific summary (the “Article” column) associated with a dataset.

Transfer to Utility Datasets

ID

Utility

Predictors

Article

1

aqol6dtotalw

BADS total score , GAD7 total score , K6 total score , OASIS total score , PHQ9 total score , SCARED total score, SOFAS total score

Transfer to Utility Datasets
ID	Utility	Predictors	Article
1	aqol6dtotalw	BADS total score , GAD7 total score , K6 total score , OASIS total score , PHQ9 total score , SCARED total score, SOFAS total score

To identify models that predict a specified type of health utility from one or more of a specified subset of predictors, use:

mdls_lup <- get_mdls_lup(ttu_dv_dss_tb = ttu_dv_dss_tb,
                         utility_type_chr = "AQoL-6D",
                         mdl_predrs_in_ds_chr = c("PHQ9 total score",
                                                  "SOFAS total score"))

The preceding command will produce a lookup table with information that includes the catalogue names of models, the predictors used in each model and the analysis that generated each one.

Selected elements from Models Look-Up Table

Catalogue reference

Predictors

Analysis

PHQ9_1_GLM_GSN_LOG

PHQ9

Primary Analysis

PHQ9_1_OLS_CLL

PHQ9

Primary Analysis

PHQ9_SOFAS_1_GLM_GSN_LOG

PHQ9 , SOFAS

Primary Analysis

PHQ9_SOFAS_1_OLS_CLL

PHQ9 , SOFAS

Primary Analysis

OASIS_SOFAS_1_GLM_GSN_LOG

OASIS, SOFAS

Primary Analysis

OASIS_SOFAS_1_OLS_CLL

OASIS, SOFAS

Primary Analysis

BADS_SOFAS_1_GLM_GSN_LOG

BADS , SOFAS

Primary Analysis

BADS_SOFAS_1_OLS_CLL

BADS , SOFAS

Primary Analysis

K6_SOFAS_1_GLM_GSN_LOG

K6 , SOFAS

Primary Analysis

K6_SOFAS_1_OLS_CLL

K6 , SOFAS

Primary Analysis

SCARED_SOFAS_1_GLM_GSN_LOG

SCARED, SOFAS

Primary Analysis

SCARED_SOFAS_1_OLS_CLL

SCARED, SOFAS

Primary Analysis

GAD7_SOFAS_1_GLM_GSN_LOG

GAD7 , SOFAS

Primary Analysis

GAD7_SOFAS_1_OLS_CLL

GAD7 , SOFAS

Primary Analysis

SOFAS_1_GLM_GSN_LOG

SOFAS

Secondary Analysis A

SOFAS_1_OLS_CLL

SOFAS

Secondary Analysis A

OASIS_PHQ9_1_GLM_GSN_LOG

OASIS, PHQ9

Secondary Analysis B

OASIS_PHQ9_1_OLS_CLL

OASIS, PHQ9

Secondary Analysis B

GAD7_PHQ9_1_GLM_GSN_LOG

GAD7, PHQ9

Secondary Analysis B

GAD7_PHQ9_1_OLS_CLL

GAD7, PHQ9

Secondary Analysis B

SCARED_PHQ9_1_GLM_GSN_LOG

SCARED, PHQ9

Secondary Analysis B

SCARED_PHQ9_1_OLS_CLL

SCARED, PHQ9

Secondary Analysis B

To review the summary information about the predictive performance of a specific model, use:

get_dv_mdl_smrys(mdls_lup,
                 mdl_nms_chr = "PHQ9_SOFAS_1_OLS_CLL")
#> $PHQ9_SOFAS_1_OLS_CLL
#>        Parameter Estimate    SE          95% CI
#> 1 SD (Intercept)    0.348 0.017   0.312 , 0.382
#> 2      Intercept    0.428 0.129   0.174 , 0.686
#> 3  PHQ9 baseline   -9.115 0.249 -9.601 , -8.618
#> 4    PHQ9 change   -7.331 0.339 -8.007 , -6.665
#> 5 SOFAS baseline    0.960 0.172   0.616 , 1.292
#> 6   SOFAS change    1.146 0.235   0.674 , 1.607
#> 7             R2    0.767 0.012   0.743 , 0.788
#> 8           RMSE    0.925 0.004   0.922 , 0.928
#> 9          Sigma    0.406 0.012   0.384 , 0.429

More information about a selected model can be found in the online model catalogue, the link to which can be obtained with the following command:

get_mdl_ctlg_url(mdls_lup,
                 mdl_nm_1L_chr = "PHQ9_SOFAS_1_OLS_CLL")

[1] “https://dataverse.harvard.edu/api/access/datafile/6484935”

Prepare a prediction dataset for use with a selected transfer to utility model

Import data

You can now import and inspect the dataset you plan on using for prediction. In the below example we use fake data.

data_tb <- make_fake_ds_one()

Illustrative example of a prediction dataset

UID

Timepoint

Date

PHQ_total

SOFAS_total

Participant_1

Baseline

2022-12-20

7

69

Participant_10

Baseline

2022-11-16

17

60

Participant_10

Follow-up

2023-02-21

17

64

Participant_100

Baseline

2023-01-31

0

76

Participant_1000

Baseline

2023-02-05

0

71

Participant_1000

Follow-up

2023-04-10

0

71

Illustrative example of a prediction dataset
UID	Timepoint	Date	PHQ_total	SOFAS_total
Participant_1	Baseline	2022-12-20	7	69
Participant_10	Baseline	2022-11-16	17	60
Participant_10	Follow-up	2023-02-21	17	64
Participant_100	Baseline	2023-01-31	0	76
Participant_1000	Baseline	2023-02-05	0	71
Participant_1000	Follow-up	2023-04-10	0	71

Confirm dataset can be used as a prediction dataset

The prediction dataset must contain variables that correspond to all the predictors of the model you intend to apply. The allowable range and required class of each predictor variable are described in the min_val_dbl, max_val_dbl and class_chr columns of the model predictors lookup table, which can be accessed with a call to the get_predictors_lup function.

predictors_lup <- get_predictors_lup(mdls_lup = mdls_lup,
                                     mdl_nm_1L_chr = "PHQ9_SOFAS_1_OLS_CLL")

Model predictors lookup table

short_name_chr

long_name_chr

min_val_dbl

max_val_dbl

class_chr

increment_dbl

class_fn_chr

mdl_scaling_dbl

covariate_lgl

PHQ9

PHQ9 total score

0

27

integer

1

youthvars::youthvars_phq9

0.01

FALSE

SOFAS

SOFAS total score

0

100

integer

1

youthvars::youthvars_sofas

0.01

TRUE

Model predictors lookup table
short_name_chr	long_name_chr	min_val_dbl	max_val_dbl	class_chr	increment_dbl	class_fn_chr	mdl_scaling_dbl	covariate_lgl
PHQ9	PHQ9 total score	0	27	integer	1	youthvars::youthvars_phq9	0.01	FALSE
SOFAS	SOFAS total score	0	100	integer	1	youthvars::youthvars_sofas	0.01	TRUE

The prediction dataset must also include both a unique client identifier variable and a measurement time-point identifier variable (which must be a factor with two levels). The dataset also needs to be in long format (ie where measures at different time-points for the same individual are stacked on top of each other in separate rows). We can confirm these conditions hold by creating a dataset metadata object using the make_predn_metadata_ls function. In creating the metadata object, the function checks that the dataset can be used in conjunction with the model specified at the mdl_nm_1L_chr argument. If the prediction dataset uses different variable names for the predictors to those specified in the predictors_lup lookup table, a named vector detailing the correspondence between the two sets of variable names needs to be passed to the predr_vars_nms_chr argument. Finally, if you wish to specify a preferred variable name to use for the predicted utility values when applying the model, you can do this by passing this name to the utl_var_nm_1L_chr argument.

predn_ds_ls <- make_predn_metadata_ls(data_tb,
                                      id_var_nm_1L_chr = "UID",
                                      msrmnt_date_var_nm_1L_chr = "Date",
                                      predr_vars_nms_chr = c(PHQ9 = "PHQ_total",SOFAS = "SOFAS_total"),
                                      round_var_nm_1L_chr = "Timepoint",
                                      round_bl_val_1L_chr = "Baseline",
                                      utl_var_nm_1L_chr = "AQoL6D_HU",
                                      mdls_lup = mdls_lup,
                                      mdl_nm_1L_chr = "PHQ9_SOFAS_1_OLS_CLL")

Apply the selected transfer to utility model to a prediction dataset to predict Quality Adjusted Life Years (QALYs)

Predict health utility at baseline and follow-up timepoints

To generate utility predictions we use the add_utl_predn function. The function needs to be supplied with the prediction dataset (the value passed to argument data_tb) and the validated prediction metadata object we created in the previous step.

data_tb <- add_utl_predn(data_tb,
                         predn_ds_ls = predn_ds_ls)
#> Joining with `by = join_by(UID, Timepoint)`

By default the add_utl_predn function samples model parameter values based on a table of model coefficients when making predictions and constrains predictions to an allowed range. You can override these defaults by adding additional arguments new_data_is_1L_chr = "Predicted" (which uses mean parameter values), force_min_max_1L_lgl = F (removes range constraint) and (if the source dataset makes available downloadable model objects) make_from_tbl_1L_lgl = F. These settings will produce different predictions. It is strongly recommended that you consult the model catalogue (see above) to understand how such decisions may affect the validity of the predicted values that will be generated.

Prediction dataset with predicted utilities

UID

Timepoint

Date

PHQ_total

SOFAS_total

AQoL6D_HU

Participant_1

Baseline

2022-12-20

7

69

0.9193293

Participant_10

Baseline

2022-11-16

17

60

0.6721956

Participant_10

Follow-up

2023-02-21

17

64

0.4242752

Participant_100

Baseline

2023-01-31

0

76

0.7530591

Participant_1000

Baseline

2023-02-05

0

71

0.7613385

Participant_1000

Follow-up

2023-04-10

0

71

0.9930864

Prediction dataset with predicted utilities
UID	Timepoint	Date	PHQ_total	SOFAS_total	AQoL6D_HU
Participant_1	Baseline	2022-12-20	7	69	0.9193293
Participant_10	Baseline	2022-11-16	17	60	0.6721956
Participant_10	Follow-up	2023-02-21	17	64	0.4242752
Participant_100	Baseline	2023-01-31	0	76	0.7530591
Participant_1000	Baseline	2023-02-05	0	71	0.7613385
Participant_1000	Follow-up	2023-04-10	0	71	0.9930864

Our health utility predictions are now available for use and are summarised below.

summary(data_tb$AQoL6D_HU)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#> 0.06525 0.42832 0.62654 0.62142 0.83585 0.99999

Calculate QALYs

The last step is to calculate Quality Adjusted Life Years, using a method assuming a linear rate of change between timepoints.

data_tb <- data_tb %>% add_qalys_to_ds(predn_ds_ls = predn_ds_ls,
                                       include_predrs_1L_lgl = F,
                                       reshape_1L_lgl = F)

Prediction dataset with QALYs

UID

Timepoint

Date

PHQ_total

SOFAS_total

AQoL6D_HU

AQoL6D_HU_change_dbl

duration_prd

qalys_dbl

Participant_1

Baseline

2022-12-20

7

69

0.9193293

0.0000000

0S

0.0000000

Participant_10

Baseline

2022-11-16

17

60

0.6721956

0.0000000

0S

0.0000000

Participant_10

Follow-up

2023-02-21

17

64

0.4242752

-0.2479204

97d 0H 0M 0S

0.1455957

Participant_100

Baseline

2023-01-31

0

76

0.7530591

0.0000000

0S

0.0000000

Participant_1000

Baseline

2023-02-05

0

71

0.7613385

0.0000000

0S

0.0000000

Participant_1000

Follow-up

2023-04-10

0

71

0.9930864

0.2317479

64d 0H 0M 0S

0.1537073

Prediction dataset with QALYs
UID	Timepoint	Date	PHQ_total	SOFAS_total	AQoL6D_HU	AQoL6D_HU_change_dbl	duration_prd	qalys_dbl
Participant_1	Baseline	2022-12-20	7	69	0.9193293	0.0000000	0S	0.0000000
Participant_10	Baseline	2022-11-16	17	60	0.6721956	0.0000000	0S	0.0000000
Participant_10	Follow-up	2023-02-21	17	64	0.4242752	-0.2479204	97d 0H 0M 0S	0.1455957
Participant_100	Baseline	2023-01-31	0	76	0.7530591	0.0000000	0S	0.0000000
Participant_1000	Baseline	2023-02-05	0	71	0.7613385	0.0000000	0S	0.0000000
Participant_1000	Follow-up	2023-04-10	0	71	0.9930864	0.2317479	64d 0H 0M 0S	0.1537073

4.2.8.2 - Example 2: Predict health utility from psychological measures (PHQ-9 and GAD-7)

This tutorial illustrates the main steps for predicting AQoL-6D utility from two psychological measures using a longitudinal dataset in wide format.

This below section renders a vignette article from the youthu library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

This vignette article is abridged and modified version of another article on predicting Quality Adjusted Life Years with youthu.

Motivation

This article illustrates how to make QALY predictions using a dataset in wide format with no health-utility measures but containing two psychological measures (GAD-7 and PHQ-9).

Install youthu

If not already installed it will be necessary to install the youthu R library. As youthu is not yet available on CRAN, it will be necessary to install it directly from its GitHub repository using an R package like remotes or devtools.

# Uncomment and run if installation is required.
# utils::install.packages("devtools") 
# devtools::install_github("ready4-dev/youthu")

Load required packages

We now load the libraries we will be using in subsequent steps. Note, both the ready4, ready4show and ready4use ready4 framework libraries will have been installed automatically when youthu was installed. The specific readyforwhatsnext module library and dplyr, purrr, stringr and tidyr CRAN libraries will have been installed at the same time.

library(ready4)
library(ready4show)
library(ready4use)
library(specific)
library(youthu)

Specify data sources

We begin by specifying the sources for our data. In this example, our data sources are online repositories.

X <- Ready4useRepos(dv_nm_1L_chr = "fakes", dv_ds_nm_1L_chr = "https://doi.org/10.7910/DVN/HJXYKQ", 
                    dv_server_1L_chr = "dataverse.harvard.edu",
                    gh_repo_1L_chr = "ready4-dev/youthu", gh_tag_1L_chr = "v0.0.0.91125")

Inspect dataset

We can now inspect the dataset we will be using to make predictions. As this is a demonstration article we are going to create a custom synthetic dataset. Our first step in doing so is to ingest a preexisting synthetic dataset (in long format) using the method explained in another vignette article

data_tb <- ingest(X, fls_to_ingest_chr = c("ymh_phq_gad_tb"), metadata_1L_lgl = F)

Our resulting dataset has unique IDs for each participant (character class), timestamps for each data collection timepoint (Date class variables) and GAD-7 and PHQ-9 scores for each timepoint (integer class).

data_tb %>% head() %>% ready4show::print_table(caption_1L_chr = "Dataset", output_type_1L_chr = "HTML")

Dataset
fkClientID	d_interview_date_t1	d_interview_date_t2	gad7_t1	gad7_t2	phq9_t1	phq9_t2
Participant_1	2020-03-22	NA	6	NA	7	NA
Participant_2	2020-06-15	NA	12	NA	13	NA
Participant_3	2020-08-20	NA	16	NA	17	NA
Participant_4	2020-05-23	2020-08-19	12	12	17	14
Participant_5	2020-04-05	2020-07-19	14	6	22	8
Participant_6	2020-06-09	NA	8	NA	8	NA

Get mapping models

We retrieve details of relevant AQoL-6D mapping models for wither of the predictors we plan on using. How these models were derived is described in a pre-print and details of model performance is included in catalogues available in an open access data repository.

mdls_lup <- get_mdls_lup(ttu_dv_dss_tb = get_ttu_dv_dss("TTU"),
                         utility_type_chr = "AQoL-6D",
                         mdl_predrs_in_ds_chr = c("GAD7 total score", "PHQ9 total score"))

mdls_lup[,c(1,2,5)] %>% 
  ready4show::print_table(caption_1L_chr = "Available models", output_type_1L_chr = "HTML")

Available models
mdl_nms_chr	predrs_ls	source_chr
PHQ9_1_GLM_GSN_LOG	PHQ9	Primary Analysis
PHQ9_1_OLS_CLL	PHQ9	Primary Analysis
GAD7_1_GLM_GSN_LOG	GAD7	Primary Analysis
GAD7_1_OLS_CLL	GAD7	Primary Analysis
PHQ9_SOFAS_1_GLM_GSN_LOG	PHQ9 , SOFAS	Primary Analysis
PHQ9_SOFAS_1_OLS_CLL	PHQ9 , SOFAS	Primary Analysis
GAD7_SOFAS_1_GLM_GSN_LOG	GAD7 , SOFAS	Primary Analysis
GAD7_SOFAS_1_OLS_CLL	GAD7 , SOFAS	Primary Analysis
OASIS_PHQ9_1_GLM_GSN_LOG	OASIS, PHQ9	Secondary Analysis B
OASIS_PHQ9_1_OLS_CLL	OASIS, PHQ9	Secondary Analysis B
GAD7_PHQ9_1_GLM_GSN_LOG	GAD7, PHQ9	Secondary Analysis B
GAD7_PHQ9_1_OLS_CLL	GAD7, PHQ9	Secondary Analysis B
SCARED_PHQ9_1_GLM_GSN_LOG	SCARED, PHQ9	Secondary Analysis B
SCARED_PHQ9_1_OLS_CLL	SCARED, PHQ9	Secondary Analysis B

We select our preferred model and retrieve summary data about the model’s predictor variables.

predictors_lup <- get_predictors_lup(mdls_lup = mdls_lup, mdl_nm_1L_chr = "GAD7_PHQ9_1_OLS_CLL")

exhibit(predictors_lup)

Variable	Description	Minimum	Maximum	Class	Increment	Function	Scaling	Covariate
GAD7	GAD7 total score	0	21	integer	1	youthvars::youthvars_gad7	0.01	FALSE
PHQ9	PHQ9 total score	0	27	integer	1	youthvars::youthvars_phq9	0.01	FALSE

Transform prediction dataset

To be used with the mapping models available to us, our prediction dataset needs to be in long format. We perform the necessary transformation.

data_tb <- transform_ds_to_long(data_tb, predictors_chr = c("gad7", "phq9"),
                             msrmnt_date_var_nm_1L_chr = "d_interview_date", round_var_nm_1L_chr = "When")
#> Joining with `by = join_by(case_id, fkClientID, When)`
#> Joining with `by = join_by(case_id, fkClientID, When)`

We drop records where we are missing data for either GAD7 or PHQ9 at either timepoint.

data_tb <- transform_ds_to_drop_msng(data_tb, predictors_chr = c("gad7", "phq9"), 
                                      uid_var_nm_1L_chr = "fkClientID")

We now predict AQoL-6D health utility for each case with complete data.

predn_ds_ls <- make_predn_metadata_ls(data_tb,
                                      id_var_nm_1L_chr = "fkClientID",
                                      msrmnt_date_var_nm_1L_chr = "d_interview_date",
                                      predr_vars_nms_chr = c(GAD7 = "gad7", PHQ9 = "phq9"),
                                      round_var_nm_1L_chr = "When",
                                      round_bl_val_1L_chr = "t1",
                                      utl_var_nm_1L_chr = "AQoL6D_HU",
                                      mdls_lup = mdls_lup,
                                      mdl_nm_1L_chr = "GAD7_PHQ9_1_OLS_CLL")
data_tb <- add_utl_predn(data_tb, new_data_is_1L_chr = "Predicted", predn_ds_ls = predn_ds_ls)
#> Joining with `by = join_by(fkClientID, When)`

Finally, we derive QALY predictions from the health utility measures at both time-points.

data_tb <- data_tb %>% add_qalys_to_ds(predn_ds_ls = predn_ds_ls, include_predrs_1L_lgl = F, reshape_1L_lgl = T)

data_tb %>% head() %>%
  ready4show::print_table(caption_1L_chr = "Final dataset", output_type_1L_chr = "HTML",
                          scroll_box_args_ls = list(width = "100%"))

Final dataset
fkClientID	d_interview_date_t1	d_interview_date_t2	gad7_t1	gad7_t2	phq9_t1	phq9_t2	AQoL6D_HU_t1	AQoL6D_HU_t2	AQoL6D_HU_change_dbl_t1	AQoL6D_HU_change_dbl_t2	duration_prd_t1	duration_prd_t2	qalys_dbl_t1	qalys_dbl_t2
Participant_10	2020-08-05	2020-11-07	15	13	17	18	0.3891806	0.6342526	0	0.2450720	0S	94d 0H 0M 0S	0	0.1316943
Participant_1000	2020-09-06	2020-12-20	13	10	13	10	0.6609298	0.2963083	0	-0.3646215	0S	105d 0H 0M 0S	0	0.1375907
Participant_1001	2020-07-05	2020-10-15	10	11	10	16	0.5324127	0.6192971	0	0.0868844	0S	102d 0H 0M 0S	0	0.1608137
Participant_1003	2020-05-18	2020-08-12	6	8	16	7	0.5630164	0.8584193	0	0.2954030	0S	86d 0H 0M 0S	0	0.1673422
Participant_1005	2020-05-09	2020-08-25	14	5	20	9	0.5090272	0.7799675	0	0.2709403	0S	108d 0H 0M 0S	0	0.1905701
Participant_1006	2020-05-29	2020-08-25	15	9	21	17	0.2969778	0.2734973	0	-0.0234805	0S	88d 0H 0M 0S	0	0.0687225

4.2.9 - Use utility mapping algorithms to help implement cost-utility analyses

Using tools (soon to be formalised into ready4 framework modules) from the youthu R package, it is possible to use utility mapping algorithms to help implement cost-utility analyses. This tutorial illustrates the main steps for doing so using psychological and functional measures collected on clinical samples of young people.

This below section renders a vignette article from the youthu library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(youthu)
library(ggplot2)
library(ready4use)
set.seed(1234)

This vignette illustrates the rationale for and practical decision-making utility of youthu’s QALYs prediction workflow. Note, this example is illustrated with fake data and should not be used to inform decision-making.

Motivation

The main motivation behind the youthu package is to extend the types of economic analysis that can be undertaken with both single group (e.g. pilot study, health service records) and matched groups (e.g. trial) longitudinal datasets that do not include measures of health utility. This article focuses on its application to matched group datasets.

Example dataset

First, we must first import our data. In this example we will use a fake dataset.

ds_tb <- make_fake_ds_two()
#> Joining with `by = join_by(fkClientID, study_arm_chr)`

Our dataset includes 268 matched comparisons, with each comparison containing baseline and follow-up records for one intervention arm participant and one control arm participant. The first few records are as follows.

First few records from input dataset

fkClientID

round

date_psx

duration_prd

PHQ9

SOFAS

costs_dbl

study_arm_chr

match_idx_int

Participant_20

Baseline

2023-07-04

0S

16

41

301.1868

Intervention

1

Participant_593

Baseline

2023-05-11

0S

19

43

259.3190

Control

1

Participant_593

Follow-up

2023-11-02

175d 0H 0M 0S

16

65

1290.4220

Control

1

Participant_20

Follow-up

2023-12-29

178d 0H 0M 0S

15

74

1787.4242

Intervention

1

Participant_259

Baseline

2023-08-29

0S

19

39

311.0018

Control

2

Participant_962

Baseline

2023-10-11

0S

10

45

276.2181

Intervention

2

First few records from input dataset
fkClientID	round	date_psx	duration_prd	PHQ9	SOFAS	costs_dbl	study_arm_chr	match_idx_int
Participant_20	Baseline	2023-07-04	0S	16	41	301.1868	Intervention	1
Participant_593	Baseline	2023-05-11	0S	19	43	259.3190	Control	1
Participant_593	Follow-up	2023-11-02	175d 0H 0M 0S	16	65	1290.4220	Control	1
Participant_20	Follow-up	2023-12-29	178d 0H 0M 0S	15	74	1787.4242	Intervention	1
Participant_259	Baseline	2023-08-29	0S	19	39	311.0018	Control	2
Participant_962	Baseline	2023-10-11	0S	10	45	276.2181	Intervention	2

This dataset contains features that make it possible to use in conjunction with youthu’s economic analysis functions. These requirements are described in the vignette about finding and using models compatible models to predict QALYs;

The dataset also contains a cost variable, which is a requirement for most, though not all, of the economic analyses that can be undertaken with youthu.

Limitations of datasets without measures of health utility

A notable omission from the dataset is any measure of utility. This omission means that, in the absence of using mapping algorithms such as those included with youthu, the most feasible types of economic evaluation to apply to this dataset would likely be cost-consequence analysis (where a synopsis of the differences in a range of measures are presented alongside cost differences) and cost-effectiveness analysis (where a summary statistic - the incremental cost-effectiveness ratio or ICER - is calculated by dividing differences in costs by differences in a single outcome measure).

These types of economic analyses can be relatively simple to interpret if either the intervention or control arm is simultaneously cheaper and more effective across all included outcome measures. However, these conditions don’t hold in our sample data.

summary((ds_tb %>% dplyr::filter(study_arm_chr == "Control" & round == "Baseline"))[5:6])
#>       PHQ9          SOFAS      
#>  Min.   : 0.0   Min.   :39.00  
#>  1st Qu.: 7.0   1st Qu.:60.00  
#>  Median :12.0   Median :66.00  
#>  Mean   :10.9   Mean   :66.13  
#>  3rd Qu.:15.0   3rd Qu.:72.00  
#>  Max.   :19.0   Max.   :89.00

summary((ds_tb %>% dplyr::filter(study_arm_chr == "Control" & round == "Follow-up"))[5:7])
#>       PHQ9            SOFAS         costs_dbl     
#>  Min.   : 0.000   Min.   :39.00   Min.   : 889.9  
#>  1st Qu.: 4.000   1st Qu.:64.00   1st Qu.:1321.1  
#>  Median : 8.000   Median :71.00   Median :1486.7  
#>  Mean   : 8.493   Mean   :70.65   Mean   :1489.0  
#>  3rd Qu.:13.000   3rd Qu.:77.00   3rd Qu.:1627.0  
#>  Max.   :27.000   Max.   :98.00   Max.   :2216.5

summary((ds_tb %>% dplyr::filter(study_arm_chr == "Intervention" & round == "Baseline"))[5:6])
#>       PHQ9           SOFAS      
#>  Min.   : 0.00   Min.   :36.00  
#>  1st Qu.: 7.00   1st Qu.:61.00  
#>  Median :11.00   Median :67.00  
#>  Mean   :10.81   Mean   :66.74  
#>  3rd Qu.:15.00   3rd Qu.:72.25  
#>  Max.   :19.00   Max.   :88.00

summary((ds_tb %>% dplyr::filter(study_arm_chr == "Intervention" & round == "Follow-up"))[5:7])
#>       PHQ9            SOFAS      costs_dbl     
#>  Min.   : 0.000   Min.   :40   Min.   : 923.4  
#>  1st Qu.: 2.000   1st Qu.:60   1st Qu.:1625.6  
#>  Median : 6.500   Median :68   Median :1777.3  
#>  Mean   : 6.851   Mean   :68   Mean   :1807.8  
#>  3rd Qu.:11.000   3rd Qu.:77   3rd Qu.:1996.0  
#>  Max.   :25.000   Max.   :93   Max.   :2872.7

The pattern of results summarised above create some significant barriers to meaningfully interpreting economic evaluations that are based on cost-consequence or cost-effectiveness analysis:

A cost-effectiveness analysis in which change in PHQ-9 was the benefit measure would be difficult to interpret as the Intervention arm is both more effective and more costly, which begs the question is it worth paying the extra dollars for this improvement? Also - would a judgment of cost-effectiveness remain the same if the study had measured a slightly different incremental benefit or recorded change over a longer or shorter time horizon? It is likely that there is no commonly used value for money benchmark for improvements measured in PHQ-9, nor is there any time weighting associated with the measure. Furthermore, if the potential funding for the intervention is from a budget that is allocated to non-depressive illnesses (e.g. physical health), results from a cost-effectiveness analysis using PHQ-9 as its benefit measure are not readily comparable with economic evaluations of interventions from other illness groups using different benefit measures that are potentially competing for the same scarce funding.
A cost consequence analyses that summarised the differences in costs with the differences in changes in PHQ-9 and SOFAS score would be difficult to interpret because while the intervention is more effective than control for improvements measured on PHQ-9 (where lower scores are better), the control group is superior if benefits are based on functioning improvements as measured by SOFAS scores (where higher scores are better). The lack of any formal weighting for how to trade off clinical symptoms and functioning means that interpretation of this analysis will be highly subjective and likely to change across potential decision makers.

These types of short-comings can be significantly addressed by undertaking cost-utility analyses (CUAs) as:

they use a measure of benefit - the Quality Adjusted Life Year (QALY) - that captures multiple domains of health, weighted by time and population preferences in a single index measure that can be applied across health conditions;
there are published benchmark willingness to pay values for QALYs that are routinely used by decision makers in many countries to make ICER statistics readily interpretable in the context of health budget allocation.

The rest of this article demonstrates how youthu functions can be used to undertake CUA based analyses on the type of data we have just profiled.

Using youthu in a cost-utility analysis workflow

Predict adolescent AQoL-6D health utility

Our first step is to identify which youthu models we will use to predict adolescent AQoL-6D and apply these models to our data. This step was explained in more detail in another vignette article about finding and using transfer to utility models, so will be dealt with briefly here.

We ingest metadata about the mapping models we plan to use. NOTE: This is a temporary step that is required due to the metadata file not being in the study online repository. This code will cease to work once the metadata file has been moved from its temporary location to the study dataset. We will perform this task when an associated manuscript exits its current review process.

mdl_meta_data_ls <- ingest(Ready4useRepos(gh_repo_1L_chr = "ready4-dev/youthu", gh_tag_1L_chr = "v0.0.0.91125"), fls_to_ingest_chr = c("mdl_meta_data_ls"), metadata_1L_lgl = F)

We now make sure that our dataset can be used as a prediction dataset in conjunction with the model we intend using.

predn_ds_ls <- make_predn_metadata_ls(ds_tb,
                                      cmprsn_groups_chr = c("Intervention", "Control"),
                                      cmprsn_var_nm_1L_chr = "study_arm_chr",
                                      costs_var_nm_1L_chr = "costs_dbl",
                                      id_var_nm_1L_chr = "fkClientID",
                                      mdl_meta_data_ls = mdl_meta_data_ls,
                                      msrmnt_date_var_nm_1L_chr = "date_psx",
                                      round_var_nm_1L_chr = "round",
                                      round_bl_val_1L_chr = "Baseline",
                                      utl_var_nm_1L_chr = "AQoL6D_HU",
                                      mdls_lup = get_mdls_lup(utility_type_chr = "AQoL-6D",
                                                              mdl_predrs_in_ds_chr = c("PHQ9 total score",
                                                                                       "SOFAS total score"),
                                                              ttu_dv_nms_chr = "TTU"),
                                      mdl_nm_1L_chr =  "PHQ9_SOFAS_1_OLS_CLL")

We now use our preferred model to predict health utility from the measures in our dataset.

ds_tb <- add_utl_predn(ds_tb,
                       predn_ds_ls = predn_ds_ls) %>%
  dplyr::select(fkClientID, round, study_arm_chr, date_psx, duration_prd, dplyr::everything())
#> Joining with `by = join_by(fkClientID, round)`

Calculate QALYs

Next we combine the health utility data with the interval between measurement data to calculate QALYs and add them to the dataset.

ds_tb  <- ds_tb %>% add_qalys_to_ds(predn_ds_ls = predn_ds_ls,
                                    include_predrs_1L_lgl = T,
                                    reshape_1L_lgl = T)

First few records from updated dataset with QALYs

fkClientID

study_arm_chr

match_idx_int

date_psx_Baseline

date_psx_Follow-up

duration_prd_Baseline

duration_prd_Follow-up

costs_dbl_Baseline

costs_dbl_Follow-up

PHQ9_Baseline

PHQ9_Follow-up

SOFAS_Baseline

SOFAS_Follow-up

AQoL6D_HU_Baseline

AQoL6D_HU_Follow-up

PHQ9_change_dbl_Baseline

PHQ9_change_dbl_Follow-up

SOFAS_change_dbl_Baseline

SOFAS_change_dbl_Follow-up

AQoL6D_HU_change_dbl_Baseline

AQoL6D_HU_change_dbl_Follow-up

qalys_dbl_Baseline

qalys_dbl_Follow-up

Participant_10

Control

243

2023-04-19

2023-10-13

0S

177d 0H 0M 0S

647.9386

1696.235

8

10

61

64

0.7597988

0.6079774

0

2

0

3

0

-0.1518214

0

0.3314119

Participant_1000

Control

191

2023-06-15

2023-12-16

0S

184d 0H 0M 0S

428.9205

1619.037

4

2

63

82

0.8459579

0.7688131

0

-2

0

19

0

-0.0771448

0

0.4067322

Participant_1001

Intervention

230

2023-05-10

2023-11-05

0S

179d 0H 0M 0S

429.3703

1844.219

10

14

59

72

0.6138300

0.8607305

0

4

0

13

0

0.2469005

0

0.3613228

Participant_1003

Intervention

115

2023-06-08

2023-12-07

0S

182d 0H 0M 0S

395.1637

1537.365

9

0

71

81

0.5808015

0.9315788

0

-9

0

10

0

0.3507773

0

0.3768011

Participant_1005

Intervention

183

2023-09-09

2024-03-13

0S

186d 0H 0M 0S

402.9910

1826.511

17

0

78

88

0.5460607

0.9593811

0

-17

0

10

0

0.4133204

0

0.3833158

Participant_1006

Intervention

219

2023-10-05

2024-04-01

0S

179d 0H 0M 0S

534.2285

2401.478

9

14

75

73

0.7239490

0.5885972

0

5

0

-2

0

-0.1353518

0

0.3216232

Analyse results

Now we can run the main economic analysis. This is implemented by the make_hlth_ec_smry function, which first bootstraps the dataset (implemented by the boot function from the boot package) before passing the mean values for costs and QALYs from each bootstrap sample to with bcea function of the BCEA package to calculate a range of health economic statistics. For this example we pass a value of 50,000 for the willingness to pay parameter, as this is the dollar amount commonly used in Australia as a benchmark for the value of a QALY.

Note, for this illustrative example we only request 1000 bootstrap iterations - in practice this number may be higher.

he_smry_ls <- ds_tb %>% make_hlth_ec_smry(predn_ds_ls = predn_ds_ls,
                                                 wtp_dbl = 50000,
                                                 bootstrap_iters_1L_int = 1000L)
#> Warning: There was 1 warning in `dplyr::summarise()`.
#> ℹ In argument: `dplyr::across(.fns = mean)`.
#> Caused by warning:
#> ! Using `across()` without supplying `.cols` was deprecated in dplyr 1.1.0.
#> ℹ Please supply `.cols` instead.

As part of the output of the make_hlth_ec_smry function is a BCEA object, we can use the BCEA package to produce a number of graphical summaries of economic results. One of the most important is the production of a cost-effectiveness plane. This plot highlights that, with an ICER of $-98,145.56, less than half of the bootstrapped iteration incremental cost and QALY pairs fall within the zone of cost-effectiveness (green). In fact, at the cost-effectiveness threshold we supplied, the results suggest there is a 8% probability that the intervention is cost-effective.

BCEA::ceplane.plot(he_smry_ls$ce_res_ls, wtp =50000,  graph = "ggplot2", theme = ggplot2::theme_light())

4.2.10 - Develop choice models

Using tools (soon to be formalised into ready4 framework modules) from the mychoice R package, it is possible to develop choice models from responses to a discrete choice experiment survey.

This below section renders a vignette article from the mychoice library. You can use the following links to:

view the vignette on the library website (adds useful hyperlinks to code blocks)
view the source file from that article, and;
edit its contents (requires a GitHub account).

library(mychoice)
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, will retire in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.
#> The sp package is now running under evolution status 2
#>      (status 2 uses the sf package in place of rgdal)

The tools in mychoice are designed to make it easier to develop and use choice models with ready4 - an open source health economic model of the systems shaping mental health and wellbeing in young people.

This development version of the mychoice package has been made available as part of the process of testing and documenting the package.

Currently there are no vignettes available. However, examples of the application of mychoice functions to a real world discrete choice experiment are in programs available at https://doi.org/10.5281/zenodo.6626256 (design of a discrete choice experiment survey) and https://doi.org/10.5281/zenodo.7223286 (analysis of discrete choice experiment survey responses). PDF versions of each program, along with the artefacts produced by each are available in the online dataset at https://doi.org/10.7910/DVN/VGPIPS.

5 - Programs for replicating study designs and analyses

R Markdown Programs combine modules of the readyforwhatsnextmodel with compatible datasets to implement reproducible and replicable analyses of youth mental health policy and system design topics.

5.1 - Finding programs

Programs are used to generate and report a model analysis.

What are programs?

Programs can be executed in their current form without the need for additional input data and, unless modified or run interactively (prompting a user for inputs during execution), will always generate the exact same output. They are typically deployed for configuring the run specifications of a computational model, specifying the data to which it will be applied and reporting analysis results.

Why are they used for in readyforwhatsnext?

readyforwhatsnext programs can be used for the following purposes:

to reproduce a study analysis, in which case you will need access to the original study data, and may also need to modify the program to specify the path to this data from your machine;
to replicate a study analysis (ie to apply the study algorithm to similar but different input data [this can be a new sample from the same population or, if used for demonstration purposes, fake data representative of the original study dataset]), in which case you will need to modify the program to specify the path to this data; and
to transfer a study analysis, in which case you use the program as a template to be modified to reflect key differences between the original study and your study.

Current readyforhwatsnext programs

Currently available readyforwhatsnext programs are summarised in the below table.

Program	Release	Date	Description	Source
aqol6dmap_fakes	0.0.9.0	02-Mar-2022	This program generates a purely synthetic (i.e. fake - no trace of any real records) population that is reasonably representative of the input data we used for the utility mapping study described in the article https://doi.org/10.1101/2021.07.07.21260129.	Dev, Archive
aqol6dmap_use	0.1	13-Sep-2022	Apply AQoL-6D Utility Mapping Models To New DataThis release includes minor formatting change and an updated version number.	Dev, Archive
dce_sa_analysis	0.1.1	27-Oct-2022	A self-documenting R Markdown program for analysing responses to a discrete choice experiment exploring the online help-seeking preferences of socially anxious young people.	Dev, Archive
dce_sa_design	0.0.9.3	26-Oct-2022	An R Markdown program to create the experimental design for a Discrete Choice Experiment (DCE) exploring online help seeking in socially anxious young people.This release uses functions from the mychoice R package (https://github.com/ready4-dev/mychoice).	Dev, Archive
ttu_lng_aqol6_csp	0.1	16-Sep-2022	Complete study program to reproduce all steps from data ingest through to results dissemination for a study to map mental health measures to AQoL-6D health utility.	Dev, Archive

Documentation

readyforwhatsnext programs are typically self-documenting, meaning that each section of code is integrated with plain English descriptions of the purpose it fulfills. The only programs that are not self-documenting are those whose primary purpose is to produce a document (normally an analysis report). Self-documenting programs and sub-routines will be typically documented as a PDF or HTML render of the RMarkdown source file. This rendered document will be bundled with the program, but in some cases may also be shared in online data repositories.

5.2 - Using readyforwhatsnext programs

The code used when applying readyforwhatsnext modules to a number of real world youth mental health policy and research projects is publicly available for review and reuse.

5.2.1 - Model health utility

Replication programs for developing, finding and applying utility mapping algorithms.

5.2.1.1 - Develop health utility mapping algorithms

Using modules from the TTU, youthvars, scorz and specific libraries, we developed utility mapping algorithms from a sample of young people attending primary mental health care services.

This below section embeds a PDF version of an R Markdown program. The following alternative options may provide improved viewing experience, more contextual information and access to more useful code formats:

5.2.1.2 - Predict health utility

Using functions (soon to be formalised into ready4 framework modules) from the youthu R package, we predicted health utility for a synthetic population of young people attending primary mental health care services.

This below section embeds a PDF version of an R Markdown program. The following alternative options may provide improved viewing experience, more contextual information and access to more useful code formats:

5.2.2 - Model youth choices

Replication programs for designing, analysing and reporting discrete choice experiments.

5.2.2.1 - Design a Discrete Choice Experiment

We used functions (soon to be formalised into ready4 modules) from the mychoice R package to design to a discrete choice experiment.

This below section embeds a PDF version of an R Markdown program. The following alternative options may provide improved viewing experience, more contextual information and access to more useful code formats:

5.2.2.2 - Analyse the results of a Discrete Choice Experiment

Using functions (soon to be formalised into ready4 framework modules) from the mychoice R package, it is possible to develop choice models from responses to a discrete choice experiment survey.

This below section embeds a PDF version of an R Markdown program. The following alternative options may provide improved viewing experience, more contextual information and access to more useful code formats:

5.2.3 - Create synthetic populations

Replication programs for constructing synthetic populations.

5.2.3.1 - Create a synthetic population of young people attending primary mental health services

We created a basic synthetic dataset of to represent a clinical youth mental health sample.

This below section renders an R Markdown program. The following alternative options may provide improved viewing experience, more contextual information and access to more useful code formats:

Introduction

This program generates a purely synthetic (i.e. fake - no trace of any real records) population that is reasonably representative of the input data we used for the utility mapping study described in the article https://doi.org/10.1101/2021.07.07.21260129.

No access to the real data is required in order to use this program - it is based on summary statistics (e.g. means and standard deviations of variables, correlation matrices). It should be noted however, that a different (and simpler) workflow can be implemented when you do have access to the source dataset (for example, by using the syn function from the synthpop package).

The output of this program is very similar but not identical to a fake dataset created by an earlier version of this program and which is saved in the “ymh_clinical_dict_r3.RDS” file from the https://doi.org/10.7910/DVN/HJXYKQ data repository.

Install required R packages

If you do not have the following packages already installed, uncomment and run the following lines.

# install.packages("faux")
# devtools::install_github("ready4-dev/ready4) 
# devtools::install_github("ready4-dev/youthvars) 
# devtools::install_github("ready4-dev/scorz) 
# devtools::install_github("ready4-dev/specific") 
# devtools::install_github("ready4-dev/TTU")
# devtools::install_github("ready4-dev/youthu")

Load the ready4 framework package.

library(ready4)

Specify parameters to generate outcome fake data

AQoL item response parameters

The first set of input data are the proportions for each allowed response for each of the twenty AQOL-6D questions at both baseline and followup.

aqol_items_prpns_tbs_ls <- list(bl_answer_props_tb = tibble::tribble(
    ~Question, ~Answer_1, ~Answer_2, ~Answer_3, ~Answer_4, ~Answer_5, ~Answer_6,
    "Q1", 0.35, 0.38, 0.16, 0.03, NA_real_,100, # Check item 5 in real data.
    "Q2", 0.28, 0.38, 0.18, 0.08, 0.04,100,
    "Q3", 0.78, 0.18, 0.03, 0.01, 0.0, 100,
    "Q4", 0.64, 0.23, 0.09, 0.0, 100, NA_real_,
    "Q5", 0.3, 0.48, 0.12, 0.05, 100, NA_real_,
    "Q6", 0.33, 0.48, 0.15, 100, NA_real_,NA_real_,
    "Q7", 0.44, 0.27, 0.11, 100, NA_real_, NA_real_,
    "Q8", 0.18, 0.29, 0.23, 0.21, 100, NA_real_,
    "Q9", 0.07, 0.27, 0.19, 0.37, 100, NA_real_,
    "Q10", 0.04, 0.15, 0.4, 0.25, 100, NA_real_,
    "Q11", 0.03, 0.13, 0.52, 0.25, 100, NA_real_,
    "Q12", 0.06, 0.21, 0.25, 0.34, 100, NA_real_,
    "Q13", 0.05, 0.25, 0.31, 0.28, 100, NA_real_,
    "Q14", 0.05, 0.3, 0.34, 0.25, 100, NA_real_,
    "Q15", 0.57, 0.25, 0.12, 100, NA_real_,NA_real_,
    "Q16", 0.48, 0.42, 0.06, 100, NA_real_, NA_real_,
    "Q17", 0.44, 0.3, 0.16, 0.07, 100, NA_real_,
    "Q18", 0.33, 0.38, 0.25, 0.04, 0.0, 100,
    "Q19", 0.33, 0.49, 0.16, 0.02, 0.0, 100,
    "Q20", 0.67, 0.21, 0.02, 100, NA_real_,NA_real_),
    fup_answer_props_tb = tibble::tribble(
    ~Question, ~Answer_1, ~Answer_2, ~Answer_3, ~Answer_4, ~Answer_5, ~Answer_6,
    "Q1", 0.51, 0.33, 0.12, 0.02, NA_real_, 100,
    "Q2", 0.36, 0.38, 0.16, 0.06, 0.02,100,
    "Q3", 0.81, 0.15, 0.04, 0.00, 0.0, 100,
    "Q4", 0.73, 0.18, 0.09, 0.0, 100, NA_real_,
    "Q5", 0.36, 0.42, 0.12, 0.05, 100, NA_real_,
    "Q6", 0.48, 0.40, 0.11, 100, NA_real_,NA_real_,
    "Q7", 0.57, 0.25, 0.09, 100, NA_real_, NA_real_,
    "Q8", 0.31, 0.33, 0.17, 0.12, 100, NA_real_,
    "Q9", 0.13, 0.35, 0.19, 0.23, 100, NA_real_,
    "Q10", 0.1, 0.21, 0.43, 0.16, 100, NA_real_,
    "Q11", 0.06, 0.25, 0.48, 0.18, 100, NA_real_,
    "Q12", 0.08, 0.27, 0.26, 0.25, 100, NA_real_,
    "Q13", 0.07, 0.37, 0.31, 0.19, 100, NA_real_,
    "Q14", 0.08, 0.37, 0.34, 0.15, 100, NA_real_,
    "Q15", 0.62, 0.23, 0.09, 100, NA_real_,NA_real_,
    "Q16", 0.52, 0.40, 0.06, 100, NA_real_, NA_real_,
    "Q17", 0.51, 0.28, 0.15, 0.06, 100, NA_real_,
    "Q18", 0.37, 0.35, 0.25, 0.03, 0.0, 100,
    "Q19", 0.43, 0.40, 0.16, 0.01, 0.0, 100,
    "Q20", 0.77, 0.21, 0.02, 100, NA_real_,NA_real_)) %>%
  youthvars::make_complete_prpns_tbs_ls()

Outcome variable correlation parameters

First we specify the names of variables we will be creating as outcome variables.

var_names_chr <- c("aqol6d_total_w","phq9_total","bads_total",
                   "gad7_total","oasis_total","scared_total","k6_total")

The next step is to specify the correlations between outcome variables (variables assumed to be ordered as in previous step) at baseline and follow-up timepoints.

cor_mat_ls <- list(matrix(c(1,-0.78,0.72,-0.67,-0.71,-0.65,-0.67,
                               NA,1,-0.73,0.69,0.66,0.63,0.71,
                               NA,NA,1,-.57,-0.64,-0.57,-0.65,
                               NA,NA,NA,1,0.74,0.70,0.63,
                               NA,NA,NA,NA,1,0.7,0.59,
                               NA,NA,NA,NA,NA,1,0.55,
                               NA,NA,NA,NA,NA,NA,1),7,7),
                    matrix(c(1,-0.81,0.72,-0.71,-0.73,-0.64,-0.68,
                        NA,1,-0.72,0.69,0.68,0.61,0.68,
                        NA,NA,1,-0.59,-0.61,-0.51,-0.61,
                        NA,NA,NA,1,0.75,0.71,0.6,
                        NA,NA,NA,NA,1,0.68,0.59,
                        NA,NA,NA,NA,NA,1,0.52,
                        NA,NA,NA,NA,NA,NA,1),7,7))

We now specify the univariate distribution parameters for each of the outcome variables.

synth_data_spine_ls <- list(cor_mat_ls = cor_mat_ls,
                            nbr_obs_dbl = c(1068,643),
                            timepoint_nms_chr = c("BL","FUP"),
                            means_ls = list(c(0.6,12.8,78.2, 10.4,8.1,34.2,12.2),
                                            c(0.7,9.8,89.4, 7.9,6.3,28.8,9.8)),
                            sds_ls = list(c(0.2,6.6,24.8,5.7,4.7,17.9,5.8),
                                          c(0.2,6.5,24.4,5.5,4.3,17.8,5.9)),
                            missing_ls = list(c(0,4,10,6,7,7,4),
                                              c(0,5,2,2,1,2,2)),
                            min_max_ls = list(c(0.03,1),
                                              c(0,27),
                                              c(0,150),
                                              c(0,21),
                                              c(0,20),
                                              c(0,82),
                                              c(0,24)),
                            discrete_lgl = c(F,rep(T,6)),
                            var_names_chr = var_names_chr,
                            aqol_tots_var_nms_chr = c(cumulative = "aqol6d_total_c",
                                                      weighted = "aqol6d_total_w"))

Generate fake data

Create fake outcome variable datasets

We now use the parameters we have just specified to create baseline and follow-up datasets with fake data for our nominated outcome variables.

aqol_scores_pars_ls <- list(means_dbl = c(44.5,40.6), 
                            sds_dbl = c(9.9,9.8),
                            corr_dbl = -0.95)
aqol6d_adol_pop_tbs_ls <- aqol_items_prpns_tbs_ls %>%
  scorz::make_aqol6d_adol_pop_tbs_ls(aqol_scores_pars_ls = aqol_scores_pars_ls,
                                     series_names_chr =  c("bl_outcomes_tb",
                                                           "fup_outcomes_tb"),
                                     synth_data_spine_ls = synth_data_spine_ls,
                                     temporal_cors_ls = list(aqol6d_total_w = 0.85))
#> Joining with `by = join_by(id, match_var_chr)`
#> Joining with `by = join_by(id)`
#> Joining with `by = join_by(id, match_var_chr)`
#> Joining with `by = join_by(id)`

Create fake descriptive variables

We now specify the names and statistical parameters of the variables we will be using in descriptive statistics. For this analysis we are not interested in capturing the joint distribution between these variables, so we only use univariate parameters.

descriptives_BL_tb <- tibble::tibble(fkClientID = aqol6d_adol_pop_tbs_ls$bl_outcomes_tb$fkClientID,
                                     round = c(1) ,
                                     d_age = rnorm(1068,18.1,3.3) %>% 
                                       purrr::map_dbl(~min(.x,25) %>% 
                                                        max(12)),
                                     d_gender = c(rep(1,653),
                                                  rep(2,359),
                                                  rep(3,39),
                                                  rep(NA_real_,17)) %>% 
                                       specific::scramble_xx() %>%
                                       factor(labels = c("Female","Male","Other")),
                                     d_sexual_ori_s = c(rep(1,738),
                                                        rep(2,289),
                                                        rep(NA_real_,41)) %>% 
                                       specific::scramble_xx() %>%
                                       factor(labels = c("Straight","Other")),
                                     Region = c(rep(1,671),
                                                rep(2,397)) %>% 
                                       specific::scramble_xx() %>%
                                       factor(labels = c("Metro","Regional")),
                                     CALD = c(rep(T,759),
                                              rep(F,169),
                                              rep(NA,140)) %>% 
                                       specific::scramble_xx(),
                                     d_studying_working = c(rep(1,405),
                                                            rep(2,167),
                                                            rep(3,305),
                                                            rep(4,159),
                                                            rep(NA_real_,32)) %>% 
                                       specific::scramble_xx() %>% 
                                       factor(labels = c("Studying only",
                                                         "Working only",
                                                         "Studying and working",
                                                         "Not studying or working")),
                                     c_p_diag_s = c(rep(1,182),
                                                    rep(2,264),
                                                    rep(3,332),
                                                    rep(4,237),
                                                    rep(NA_real_,53)) %>% 
                                       specific::scramble_xx() %>%
                         factor(labels = c("Depression", "Anxiety","Depression and Anxiety", "Other")),
                         c_clinical_staging_s = c(rep(1,625),
                                                  rep(2,326),
                                                  rep(3,86),
                                                  rep(NA_real_,31)) %>% 
                           specific::scramble_xx() %>%
                           factor(labels = c("0-1a","1b","2-4")),
                         c_sofas = c(rnorm(1068-30,65.2,9.5),
                                     rep(NA_real_,30)) %>% 
                           purrr::map_dbl(~min(.x,100) %>% 
                                            max(0)) %>% 
                           specific::scramble_xx(),
                         s_centre = NA_character_, 
                         d_agegroup = NA_character_, 
                         d_sex_birth_s = NA_character_, 
                         d_country_bir_s = NA_character_,
                         d_ATSI = NA_character_,
                         d_english_home = NA, 
                         d_english_native = NA, 
                         d_relation_s = c(rep(1,325),
                                          rep(2,426),
                                          rep(3,286),
                                          rep(NA_real_,31)) %>% 
                           specific::scramble_xx() %>%
                           factor(labels = c("REPLACE_ME_1",
                                             "REPLACE_ME_2",
                                             "REPLACE_ME_3")))  %>%
  dplyr::mutate(d_sex_birth_s = dplyr::case_when(is.na(d_gender) ~ NA_integer_,
                                                 as.integer(d_gender) %in% 
                                                   c(1L,2L) & 
                                                   runif(1068)>0.995 ~ as.integer(d_gender) %>%
                                                   purrr::map_int(~ ifelse(is.na(.x), 
                                                                           .x, 
                                                                           switch(.x,2L,1L,3L))),
                                                 as.integer(d_gender) == 3 ~ sample(c(1L,2L), 
                                                                                    1068, 
                                                                                    replace = T),
                                                 TRUE ~ as.integer(d_gender)
                                                 ) %>%
                  factor(labels = c("Female","Male")))
descriptives_FUP_tb <- descriptives_BL_tb %>% 
  dplyr::filter(fkClientID %in% 
                  aqol6d_adol_pop_tbs_ls$fup_outcomes_tb$fkClientID) %>%
  dplyr::mutate(round = 2,
                d_age = d_age + 0.25,
                Region = Region %>% 
                  specific::randomise_changes_in_fct_lvls(0.98),
                d_studying_working = d_studying_working %>%
                  specific::randomise_changes_in_fct_lvls(0.9),
                c_p_diag_s = c_p_diag_s %>% 
                  specific::randomise_changes_in_fct_lvls(0.90),
                c_clinical_staging_s = c_clinical_staging_s %>% 
                  specific::randomise_changes_in_fct_lvls(0.8),
                c_sofas = c_sofas + rnorm(643,4.7,10) %>% 
                         purrr::map_dbl(~min(.x,100) %>% max(0)))
bl_tb <- dplyr::inner_join(descriptives_BL_tb,
                           aqol6d_adol_pop_tbs_ls$bl_outcomes_tb) 
#> Joining with `by = join_by(fkClientID)`
fup_tb <- dplyr::inner_join(descriptives_FUP_tb,
                            aqol6d_adol_pop_tbs_ls$fup_outcomes_tb)
#> Joining with `by = join_by(fkClientID)`

We make some adjustments to ensure that the c_sofas variable is correlated with our aqol6d_total_w variable at both baseline and follow-up.

bl_tb <- bl_tb %>%
  dplyr::mutate(c_sofas = faux::rnorm_pre(bl_tb$aqol6d_total_w %>% 
                                            as.vector(), 
                                          mu = 65.2, 
                                          sd = 9.5, 
                                          r = 0.5, 
                                          empirical = T) %>% 
                  purrr::map_dbl(~min(.x,100) %>% max(0)))
fup_tb <- fup_tb %>%
  dplyr::mutate(c_sofas = faux::rnorm_pre(fup_tb$aqol6d_total_w %>% 
                                            as.vector(), 
                                          mu = 69.9, 
                                          sd = 10, 
                                          r = 0.5, 
                                          empirical = T) %>% 
                  purrr::map_dbl(~min(.x,100) %>% max(0)))

Combine datasets

We now add the fake outcome variables dataset to the fake descriptive variables dataset.

composite_tb <- dplyr::bind_rows(bl_tb, fup_tb) %>%
  dplyr::mutate(d_age = floor(d_age)) %>%
  dplyr::mutate(d_gender = d_gender %>% as.character() %>%
                  purrr::map_chr(~ifelse(.x=="Other",
                                         sample(c("Genderqueer/gender nonconforming/agender",
                                                              "Transgender"),1),
                                         .x)),
                s_centre = Region %>% as.character() %>%
                  purrr::map_chr(~ifelse(.x=="Metro",
                                         sample(c("Canberra","Southport","Knox"),1),
                                         "Regional Centre")),
                d_country_bir_s = CALD %>%
                  purrr::map_chr(~ifelse(.x,
                                         "Other",
                                         "Australia")), 
                       d_ATSI = CALD %>%
                  purrr::map_chr(~ifelse(.x,
                                         "Yes",
                                         "No")),
                       d_english_home = CALD %>%
                  purrr::map_chr(~ifelse(.x,
                                         "No",
                                         "Yes")), 
                       d_english_native = CALD %>%
                  purrr::map_chr(~ifelse(.x,
                                         "No",
                                         "Yes"))
                ) %>%
  dplyr::select(-CALD) %>%
  dplyr::select(-Region)
composite_tb <- composite_tb %>%
  dplyr::select(-setdiff(names(composite_tb)[startsWith(names(composite_tb),
                                                        "aqol6d_")],
                         names(composite_tb)[startsWith(names(composite_tb),
                                                        "aqol6d_q")]))
composite_tb <- composite_tb %>%
  dplyr::mutate(c_sofas = as.integer(round(c_sofas,0))) %>%
  dplyr::mutate(round = factor(round, labels = c("Baseline",
                                                 "Follow-up"))) %>%
  dplyr::mutate(d_relation_s = dplyr::case_when(d_relation_s %in% 
                                                  c("REPLACE_ME_1","REPLACE_ME_2") ~ 
                                                  "Not in a relationship",
                                                T ~ "In a relationship")) %>%
  youthu::add_dates_from_dstr(bl_start_date_dtm = Sys.Date() - lubridate::days(600),##
                              bl_end_date_dtm = Sys.Date() - lubridate::days(420),
                              duration_args_ls = list(a = 60, b = 140, mean = 90, sd = 10),
                              duration_fn = truncnorm::rtruncnorm,
                              date_var_nm_1L_chr = "d_interview_date") %>%
  dplyr::select(-duration_prd) %>%
  youthvars::transform_raw_ds_for_analysis() %>%
  dplyr::rename(phq9_total = PHQ9,
                bads_total = BADS,
                gad7_total = GAD7,
                oasis_total = OASIS,
                scared_total = SCARED,
                k6_total = K6,
                c_sofas = SOFAS) %>%
  dplyr::select(-c("d_agegroup","Gender", "CALD", "Region"))

6 - Subroutines (reporting templates)

Subroutines perform part of an analysis and reporting algorithm.

What are subroutines?

Sub-routines need to be called by parent programs that supply them with input data. Sub-routines can be called by multiple programs and will produce output that varies based on the input values they are supplied with. They are typically deployed to implement parts of a model’s analysis and reporting algorithm.

Why are they useful?

readyforwhatsnext model subroutines can be used primarily as templates to generate reports. They can be used in three ways:

to help execute a program or function written by a third party (in which case you probably won’t need to modify the subroutine and may not even be aware that it is being used);
to help execute a program or function that you write (in which case, you shouldn’t have to modify the subroutine, but may find it useful to customise it to your purposes); and
to serve as a template for subroutines you write yourself that perform similar tasks (in which case you will be rewriting the subroutine’s code).

Current readyforwhatsnext subroutines

Currently available readyforwhatsnext subroutines are summarised in the below table.

Subroutine	Release	Date	Description	Source
ms_tmpl	0.1.1.0	19-Apr-2022	A collection of files to provide a template for generating scientific manuscripts describing open source mental health systems models projects that use the ready4 framework.This release is a minor patch to correct an incorrectly specified version number.	Dev , Archive
mychoice_results	0.1	07-Nov-2022	Report results from a Discrete Choice Experiment implemented with the mychoice R package.	Dev, Archive
ttu_lng_ss	0.9.0.1	23-Jun-2023	This sub-routine program extends the R package TTU by providing a toolkit for automatically authoring a first draft of a scientific manuscript from results generated by TTU modules. The program is intended for use and as the last component of TTU’s reporting workflow for utility mapping modelling projects. An example of this workflow is available at: https://doi.org/10.5281/zenodo.6116077 . This program generalises a program that produced the manuscript for a real world study (https://doi.org/10.1101/2021.07.07.21260129). The program can produce manuscripts in PDF / LaTex (example - https://dataverse.harvard.edu/api/access/datafile/4957407) and Word (example - https://dataverse.harvard.edu/api/access/datafile/4957416). It should be noted that the Word output requires some manual editing to adapt section numbering, modify table headers and resize tables to page boundaries.This release fixes a bug that prevented the previous version from rendering.	Dev , Archive
ttu_mdl_ctlg	0.1.0.1	18-Jul-2023	A reporting template for utility mapping models created using the TTU package (https://ready4-dev.github.io/TTU/index.html). This release includes minor bug fixes.	Dev, Archive

Documentation

readyforhwhatsnext subroutines are currently minimally documented, typically in the form as notes contained in a README file in the source code bundle.

7 - Decision aids

User interfaces can make it easier to generate practical insight from readyforwhatsnext model modules.

7.1 - Predicting the spatial epidemiology of emerging mental disorders

We previously developed a user interface for the epidemiology modules of our Springtides model of places.

The Springtides app reproduced below is currently deprecated, pending a new version to be released in 2024. We don’t encourage use of this app to inform decision making as the input data has become dated and the current web based version often fails if generating large / or customised geometries. The app is reproduced below purely for illustrative purposes. If you try it out, we recommend that you only select the default type of geometry (“Select from a menu of existing options”) as the web version is not configured to render all bar the simplest custom geometries. To use the app you need to first confirm your selections in the “Where” tab, before confirming selections from the “What” box, then from the “Who” box and finally the “When” box before an orange box appears that gives you the option to generate a report. When trying this app out, we recommend keeping the number of simulations low (e.g. 10) as it takes several minutes for even small numbers of runs to execute.

8 - Contribute to our development

How to contribute to readyforwhatsnext.

8.1 - Our priorities

Our current list of priorities for the development of readyforwhatsnext shape when and how we need your help.

8.1.1 - Priority 1: Launch a Minimum Viable Product (MVP)

We want to give potential users confidence that they can appropriately apply readyforwhatsnext to their decision problems by bringing all our existing development release and unreleased software to production release status.

Why?

All our software, regardless of status is supplied without any warranty. However, our views about whether an item of software is potentially appropriate for others to use in undertaking real world analyses can be inferred from its release status. If it is not a production release, we probably believe that it needs more development and testing and better documentation before it can be used for any purpose other than the specific studies in which we have already applied it. Partly for this reason, it is unlikely that any item of our software will be widely adopted until it is available as a production release. We also cannot meaningfully track uptake of our software until it becomes available in a dedicated production release CRAN. We also need a critical mass of model modules available as production releases so that they can be combined to model moderately complex systems.

What?

Bringing an initial set of development version and pipeline libraries to production release, will constitute the launch ofa readyforwhatsnext Minimum Viable Product (MVP). The MVP will comprise an initial skeleton of production ready modules for modelling people, places, platforms and programs.

The most important types of help we need with achieving this goal are funding, code contributions, community support and advice.

How?

The main tasks to be completed to bring all of our existing code libraries to production releases is as follows:

(For unreleased software) Address all issues preventing public release of code repositories (e.g. fixing errors preventing core functions working, removing all traces of potentially confidential artefacts from all versions/branches of repository, etc.).
(For code libraries are implemented using only the functional programming paradigm) Author and test new modules.
Write / update unit tests (tests of individual functions / modules for multiple potential uses / inputs that will be automatically run every time a new version of a library is pushed to the main branch of its development release repository).
Enhance the documentation that is automatically authored by algorithms from the ready4 framework we use to author readyforwhatsnext. This will involve some or all of:

minor modifications of function names / arguments / code;
updating the taxonomies datasets used in the documentation writing algorithm; and/or
updating the documentation authoring algorithms (within the ready4fun and ready4class packages).

Adding human authored documentation for the modules contained in each library.
(For some libraries) Adding a user-interface.

When?

Our production releases will be submitted to the Comprehensive R Archive Network (CRAN). CRAN does not allow for submitted R packages to depend on development version R packages, so the dependency network of our code-libraries shapes the sequence in which we bring them to production releases.

How quickly we can launch production releases of all our code depends on how much and what type of help we get. Working within our current resources we expect the first of the 23 libraries listed to be released during late 2024 and the last during late 2026. With your help this release schedule can be sped up.

8.1.2 - Priority 2: Maintain readyforwhatsnext

We want the readyforwhatsnext to continually improve and update in response to the needs of potential users and stakeholders.

Why?

A significant limitation of many health economic models is that they are not updated and can become progressively less valid with time. The importance of maintaining a computational model increases if, like readyforwhatsnext, it is intended to have multiple applications and users. As we progressively make production releases to launch the MVP model, we intend that people will start using it. As readyforwhatsnext becomes more widely used, its limitations (errors, bugs, restrictive functionality and confusing / inadequate documentation) are more likely to become exposed and to require remediation. Addressing such issues needs to implemented skillfully and considerately to avoid unintended consequences on existing model users (e.g. to ensure software edits to fix one problem do not prevent previously written replication code or downstream dependencies from executing correctly). Open source projects like readyforwhatsnext also need to make changes in response to decisions by third parties - such as edits to upstream dependencies and changes in the policies of hosting repositories and to update citation / acknowledgement information to appropriately reflect new contributors.

What?

All readyforwhatsnext model software needs to be maintained and updated to identify and fix bugs, enhance functionality and usability, respond to changes in upstream dependencies and to conscientiously deprecate outdated code. Open access datasets made available for use in modelling analyses need to be actively curated to ensure they remain relevant to current decision contexts. Decision aids need to be reviewed and updated to ensure they continue to use the most up to date and appropriate modules and input data.

The most important types of help we need with this priority area are funding, code contributions, community support and advice.

How?

The main tasks for the maintenance of framework and model software are to:

Appropriately configure and update the settings of the ready4 GitHub organisation and its constituent repositories to facilitate easy to follow and efficient maintenance workflows.
Proactively:

author ongoing improvements to software testing, documentation and functionality;
make archived releases of key development milestones in the ready4 Zenodo community; and
submit new production releases to the Comprehensive R Archive Network (CRAN).

Reactively elicit, review and address feedback and contributions from readyforwhatsnext community (e.g. bugs, issues and feature-requests).

The main tasks for curating model data collections include:

Implementing ongoing improvements and updates to meta-data descriptors of data collections and individual files.
Facilitating the linking of datasets to and from the ready4 Dataverse.
Reviewing all collections within the ready4 Dataverse to identify datasets or files that are potentially out of date.
Creating and publishing new versions of affected datasets with the necessary additions, deletions and edits and updated metadata. Prior versions of data collections remain publicly available.
Informing the readyforwhatsnext community of the updated collections.

The main tasks for curating decision aids include:

Monitoring the repositories of the software and the data used by the app for important updates.
Deploying an updated app bundle of software and data to a test environment on Shinyapps.io.
Testing the new deployment and elicit user feedback.
Implementing any required fixes identified during testing.
Deploying the updated app to a Shinyapps.io production environment.
Informing the readyforwhatsnext community of the updated decision aid.

When?

Maintenance is an ongoing and current responsibility. Maintenance obligations are expected to grow considerably as we launch more production releases, extend the readyforwhatsnext model and grow a user community.

8.1.3 - Priority 3: Apply readyforwhatsnext to undertake replications and transfers

We want readyforwhatsnext to be used to implement replications and transfersof the original studies for which that software was developed.

Why?

Authoring of new readyforwhatsnext modules] can involve a significant investment of time and skills, an investment that is typically made in the context of implementing a modelling project for a scientific study. However, once authored, these modules may significantly streamline the implementation ofmay apply - more replications and generalisations mean more open access data and module customisations available to other users, enhancing the practical utility of readyforwhatsnext.

What?

We plan to demonstrate that studies implemented with readyforwhatsnext are relatively straightforward and efficient to replicate and transfer. The most important initial types of help we need with achieving this goal are funding, projects, code contributions and advice.

How?

The main tasks for implementing study replications and transfers are:

Identify the example study to be replicated or transferred.
Review that study’s analysis program:

do the data used in this program have similar structure / concepts / sampling to the data for which a new analysis is planned?
are modules used in that program from production release module libraries and do any of them require authoring of inheriting modules to selectively update aspects of module data-structures or algorithms?

Create a new input dataset, labelling and (for non-confidential data) storing the data in an online repository (which can be kept private for now).
(If new inheriting modules are required) Make a code contribution to create and test new inheriting modules.
Adapt the original study’s analysis program to account for differences in input data, model modules and study reporting.
Share new new analysis program in the ready4 Zenodo community.
Ensure the online model input dataset is made public and submit it as a Linked Dataverse Dataset in the appropriate section of the ready4 Dataverse.

When?

In most cases, we recommend waiting until production releases of relevant module libraries are available. However, we are currently planning or actively undertaking some initial study analysis transfers using the development versions of our utility mapping and choice modelling module libraries. We are undertaking this work in parallel with testing and, where necessary, extending the required modules. We suggest that, should you believe that any of our development version software is potentially relevant to a study you wish to undertake, you first get in touch with our project lead to discuss the pros / cons and timing of using this software.

8.1.4 - Priority 4: Grow a user community

We want to develop a community of readyforwhatsnext users, contributors and stakeholders to sustain the development, maintenance, application, extension and impact of the project.

Why?

readyforwhatsnext is open source because we believe that transparent and collaborative approaches to model development are more likely to produce transparent, reusable and updatable models. No one modelling team has the resources or breadth of expertise and diversity of values to adequately address all of the important decision topics in youth mental health systems design and policy. Opportunities for modellers to test, re-use, update and combine each other’s work help make modelling projects more valid and tractable. Models have become increasingly complex, so simply publishing model code and data may have limited impact on improving model transparency. These aretefacts also need to be understood and tested. Clear documentation and frequent re-use in different contexts by multiple types of stakeholder make it more likely that errors and limitations can be exposed and remedied. Decentralising ownership of a model to an active community can help sustain the maintenance and extension of a model over the long term and mitigate risks and bottlenecks associated with dependency on a small number of team members.

What?

Our aim is to enhance the resilience, quality, legitimacy and impact of readyforwhatsnext by developing a community of users and contributors. The most important initial types of help we need with achieving this goal are funding, community support and advice.

How?

The process of developing the readyforwhatsnext community involves the following tasks:

Creating and recruiting to volunteer advisory structures to elicit guidance on strategic, technical and conceptual topics.
Enhancing the ease of use for third parties of existing framework authoring tools.
Developing improved documentation and collateral (e.g. video tutorials) for readyforwhatsnext model modules and datasets.
Configuring hosting repositories to implement clear collaborative development workflows.
Promoting readyforwhatsnext to potential users and stakeholders.
Continually expanding, diversifying and updating the authorship and maintenance responsibilities of all readyforwhatsnext software.

When?

The speed at which we undertake activities to grow a user community depends on our success at securing funding to provide required support infrastructure.

8.1.5 - Priority 5: Extend the scope of the readyforwhatsnext model

We want progressively extend the capability of readyforwhatsnext to explore new economic topics in youth mental health.

Why?

We hope that once launched, the readyforwhatsnext MVP systems model will be transparent, reusable and updatable and usful for addressing some important topics in youth mental health. However, there will inevitably be a much greater number of topics for which that the MVP model lacks the scope to adequately address. The two main scope limitations of the MVP model are expected to be omissions and level of abstraction. Some relevant system features will be ommitted from representation by the MVP model - for example our pipeline of platforms modules does not currently include any planned modules for modelling the operations of digital mental health services or schools. System features that are represented in the MVP model may only have one level of abstraction, which may be either too simple or too complex to be appropriately applied to some modelling goals.

What?

We plan to progressively extend the scope of readyforwhatsnext and the range of decision topics to which it can validly be applied. The most important initial types of help we need to achieve this goal are funding, projects and advice.

How?

The two main strategies for extending readyforwhatsnext are to translate existing models and develop new models. The process for developing new models is outlined elsewhere as the steps required to undertake a modelling project.

Translating existing models involves the following steps:

Identify existing computational model(s) of relevant youth mental health systems to be redeveloped using the ready4 framework. Processes for identifying models could include:

A modelling team reviewing some of the models that they have previously implemented using other software; and/or
A systematic search of published literature and/or model repositories.

(Optional - only if a single project plans to redevelop multiple models) Develop a data extraction tool into which data on relevant model features will be collated and categorised.
Extract data on relevant model features. In the (highly likely) event that the reporting and documentation of the model being redeveloped lacks important details:

Contact the original model authors for assistance; and/or
Seek relevant advice to help determine plausible / appropriate values for missing data.

Author module libraries for representing the included model(s).
Author labelled open access datasets of model input data (which can be set to private for now).
Author analysis and reporting programs designed to replicate the original modelling study / studies.
Compare results from original and replication analyses. Ascertain the most plausible explanations for any divergence between results. Where this explanation relates to an error or limitation in the new readyforwhatsnext modules or analysis programs that have been authored, fix these issues.
Complete documentation of model libraries, datasets and analyses.
(If not already done) Publish / link to datasets on the ready4 Dataverse and share releases of libraries and programs in the ready4 Zenodo community.

When?

As our current focus in on developing the MVP model, we are not yet actively pursuing this priority. That will change if we are successful in securing more support from funders. In the mean time, if you are a researcher and/or modeller who is interested in leading a project that can help extend readyforwhatsnext, you can contact our project lead for guidance and/or to discuss the potential for collaborations.

8.1.6 - Priority 6: Integrate readyforwhatsnext software with other open source tools

We want coders and modellers working in languages such as python to be able to readily use and contribute to readyforwhatsnext.

Why?

Currently all readyforwhatsnext model software is developed using the R language. Although R is powerful, popular and flexible, there are limitations to relying on this toolkit alone. For some tasks, tools written in other languages provide superior performance. Requiring coders to have knowledge of R erects barriers to participation that thus the rate and quality of ready4’s development.

What?

We aim to support and integrate the development and use of tools to implement and extend the readyforhwatsnext in multiple languages, with an initial focus on python. The most important initial types of help we need with achieving this goal are advice, funding and code contributions.

How?

This is a longer term program of activity that has yet to be planned. We expect the first step in this process will be convening an advisory group of interested stakeholders to help us identify appropriate actions.

When

We have no active plans to progress this during our current 2023-2025 activity cycle. However, we are open to providing whatever support and guidance we can to researchers and organisations who are interested in leading a project of this nature.

8.2 - How to help

There are a number of ways you can contribute to readyforwhatsnext.

8.2.1 - Provide advice

ready4 needs the guidance of community members, decision-makers and technical experts to shape its development.

What?

We need advice:

to help review and update our priority goals and develop, refine and implement strategies for achieving these goals;
to help plan and execute modelling projects that produce transparent, reusable and updatable computational models; and
to identify how our existing software and data can be usefully improved.

Who?

We wan advice from our users (coders, modellers and planners), stakeholders (funders, researchers and community members) and other supporters (those with relevant expertise in technical communication, building open source communities, product development, etc).

How?

Advice can be provided by:

Participate in the advisory structures and events of individual modelling projects. The nature of these opportunities will vary by project and the team responsible for implementing each project.
Flag software features, usability and documentation issues. If you have the capacity and willingness to also fix the issues you can approach this using the process for making a code contribution. Otherwise, you can do so by creating an issue on that software projects repository in our GitHub organisation. For example, to create a new issue relating to the ready4 foundation library, use https://github.com/ready4-dev/ready4/issues/new (you will need a GitHub account).

8.2.2 - Contribute code

Help improve the reliability, functionality and ease of use of ready4 software.

What?

Test, improve or extend our software. This is essential to us achieving our following priority goals:

Priority 1. Launching the readyforwhatsnext MVP systems model.
Priority 2. Maintaining readyforwhatsnext.
Priority 3. Applying readyforwhatsnext.
Priority 4. Growing a readyforwhatsnext community.

Who?

To make a code contribution, you need to be a coder familiar with R, R Markdown and git. You will also need a GitHub account. For many types of contribution, you will also need to use our software framework. We have yet to adequately document and refine these tools to make them easier for third parties to use (we plan to do this), so if you are interested in making anything other than a relatively minor code edit, we recommend that you first contact our project lead to discuss your idea.

As a contributor to readyforwhatsnext, you will also be expected to adhere to the

How ?

The process for making a code contribution, broadly conforms to the steps we itemise below, that we have minimally adapted from this template. If you need further help to make a contribution, you can contact the ready4 project lead directly.

Find an issue that you are interested in addressing or a feature that you would like to add. Ideally consider how your planned contribution matches our current priorities.
Fork the repository associated with the issue from our GitHub organization to your local GitHub organization. This means that you will have a copy of the repository under your-GitHub-username/repository-name.
Clone the repository to your local machine using:

git clone https://github.com/github-username/repository-name.git

Create a new branch for your fix using:

git checkout -b branch-name-here

Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
To add the file contents of the changed files to the “snapshot” git uses to manage the state of the project, also known as the index, use:

git add insert-paths-of-changed-files-here

To store the contents of the index with a descriptive message, use:

git commit -m "Insert a short message of the changes made here"

Push the changes to the remote repository using:

git push origin branch-name-here

Submit a pull request to the upstream repository.
Title the pull request with a short description of the changes made and the issue or bug number associated with your change. For example, you can title an issue like so “Added more log outputting to resolve #4352”.
In the description of the pull request, explain the changes that you made, any issues you think exist with the pull request you made, and any questions you have for the maintainer. It’s OK if your pull request is not perfect (no pull request is), the reviewer will be able to help you fix any problems and improve it!
Wait for the pull request to be reviewed by a maintainer.
Make changes to the pull request if the reviewing maintainer recommends them.
Celebrate your success after your pull request is merged!

8.2.3 - Fund projects

Help us secure our future and accelerate our development.

What?

Provide cash or in-kind resources to support us to achieve any or all of our priority goals:

Priority 1. Launching the readyforwhatsnext MVP systems model.
Priority 2. Maintaining readyforwhatsnext.
Priority 3. Applying readyforwhatsnext.
Priority 4. Growing a user community.
Priority 5. Extending readyforwhatsnext.
Priority 6. Integrating readyforwhatsnext with other tools.

Who?

We are seeking support from multiple different types of funder. At this early stage of our development we would expect that the most impactful way of supporting ready4’s development will be to award funding for that purpose directly to Monash University. Other ways to support ready4 will be to fund readyforwhatsnext modelling projects led by other research institutions and which may or may not be formally affiliated with us.

How?

The two main categories of funding we seek are:

Core infrastructure. Essential to the success of priorities 1-2 and 3-6 above is adequately resourced support infrastructure. Financial support we receive for this purpose will primarily be dedicated to recruit a skilled team of data scientists (coders), modellers, technical documentation / training developers, community builders and stakeholder managers. Other important resource requirements relate to licensing appropriate technical solutions (hosting, security, workflow optimisation, etc) to support the readyforwhatsnext community.
Modelling projects To advance priorities 3 and 5 above, teams with high quality plans to undertake modelling projects with readyforwhatsnext need to be backed with financing. Typically funding provided to these types of projects will be primarily spent on employing modellers, data-scientists and other researchers and on supporting processes to meaningfully engage community members, planners and other stakeholders.

If you would like to invite a funding proposal from ready4, contact the project lead.

8.2.4 - Undertake projects

Plan, conduct and disseminate readyforwhatsnext modelling projects.

What?

A readyforwhatsnext modelling project undertakes novel analysis of youth mental health topics by using, enhancing and/or authoring model modules, datasets and executables. Each ready4 modelling project has its own unique funder(s), governance, objectives and team. The links between modelling projects are in the form of a common framework.

Undertaking modelling projects will help us achieve our following priority goals:

Priority 3. Applying readyforwhatsnext.
Priority 5. Extending readyforwhatsnext.

Who?

Modelling projects should typically be led by a researcher who may or may not be a modeller. The core project team will always include modelling expertise and, should authorship of new modules (or extensions to existing modules) be required, will also need to include coders. Advisory structures to engage community members and planners are also recommended.

How?

There are three main steps in implementing a ready4 modelling project.

Step 1: Develop model

Each project’s computational modelis constructed by adopting one or more of the following strategies:

selecting a subset of existing readyforwhatsnext modules and using them in unmodified form;
selecting a subset of existing readyforwhatsnext modules and contributing code edits to these modules to add desired functionality;
selecting a subset of existing readyforwhatsnext modules and using them as templates from which to author new inheriting modules (which can be code contributions to an existing module library or distributed as part of a new library; and/or
authoring new ready4 modules (most likely to be distributed in new code libraries).

As part of the validation and verification process for all new and derived modules, tests should be defined, bundled as part of the relevant module libraries and rerun every time these libraries are edited.

Step 2: Add data

By data we typically mean digitally stored information, principally relating to model parameter values, that can be added to the ready4 computational model to tailor it to a specific decision context (e.g. a particular population / jurisdiction / service / intervention) and set of underpinning beliefs (e.g. preferred evidence sources). Data for a ready4 modelling project will be from one or both of the following options:

finding and using existing open access data from other ready4 projects;
supplying new project specific data, appropriately describing these data and (for non-confidential records) sharing these data publicly.

Step 3: Run analyses

ready4 project analyses apply algorithms contained in ready4 modules to supplied data to generate insight and can be implemented by:

adapting existing replication programs;
authoring new analysis programs; and / or
developing a user-interface to allow non-technical users to run custom analyses.

When reporting analyses, using a reporting template can be useful.

8.2.5 - Support the readyforwhatsnext community

Help develop high quality, clear and comprehensive documentation, instruction and responsive help.

What?

Help other members of the readyforwhatsnext community to apply ready4 by authoring documentation, developing training and posting answers in online help. This support is essential for us to advance the following project goals:

Priority 2. Maintaining readyforwhatsnext.
Priority 4. Growing a user community.
Priority 5. Extending readyforwhatsnext.

Who?

Any community member (user or other stakeholders) can help us to improve the accessibility, clarity and usefulness of our documentation. Coders and modellers are particularly welcome to contribute support that leverages their technical expertise.

How?

The types of support that we welcome contributions on include:

Improving the documentation contained on this website. To do this, you will need a GitHub account. Once you have that, you can:

flag a general issue and suggest improvements by clicking on the “Create documentation issue” link or visiting https://github.com/ready4-dev/ready4web/labels/documentation ; and/or
suggest edits to a specific page by clicking on the “Edit this page” link.

Improve the documentation for specific library, executable or dataset:

for software documentation edits, you can use the same workflow as that for making a code contribution; and
for improvements to dataset documentation, we have yet to set up a streamlined workflow for this process, so for moment please contact the ready4 project lead directly if you ar interested in making this type of contribution.

Contributing to developing other training and support resources (e.g. answering questions in online help, video turorials, etc). We believe that this type of content is most likely to become relevant when we have made more progress in developing the readyforwhatsnext community. But again, if you are interested in this area, please contact the project lead to discuss.

8.3 - Contributor covenant (code of conduct)

To foster an inclusive and respectful community, all contributors to readyforwhatsnext are expected to adhere to the Contributor Covenant.

Our pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.

Our standards

Examples of behavior that contributes to a positive environment for our community include:

Demonstrating empathy and kindness toward other people
Being respectful of differing opinions, viewpoints, and experiences
Giving and gracefully accepting constructive feedback
Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
Focusing on what is best not just for us as individuals, but for the overall community

Examples of unacceptable behavior include:

The use of sexualized language or imagery, and sexual attention or advances of any kind
Trolling, insulting or derogatory comments, and personal or political attacks
Public or private harassment
Publishing others’ private information, such as a physical or email address, without their explicit permission
Other conduct which could reasonably be considered inappropriate in a professional setting

Enforcement responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement. All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the reporter of any incident.

Enforcement guidelines

Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:

1. Correction

Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.

Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.

2. Warning

Community Impact: A violation through a single incident or series of actions.

Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.

3. Temporary ban

Community Impact: A serious violation of community standards, including sustained inappropriate behavior.

Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.

4. Permanent ban

Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.

Consequence: A permanent ban from any sort of public interaction within the community.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.

Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.

For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.

Unique identifier	Data collection round	Date of data collection	Age	Gender (grouped)	Sex at birth	Sexual orientation	Relationship status	Aboriginal or Torres Strait Islander	Culturally And Linguistically Diverse	Region of residence (metropolitan or regional)	Education and employment status	EQ5D - Mobility domain score	EQ5D - Self-Care domain score	EQ5D - Usual Activities domain score	EQ5D - Pain / Discomfort domain score	EQ5D - Anxiety / Depression domain score	Kessler Psychological Distress - 10 Item Total Score	Overall Wellbeing Measure (Winefield et al. 2012)	EuroQol (EQ-5D) - (weighted total)	EuroQol (EQ-5D) - (unweighted total)
1	BL	2019-10-22	14	Male	Male	Heterosexual	In a relationship	No	No	Metro	Not studying or working	1	1	1	1	2	11	87	0.879	6
2	BL	2019-10-17	19	Female	Female	Heterosexual	In a relationship	Yes	Yes	Regional	Studying only	1	2	1	1	1	14	65	0.846	6
2	FUP	2020-02-14	19	Female	Female	Heterosexual	In a relationship	Yes	Yes	Regional	Studying only	3	1	1	1	1	10	71	0.850	7
3	BL	2020-02-15	21	Female	Female	Other	Not in a relationship	NA	NA	Metro	Studying only	1	1	3	1	1	13	74	0.883	7
3	FUP	2020-06-14	21	Female	Female	Other	Not in a relationship	NA	NA	Metro	Studying only	1	1	2	1	1	10	64	0.906	6
4	BL	2019-12-14	12	Female	Female	Heterosexual	In a relationship	Yes	Yes	Metro	Not studying or working	1	1	1	3	1	18	40	0.796	7

fkClientID	study_arm_chr	match_idx_int	date_psx_Baseline	date_psx_Follow-up	duration_prd_Baseline	duration_prd_Follow-up	costs_dbl_Baseline	costs_dbl_Follow-up	PHQ9_Baseline	PHQ9_Follow-up	SOFAS_Baseline	SOFAS_Follow-up	AQoL6D_HU_Baseline	AQoL6D_HU_Follow-up	PHQ9_change_dbl_Baseline	PHQ9_change_dbl_Follow-up	SOFAS_change_dbl_Baseline	SOFAS_change_dbl_Follow-up	AQoL6D_HU_change_dbl_Baseline	AQoL6D_HU_change_dbl_Follow-up	qalys_dbl_Baseline	qalys_dbl_Follow-up
Participant_10	Control	243	2023-04-19	2023-10-13	0S	177d 0H 0M 0S	647.9386	1696.235	8	10	61	64	0.7597988	0.6079774	0	2	0	3	0	-0.1518214	0	0.3314119
Participant_1000	Control	191	2023-06-15	2023-12-16	0S	184d 0H 0M 0S	428.9205	1619.037	4	2	63	82	0.8459579	0.7688131	0	-2	0	19	0	-0.0771448	0	0.4067322
Participant_1001	Intervention	230	2023-05-10	2023-11-05	0S	179d 0H 0M 0S	429.3703	1844.219	10	14	59	72	0.6138300	0.8607305	0	4	0	13	0	0.2469005	0	0.3613228
Participant_1003	Intervention	115	2023-06-08	2023-12-07	0S	182d 0H 0M 0S	395.1637	1537.365	9	0	71	81	0.5808015	0.9315788	0	-9	0	10	0	0.3507773	0	0.3768011
Participant_1005	Intervention	183	2023-09-09	2024-03-13	0S	186d 0H 0M 0S	402.9910	1826.511	17	0	78	88	0.5460607	0.9593811	0	-17	0	10	0	0.4133204	0	0.3833158
Participant_1006	Intervention	219	2023-10-05	2024-04-01	0S	179d 0H 0M 0S	534.2285	2401.478	9	14	75	73	0.7239490	0.5885972	0	5	0	-2	0	-0.1353518	0	0.3216232

Documentation

1 - Overview

What is readyforwhatsnext?

What makes readyforwhatsnext model modular?

What is it being used for?

Can I use it?

Why is it a prototype?

Can I help?

Where should I go next?

2 - Examples

3 - Getting started

3.1 - System requirements

3.2 - Installing readyforwhatsnext model modules

Before you install

Installation

Configuration

Try it out!

3.3 - Terms of use

3.3.1 - Open source licensing

3.3.2 - Citing readyforwhatsnext

3.3.3 - Disclaimer

3.4 - The readyforwhatsnext model

3.4.1 - Modules for modelling people

3.4.2 - Modules for modelling places

3.4.3 - Modules for modelling platforms

3.4.4 - Modules for modelling programs

3.5 - Modules pipeline

3.5.1 - Pipeline of people modules

3.5.2 - Pipeline of places modules

3.5.3 - Pipeline of platforms modules

3.5.4 - Pipeline of programs modules

4 - Tutorials

4.1 - Find model modules and datasets

4.1.1 - Finding specific modules and sub-modules

Motivation

Implementation

Use

Related content

4.1.2 - Model module libraries

Module libraries for modelling people

Module libraries for modelling places

Module libraries for modelling programs

4.1.3 - Find open access model data

4.2 - Use readyforwhastsnext model modules

4.2.1 - Add metadata to datasets of individual human records

Ingest data

Add metadata

YouthvarsProfile methods

Inspect data

YouthvarsSeries methods

Validate data

Inspect data

Share data

4.2.2 - Validate variable total scores

Variable classes and data integrity

Included sub-module classes

Assessment of Quality of Life Six Dimension (Adolescent) Health Utility

Child Health Utility Nine Dimension - Australian Adolescent Scoring

Behavioural Activation for Depression Scale (BADS)

Generalised Anxiety Disorder Scale (GAD-7)

Kessler Psychological Distress Scale (K6) - Australian Scoring System

Kessler Psychological Distress Scale (K6) - US Scoring System

Kessler Psychological Distress Scale (K10) - Australian Scoring System

Kessler Psychological Distress Scale (K10) - US Scoring System

Overall Anxiety Severity and Impairment Scale (OASIS)

Patient Health Questionnaire (PHQ-9)

Screen for Child Anxiety Related Disorders (SCARED)

Social and Occupational Functioning Assessment Scale (SOFAS)

4.2.3 - Standardise Variable Values With Fuzzy Logic And Correspondence Tables

In brief

Create project

Supply seed dataset

Specify standards

Compare variable of interest values from seed and standards dataset.

Standardise variable values

4.2.4 - Standardise Variable Values With Lookup Codes

In brief

Create project

Supply seed dataset

Specify standards