Run CohortContrast Analysis
Usage
CohortContrast(
cdm,
targetTable = NULL,
controlTable = NULL,
pathToResults = getwd(),
domainsIncluded = c("Drug", "Condition", "Measurement", "Observation", "Procedure",
"Visit", "Visit detail", "Death"),
prevalenceCutOff = 10,
topK = FALSE,
presenceFilter = 0.005,
complementaryMappingTable = NULL,
runChi2YTests = TRUE,
runLogitTests = TRUE,
getAllAbstractions = FALSE,
getSourceData = FALSE,
maximumAbstractionLevel = 5,
createOutputFiles = TRUE,
complName = NULL,
runRemoveTemporalBias = FALSE,
removeTemporalBiasArgs = list(),
runAutomaticHierarchyCombineConcepts = FALSE,
automaticHierarchyCombineConceptsArgs = list(),
runAutomaticCorrelationCombineConcepts = FALSE,
automaticCorrelationCombineConceptsArgs = list(),
numCores = max(1L, ceiling(0.2 * parallel::detectCores()), na.rm = TRUE)
)Arguments
- cdm
Connection to database
- targetTable
Table for target cohort (tbl)
- controlTable
Table for control cohort (tbl)
- pathToResults
Path to the results folder, can be project's working directory
- domainsIncluded
list of CDM domains to include
- prevalenceCutOff
numeric > if set, removes all of the concepts which are not present (in target) more than prevalenceCutOff times
- topK
numeric > if set, keeps this number of features in the analysis. Maximum number of features exported.
- presenceFilter
numeric > if set, removes all features represented less than the given percentage
- complementaryMappingTable
Mappingtable for mapping concept_ids if present, columns CONCEPT_ID, CONCEPT_NAME, NEW_CONCEPT_ID, NEW_CONCEPT_NAME, ABSTRACTION_LEVEL, TYPE
- runChi2YTests
Boolean for running the CHI2Y test (chi-squared test for two proportions with Yates continuity correction).
- runLogitTests
boolean for logit-tests
- getAllAbstractions
boolean for creating abstractions' levels for the imported data, this is useful when using GUI and exploring data
- getSourceData
boolean for fetching source data
- maximumAbstractionLevel
Maximum level of abstraction allowed
- createOutputFiles
Boolean for creating output files, the default value is TRUE
- complName
Name of the output study directory
- runRemoveTemporalBias
Logical; when `TRUE`, runs `removeTemporalBias()` as an optional post-processing step.
- removeTemporalBiasArgs
A list of additional arguments passed to `removeTemporalBias()` (for example `ratio`, `alpha`, `domainsIncluded`, `removeIdentified`). Missing arguments default to `ratio = 1`, `alpha = 0.05`, `domainsIncluded = NULL`, and `removeIdentified = TRUE`.
- runAutomaticHierarchyCombineConcepts
Logical; when `TRUE`, runs `automaticHierarchyCombineConcepts()` after temporal-bias processing.
- automaticHierarchyCombineConceptsArgs
A list of additional arguments passed to `automaticHierarchyCombineConcepts()` (for example `abstractionLevel`, `minDepthAllowed`, `allowOnlyMinors`). Missing arguments default to `abstractionLevel = -1`, `minDepthAllowed = 0`, and `allowOnlyMinors = TRUE`.
- runAutomaticCorrelationCombineConcepts
Logical; when `TRUE`, runs `automaticCorrelationCombineConcepts()` after hierarchy combining.
- automaticCorrelationCombineConceptsArgs
A list of additional arguments passed to `automaticCorrelationCombineConcepts()` (for example `abstractionLevel`, `minCorrelation`, `maxDaysInBetween`, `heritageDriftAllowed`). Missing arguments default to `abstractionLevel = -1`, `minCorrelation = 0.7`, `maxDaysInBetween = 1`, and `heritageDriftAllowed = FALSE`.
- numCores
Number of cores to allocate to parallel processing. Defaults to 20 percent of detected cores (minimum 1).
Value
A CohortContrastObject. This is a list with the main analysis tables data_patients, data_initial, data_person, data_features, conceptsData, complementaryMappingTable, selectedFeatureData, trajectoryDataList, and config. Together these components contain the processed cohort-level, person-level, feature-level, optional mapping, and configuration outputs produced by the workflow.
Examples
# \donttest{
if (requireNamespace("CDMConnector", quietly = TRUE) &&
requireNamespace("DBI", quietly = TRUE) &&
requireNamespace("duckdb", quietly = TRUE) &&
nzchar(Sys.getenv("EUNOMIA_DATA_FOLDER")) &&
isTRUE(tryCatch(
CDMConnector::eunomiaIsAvailable("GiBleed"),
error = function(...) FALSE
))) {
pathToJSON <- system.file(
"example", "example_json", "diclofenac",
package = "CohortContrast"
)
con <- DBI::dbConnect(
duckdb::duckdb(),
dbdir = CDMConnector::eunomiaDir("GiBleed")
)
cdm <- CDMConnector::cdmFromCon(
con = con,
cdmName = "eunomia",
cdmSchema = "main",
writeSchema = "main"
)
targetTable <- cohortFromJSON(pathToJSON = pathToJSON, cdm = cdm)
controlTable <- createControlCohortInverse(cdm = cdm, targetTable = targetTable)
result <- CohortContrast(
cdm = cdm,
targetTable = targetTable,
controlTable = controlTable,
pathToResults = tempdir(),
prevalenceCutOff = 1,
topK = 3,
presenceFilter = FALSE,
runChi2YTests = TRUE,
runLogitTests = TRUE,
createOutputFiles = FALSE,
numCores = 1
)
head(result$data_features)
DBI::dbDisconnect(con, shutdown = TRUE)
}
# }