Setup
a00_introduction.Rmd
Introduction
The CohortContrast package is designed to facilitate cohort exploration. It accepts a target cohort of any size from your OMOP CDM instance and allows you to provide a custom control cohort or generate one using matching or inverse controls. The package performs concept-level enrichment analysis and provides a handy GUI for visualization, mapping, and trajectory creation
Initiating database connection
The CDMConnector
package is used to establish the database connection for running
CohortContrast
. You can configure the connection either by
reading credentials from a .Renviron
file or explicitly
writing them in your script.
################################################################################
#
# Initiate the database connection
#
#################################################################################
user <- Sys.getenv("DB_USERNAME")
pw <- Sys.getenv("DB_PASSWORD")
server <- stringr::str_c(Sys.getenv("DB_HOST"), "/", Sys.getenv("DB_NAME"))
port <- Sys.getenv("DB_PORT")
cdmSchema <-
Sys.getenv("OHDSI_CDM") # Schema which contains the OHDSI Common Data Model
cdmVocabSchema <-
Sys.getenv("OHDSI_VOCAB") # Schema which contains the OHDSI Common Data Model vocabulary tables.
cdmResultsSchema <-
Sys.getenv("OHDSI_RESULTS") # Schema which contains "cohort" table (is not mandatory)
writeSchema <-
Sys.getenv("OHDSI_WRITE") # Schema for temporary tables, will be deleted
writePrefix <- "cc_"
db = DBI::dbConnect(
RPostgres::Postgres(),
dbname = Sys.getenv("DB_NAME"),
host = Sys.getenv("DB_HOST"),
user = Sys.getenv("DB_USERNAME"),
password = Sys.getenv("DB_PASSWORD"),
port = port
)
cdm <- CDMConnector::cdmFromCon(
con = db,
cdmSchema = cdmSchema,
achillesSchema = cdmResultsSchema,
writeSchema = c(schema = writeSchema, prefix = writePrefix),
)
Building a target cohort
Let´s say we want to explore our cohort of breast cancer patients. We can import this target cohort multiple ways:
1. Target cohort from OHDSI OMOP database.
If your cohort is defined in ATLAS, you can use its unique cohort ID. Ensure the cohort is generated within ATLAS on the CDM instance.
################################################################################
#
# Create target table from OMOP CDM instance (ATLAS's cohort id)
#
#################################################################################
cohortsTableName = 'cohort'
targetCohortId = 1403
targetTable <- CohortContrast::cohortFromCohortTable(cdm = cdm, db = db,
tableName = cohortsTableName, schemaName = cdmResultsSchema, cohortId = targetCohortId)
2. Target cohort from JSON description file.
If you have the JSON expression of the cohort (exportable from ATLAS), you can import the cohort directly.
################################################################################
#
# Create target table from a JSON
#
#################################################################################
pathToJSON = '/Users/markushaug/UT/R-packages/Develop/Git/CohortContrast/tests/testthat/inst/JSON'
targetTable <- CohortContrast::cohortFromJSON(pathToJSON = pathToJSON, cdm)
3. Target cohort from a CSV file.
We can import the target table from a CSV. The file should have the same columns as present inside a cohort table (cohort_definition_id, subject_id, cohort_start_date, cohort_end_date).
################################################################################
#
# Create target table from a CSV
#
#################################################################################
pathToCsv = '/Users/markushaug/UT/R-packages/Develop/Git/CohortContrast/tests/testthat/inst/CSV/cohort/cohort.csv'
targetTable <- CohortContrast::cohortFromCSV(pathToCsv = pathToCsv, cohortId = 2)
4. Target cohort from a table
We can also use a table in our memory.
################################################################################
#
# Create target table
#
#################################################################################
library(tibble)
#Create the dataframe
data <- tribble(
~cohort_definition_id, ~subject_id, ~cohort_start_date, ~cohort_end_date,
1, 4804, '1997-03-23', '2018-10-29',
1, 4861, '1982-06-02', '2019-05-23',
1, 1563, '1977-06-25', '2019-04-20',
1, 2830, '2006-08-11', '2019-01-14',
1, 1655, '2004-09-29', '2019-05-24',
2, 5325, '1982-06-02', '2019-03-17',
2, 3743, '1997-03-23', '2018-10-07',
2, 2980, '2004-09-29', '2018-04-01',
2, 1512, '2006-08-11', '2017-11-29',
2, 2168, '1977-06-25', '2018-11-22'
)
targetTable <- cohortFromDataTable(data = data, cohortId = 2)
For cases 2-4 you do not have to specify the cohortId
parameter inside the function call, but when multiple cohorts present it
is advised.
Building a control cohort
The control cohort is a cohort that the target cohort is compared against. This means we will check the proportions of each concept occurrence between the two cohorts. The result of the analysis is heavily dependant on both of them, therefore they should be selected with care.
The control cohort can be generated the same way the target cohort
has been shown to be generated in examples 1-4. But there are a few
automatic ways the package CohortContrast
provides:
1. Control cohort based on matches
One of the scientific approaches is to select matches from the
database (based on age and sex). There are parameters such as
ratio
which shown how many controls we want for each case.
Also min
(at least that many matches error otherwise) and
max
(at maximum n matches) parameters can be used.
################################################################################
#
# Create control cohort table based on matches
#
#################################################################################
controlTable = CohortContrast::createControlCohortMatching(cdm = cdm, targetTable = targetTable, ratio = 4)
2. Control cohort based on inverse controls
Sometimes it makes the most sense to use inverse controls (the same subjects during observation period not described in target cohort). This is the case for example if we want to see the contrast after a diagnosis.
################################################################################
#
# Create control cohort table based on inverse controls
#
#################################################################################
controlTable = CohortContrast::createControlCohortInverse(cdm = cdm, targetTable = targetTable)
Other considerations
If you have constructed the cohorts by hand it is strongly advised to check and resolve overlap conflicts as well as conflicts with observation period inside the OMOP CDM.
################################################################################
#
# Resolve conflicts
#
#################################################################################
targetTable = CohortContrast::resolveCohortTableOverlaps(cdm = cdm, cohortTable = targetTable)
controlTable = CohortContrast::resolveCohortTableOverlaps(cdm = cdm, cohortTable = controlTable)