We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi,
found this in the exercises of the Oxford RWE summer school, looks a bit like a bug to me, but I may be incorrect.
It's the exercise where we're supposed to find the average number of prescriptions, using patientprofiles.
//Oskar
`############################### library(CodelistGenerator) library(CDMConnector) library(duckdb) library(PatientProfiles) library(dplyr)
con <- dbConnect(duckdb(), eunomia_dir()) cdm <- cdmFromCon(con = con, cdmSchema = "main", writeSchema = "main")
cdm <- generateConceptCohortSet( cdm = cdm, name = "sinusitis", conceptSet = list( "bacterial_sinusitis" = 4294548, "viral_sinusitis" = 40481087, "chronic_sinusitis" = 257012, "any_sinusitis" = c(4294548, 40481087, 257012) ), limit = "all", end = 0 )
##############
cdm$sinusitis |> addTableIntersectCount( tableName = "drug_exposure", window = c(-Inf, Inf), targetEndDate = NULL, nameStyle = "number_prescriptions" ) |> filter(cohort_definition_id == 2) |> # Filter on cohort after intersection. summarise(mean_prescription = mean(number_prescriptions))
cdm$sinusitis |> filter(cohort_definition_id == 2) |> # Filter on cohort after intersection. addTableIntersectCount( tableName = "drug_exposure", window = c(-Inf, Inf), targetEndDate = NULL, nameStyle = "number_prescriptions" ) |> summarise(mean_prescription = mean(number_prescriptions))
cdm$sinusitis |> filter(subject_id == 806) |> distinct(subject_id, cohort_definition_id)
cdm$drug_exposure |> filter(person_id == 806) |> count()
##################################################
cdm$sinusitis |> filter(subject_id == 806) |> filter(cohort_definition_id == 2) |> ################ FILTER on cohort before intersection addTableIntersectCount( tableName = "drug_exposure", window = list(c(-Inf, Inf)), nameStyle = "number_prescriptions" ) |> pull("number_prescriptions") |> mean()
##################
cdm$sinusitis |> filter(subject_id == 806) |> addTableIntersectCount( tableName = "drug_exposure", window = list(c(-Inf, Inf)), nameStyle = "number_prescriptions" ) |> filter(cohort_definition_id == 2) |> ################ FILTER on cohort after intersection pull("number_prescriptions") |> mean()
`
The text was updated successfully, but these errors were encountered:
thanks for reporting @OskarGauffin and thank @ilovemane for fixing it
Sorry, something went wrong.
ilovemane
Successfully merging a pull request may close this issue.
Hi,
found this in the exercises of the Oxford RWE summer school, looks a bit like a bug to me, but I may be incorrect.
It's the exercise where we're supposed to find the average number of prescriptions, using patientprofiles.
//Oskar
`###############################
library(CodelistGenerator)
library(CDMConnector)
library(duckdb)
library(PatientProfiles)
library(dplyr)
con <- dbConnect(duckdb(), eunomia_dir())
cdm <- cdmFromCon(con = con, cdmSchema = "main", writeSchema = "main")
cdm <- generateConceptCohortSet(
cdm = cdm,
name = "sinusitis",
conceptSet = list(
"bacterial_sinusitis" = 4294548,
"viral_sinusitis" = 40481087,
"chronic_sinusitis" = 257012,
"any_sinusitis" = c(4294548, 40481087, 257012)
),
limit = "all",
end = 0
)
##############
solution:
cdm$sinusitis |>
addTableIntersectCount(
tableName = "drug_exposure",
window = c(-Inf, Inf),
targetEndDate = NULL,
nameStyle = "number_prescriptions"
) |> filter(cohort_definition_id == 2) |> # Filter on cohort after intersection.
summarise(mean_prescription = mean(number_prescriptions))
gives you 50.
cdm$sinusitis |>
filter(cohort_definition_id == 2) |> # Filter on cohort after intersection.
addTableIntersectCount(
tableName = "drug_exposure",
window = c(-Inf, Inf),
targetEndDate = NULL,
nameStyle = "number_prescriptions"
) |> summarise(mean_prescription = mean(number_prescriptions))
gives you 25.
Which one is correct?
Check number of prescriptions for subject_id = 806
This subject belongs in all four sinusitis cohorts:
cdm$sinusitis |> filter(subject_id == 806) |> distinct(subject_id, cohort_definition_id)
And there is 21 drugs in the drug_exposure table for person_id = 806.
cdm$drug_exposure |> filter(person_id == 806) |> count()
##################################################
cdm$sinusitis |> filter(subject_id == 806) |>
filter(cohort_definition_id == 2) |> ################ FILTER on cohort before intersection
addTableIntersectCount(
tableName = "drug_exposure",
window = list(c(-Inf, Inf)),
nameStyle = "number_prescriptions"
) |>
pull("number_prescriptions") |>
mean()
21 - correct.
##################
cdm$sinusitis |>
filter(subject_id == 806) |>
addTableIntersectCount(
tableName = "drug_exposure",
window = list(c(-Inf, Inf)),
nameStyle = "number_prescriptions"
) |>
filter(cohort_definition_id == 2) |> ################ FILTER on cohort after intersection
pull("number_prescriptions") |>
mean()
42. Double the correct answer. I find it a bit surprising / possible bug
that the count is doubled by not filtering on the cohort before the intersection.
`
The text was updated successfully, but these errors were encountered: