CNICS Data Elements
The CNICS cohort maintains data on 10 unique domains: (1) disease diagnoses, (2) laboratory data, (3) medication data, (4) demographics, (5) health care utilization, (6) vital status, (7) patient reported outcomes (PROs), (8) antiretroviral drug resistance, (9) biologic specimens and (10) census block data. The details of each domain are provided below.
a. Diagnosis Data are mapped to standardized codes defined in the CNICS Standards and Submissions Document and include AIDS-defining diagnoses, non-AIDS-defining malignancies, cardiovascular and cerebrovascular disease, kidney disease, diabetes, dyslipidemia, liver disease, hypertension, mental health disorders, and substance use. Diagnoses are prospectively recorded in the EHR by the treating provider at the time care is provided from a constrained list of standardized diagnosis codes. Data regarding diagnoses that occurred prior to the patient’s initial visit to a CNICS site are collected at enrollment and coded as ‘patient reported’ with or without outside documentation. Sites verify diagnoses through systematic review of provider notes and other medical records, event driven audits, and verification of random samples of events. Standardized protocols developed for ascertainment and endpoint verification of MI’s and malignancies (including lung cancer) have been implemented. In addition, access to tissue block or fresh frozen specimens can be set up in real-time (on demand) for specific studies (e.g., R01s, R21s). Encounter data now includes a broad range of encounter types that patients engage in as part of routine HIV care and detailed information on appointments to assess retention in care issues through visit adherence assessments.
Malignancy Data: All 8 sites submit incident invasive cancer data collection through 2010. Currently, 2,364 verified cancer diagnoses exist in CNICS, including 77% non-KS biopsy confirmed and 59% KS biopsy confirmed. As of March 2014, AIDS-defining cancers include: KS mucocutaneous/29%, KS visceral/5%, NHL/17%, NHL CNS/3%, and Cervical/1%. Non AIDS malignancies include: Skin non-melanoma/10%, anal/5%, lung/5%, Hodgkin/4%, prostate/3%, melanoma/2%, liver/2%, breast/2%, colorectal/2%, oral cavity/pharynx/1%.
b. Laboratory Test Results that are collected and maintained in the central database include plasma HIV-1 RNA levels, CD4+ T cell count, viral hepatitis, hematologic, kidney, and chemistries/metabolic markers. Laboratory data are uploaded directly from clinical site laboratory medicine systems and surveillance for coding changes and outliers is conducted.
c. Medication Data that are collected and maintained in the central database include antiretroviral, anti-anxiety, antidepressant, anti-infective, antifungal, anti-hypertensive, anti-psychotic, anti-tuberculosis, antiviral, diabetes, lipid lowering, and mood stabilizing medications. Providers enter Electronic Health Record medication data (including start / stop dates) or prescription fill / refill data from institutional pharmacy dispensing systems are uploaded directly into CNICS and verified through medical record review. Data regarding antiretroviral treatment that occurred prior to the patient’s initial visit to a CNICS site are collected at enrollment and coded as to completeness and level of date precision. Expansion of medications submitted to CNICS is driven primarily by the requirements of CNICS approved studies, including those used in the treatment of a condition or with side effects related to the clinical outcome under investigation.
d. Demographic Data include sex, race/ethnicity, age, and risk factor for HIV transmission collected at the time the patient initiates care using standardized categories. Race and ethnicity are coded according to HRSA standards, and risk factors for HIV transmission are classified according to the CDC 1993 case definition. A copy of the CNICS cohort demographics updated 04/2014 can be downloaded here.
e. Utilization Data include initial patient enrollment, primary care visits, and hospitalizations, and will be expanded to address planned studies examining predictors of missed visits and utilization of other medical resources through other specialty services. These data are captured in patient encounter and appointment systems, and inpatient medical records systems at each site.
f. Vital Status Data CNICS sites use local procedures to track deaths and maintain death registries. CNICS subscribes to the US National Death Index (NDI) used to verify patient death dates. Sites query the NDI semiannually to ensure complete ascertainment of death data in CNICS.
g. Patient reported outcomes (PROs) data are collected at most sites on consenting patients. Patients use touch-screen tablets or PCs with an easy to navigate interface that are connected to a wireless network using SSL/TLS encryption. The following domains are obtained: depression and anxiety (PHQ) adherence (AACTG, VAS, and self-rating item); smoking, alcohol, drug use (AUDIT-C and ASSIST); HIV transmission risk behaviors; symptom burden (HIV Symptom index); physical activity level (LRCQ); body morphology (FRAM); and Quality of Life (EuroQual; EQ-5D).
Currently, patients at seven CNICS sites use the CNICS Patient Reported Data and Outcomes (PRO) System to complete clinical assessments on 12 domains that are used for clinical care at the time of the encounter and for research. During the past year, the University of North Carolina site began using the CNICS PRO System, and Johns Hopkins University should complete testing and begin using the system in September 2013. To date, over 36,000 assessments have been completed ( by March 2014) for over 10,000 unique patients, with more than 8,600 sessions completed in 2013 alone. See patient demographics for PROs below.
h. Antiretroviral drug resistance data are complex in format and heterogeneous in delivery and availability. CNICS has made considerable progress in overcoming the proprietary and technical challenges of collecting resistance data at each site. CNICS has achieved electronic capture of diverse and evolving HIV drug resistance data, including full nucleotide genotype, phenotype, and tropism assays with easy capability to expand with new drug targets (i.e., integrase).
i. Specimen collection data are also managed centrally. The goal of any specimen repository is to have the desired specimens readily available, ‘on demand’, for every CNICS approved project. While it is very difficult to predict the research questions of the future, CNICS has taken a proactive, strategic approach in selecting patients from whom specimens will be collected. Specimen repositories at all CNICS sites collect specimens on targeted populations of interest (e.g., treatment naïve individuals initiating therapy, “elite” controllers and/or long-term non-progressors). Specimens collected universally at specified sites on all consenting patients, thereby creating a broad palate to support future studies. CNICS sites collect types of specimens that are most likely to be used to address emerging translational and clinical questions, including plasma (e.g., for biomarkers), viably frozen PBMCs (e.g., for functional immunologic assays), or snap frozen PBMCs (for genetic analyses). Each of the CNICS sites has experience with sample preparation and quality assurance through the ACTG / AVEU / HVTN networks.
Host Genetics Studies. CNICS is collecting snap frozen PBMCs to enable investigators to perform host genetic analysis in future studies. Patients have been consented to allow genetic studies at each site.
j. Census block data are also routinely obtained. With the development of geographic information systems (GIS) over the past decade, interest has grown in the application of geospatial information to frame individual-level health data in the context of neighborhoods and communities. The burgeoning field of spatial epidemiology allows the incorporation of location as a variable in statistical modeling, and permits formal hypothesis testing of case-clustering, spread of an epidemic over time, and innumerable other questions requiring geo-referenced data. Linkage to geographic data will permit analysis of patients’ health outcomes and socio-demographics in the context of communities, environment, and social structure – a first for HIV research on a large scale. The protection of individual privacy is a challenge with geo-referenced data. Currently, HIPAA permits the use of the three initial numbers of a zip code, if the total population formed by combining all the zip codes with the same three digits contains more than 20,000 individuals. An attractive alternative to zip codes is the census block group, the smallest geographic entity for which the U.S. Census Bureau tabulates and publishes its decennial sample data. Census block groups are numbered subdivisions of counties delineated to contain between 600 and 3000 residents (240-1200 housing units). Privacy is protected by “translating” a street address into a ‘census block group’ number using commercially available GIS software; back-translation of CBG numbers to determine an address is not possible. Including CBG numbers as a new data element is an exciting opportunity that will further distinguish CNICS from other large cohort studies.