Types of Data (Introduction to Medical Informatics) (http://www.cpmc.columbia.edu/edu/textbook) LAST REVIEWED: 22 September 1997 MEDICAL DATA in the reading, talk about lawyers, accountants, engineers, ... using computers distinguish from physicians: min computer use even for mundane information processing activity but changing, partially due to shift of focus from knowledge to data data vs knowledge data = observation (heart murmur: description or tape recording) knowledge = interpretation of data 1. the interpretation itself (defective heart valve) knowledge to A may be data to B at a higher level 2. methods to interpret data (process of diagnosis) information includes both importance of data: characterize patients enable diagnostic and therapeutic decisions data can be... costly risky uncertain (inaccuracy of test; gaps in scientific knowledge; poor memory) parameters of a single data element patient identification parameter being measured can be nested: can have parts that in turn have parts... value of parameter time of observation circumstances of observation ex: body position in BP measurement; arm used questions data can answer what is the history? symptoms? examination findings? changes in symptoms and signs over time? changes in physiologic function over time? previous treatment? rationale for treatment? data integration can be key to diagnose disease and etiology, may need multiple elements ex: dx of renal failure secondary to medication requires name of med; start/stop date; series of measures of renal function; precision/granularity of data data (esp numerical data and esp time) can have varying precision ex: 1994 vs 5/1994 vs 5-18-1994 vs 1:03 PM on 5-18-1994... NARRATIVE DATA written in natural language examples: CXR narrative, DC summary bulk of data is in this form admit note, progress note, discharge summary, nursing note, ... radiology, pathology, ..., reports English plus technical terms specific to the field idioms/phrases: shorthand narration "dyspnea on exertion" (DOE) take on meaning beyond its parts: succinct characterization use of shorthand "PERRLA"=pupils equal, round, and reactive to light and accommodation often takes on new meanings like idioms not so natural can be entered on paper or computer write in chart write on form and transcribe dictate and transcribe word processor "clickable" templates CODED DATA impose structure on the data examples: lab screen, CXR parse, paper form eg, what might you want to know about a lab test or medication order (from CPMC clinical database) "primary time" start time stop time update time status event code patient database key accession number alert flag event class frequency quantity value: code, class, parent, type, value, units, range can be viewed as a tuple (relation; fact with several components): patient identifier (eg, medical record number=1234) name of parameter (eg, potassium) value (eg, 4.5 mg/dl) time (eg, 1993-01-04 14:35:23) modifiers (eg, likely) use of modifiers vs. columns modifiers allow easier storage: db need not be changed when new kinds of modifiers are added columns allow easier recall: all attributes in a single tuple parameter specifies what the tuple is describing coded term from a vocabulary eg, on paper form, might be field label ("last name") vocabulary well-defined, finite set of terms controlled (ltd set), structured (relationships) often associate a code for each term can be arbitrary (MED) or symbolic (ICD-9, where each part of code can mean something) define the terms express relations among terms (synonym, subclass) eg, ICD-9, SNOMED, CPT, UMLS numeric values age, temperture, lab values, ... variable precision (3 vs 3.00) includes units (3 g = 3000 mg) normal range (3 g on machine A = 4 g on B) categorical values small number of discrete choices dichotomous = binary (yes/no, male/female) nominal = no order, qualitative (religion) ordinal = natural order, qualitative or quantitative (age range 0-10, 11-20, ...) coded values taken from vocabulary permit relations among concepts: coded parameter to coded value eg, allergen = penicillin null value sometimes only need parameter without value sometimes value is not known explicit vs implicit null: explicit: value known to be not known implicit: no value for an attribute in db important in framing queries (may need to screen out implicit nulls) (free text or narrative values) not fully coded, enjoys only some of benefits of coded data time instants = points in time (1993-09-15 at 12:45:67.0) intervals (pair of instants) special case = points with granularity duration (3 months) seasons (winter of any year) fuzzy times (sometime last year, recently) can be represented as pair of intervals can be represented as exact point with parameter indicating most significant part ex: 1994-01-01-00.00.00; sig=year probability plot - most general for each instant give P(it is included) can be used to represent other types often several times eg, for lab: collect, arrival, perform, report, acknowledge "primary time" is the medically most relevant time (eg, collect) time and versioning (in CS literature) "historical database" = store time with data "rollback database" = keep log of updates (what was true "as of" yesterday) "temporal database" = historical + rollback modifiers context (eg, blood pressure sitting) certainty (eg, likely) degree (eg, mild) location (eg, right lower lobe) specimen (eg, plasma) accession number ... missing data negative, normal, or unknown default reasoning may depend on context (diagnosis) nesting example: microbiology is a blood pressure "120/80" or two items: 120 and 80 microbiology result: set of organisms -> antibiotics -> sensitivities 1. can you represent complex relations 2. what does the parent-child relation mean? (part of, result of) 3. even if you can represent, can you get the view you want? another example: two views of a CBC many observation are difficult to code or even express patients reaction to questions (eg, avoidance) MDs initial impression of patient's well-being like narrative, can be based on paper or computer have been using forms for years (eg, name and age) structured entry form and entry clerk direct entry automatically or manually code narrative text ADVANTAGES OF NARRATIVE VS. CODED DATA (independent of use of computer) narrative universal and familiar little training, except for overall format of note expressive, even for new and unexpected situations coded encourage complete entry (phone number field) how can I reach you; vs. name, address, phone improve ability to locate and read data force provider to classify eg, billing needs a single "primary diagnosis" force provider to use standard vocabulary (ICD-9, DRG) therefore aids retrieval, research, billing QUASI-CODED DATA reports may appear to be coded example: two echos that are slightly different unless one can translate unambiguously into tuples, not really coded OTHER DATA signals electrocardiogram images drawn sketches X-ray related reading: McDonald CJ, Tierney WM. Computer-stored medical records: their future role in medical practice. JAMA 1988;259(23):3433-40. EXAMPLES OF MEDICAL DATA (all patient references have been changed to protect privacy) [1] DISCHARGE SUMMARY (NARRATIVE) Name: PATIENT, TEST Sex: F Birthdate: 10/10/916 MRN: 3131313 Discharge Summary Date: 01/11/94 Rpt. Num: EA17055000 -------------------------------------------------Status: Unsigned & Preliminary ADMITTED: 12/26/93 DISCHARGED: 01/11/94 CHIEF COMPLAINT: This is one of several CPMC admissions for this 77-year-old retired bank clerk who enters with an acute myocardial infarction. HISTORY OF PRESENT ILLNESS: Miss Levine has been under the direct care of Dr. John Wilson with hypertension and recently was hospitalized with E. coli urosepsis and atrial fibrillation. She was discharged on 11/10/93 on Sotalol, Tenormin and Lanoxin with Lozol for a diuretic. She was also anticoagulated. She is treated with Cipro for a 14-day course. In follow up in the DPO she had several problems: 1. She developed a rash which was attributed to Sotalol which was subsequently stopped. 2. She developed gross painless hematuria with therapeutic prothrombin time, so Coumadin was stopped on 11/26. 3. Her hematocrit decreased progressively from October 12 through 11/12 to 29%. 4. She developed shortness of breath and increased edema, treated with increased Lozol. Evaluation on 11/30/93 showed a sedimentation rate of 133, a hematocrit of 26.4, creatinine of 1.9, BUN of 28, sodium of 128. By December 6, her hematocrit was 26.7 with 5% retics. The creatinine was 2.1. She was last seen on 12/21/93 weighing 132 lbs. with a blood pressure of 200/100, a pulse of 88 which was regular, jugular venous distention, a murmur of aortic insufficiency, 2+ edema. Her urinalysis showed 3+ albumin, many red cells and white cells. The sedimentation rate was repeated and found to be 130. Sodium was 135, potassium 3.7, BUN 30, creatinine 1.4. A blood culture evaluation was negative X 1. Latex fixation was positive only 1 to 80. EKG showed normal sinus rhythm, left atrial enlargement and a nonspecific STT changes. ST depressions were noted in the lateral leads. The inferior T-wave inversions had been seen before. She had a positive ASLO. Her chest x-ray showed pulmonary vascular congestion, congestive heart failure. She was stable until the night before admission when she developed the onset of substernal chest pain with nausea, vomiting and pain, with radiation into the left arm. Emergency services were called and she was found to be hypotensive with a blood pressure of 90 and a heart rate of 40-50. She was treated with Atropine and transferred to the emergency room. Her left cardiogram was shown to be abnormal with ST elevations in I, III and F; ST depressions in I and L, II through III. Her hematocrit was found to be 21%. She denies any interval chest pain or palpitations other than these episodes. In the emergency room she was not administered thrombolytic therapy because of extensive comorbid disease. Her pain decreased and she was admitted to the Intensive Care Unit. PAST MEDICAL HISTORY: She has undergone prior surgeries, including C-section, ... [2] CHEST XRAY (NARRATIVE) Probable mild pulmonary vascular congestion with new left pleural effusion, question mild congestive changes. Elevated left hemidiaphragm, not changed from prior film. [3] CHEST XRAY (CODED) pulmonary vascular congestion certainty: high degree: low pleural effusion region: left status: new congestive changes certainty: moderate degree: low elevated hemidiaphragm bodyloc: hemidiaphragm (region: left) change: no change previous_exam: available [4] SINGLE LABORATORY REPORT (CODED) Result Display (d) Name: PATIENT, TEST Sex: F Birthdate: 10/10/916 MRN: 3131313 DIFFERENTIAL Received: 09/22/94 15:58 Acc.No: H50500 ------------------------------- Price: -----------Status: Final Test Result Unit Range NEUTS 73 % 40 -70 LYMPHS 9 % 20 -50 MONO 8 % 4 - 8 EOS 8 % 0 - 6 BASO 1 % 0 - 2 BAND 1 % 0 - 5 RBC MORPH 1+ ANISO PLATELET EST INCR % ADEQ [5] MULTIPLE LABORATORY REPORT (CODED) Result Display (d) Name: PATIENT, TEST Sex: F Birthdate: 10/10/916 MRN: 3131313 DIFFERENTIAL Received: 09/22/94 15:58 Acc.No: H50500 --------------------------- Price: -----------Status: Final DATE NEUTS LYMPHS MONO EOS BASO BAND RBC MORPH PLT 9/22/94 73 9 8 8 1 1 1+ ANISO INCR 9/12/94 60 10 8 7 2 5 2+ ANISO INCR 8/23/94 52 13 10 5 1 0 ADEQ 8/15/94 45 12 13 8 0 2 1+ ANISO ADEQ 8/10/94 48 11 10 5 3 1 ADEQ 7/10/94 51 15 15 6 1 1 ADEQ 7/05/94 43 14 12 9 0 2 1+ ANISO ADEQ 4/01/94 55 13 9 6 1 0 ADEQ 3/12/92 49 12 13 8 2 1 ADEQ [6] NESTED LABORATORY REPORT (CODED) Result Display (d) Name: PATIENT, TEST Sex: F Birthdate: 10/10/916 MRN: 3131313 CULTURE, ROUTINE Received: 09/07/94 21:11 Acc.No: W42183 ------------------------------- Price: $33.00 -----------Status: Final SPECIMEN DESCRIPTION: CATHETERIZED URINE CULTURE: >100K COL/ML ESCHERICHIA COLI ORGANISM: >100K COL/ML ESCHERICHIA COLI METHOD: MICROSCAN MI AMP T/S TIM PIP CIP CFZ CFX CEZ A/S CFT GEN IMP AMI MIC <=2 <=8 <=8 <=1 <=2 <=2 <=2 <=4 <=1 <=4 8 S S S S S S S S S S S S S [7] OPERATIVE REPORT (QUASI-CODED) Name: PATIENT, TEST Sex: M Birthdate: 08/07/926 MRN: 3131313 Operative Report Date: 04/12/91 Rpt. Num:BD17145000 ------------------------------------------Status: Unsigned & Preliminary ADMITTED: OPERATION: 4/12/1991 DISCHARGED: SURGEON: E. FA, M.D. PREOPERATIVE DIAGNOSIS: ANASTOMIC STRICTURE. POSTOPERATIVE DIAGNOSIS: SAME. OPERATION: CYSTOSCOPY, VISUAL URETHROTOMY, TRANSURETHRAL RESECTION OF BLADDER NECK. ANESTHESIA: SPINAL. COMPLICATIONS: NONE. MEDICATIONS: AMPICILLIN AND GENTAMICIN. CONDITION: STABLE. [8] OPERATIVE REPORT (QUASI-CODED, SLIGHTLY DIFFERENT FROM [7]) Name: PATIENT, TEST Sex: M Birthdate: 08/07/926 MRN: 3131313 Operative Report Date: 06/09/94 Rpt. Num: EF14055000 -----------------------------------------Status: Unsigned & Preliminary ADMITTED: 05/11/94 OPERATION DATE: 06/09/94 DISCHARGED: PREOPERATIVE DIAGNOSIS: URETHRAL STRICTURE. POSTOPERATIVE DIAGNOSIS: SAME. OPERATION: CYSTOSCOPY, URETHRAL DILATION AND CATHETER PLACEMENT. SURGEON: RONALD M. BENNETT, M.D. ASSISTANT: ELAINE R. SMITH, M.D. ANESTHESIA: MONITORED ANESTHESIA CARE.