Diagnosis and Therapy Aids (Introduction to Medical Informatics) (http://www.cpmc.columbia.edu/edu/textbook) LAST REVIEWED: 31 December 1997 knowledge-based systems (KBSs) in medical informatics whereas AI is more the study of KBSs medical informatics stresses the use of them history idea for Bayes for diagnosis [Ledley & Lusted 59] prototype Bayesian system [Warner 64] Bayesian abdominal pain [de Dombal 72] 50s pre-computer 60s proof of concept 70s heuristic 80s laboratory 90s return to healthcare EXTERNAL FACTORS goals assist, not replace the physician do waht computer does best: memory, processing focus attention provide diagnosis facilitate management what is true = pure diagnosis (eg, de Dombal) user decides data needed or fixed set of questions simpler findings (manifestations) symptoms, signs, tests => diagnosis what to do (eg, QMR suggesting next test) what data to gather (based on previous results) whether to treat how to treat how to follow up requires knowledge of risks & costs of tests, Rx knowledge base | V +--> data input -------> inference engine | | | | | | V | | | | +---------- diagnosis | | | | | V | | | +---------- explanation | | user <--+ V | ^ +-------------- questions | | V +------ patient <--------- therapy required for good decision (analogous to a person) accurate data reliable knowledge (knowledge base) good problem-solving skills (inferencing mechanism) advice mode passive - user approaches system when help is desired consulting - user enters data, system gives answer eg, QMR critiquing - user enters plan, system critiques eg, ATTENDING critiques anesthesia plan active - give advice when help is needed even if user does not realize help is needed must be integrated with database eg, HELP, CPMC human-computer interface fast - eg, causal network is slow easy to use - must be familiar incorporated into normal daily activity user will not stop what she is doing data acquisition probably the single major problem will not re-enter data already on computer elsewhere most systems are not linked to the database will not spend time entering new data QMR consultation can take an hour integration with clinical information system link to database triggered by clinical events part of normal data review process for user evaluation huge number of finding combinations to be tested cost of evaluating outcomes ability to explain result do not know whether the system is reliable canned text generate explanations of how system can to conclusion (not why) deeper reasoning about causes but MDs will accept unexplained advice if they trust it eg, antibiotic book, Washington manual psychological factors loss of control inertia but these seem to be less important than first thought liability of KBSs negligence law - reasonable expectations for safety like current malpractice strict liability - must not be harmful pay 3 times damages liability of not using KB (like not using MEDLINE) INTERNAL METHODOLOGIES issues knowledge representation - how is knowledge stored procedural code probabilities rules semantic network inference generation (many types; see below) knowledge acquisition clinical experts clinical databases physiology theory maintenance of the knowledge base how does one change affect the rest validation of the knowledge base does it do what I want it to do does what I want it to do work completeness does the KB cover all relevant diseases for diagnosis the KB must be complete for alerts, each part can do something useful does the KBS know when it does not know the answer or does it just guess wrong A. algorithmic procedural code mathematical formulae and branching logic mix KR and inferencing mechanism acid-base and electrolyte management [Bleich 72] basically a large, procedural computer program really used ECG interpretation early ones were deterministic now a mixture of techniques digoxin, early protocol systems very fast performance *maintenance and complexity of coding limited to narrow domains B. classification (statistical or pattern) 1. Bayesian classification: use Bayes Theorem to calculate P(D|T) acute abdominal pain, appendicitis [de Dombal 72] Univeristy of Leeds -> then in English ERs review thousands of cases to get P(D), P(T|D) findings (T) include symptoms, signs, simple tests usually entered via data collection form then use Bayes theorem to calculate P(D|T) diagnoses appendicitis diverticulitis perforated ulcer cholecystitis small bowel obstruction pancreatitis nonspecific abdominal pain evaluation on 304 cases MDs 65% to 80% accuracy system 92% accurate appendicitis: no FN, 6 FP (vs 20 by MDs) in its simple form, Bayes Theorem assumes each patient has only one diagnosis assumes conditional independence of findings: f1 f2 D -D 0 0 1 11 0 1 0 1 1 0 0 1 1 1 5 1 P(D) = 6/20 P(f1|D) = 5/6 P(f1|-D) = 2/14 P(f2|D) = 5/6 P(f2|-D) = 2/14 P(D|f1,f2) = 5/6 = 0.83 but according to odd's ratio form of Bayes theorem OD(D|f1,f2) = OD(D) P(f1|D)/P(f1|-D) P(f2|D)/P(f2|-D) = 6/14 (5/6)/(1/7) (5/6)/(1/7) = 175/12 P(D|f1,f2) = 0.94 also HELP (KB later transferred to *Illiad*) [Warner] use clinical database and experts to gets Ps Boolean frames to handle conditional dependence do not count redundant evidence twice more general belief network can handle dependence directly (may be "evidence" or prob) if not assume independence and exclusivity if assumed independence need only 1+2*(#findings) probs without assumptions need 2**(#findings) difficult to get probabilities, but can use expert opinion statistics from a clinical database transfer, but can change prior probabilities works relatively well despite strict assumptions 2. classification trees, factor anal, linear discriminant anal, ... train system on existing data (findings+disease) system creates a classification tree, equation, ... then put in data from new case get back probability of disease eg, recursive partitioning in MI [Goldman 81] spec (74 system vs 71 MDs), sens (88 for both) for trees, not necessarily even need computer need lots of data 3. neural networks network of nodes train with existing data with known diagnoses then run new cases -> diagnosis acts much like a statistical classifier not much real use 4. database comparison compare patient to similar ones in large database start with exact match loosen criteria to get enough patients needs much data, slow, problems for naive user C. decision analysis systems [Pauker 81] systems for knowledge engineers to build new trees systems for users to run a particular decision tree probabilities on nodes already determined need to add personal utilities, differences due to seriousness of illness D. production rule systems if-then facts (not branching logic) inference through forward or backward chainging forward: +GPCs -> possible Staph -> oxacillin backward: thesis -> what is needed to prove it can attach uncertainty to the if-then rules MYCIN [Shortliffe 76] uses production rules with certainty factors knowledge base is modular (rules) certainty factors allows fuzzy statements laboratory evaluation showed promise not put into production use IF infection is meningitis type of infection is bacterial ... THEN (certainty = 0.80) cause may be pnuemoccus or neisseria explanation - assemble natural language version of rule reasoning; but inflexible - system interprets user's "why?" question must specify all antecedents (eg, dilated pupil not due to med) problems with maintenance as KB grows not really modular - each rule affects many others difficult, time consuming to turn expertise into rules therefore limited to narrow domains difficult to validate, except by running many cases minority of real systems use this method E. Cognitive model attempt to mimic human reasoning diagnostic process hypothesis generation hypothesis evaluation question generation repeat want to generate hypotheses that are: coherent - without contradictions adequate - explain all findings parsimonious - simple explanation (Occam's razor) found that: long term associative memory more important than methodology humans use 3-levels: pos, neg, no info count positive findings that should be there - neg that should be there Quick Medical Reference (QMR) [Miller] derived from INTERNIST, an early program [see QMR addendum] knowledge base of disease profiles relate findings to diseases frequency_how often seen in disease P(T|D) evoking strength_how many pts with finding have disease P(D|T) import_how important is it to explain this finding first enter initial findings scoring algorithm to order diseases by likelihood type of diagnosis approach depends on distribution of scores: pursue: one disease in the lead rule out: two diseases are close and separate from all others discriminate: many diseases lumped together it can then ask for more data, and repeat the process terminate when one better than others, or cannot get better list DXplain [Barnett] was on AMA network, but now stopped among the most successful, but still not used much "generalized set covering theory" lends rigor F. causal network create physiology model of body try to represent cause and effect disease causes findings (eg, pneumonia causes cough) assign probability of pneumonia causing cough therefore can reason about cases never seen before can give reasoned explanations in most areas of medicine, causation is not known MDs do not really use causal reasoning in practice can be slow - execution time rises exponentially not much real use G. protocol systems already done in HELP and RMRS (see below) ONCOCIN [Shortliffe] direct management patients on cancer protocols system recommends what data is required what drugs to give and when alerts to harmful interactions and effects much work on linking problems over time knowledge acquistion tool called OPAL H. modular-independent systems (event-driven, often alerting) two early examples still in use today HELP [Warner] RMRS [McDonald] drug-drug interactions several commercial knowledge bases incorporated into pharmacy products very successful, probably beneficial still unclear how beneficial CPMC Event Monitor [see event monitor attachment] related reading: Tuhrim S, Reggia JA. An introduction to computer- assisted medical decision making I. M.D. Computing 1985;2(1):28-36. Reggia JA, Tuhrim S. An introduction to computer- assisted medical decision making II. M.D. Computing 1985;2(4):40-6.