A comparison of two methods for retrieving ICD-9-CM data: The effect of using an ontology-based method for handling terminology changes

Academic Article


  • Objective: Most existing controlled terminologies can be characterized as collections of terms, wherein the terms are arranged in a simple list or organized in a hierarchy. These kinds of terminologies are considered useful for standardizing terms and encoding data and are currently used in many existing information systems. However, they suffer from a number of limitations that make data reuse difficult. Relatively recently, it has been proposed that formal ontological methods can be applied to some of the problems of terminological design. Biomedical ontologies organize concepts (embodiments of knowledge about biomedical reality) whereas terminologies organize terms (what is used to code patient data at a certain point in time, based on the particular terminology version). However, the application of these methods to existing terminologies is not straightforward. The use of these terminologies is firmly entrenched in many systems, and what might seem to be a simple option of replacing these terminologies is not possible. Moreover, these terminologies evolve over time in order to suit the needs of users. Any methodology must therefore take these constraints into consideration, hence the need for formal methods of managing changes. Along these lines, we have developed a formal representation of the concept-term relation, around which we have also developed a methodology for management of terminology changes. The objective of this study was to determine whether our methodology would result in improved retrieval of data. Design: Comparison of two methods for retrieving data encoded with terms from the International Classification of Diseases (ICD-9-CM), based on their recall when retrieving data for ICD-9-CM terms whose codes had changed but which had retained their original meaning (code change). Measurements: Recall and interclass correlation coefficient. Results: Statistically significant differences were detected (p< 0.05) with the McNemar test for two terms whose codes had changed. Furthermore, when all the cases are combined in an overall category, our method also performs statistically significantly better (p< 0.05). Conclusion: Our study shows that an ontology-based ICD-9-CM data retrieval method that takes into account the effects of terminology changes performs better on recall than one that does not in the retrieval of data for terms whose codes had changed but which retained their original meaning. © 2011 Elsevier Inc.
  • Authors

    Published In

    Digital Object Identifier (doi)

    Author List

  • Yu AC; Cimino JJ
  • Start Page

  • 289
  • End Page

  • 298
  • Volume

  • 44
  • Issue

  • 2