Docle coding and classification system browser
Oon Y K
Docle Systems
www.docle.com.au
The
linnean Docle coding and classification system is the most widely used coding
system by Australian primary health care providers. With the emerging role of
the electronic medical record, expectations of greater utility of such an EMR
is heightened. There will be a need
for hospital-hospital and GP-hospital inter-transfer of medical data, coding
issues will arise in the context of this data interchange. A transportable
electronic medical record without a
clearly defined medical coding system
is like a vehicle without wheels. And in fact a transportable medical record
system leveraging the natural expressivity of Docle is afoot. But which coding
system? There is a spectrum of choices, natural language at one end of the
spectrum, to the other extreme of representing
medical entities as numbers. Docle is an all but natural alphabetic language of medicine suitable for
machine processing. The Docle framework has equal facility for ‘lumpers’
and ‘splitters’.
Docle is over ten years
old, it did not make much of an impact until it went linnean in 1995. Within
months, it was incorporated in Australia’s most widely used medical package.
The Docle browser operates on over 16,000 medical objects that are related to
one another in a linnean hierarchy much like what Carolus Linnaes did to
biological classification. The secret of a good coding system is properties and
not pointing, it also solves the ostension problem. The Docle system is mapped
to the ICD codes and has a ‘less
is more’ approach of recycling lexical
patterns and the judicious use of operators to make powerful expressions. Docle
may just well emerge to fulfill the vision of UMLS but not carry the dead baggage
of the old numeric codes.This Docle browser, in a nutshell, explodes the power
of linnean medical classification.
Why Docle?
1) Docle is a belief system
modelled on the linnean biological
classification system. The linnean biological framework has been stable for
over 200 years. There is no coding and classification problem in biology. By
modelling on this framework, the full potential of medical coding and
classification is unleashed with the Docle framework. Medical entities are
classified as species and placed in a hierarchy much like a species such as
homo sapiens. Every one of the over ten thousand Docle medical objects are thus related in a congruous framework.
For instance the liver object at the level called ORDER knows all its associated
diseases, symptoms and signs. The Docle health classification system has drawn
the two strands of biology and medicine together with a common linnean model.
2) Backward compatibility
and future potential.
The choice of coding system
of the future for Australian medical practice must in a sense comprise at least
the functions found in the three buttons one expects in a video recorder. The
buttons are Play, Rewind and Fast Forward. The linnean Docle coding and
classification system that is being played on the Medical Director software by
over 6000 Australian general practitioners in 1999 has those three functions.
Docle is in an advanced stage of mapping to ICD9 and ICD10; in that sense it
has a rewind function. Docle has the
expressivity of natural language, and yet the precision of numeric coding, and
can be mapped and made backward and forward compatible to any coding system.
The fast forward function of Docle is its intrinsic strength in knowledge
representation. The embodiment of such
a knowledge tool is the medical
spreadsheet demonstrated at the HIC98 conference in Brisbane.
3) Simplicity of design
framework, clarity of embodiment.
Docle is a linnean, hierarchical system with multiple inheritance . All that means
is that Docle combines some of the
ideas of Smalltalk and the linnean biological classification of Carolus
Linnaeus to tackle the issues of
medical coding and classification. The problem in medical coding and
classification is comparable to the
challenge of covering all
the surface area in say the bathroom with band-aids with either no or
controlled overlap. Certain surfaces in the bathroom, such as the taps and
toilet bowls are difficult to say the least. This metaphor describes the need
to code for every symptom, sign, disease,
investigation and investigation result in clinical medicine. Another
way to look at the medical coding
challenge is to imagine the work being
similar to the biologist classifying thousands of problematic species like
the platypus.
The linnean biological framework
comprising phyla, class, order, family
and species is an ingeniously suited framework, after modifications, for
classifying medical
objects. Docle is different in that it
is totally alphabetic and uses primary, secondary and tertiary keys to access
"objects" that hold the linnean properties of each medical object.
All these objects are linked together in a congruent 'belief system'. No
more the mapping of diseases to
meaningless numbers. Docle breaks free from the restrictions of multiaxial
coding. Nothing stays the same in
medicine, the Docle framework is designed for expected constant change and
improvements in our understanding of medical science. With Docle, the band-aids
come in various shapes and sizes in order that everything is covered in a
congruent manner. One can even imagine tiny band-aids that are arrayed in the
manner of jigsaw puzzles to make up the complete area of a large irregularly
shaped band-aid. Certain parts of the bathroom are covered up to six or seven layers
of band-aids that are meticulously cut, measured and catalogued. Each band-aid,
no matter how small, knows its relationships to every other band-aid in the
bathroom.
Embodiments of
Docle:
The primary key for colles fracture is
fracture.radius@coll-es ,
the secondary
key being frac.radi@colles ( the Docle algorithm automatically generates a
standard abbreviation), while the tertiary or alias key is collesFracture.
There is also fracture.radius@galleazi.
Now these two entities have as their genus fracture.radius^ . The entry for
colles fracture was a late entry. The previous ‘best’ code had been fracture.radius ( frac.radi), this “band-aid”
covered a lot of territory, fracture.radius@coll-es is a species form of the
genus fracture.radius. A medical entity often has several genera (multiple
inheritance). This is akin to a small band-aid overlaid by several large
irregularly shaped band-aids. The other
good name for genus is UMG or useful medical grouping; remember Docle is
linnean, hence it looks better to use terms like genus, family order where the
hierarchy is already predetermined. With
colles fracture, if you had coded with fracture.radius, no matter. In
future when you search for colles fracture, the search engine would recruit all
the species associated with the genus or UMG called
"fracture.radius", as that is the UMG that fracture.radius@coll-es
also belongs to. The issue of
deprecation of codes is important. With Docle,
the scheme used to manage codes that are no longer used is termed
Graceful Deprecation (see below).
Updating Docle is easy. A hypothetical
example is that of an eschericia coli strain causing an outbreak of food
poisoning in Melbourne. The microbiologists in Parkville have identified a new
species they have christened the 'Melbourne' strain. Coding this in Docle is a
doddle:
infection<eschericia@coli@melbourne -> infe>esch@coli@melb
as there is in
historical precedence, a whole series of infection<escherichia@coli@subspecies
.
The operator
characters are . for “located at”, @
for associated with and of course > and < meaning “leading to” and “due to” plus other useful operators.
Now compare the
above with the complexities of ICD9 and
ICD10 below, where a lifetime of study still will not make you competent in
coding. Life is too short for living with numeric coding systems with self
inflicted complexity.
4) The KISS
principle.
Docle is simple
and efficient, it is totally alien to the look and feel of numeric coding systems
such as ICD9 and ICD10.
Below is an
example of the trauma of working with ICD-9 and ICD-10 coding:
ICD-9-CM codes
910.0 Abrasion
or friction burn of face, neck or scalp except eye, without mention of
infection
910.1 Abrasion or friction
burn of face, neck or scalp except eye, infected
both map forward
historically to ICD-10-AM code
S00.01Superficial
injury of scalp, abrasion.
However the
AR-DRG grouper software code 910.0 groups to DRGs 492 - 494 Trauma to the skin,
subcutaneous tissue & breast while
code 910.1 groups to DRGs 489-491 Cellulitis. The two ICD-9-CM codes group to
different DRGs but map historically to the same ICD-10-AM code (S00.01). To
maintain congruency, a patch is needed. The ICD-9-CM codes needs to be
logically mapped to maintain the DRG groupings. A logical map for 910.1 to
ICD-10-AM code L08.9 Local infection of skin and subcutaneous tissue,
unspecified is needed.The above
examples show that when codes transcend multiple anatomical sites,
pathophysiological processes and with the mixing of medical idioms - inevitably
leads to future incongruencies. Constructing medical belief systems for
decision support with such a numeric scheme is a material challenge.
5) The Ostension
problem - properties, not pointing
One significant difference between the
Docle and the traditional numeric approach to medical coding and classification
involves the philosophical issue termed
the ambiguity of ostension. The inflection point in the evolution of medical
coding systems is the grappling with the issue of ambiguity of ostension.
Ambiguity of ostension is a topic found in basic philosophy text books. The
numeric coding schemes use 'pointing' while Docle uses the concept of
'properties' or behaviour. Pointing
leads to a conundrum referred to in philosophy as the ambiguity of ostension or
the fallacy of mere pointing. In the classic sense this happens when we try to
extend the vocabulary of a young child.
The parable of the inquisitive child ( as covered by Gareth Matthews in 'Philosophy
and the Young Child' ) asking of the meaning of the French word 'La Table'. He
was not satisfied about the father merely pointing to a table. He asks 'How
does one know that the pointing is not at the table top or the
colour of the table?'. Looking up the meaning of a word in the dictionary is not helpful when it just points to a bigger unknown word.
Looking up a key in a numeric coding scheme
leads to an obscure numeric code. A grand scheme like UMLS is like
looking up for the meaning of a word and to be told that the meaning of that
word in Swedish, German and Swahili. Augustine in De Magistro resolved the
problem by stating that the meaning of a word say " bird-catching" is
demonstrated by a bird-catcher doing his
thing . Augustine goes on to
say that an observer intelligent enough
will eventually catch on to what bird-catching is and hence what bird-catching
means. Meaning is derived from the properties or aspects of the behaviour of
'bird-catching'. In a sense, the Docle coding scheme comprises objects with
behaviour. Each Docle object is evaluated by its 'behaviour', its utility is derived by leveraging on its
relationships to other Docle objects. It is
obviously much easier for the software implementor to implement his
decision support using a coding scheme based on properties and behaviour,
rather than a pointing system. An alphabetic
scheme like Docle is much easier to support. Over a weekend, Medical
Director could request for 50 to 100 items to be coded in Docle. Which coding
scheme can give that sort of turnaround? Docle can. Remember the fallacy of
ostension. The prayer for every medical informatician is “Give us each day more
properties rather than more pointing in our coding systems and forgive us for
the trouble we have caused with the codes we have deprecated.” At 16,000 items
identified and classified over the past 10 years, the hack work for Docle is
largely done.
6) Decision
Support
The cogent reason for digitizing medical
data is machine facilitated decision support. Docle has demonstrated its sheer suitability for decision support in
the form of the PLUM MEDICAL SPREADSHEET as documented in the HIC98 CDROM.
7) Universal
Transportable Medical Record.
Any universal transportable medical
record initiative without a defined medical coding system is like a vehicle
without wheels. Watch out for a
newcomer that uses Docle, this
project has been christened DocleScript.
8) Size.
Docle is not too big, that is a plus.
UMLS is big, maybe too big. This applies to the other numeric coding
schemes. I personally think that big is not necessary better. Who needs 60
terms to describe candida infection - just two or three are enough. A big
coding list will just slow the pick list. The optimal size should be large
enough to fulfil our needs and no more.
Remember properties and not pointing!
When people boast about their 400,000 terms list, it is more hubris than logic. Resolution
matters, not size - a poor resolution photograph if blown up will only show
more fuzziness. Just like the photos captured on a Kodak box camera, no matter
how you enlarge the print from the box camera, the photo resolution is still a
disappointment. Mapping four fuzzy numeric coding systems gives you a fuzzy
numeric system four times bigger. There
is so much hubris and hype about the existing numeric "official"
codes that we need our eyes opened to
see the emperor's clothes for what they are.
9) Splitters
versus lumpers; differentiation versus
integration.
The perennial conundrum faced by the WHO classification body has always been the
issue to lump codes together or to split them into more specific codes. With
Docle, we can split and lump at the same time. We can differentiate the various
species and subspecies of “DiabetesMellitus”, but the differentiated species
know their memberships in the various genera. We need complexity but with
complete integrity. We need to split and lump at the same time. This differentiation/integration design creates
outward simplicity for the end users who are only interested in real life
medical entities and not some abstract artificial codes that span disjointed
anatomical areas and pathophysiology as
in the numeric codes examples.
10) Variable
versus computer constants
One of
the biggest differences between Docle and the prior art of Read, ICD, SNOMED and ICPC is that Docle
uses the computer variable concept while the rest of the field are implemented
like computer constants. A variable is like a container. This container in Docle, called a Docle
object, can be accessed via primary, secondary and tertiary keys. The three
types of keys are equivalent in the sense that they all point to the same
container with its stored methods and data.
Inside the container is the belief system about the object. While the
name of the variable is fixed, the contents of the variable may vary over time.
Docle makes use of the concept of separation of the belief system data from the
key code itself. This separation of the
properties to the code key itself provides Docle with unparalleled flexibility
to expand and mutate with the growth of medical knowledge. The key, be it primary, secondary (Docle
abbreviations) or tertiary (aliases) - all leads to the same medical object
with its stored behaviour. Medical advance will lead to gradual adjustments to
the behaviour of the medical object. It
is hard to envisage the need to change species names such as
rheumatoidArthritis or diabetesMellitus.
11) Number codes
is not a viable belief system
The linnean System in biology is a viable
belief system that is alive, moving on with the advancement of biological
knowledge. It is a framework or road map to the realm of biology. The gaps in
the linnean framework excites the imagination of the biologist about missing
links in their knowledge. It is a powerful method of cognating the knowledge
that is being accumulated. This
yearning for classifying and cognating medical knowledge was expressed in the
preface of the ICD 9 manual, but it was just a yearning.
12) Granularity
problem and the Genus chunking solution.
The granularity problem is familiar with
anyone attempting to write a decision
support program in medicine. An instance of this problem is the flagging of the
disease/drug interaction between the beta-blockers and diabetes mellitus. It would be tedious, inefficient
and prone to error to try to pick up every specific type of beta-blocker
interacting with every variation of diabetes mellitus. An example of the
beneficial effects of chunking into genus level is the case of diabetes
mellitus. Chunking up of the three variants of
diabetesMellitus: diabetesMellitus@gestation,
diabetesMellitus@insulinIndependentDiabetesMellitus,
diabetesMellitus@nonInsulinDependentDiabetesMellitus
into a
genus called diabetesMellitus, allows the common behaviour to be stored
in the diabetesMellitus genus object. Likewise we can chunk up the therapeutic
species of propanolol, atenolol and metoprolol into the medical genus betaBlocker.
An adverse drug-disease interaction is flagged when the two genera of
betaBlocker and diabetesMellitus are combined. A new beta-blocker will inherit this interaction behaviour as
soon as it is tagged as belonging to the genus of betaBlocker in its container holding its belief system.
13) Why choke on
number codes when Docle is a feast in verse?
Numeric coding schemes look like this
T-2800 M-44060 E-2001 F-03003 for pulmonary tuberculosis with granuloma.
While some Docle
codes for chest pains look like these:
chest@pain/neck@move
chest@pain/food
chest@pain/respiratory
chest@pain%min
where the /
operator means increased or aggravated by and the % operator denotes
quantification.
14) Docle
carries with it a free and unified medical abbreviation standard.
The abbreviation is the computer
generated secondary key. For example
carcinoma located at thyroid is coded as carcinoma.thyroid and the secondary
key/abbreviation is carc.thyr
15) Code shear
technology.
Docle is built up of words joined by operators, much like an
internet address. Coded entities are modified by aspects such as laterality,
acute, chronic, simple, compound, complicated
and male or female. The
modifiers are added to the main code by clicking of buttons. The & character
is the shear operator, an example is
the code fracture.femur&rightHandSide&simple. During processing the
substring &rightHandSide can be
sheared off to return the basic code: fracture.femur .
16) Best
Practice by stealth
EBM can be encoded inside the Docle
object, each disease docle object can have a list of ranked recommended treatments and a list of ranked
investigations. Adoption of a Docle type coding system will achieve Cochrane by
stealth.
17)
Anatomical belief system behind the Docle framework.
All Docle objects are linked together to
form a viable and congruous belief system, even in the anatomical sense. The
need to map every Docle object onto an anatomical framework has thrown up a
previously unnamed body organ. It detected a gap in its anatomical
hierarchy. The anatomical locations
scrotum and testis has a missing superclass, Docle has christened this organ
the tistum. The Docle for tistum is
tist which has as its subclasses scrotum and testis. Docle is the first medical
coding system with an official term for balls. A disease entity that involves a
finger can trigger the message that the hand is involved is part of this
anatomical belief system.
18) Docle has
been subjected to the duress of actual use. Four years ago Docle was
introduced; it is undergoing stepwise refinements with constant usage. This is
both in the medical community usage and in decision support project
development.
19) Graceful
Deprecation
All coding schemes get into the sticky situation of having to dispose of codes that are no longer wanted. There are a variety of schemes to deal with codes that are superseded. One way is to publish a list to say these codes are no longer valid. Another scheme is to use an autogenerated number to represent a code. This autogenerated number once used is never recycled. Problem is that your electronic medical record will be studded with all these autogenerated numbers. To make sense of these autogenerated numbers one will need a lookup table. Making sense of the relationships among these autogenerated numbers will require great ingenuity and effort. There is a softer and gentler approach to deprecated codes employed by Docle whereby the bottom line worst case scenario is being saddled with a human readable and understandable Docle code or expression that makes clinical sense. With future Docle revisions, a new scheme for handling deprecated codes termed Graceful Deprecation is introduced.
20) The Docle
Browser.
The Docle concept is best explained by
diving into the browser. Just like it is hard to explain about the world wide
web. It is much easier to give someone an internet connection and a Netscape
browser and let him roam.
The Docle
browser is a multi-paned application that shows to advantage the Docle
framework. Properties of each Docle object such as its phylum, class, order,
family, genus and species are displayed on selection of a Docle entry.
Associated species with membership of similar genera are also displayed on
a listing pane. The main pick list contents can be altered
by selecting for the various linnean hierarchy. For example, the main pick list
can be populated by anatomical entries by clicking at the linnean Order level.
By selecting an anatomical object, all Docle species with reference to
the anatomical object are listed in the main pick list itself. This ‘diving’ or
drilling down capacity makes the browser a powerful tool for extending and
enhancing the integrity of Docle
itself. By listing all conditions associated with the common bile duct, one may
find that a single medical condition may have been inadvertently coded twice or
overlapped with another code.
References
Oon, Y Kuang.
HISA HIC Conferences Proceedings for
1996, 1997, 1998
www.docle.com.au
contains links to articles about Docle.
Matthew, Gareth.
Philosophy and the young child. Harvard University Press 1980. Page 96.