The Gelati Syndrome

- the fatal flaw in any numeric/multi-axial  reference coding system -



There is a stark choice of medical coding systems:

·        reference coding system

·        name/scripting coding system.


A medical code often describes a central theme and associated modifiers – for example ‘injury involving cycling and traffic accident’. A medical code is analogous to specifying the type of cone (theme) and the flavours of gelati (modifiers). Hence a cone/gelati combination can be ordered by specifying a  reference code on the wall or by naming. Examples of reference coding are the V series injury codes in ICD10 where there are 98 modifiers or gelati flavours to injury. With only 11,000 slots for V codes, yet there are 3,764,376 combinations from 3 or 4 modifiers from the pool of 98. Only a theoretical 1% (11,000 out of 3,764,376) of contextual situations are coded exactly right. Call by name/scripting as in Docle, is exact and precise in contextual input and output. Docle has in excess of 220 modifiers for injury alone. With up to 4 modifiers, Docle specifies over 24 billion injury variants (including sports) in 3 pages versus the  7,151,370 pages needed for reference coding to cover the same ground. In medicine, the number of themes/cones and modifiers/gelati keep expanding; a reference coding system has to keep track of the relationships among the exponentially growing codes. Docle keeps track of the relationships among the individual theme/cone and modifier/gelati only.


Back to a formal definition of the Gelati Syndrome - an adverse  outcome for the patient or a component of the health system arising from the use of a call by reference coding system (ICD,ICPC,Read,Snomed), leading to the loss of contextual information or misinformation, or sufferance from  the complexity  of using or maintaining such a system.






Coding in medicine is at the crossroads. In a nutshell, the problem boils down to the issues of integrity and efficiency as to which is better for medical data representation:



The thesis of this paper is that text, or all but natural text such as Docle may offer the winning edge through its natural facility for human-readable scripting using modifiers. The reference codes such as Read/Snomed, that are modelled on the old  numeric paradigm, appear unable to straddle the challenges  of e-medicine. Based on the rigid multi-axial systems  originally seen with the Snomed codes, these codes map a mainly rigid numeric string to a textual description of the target medical entity that may contain a plurality of contextual information. The alternative to this method is a system comprising an all but natural human and computer-readable language of health termed Docle. Docle comprises terms that are coded and classified in a Linnean hierarchy  in the manner of biological classification. Scripting of these terms, in combination with modifiers and predefined computer syntax, allows the creation of DocleScripts to represent patient medical files and medical  knowledge for decision-support. The case for Docle is threefold today:







“Men, it is a mad-brained trick, but it is no fault of mine,” Lord Cardigan.


The lesson from the Charge of the Light Brigade can be applied to medical coding

- are all numeric codes like Raglan’s courier? The circumstances  surrounding the Charge of the Light Brigade in 1854 at the Battle of Balaclava shows the importance of context. Lord Raglan, the British C-in-C looked down from a high vantage point and saw a small Russian force capture some British guns located at a ridge. He scribbled an order to the Light Brigade, located at the bottom of the valley, to advance rapidly to prevent the Russians from carrying off the guns. From his vantage point, the meaning was obvious.

Unfortunately, Lord Cardigan, the Light Brigade commander was too far down to see the crest of the ridge. The only guns he was aware of were the ones which covered the main Russian army at the end of the valley. Unable to extract more information from Raglan’s courier, the officers made what sense thy could of the order and ordered the fateful and  fabled charge that is immortalised by Tennyson’s poem. The Charge could have been avoided by a series of contextual modifiers such as a DocleScript type instruction:


attack &start@location %Map21F9  &target@location%Map21B12  &force@enemy%100 &objective%protection@guns%4  &force@self%50  &casualty@allowable%50  &when%now!


…..maximal loss would have been 50 men, the target would have been obvious.

Medical decision-support needs fine grain information, so does medical research to pick up the tiny nuances that may lead to a breakthrough in understanding. Loss of coded information and loss of context is seen in the injury or V codes in ICD 10 where 3356 V codes cover all types of injuries.  By contrast the Docle codes cover all injuries, including all Olympic and non-Olympic  sports events utilising less than 221 modifiers. Using a minimal of 4 modifiers to describe 250 basic types of injuries, over 24 billion injury codes can be scripted. More complex expressions rivalling natural speech are possible using information tunnelling. A system of context conflict catching is described.





This short discussion will show the incongruency of the V codes design with  its limited range of  discriminants or context modifiers for  injuries. There are a total of 98 discriminants  or points of differences in the V codes. These 98 discriminants are among them : 'accident' 'activities' 'agricultural' 'air' 'aircraft' 'alighting' 'all-terrain' 'animal' 'animal-drawn' 'animal-rider' 'antecedent' 'balloon' 'boarding' 'boat' 'bus' 'canoe' 'car' 'causing' 'collision' 'commercial' 'construction' 'craft' 'cycle' 'cyclist' 'derailment' 'driver' 'drowning' 'eating' 'fall' 'fishing' 'fixed-wing' 'glider' 'ground' 'hang-glider' 'heavy' 'helicopter' 'hit' 'income' 'industrial' 'inflatable' 'injured' 'injury' 'kayak' 'leisure' 'merchant' 'microlight' 'mode' 'motor' 'motor-' 'motor-vehicle' 'motorcycle' 'noncollision' 'nonmotor' 'nonmotor-vehicle' 'nonpowered' 'nonpowered-aircraft' 'nontraffic' 'object' 'occupant' 'off-road' 'outside' 'parachutist' 'passenger' 'pedal' 'pedestrian' 'person' 'pick-truck' 'pick-up' 'powered' 'powered-glider' 'premises' 'private' 'railway' 'rider' 'rolling' 'sailboat' 'ship' 'spacecraft' 'sports' 'stationary' 'stock' 'streetcar' 'submersion' 'three-wheeled' 'thrown' 'traffic' 'train' 'transport' 'truck' 'unpowered' 'van' 'vehicle' 'victim''s' 'water' 'water-skis' 'water-transport-related' 'watercraft' 'work'.

The V series of codes are based on the template of Vnn.n[n], where there is a capital V with 2 digits before and one or two digits after the decimal point.


The number of  V codes possible for 3 digits:  10 x 10 x 10 x 10 = 1,000

The number of  V codes possible for 4 digits:10 x 10 x 10 x 10 = 10,000


There is a total of 11,000 possible V codes maximum, without a total rewrite of the V codes series and without losing the multi-axial information encoded in the number and its location within the code. The V series usually has three or four descriptor subthemes on average.


Using the nCr formula:

with the 98 discriminants,

                        3 out of 98 gives 152,096  permutations

4 out of 98 gives 3,612,280 permutations


Based on the assumption that 98 discriminants are all that the health and related  professions and medical research community will ever need, the number of code space needed is at least


3,612,280 + 152,096 = 3,764,376


There is a gross mismatch between the contextual situations that the user wants to code and the codes allowed for by the V series design.  There are


3,764,376 – 11,000  =  3,753.376


contextual combinations  that may be required but has found no expression in the V code series. In short only a theoretical 1% (11,000 out of 3,764,376) of contextual situations are coded exactly right. The other 99% of contextual situations are coded incorrectly, do not have a code assigned to them or get lumped into the unspecified activity categories.

This has resulted in some interesting ICD codes with dubious context:


"V81.70","Occupant of railway train or railway vehicle injured in derailment without antecedent collision, while engaged in sports activity","Y",

"V81.71","Occupant of railway train or railway vehicle injured in derailment without antecedent collision, while engaged in leisure activity","Y".





Since there is a mismatch  of the number of  codes and the combinations possible with the discriminants, information coded must be retrieved via a lookup table and not by  inspection of the ICD codes themselves.


Example :

The discriminant ‘alighting’ is found in both V codes below, yet there is no location specific digit representation of   ‘alighting’  common in both  V codes. This is to be expected as the coding space only allows  1 percent of possibilities.  In other words, the user will have to look up the ICD table to see what was actually  being coded.


"V10.30","Pedal cyclist injured in collision with pedestrian or animal, while boarding or alighting, while engaged in sports activity","Y",

"V43.4","Car occupant injured in collision with car, pick-up truck or van, while boarding or alighting","N".





What happens to the case of the professional golfer hurt with a golf buggy while engaged in coaching a Japanese tourist at Cape Schanck? It needs to be coded as ‘V99.90’, ‘V99.91’,’V99.91’, ‘V99.92’ or is it ‘V98.8’ or ‘V98.94’ or ‘V98.99’ or just ‘V99’ or all of the above. Would the system allow a couple of V codes? Would not a single unified code do? This makes coding very prone to error. It makes training  and staffing of coders a very specialised arduous task – like the teaching of Latin.




If the number of discriminants stay the same, then there is no maintenance problem at all. As more discriminants are added, the more flaky the system becomes as the location-specific number  becomes meaningless.  However with time the pressure will mount for modifiers like  … Vietnam, Timor,  and even ethnic labels  to do good public health research.


Some sports-mad  public health offcials would want to throw in a large number of sporting events such as the ones at the Sydney 2000 Olympics games. A sports medicine doctor  might be interested in looking at injuries involved in distance running.  The V codes are confined to 2X2 decimal digits only – to add new discriminants will throw everything out of kilter. To make the new framework  backward compatible is  a nightmare. There are over 100 Olympic events; the possible number of valid combinations of injuries possible is a gargantuan figure. To code for all various possible types and causes of sporting injuries using the ICD paradigm is a big ask.


A good coding system must be able to cope with this technical problem of  the explosion  of  terms in  medical coding or any area of knowledge coding. When the situation arises or when the end-user wants coding for tiny nuances, it should not be necessary to add  thousands of number codes. The explosion of  codes that was seen in the Read coding system, which reached 400,000, must have made maintenance a nightmare.


Information retrieval


An  ICD code once applied creates a situation whereby it is  hard to retrieve or to analyse a particular subcomponent of the code. For example the code for pedestrian injuries can be anything from ‘V01’ to ‘V09’. The  digits of the decimal code do not appear to have a numeric pattern that is  mapped to a particular activity such as ‘collision’ , a ‘truck’ , a ‘car’ or  ‘sports activity’ or ‘work’. Because the ICD code (and other similar numeric codes)  itself does not have a fixed alphanumeric pattern mapped to an external reality, searching for subcomponents is possible but excruciatingly difficult. To seach for all truck injuries, a belief system has to be constructed from the ICD list and all truck injuries are mapped to a collection of ICD V codes that have truck as a possible cause of the injury. This lack of “regular patterns” means that the authors of decision-support software cannot use the technique of lazy parsing whereby pattern or string-matching searches are used to locate potential candidates fulfilling the required criteria. This also means that the moment a single new ICD code is issued,  the analytical tools for ICD diseases are deprecated. To update the analytical tool to the level of correctness, each new ICD code has to be analysed, its impact on every subroutine of the analytical machine needs to be assessed and each subroutine has to be individually updated.


Limitation on the level of complexity of events described


Scientists may become more curious. They may want to study the following:


From another angle, some minor social practices may lead to profound changes in disease demographics. These tiny nuances are the source of many medical breakthroughs. An example is the use of condoms or the practice of circumcision in the spread of AIDS - a hot topic. A disease representational system must be able to code for these tiny variations as a minimum requirement. In that sense the current ICD scheme is not sensitive or expressive enough to code for a comprehensive picture that is required by the researchers. These same researchers might create ad hoc series of codes to be used once and discarded - how unfortunate. Instead a powerful coding scheme can virtually take on any health research project and cognate the results of  medical research over the generations.


One of the medical (or any entity in any knowledge domain) coding challenges is the problem of converting handbooks and textbooks guidelines into fully-fledged computerized decision-support systems, offering answers at the point of clinical decision-making. This is distinct from the conversion of text material into a series of lookup pages in HTML  housed in a web browser. The type of decision-support required is more like the user doing data entry into a clinical recording system and by clicking on a button, the computer will respond with the most pertinent treatment plans tailored to that particular patient.


The  business rule that needs to be captured is like:


If the patient with endocarditis is an adult, has no allergy to penicillin, then give

Benzylpenicillin 2g 4hourly intravenously for 2 weeks


Gentamycin 2mg/kg 4 hourly intravenously for 7 days, monitoring blood levels


Vancomycin 1 g intravenously 12 hourly  PLUS  Gentamycin 2mg/kg 4 hourly intravenously, both for 3 days.


Coding the above with a numeric coding system is a challenge.






This system and method of coding for medical data and knowledge is, of course, not confined to medicine and can be extended to any other knowledge domain. The escalation of medical codes, such as seen in ICD10 , Snomed and the library of congress ULMS, can be checked using the concept of scripting of  medical themes and modifiers already predefined in a Linnean multi-inheritance hierarchical framework as outlined below. These predefined words can be constructed to represent simple and complex data with precision suitable for human understanding and computer processing.  Knowledge, such as held in medical texts and guidelines, can be translated into these scripts, termed DocleScript.  DocleScript employs  the concept of  a  script with a theme, plot and subplot construct. The problem of data and knowledge representation in medicine and other domains such as law or engineering is akin to the analogy of data and knowledge representation in English. In a crude form, we need a lexicon such as an Oxford Dictionary of the English language which lists all the words, correct spellings and their meanings. We then need a syntactical semantic rules to combine the English words into prose or verse or treatises or essays or reports to convey the information. The Docle notational, Linnean classification system of  medical terms  provides the grist for this invention. This  invention is a scripting system and method that uses the controlled combinations of  these predefined terms which are themes and modifiers  into constructs that can accurately represent medical and non-medical data ,  knowledge and concept.

There is a limited number of species (or special unique)  terms in medicine, each species is  associated with its unique set of identifying  criteria, prognostication and  treatment. However there  is an infinite variety of  variations of these species terms, much like the situation of the species called homo sapiens. There are homo sapiens that work for the bank and homo sapiens that live in the artic regions or the tropics. There are dark-skinned  homo sapiens and there are homo sapiens who are still hunter gatherers. The problem arises from the  prior art  of attempting to force variants of medical species into their own unique species, each with a unique code.


The solution is to limit coding to terms identified as species only and to describe variants using modifiers. Furthermore the construction of the modifiers are structured into contextual constructs called plots and subplots. Because there is no limit to the size of the plots and the infinite number of plies of subplots, which enables information tunnelling, there is no limit to the expression of the most complicated ideas and concepts in data and knowledge coding.


The solution is to define a computer scripting language underpinned by keywords held in a Linnean medical belief system. There are two types of keywords. Keywords preceded by the ‘&’ character are known as modifiers. Keywords without the ‘&’ character are symptoms and signs, diseases, investigations and results of investigations, procedures, process of care and therapeutics – they become the subject of themes. The problem of representing the knowledge in a medical text that is suitable for use in decision-support is to have a data and knowledge representation system. The solution in this invention is based on a Linnean medical belief system for all modifier and thematic (non-modifier) terms.


These modifier and non-modifier terms are written in an all but natural English expression that is human readable and suitable for computer  processing. To code for knowledge representation, the prerequisite is the ability to code for basic events such as medical events using small scale scripting with the  modifiers construct.

Since all the keywords are human-readable and non-numeric based, they are  immediately decipherable. It is unlikely the user will code for nonsensical scripts utilising the wrong modifiers. These keywords, termed themes and modifiers, are classified and arranged  in a Linnean hierarchy with multiple inheritance. The Docle Linnean framework models the ideas of biological classification started by Linnaeus in the 1750s.  One of the central tenets of biological classification is the concept of the species.  The other tenets are the hierarchies and the concept of the taxon (plural taxa).  A taxon is a group with shared values in each hierarchy.  Species identification is half the work; the other half involves placing the species in the right taxon in the right hierarchy.


The system of classification in Docle is based on the above framework with modifications.






The coding system uses the system and method of constructing a script to describe  a medical event.  The script comprises a theme, a plot construction by appending predefined preclassified word modifiers and optional series of subplots embedded in parentheses. Information tunnelling is achieved using the subplot construct. As there are infinite layers of sub-plotting allowed, very detailed description of events, without fear of contextual confusion, is achievable.


A practical example of a theme is ‘injury’.


Examples of modifiers which are stigmatised with the initial ‘&’ character are:


&ctx%driver &ctx%car &ctx%accident@traffic  &ctx%coll-ision@headOn   &speed%80kph    &ctx%truck  &speed%100kph


Hence in Extended Backus Naur Format the script is:

theme =  doclePrimaryKey 

script = theme  [ plot  ]

plot = {modifier} [subplot]

subplot  =  “(“  { modifier}  {subplot }  “)” 


Plots are just a sequence of modifiers, whereas subplots are enclosed in parentheses. Subplots are recursively defined in the EBNF notation above.

Subplots modifies the behaviour of the modifiers at the information  level that they occur at.


Some demonstrations of theme, plot and subplot constructions

The example of the car driver that was injured in a traffic accident during which his car, travelling at 80kph, collided head on with a truck travelling at 100kph.


injury &ctx%driver &ctx%car &ctx%accident@traffic ( &ctx%coll-ision@headOn  ( (&ctx%car &speed%80kph)   (&ctx%truck  &speed%100kph ) ) )


The example of the truck driver that was injured in the same traffic accident  as above. An extra layer of information tunnelling is used to describe the sex of the car driver as male, aged 18 and that his car was red.


injury  &ctx%driver &ctx%truck  &ctx%accident@traffic ( &ctx%coll-ision@headOn    ( ( &ctx%truck &speed%100kph) (&ctx%car  &speed%80kph &color%red ( &ctx%driver &sex%male &age%18)  ) ) )


The above information regarding the injury of the truck driver can  be represented as a indented layered  contextual information diagram as shown below:



            &ctx%driver &ctx%truck 



                                    &ctx%truck &speed%100kph

                                    &ctx%car  &speed%80kph &color%red

                                                &ctx%driver &sex%male &age%18



What happens when it is two trucks colliding head on? The   protagonist context ‘&ctx%prot-agonist’ is used to distinguish it from the other truck. The injured patient’s truck was travelling at the sedate pace of 60kph. Example below shows the other truck driver was aged 80 and travelling at 120kph.


injury  &ctx%driver &ctx%truck  &ctx%accident@traffic ( &ctx%coll- ision@headOn    ( ( &ctx%truck &speed%60kph  &ctx%prot-agonist) (&ctx%truck  &speed%120kph  ( &ctx%driver &sex%male &age%80)  ) ) )


The indented diagram to show the information layers:



            &ctx%driver &ctx%truck 



                                    &ctx%truck &speed%60kph  &ctx%prot-agonist

                                    &ctx%truck  &speed%120kph

                                                &ctx%driver &sex%male &age%80


It is possible to add information such as alcohol levels to the parties involved if required. This scripting shows the possibility of coding for public health and medical research utilising these constructs. For example there are over 24 billion  potential  combinations for the coding of injuries using just  four modifiers out  of a range of 220 modifiers combined with the 250 basic injury codes. The 250 injury codes stem from the generic (genus)  concept of  injury plus 249 types of  injury associated with each specific anatomical site using the “dot” or  “located at” operator variants.


Examples: injury ; injury.brain ; injury.foot ; injury.finger ; injury.eye


In addition there are more than 220 Docle modifiers for the injury, these specify how the injury took place, notated such as:


 injury &ctx%driver &ctx%car &ctx%accident@traffic


With the case of using the modifiers, without repetition of each modifier, all the possible permutations  can be calculated using the formula:

nCr := n!/r! (n - r )!

This standard   formula  describes the number of selections of “n” different objects  taken “r” at a time . That is,  the  number of “r” subsets that can be formed from an “n” set. As the number of permutations of modifiers that can be recruited from the pool of 220 for categories of 1 to 4 modifiers, the total number of permutations are:

220C1  = 220

220C2  = 24090

220C3  =  1750540

220C4   =   94,966,795

Therefore the number of  selections possible is:

220 + 24090 + 1750540 + 94966795 :=  96741645

given the number is restricted to the range from 1 to 4  out of the pool of 220. Based on the use of up to four modifiers, the possible number of injury codes is:

96741645 x 250 =   24,185,411,250

potential Docle codes for injuries. If we add the the 250 injury codes that have zero modifiers, we get the grand total of 24,185,411,500. That is over 24 billion codes for injury alone, but of course these include codes for all types of sporting injuries inclusive of all Olympic sports events.


Needless to say, when eight modifiers are specified, the potential number of codes will be astronomical, all these without the use of the subplotting construct.


The sensible objection is the question as to whether we need such a detailed coding system. A closer  inspection is that we do not have that many codes. We have only a collection of 250 injury codes. A generic code for ‘injury’ and, as there are currently 249 defined anatomical structures, makes a grand total of 250 types of injuries all linked via an anatomical belief system.


The important concept here is the breaking down of a coding exercise into the main theme, its nuances (plot) and an infinite depth of micro nuances(subplots) when the situation arises. The elegance of this solution is that we can define the scripting and its lexical elements, yet restrict the use to only a tiny subset of the definitions, if there is any irrational fear of overcoding. The modifier plot/subplot construct allows the formation of scripts with tenuous rather than tenacious links as in the  ICD scheme. In the latter case, any discriminant such as ‘pedestrian’ and ‘traffic accident’ are tenaciously bonded together in a serious of compounded expressions each tenaciously mapped to a single, mainly numeric code. Managing change in medicine needs a “tenuous” rather than a “tenacious” stance. The tenuous and  flexible  method in this scripting language allows for graceful development of coding expressions to express new ideas  and subtle nuances needed by the researcher.


Another example of how  information is tunnelled under a lower level, similar to a city multi-level carpark:


(&ctx%bite ( &ctx%man &ctx%dog   ) )  - in this subplot we are not sure who bit who, someone was bitten

(&ctx%man  &ctx%bite &ctx%dog   ) – in this subplot it is clear man bites dog as the actor precedes the object with the verb interposed.


The good news is that regular patterns are easy to search and they simplify data retrieval. DocleScript keeps evolving towards a natural language (just like English), yet is apprehensible to the computer.






The keywords and modifiers in the Docle coding system are in all but a natural language that almost reads like ordinary English. Any contextual errors can be potentially picked up by an alert user.To automate the context conflict detection, a  Context Conflict Catcher can be implemented. A program can be written to vet for the correctness of the scripting. It catches logical errors such as a code for:


injury &ctx%driver &ctx%pedestrian


where the injured person cannot be the driver and the pedestrian concurrently.


The above scripting has a theme called ‘injury’. The ‘&ctx%driver’ can be analysed as  components. The ‘&’ character denotes that it is a modifier, the ‘ctx’ means context, the ‘%’ character means value, and the ‘driver’ defines the context.


One  way of catching contextual conflicts is to limit the possibilities of  modifiers used in actual coding practice. For example, when the coder selects ‘injury’, the user is prompted  for the location of the injury, then for traffic or non-traffic injury, the type of vehicle, and  whether the patient is a passenger or driver or pedestrian etc. This limits the potential for contextual conflicts, yet because of the availability of modifiers, allow choice for the investigators involved in a research project.


Another  way of catching contextual conflicts is to employ a  dictionary of incongruous terms that are stored as associations. For example, the association of  ‘&ctx%driver’  with ‘&ctx%pedestrian’ are incongruent at the same information level. Another incongruent association is that of  ‘&ctx%driver’  with ‘&ctx%passenger’ at the same information level.


Patterns can also be used to catch context conflicts. Problematic conflicts are documented and stored as a set of “dirty character strings”. Vetting for correctness involves matching the “dirty” strings with the putative script - if it matches the script is flagged as suspect. Efficiency can be improved by presorting the alphanumeric collating sequence of the tokens.


The Linnean medical belief system can also be used to parse the script for conflict detection.


The health system may be interested in creating a list of , say, 5000  mini DocleScripts for coding injuries. To select the code, users will then click on the key description or  the most suitable DocleScript.






Any coding scheme that reaches 100,000 items or more becomes anaplastic and difficult  to maintain and use.  The medical informatics community of the 21st century does not need a medical coding system as such, but is in dire need of a medical language system.  To describe the situation  of the  numeric reference coding systems, paraphrasing Lord Cardigan succinctly, “Men, it is a mad-brained trick”. In the event that things go awry, we do not have the luxury of  Lord Cardigan’s defense, “but it is no fault of mine”.  In the scenario that a national health system mandates a numeric reference coding system,  we can paraphrase Lord Alfred Tennyson:


            Was there a man dismayed?

            Not though the soldier knew

            Someone had blundered.

            Theirs not to make reply,

            Theirs not to reason why,

            Theirs to do and die,

            Into the valley of Death

            Rode the six hundred medical coders,  medical software developers…..


On a lighter note, maybe it will only be gelati on our face for a short while until common sense reigns supreme.


Footnote: The order by Lord Raglan, scribbled in pencil by General Richard Airey, had contextual problems, “Lord Raglan the Commander-in-Chief wishes the calvary to advance rapidly to the front, and try to prevent the enemy carrying away the guns…..”.


APPENDIX: More examples in injury coding


injury &ctx%aircraft@comb-at  &ctx%peace  - injuries associated with warplanes in peacetime.


injury &ctx%aircraft@comb-at  &ctx%war  - injuries associated with warplanes during war.


injury &ctx%school@excu-rsion - injuries sustained during school excursions


injury &ctx%cycl-ing  &ctx%helm-et@non - injury sustained while cycling without helmet


injury &ctx%driv-er  &ctx%accident@traffic &ctx%coll-ision@rear@server - injured driver drove into the rear of another vehicle. This might be relevant in a study on whiplash.


injury &ctx%driv-er  &ctx%accident@traffic  &ctx%coll-ision@rear@rece-iver - injured driver’s car was hit from the rear. This might be relevant in a study on whiplash.


injury &ctx%pass-enger  &ctx%accident@traffic   &ctx%pass-enger@rear@driv-er  &ctx%coll-ision  &ctx%pole  - injured passenger was sitting behind the driver. This might be relevant in a study on which spot is safest for the passenger when the car wraps around a lamp post.


injury &ctx%occu-pant &ctx%car  &ctx%accident@traffic  &ctx%coll-ision@rear@receiver  (  &ctx%coll-ision@rear@server   &ctx%driver &sex%male &age%17   ) – injured passenger was in a car which was hit from the rear.  The driver of the other vehicle is male and aged 17. This might be relevant in the tenuous situation when the  personal coding of patient health can be drawn out for the good of the public health. If we get a rash of paraplegics or serious head injuries  from 17 year olds serving up bad rear-end collisions, something can be done  quickly. This is an example of information tunnelling using subplot construction.


Coding for complex medical entities can be standardised to extract maximal relevant  information possible for public health and the common good  with the least input effort ( clever front ends). The user need never see the DocleScripts as such. The Docle coding and classification process is AHEAD of the user and can respond by adding a handful of modifiers to reflect public health changes. For example, Timor ( &ctx%timor ) or Vietnam (&ctx%vietnam)  context modifiers  may be required for Australian servicemen.


These examples also show the benefits of a self-descriptive coding system. For instance, a search for the pattern ‘&ctx%accident@traffic ‘ will extract all the cases of traffic accidents – this is called “lazy parsing”  as we search and collect meaningful cases from dumb string matching.


The ABC modifiers:











































































Copyright and all rights reserved – Dr Y K Oon.

Aspects of  Docle and DocleScript systems and methods  are patent pending. The exact syntactical construct  of DocleScript is changing as it undergoes refinements towards simplification. This paper is an abstract of  the submission to the GPCG coding jury Sydney 2000. It was also presented at the HIC/HISA conference in Adelaide, 3-5 Sep 2000.



Dr Y Kuang Oon

Docle Systems

29 Darryl St


Victoria 3179