Open-Source Medical Information Management

Copyright 1999, Daniel L. Johnson



I have no commercial interest in any of the software discussed in this essay; in fact, I've spent a lot of my own money on this project just for the pure pleasure of it. My only conflict in this arena is that I have lately come to own a little Red Hat stock through accident of birth.

I am not a GNU/Linux or free software / open source zealot; I simply recognize its genuine strengths and enormous potential. I am not opposed to commercial software; in fact, I am an investor and board member of a company, Technology Concepts, Inc., that is a provider of real estate database software and which does not use any free or open source technology, and is wedded to Microsoft technolgy.

My employer, the Red Cedar medical Center, and our owner, the Mayo Regional Health System,
have not supported this work, nor have they been asked to endorse it. It is purely my own

I have done enough coding to know that my time is better spent supporting skilled hackers rather than trying to become one. I have watched the computer industry carefully for twenty years, but I do not know nearly enough about it; in this essay I have done my best to tell the truth: all errors are inadvertent, and I'll be grateful to be educated where you see a need for it.

Whenever something I say in this document seems not to make sense, please consider it a failed attempt at humor.

Author's background

I am Daniel L. Johnson, an internist from Menomonie, Wisconsin. I've had an interest in office ergonomics for about 30 years, since being an office supervisor before medical school; and an interest in finding ways to use computers to aid clinical efficiency since about 1983, when I found a replacement for the accounts-receivable software that my clinic was using. In 1985, after a change in practices, I became interested in the intellectual and manual processes of mining information from a clinical record for medical decision-making. This led me in 1986 to write a specification for software to make the intellectual tasks of the office physician more efficient, but the databases and tools did not yet exist to make this feasible. This specification, updated for current technology is available at

The tools now exist to create such a system, but it remains to be determined whether interest, motivation, and human capital can be assembled to bring this about. This specification I whimsically call "QuickQuack."

The html version of this essay will be located at

You may contact me at
    johnson DOT danl AT mayo DOT edu
    johnsondanl AT m1 DOT uwstout DOT edu

Open-source medical software.

The focus of this essay is medical-record software aimed at the outpatient-care setting. Hospital care requires record-keeping with an entirely different philosophy than outpatient clinical care, such that it is not possible to do justice to both in one presentation.

Hospital documentation is oriented around single, completely encapsulated events of care, lasting hours to weeks. Outpatient-clinic documentation is oriented around longitudinal care for at least an episode of illness, but in primary medicine, the "episode of care" is the patient's entire life. In the most general sense, any chronic illness or primary care requires rapt attention to the longitudinal aspects of the patient's condition; a transient condition is an "interlude"1 of care.

Thus a hospital-based electronic medical record could be subsumed within an outpatient-clinic record, as a "lobe" off the main record; but it is not possible to take a record designed for the hospital and generalize it to the outpatient-clinic setting.

A strategy of any project must be to begin simply; to identify accurately the essentials of the electronic record and focus on these while planning for the inevitable addition of complexity and evolutionary needs. One of my goals here is to begin to identify the "kernel" of the record and recognize the directions it must take.

Extant Computerized Medical Information Management Projects
Project Internet address for more information, if any

(principal author or project leader, project location)


(Greg Wettstein, Fargo, North Dakota)




(Stephan R.A. Deibel, John Ehresman, Massachusetts)


(Chris Fraser, Australia)


(Alexander Caldwell, California)


(Jeff Buchbinder, Connecticut)


(Tim Cook)


(Brian Bray, Joseph Dal Molin, Hamilton, Ontario)


and /QQMIM/qq4.html

A listing of most projects is at

A brief history of free software:

(See and

In the 1970's, the success of proprietary operating systems, particularly UNIX, created frustration in the academic community, who could not use these systems for study or teaching. This was a tremendous hindrance to their effectiveness. This situation motivated Richard Stallman to begin the GNU project in the mid 1980's. At first this project produced utilities and applications, but its fundamental goal was a free OS. Meanwhile, the proliferation of different flavors of UNIX resulted in development of the POSIX standards. In 1991, the world still was without a functional OS that was available for teaching.

The FSF had already begun an OS project, but this had bogged down due to hardware limitations and its own complexity. Into this vacuum, Linus Torvalds in 1991 posted his preliminary work on making a UNIX-compatible OS for Intel processors. This project was sufficiently simple and its goals quite clear; and the felt need was very great, so that it attracted the interest of many skilled programmers around the world.

It is important to realize that Linux was initially a success: it immediately was a useful teaching tool and its development quickly liberated academicians from the vendor lock that had paralyzed them for a quarter-century. These developments are more important to the rapid progress of software and connectivity than is its subsequent commercial success, as free access by programmers to code and basic tools makes them significantly more efficient and more effective.

A by-product of this success has been a controversy over the meaning of "free" and the development of an interesting variety of points of view on what restrictions are or are not important for the continued development of such software. The controversy ends up, of course, with some disagreement over whether there is some code that must be or should be confidential and proprietary in order to ensure the viability and reliability of businesses that develop it and are expected to maintain it.

To review the open-source material, visit

The Moral Basis of Free Software/Open Source.

Now that I have your full attention, let me explain why I use the word "moral" here. Although I have seen programmers deride the enthusiasms of f-s/o-s enthusiasts as "religious," this accusation is inaccurate, and the movement is not in any way a religion. The word "moral" is an important secular term to describe the fact that when we form groups and associations, we encourage and follow rules that define the group and that help the group produce benefit for all its members. Such rules are, in the most general sense, "moral" obligations, which are relative to the goals of the group. The social nature of mankind is such that group loyalties are strong, and the norms of any tightly-knit group may sometimes be enforced with "religious" zeal. The f-s/o-s movement was begun by people with strong convictions, and continues to attract some people with strong convictions about what it can and should be.

Hence, to understand the strength of the "free software"/"open-source" community and the vigor of the debates over proper handling of free software, we must notice that these phenomena involve groups of people, not individuals. These groups did not form on the basis of economic need, but of educational and functional need. Economic use followed.

Let's step aside briefly to understand the free-software/open-source revolution better:

Capitalism has traditionally been a bastion of individualism, and personal freedom is thought to be part of its foundation. On the other hand, academia and commercial enterprise have often been in conflict, partly because free exchange of ideas is the most important principle of an academic community, while privileged knowledge has always been a source of riches for the businessman.

In the centuries since Adam Smith's work, economists have paid ever-closer theoretical attention to the idealized construct of "perfect competition." Two of the requirements of perfect competition are freedom to participate in the economy ("to trade") and equal access to information. Businessmen who profess loyalty to "competition" and to the ideals of capitalism are assertive in advocating their rights to freely participate in trade and to compete, but don't share information, favoring imperfection as long as it favors them. Academicians have paid scant attention to trade and have devoted much attention to freedom of ideas, often with trumpeting and breast-beating.

In 1986, David Gauthier famously wrote in Morals by Agreement that a perfectly competitive market, "Were it realized, would constitute a morally-free zone, a zone within which the constraints of morality would have no place. ...this is not a fault, but the essential virtue of the market." This libertarian ideal has had significant influence on the behavior of judges and businessmen. But it is not correct. The reason it is not correct, and the fact it is not correct, are important to understanding the vitality of the open-source, free-software community.

In his forthcoming book, The Moral Conditions of Economic Efficiency, Walter Schultz analyzes the Fundamental Theorem of Economics and clarifies its presuppositions, and proves rigorously that an idealized economy cannot be efficient if the agents acting within it have neither moral constraints nor an internal incentive to act morally.

One key to understanding is that "morality," at its kernel, is neither religious nor absolute. It is relative to the values of a particular community. Schultz, a philosopher, presents a minimal definition of "morality" that is valid across Western and modern Eastern cultures:

Morality is
    - a normative social practice,
            the purpose of which pertains to collective and individual well being,
    - guided by beliefs held in common,
                - criteria by which to evaluate behavior,
                - criteria for mutual responsibility, and
                - procedures for mutual accountability.

That is, "morality" is the word we use to describe the behavioral standards or limits a group evolves to define itself, for the benefit of the group and the individuals in it. The well being of the individual does not necessarily conflict with group well-being, but when they do conflict, the balance is tipped somewhat in favor of the well being of the group as long as the individual is not harmed.2

We humans are social animals; we--that is, our self-concept and our public identity--are significantly defined by the groups we belong to and which will have us. Some of our deepest convictions and strongest emotions involve group dynamics. Much of what we regard as "right" or "wrong," "just" or "unjust," stem not from religious moral absolutes but from informal group or community dynamics.

So Robert Young of Red Hat is taking a consciously moral stance when he states that Red Hat releases all its code because "it is the right thing to do," and when he refers to partnerships with developers and users as "setting up an ecosystem" that creates a "virtuous circle." (See PC Week, Sept. 27, 1999, p.100)

As Linux has drawn millions into the fringes of the free-software movement, we have seen vigorous debates over the standards and "normative constraints" which are proper for this evolving community. These debates are an essential part of community formation and evolution; their outcomes define the community and the nature of the "well being" it confers on its members. The enduring conflict of values and priorities between the commercial and the academic communities, both of which can benefit from free software, has fueled the fires of debate. The existence of "community" does not imply consensus! Anyone who's lived in a small town knows this.

In fact, the objectivity and the personal restraint of most participants in this debate is more remarkable than the occasional bursts of intolerance or foolish egocentrism. (As opposed to prudent egocentrism...)

Robert Young notes the lack of consensus in the same PC Week interview: "This term "Linux Community" and the implication to outsiders that the community is cohesive--it has never been cohesive. It is, far and away, the most argumentative, acerbic group I have ever had the misfortune to be a part of . But don't get me wrong. That has been good for the technology. It's a community that values truth and values engineering excellence over marketing and compromise."

Academic Freedom and Capitalist Opportunity.

The history of the medical community is a paradigm for what has been developing in the free software/open source community, as the same debates have occurred across recent centuries.

Two and three hundred years ago, doctors, particularly surgeons, were entrepreneurial craftsmen. Those who had discovered secrets of anatomy, surgery, or medication used this special knowledge to make themselves famous and rich. They used this knowledge to attract clients and apprentices, and an apprenticeship to a famous surgeon was not purchased cheaply. Their discoveries were published after decades, or posthumously, if at all.

In fact, publication itself is a late development. Gutenberg's invention of the printing press was not done in order to make mass publication possible. The motive and the first use was simply to reduce the production cost of illuminated manuscripts, to sell these for the (very high) going rate, and make a large profit. Mass communication became possible only with inexpensive methods of typesetting, paper production, and printing -- and with the discovery that a mass market might indeed exist, a nineteenth-century phenomenon.

Today doctors, particularly surgeons, still put enormous energy and politically-sophisticated efforts into justifying and protecting our high fees and comfortable incomes. But this is no longer done through entrepreneurial promotion of medical secrets; it is done by maintaining special expertise in areas of highly complex public knowledge and providing service of extremely high quality.

In fact, if a health practitioner claims to have special, secret knowledge, this is always assumed to be quackery -- until it is published and subjected to the rigors of scientific validation. The "doctor" who practices strictly entrepreneurial medicine using "peculiar" knowledge is viewed contemptuously by physicians, and is in fact acting immorally based on the standards (social norms) of the medical community using the cross-cultural definition of "morality" above.

This is exactly the transformation that free software, particularly GNU/Linux, is fostering. Software is becoming a community asset, and community ownership is becoming a moral standard.

Why is this happening?

Because there is unequivocal community benefit.

The justification for academic freedom is ultimately that common knowledge benefits society -- "community" in the broadest sense.

The reason that medical knowledge has become public property is that there have been successive revolutions in knowledge of anatomy, surgical technique, anesthesia, bacteriology, antibiotics, physiology, pharmacology, and now immunology and molecular genetics which have transformed medical care from shamanism to reliability. To share this knowledge benefits mankind -- "community" in the broadest sense.

And the reason that proprietary operating systems and basic tools are coming under the rubric of academic freedom -- the underlying significance of "free software" -- is that computers are becoming a ubiquitous and essential tool of society.

Our definition of "moral" limns (highlights) the observation that social benefit, in practice, outweighs individual benefit. That is, if a group is to exist at all, benefit to the group must outweigh benefit to an individual when they are in conflict. To put it another way, the group exists to benefit its members: this is an individual benefit. But when taking an action that benefits an individual will "harm" the group in some way, then the individual is "morally" constrained in some way to avoid the harm; ideally without also harming the individual.

The "harm" that proprietary, secret code brings to a community of users (end-users and the programmers that serve them) is (for examples) delayed development and failure to resolve bugs, frustration from achieving goals of known feasibility, inefficiency, and financial exploitation. The "benefit" of open code is (for examples) to accelerate development, enhance efficient use of code, freely exchange and debate ideas, which leads to improved algorithms and techniques, to expedite agreement on communication and exchange protocols, and to hinder financial exploitation and gouging by introducing competition for service.

How does this apply to the development of medical software?

It means that those features of software which will be of general use to the entire medical community in promoting communication, appropriate data exchange, and those features which tend to improve health care in society should be subject to the principles of academic freedom: the code should be open.

It means that software designed to perform tasks that tend to be unique to organizations or matters of individual preference, or knowledge that is special to a particular enterprise need not be open; in fact, opening such code does jeopardize the security of the vendor.3

Nevertheless, users of software tools are learning that open code helps protect them from vendor lock and exploitation, and sophisticated users are beginning to require, as part of vendor agreements, that the code as well as the executables be released to the user, typically with a non-disclosure agreement executed by the user on behalf of the vendor.

Rights of the Programmer

Dr. Schultz, after proving rigorously that economic inefficiency is the outcome of a morality-free community, then asks what are the normative conditions that will provide efficiency in an idealized competitive economy. He rigorously proves that at least four exist in respect to economic exchange situations, which happen to moral rights in the cross-cultural sense already given. I list them without his proof:
· A right to truth. This is a right to truth regarding goods and services and acceptable prices; it entails an obligation not to lie.
· A right to property. This permits a set of defined property rights; it entails an obligation against theft.
· A right to autonomy. This is a liberty right, to act freely within group constraints; it prohibits exploitation and slavery.
· A right to welfare. The Fundamental theorem presumes an "initial consumption bundle;" this right obligates the community to restore a minimally adequate consumption bundle to the person whom disaster strikes; everyone else contributes to its restoration; it entails an obligation to give. (This is what commercial insurance and government disaster relief provide.)

It is useful, in seeking to understand the nature of the free-software/open-source movement, to extend these rights to production situations, specifically the economics of software production and service. In this sense, what do these rights entail? I attempt here to connect them with the known mores of the community, to the extent that there appears to be any consensus.

· A right to truth.

The hacker has a right to verifiable code.

There is an obligation not to distribute (deliberately) obfuscated code, rogue patches or binaries, Trojan horses, and not to give false instruction.

· A right to property.

There can be a set of ownership rights by which a hacker may own and distribute code, and there is (separately) an absolute right to have possession of code.

This right also implies a right to hack; there is an obligation not to hinder hacking, an obligation not to plagiarize code; and an obligation not to destroy code or its repositories (an obligation not to disseminate destructive viruses).

This property right gives a hacker the right to earn a living from the community's code and his/her own modifications.4

· A right to autonomy.

There is a right to liberty within the community; to hack in whatever way the individual wishes; the right not to be exploited, interfered with, or enslaved. It entails an obligation not to intrude on the autonomy of others (e.g., with false announcements).

· A right to welfare.

There is a right to receive a "grubstake" from the community, either as a newbie or after a destructive disaster. This entails a right to learn to hack new code (apprenticeship) and an obligation to teach others in the community. (Property welfare is in our society covered by commercial insurance.)

Programmers in the free-software/open-source community are not, of course, in any sense consciously working under these principles; what I have done is to attempt to take a well-defined set of theoretical rights that apply to an idealized exchange economy and ask informally whether there is commonality with what I've observed in the f-s/o-s community. These parallels are interesting.

Motivations of Open-Source Participants

What causes programmers to participate in the f-s/o-s community? Several people have commented on this with interest, and after reading some of this commentary and after analyzing my own observations, the only possible conclusion is that people participate in this activity for the full range of reasons that they participate in any other activity, and it is not realistic to attribute any particular motive to the entire community, even though one part of the community might be easily characterized for one reason or another.

Regardless, there are some important factors to consider, that are important in understanding the free software / open software communities.

Non-economic incentives.

In the beginning of free software, there was "officially" only Richard Stallman, and he has made his motives clear in his own writings. Some of those who latched on perhaps shared his motives, but it's also clear that not all did. Nevertheless, for several years, free and open software was not commercially useful, so while economic dreams might exist, participants had to satisfy their economic needs elsewhere. Thus free software was the hobby of the "rich," and those who devoted time to it were doing so for reasons other than simple avarice.

By "rich," I do not mean that the participants have been or are financially wealthy; only that their survival needs are somehow satisfied in a way that left considerable time for experimentation with free software. Some may have been wealthy; most simply had "day jobs" or were students with the usual sources of student support.

My own impression, from the sidelines, was that to some extent the free software community was a "sandbox culture." That is, like play in a child's sandbox, some of the work was done for the sheer pleasure of being able to build something by yourself, which one did not have the opportunity to do otherwise, for any variety of natural reasons. Linus Torvalds was quoted in an interview in the November, 1999, Linux Journal, as saying, "Linux didn't start out as a message to the masses. Unlike Richard Stallman, I really don't have a message. He has one and can go on about it forever. I'm just an engineer. Let's see. Do things well! Do them with heart!..." The desire to build seems to be built into the human psyche, and to build well is a natural goal.

It is also clear that some participation is morally based, as Stallman's seems to be; often this is a response or adverse reaction to negative commercial values such as greed or debilitating secrecy. In any case, as a community develops, it intrinsically develops a morality that defines its borders and its purposes. This results in strong and even militant advocacy of these characteristics. To the extent that non-economic incentives are seen as an essential characteristic of the community, there will be emotional and persuasive argument against permitting economic incentives to guide the community. We have seen such debates.

Economic Incentives.

The quality and usefulness of the mature GNU/Linux system has been great enough to make these tools useful in many ways to people who must make their living with software. This has led to non-commercial successes such as Apache and Debian and to commercial successes such as Red Hat and Caldera. As a result, we now have a larger community, which presently includes all the usual economic motives for participating. This has resulted in the "free" versus "open" software debate, and the recognition by most people that it is economically appropriate for some software not to be free. (Overall, the commercial-software community does not understand the strength of freedom for power and quality, does not understand its benefits to the end user, and does not understand when software is most suitably "open" and when it is suitable to keep it "closed.") The agreement that some software is best "free" and other software may be suitably proprietary has, of course, resulted in vigorous and sometimes intense debate about where to draw the grey zone.

"Open" vs. "closed" medical software.

The medical community brings a special complication to this debate, because the information that is kept by this community is confidential . The requirement for confidentiality will inevitably confuse the debate about what aspects of medical software may be open-source. Some of this confusion will likely be deliberate. While the code may be free, the information it contains and manipulates must never be free. It is perhaps less obvious that commercial exploitation of this confidential information, whether or not behind the veil of closed code, is unethical.

Briefly, our need in the medical community is for open-source connectivity tools, common databases, and open-source security and authentication tools. That is, we medical professionals need to be able, on behalf of good care for our patients, to transfer clinical information electronically to other professionals involved in a patient's medical care, either concomitant with a referral or by the direction of the patient. At the same time, we must protect this data from mining by insurance companies, government agencies, and interested individuals who have no right to the information, and who might use it in ways adverse to the patient's benefit.

It is clear that user interfaces, productivity tools, and display techniques can be as proprietary as desired. Business offices, medical ancillary staff, and physicians do need to have top-level tools that help them work efficiently, and this is best done by being able to customize theses tools to local and individual needs.

Open-source Medical Software Projects

With this philosophical rubric in mind, let's review current open-source projects and consider how they meet the need for free code in medical information management.

Classification of software:
DHCP / VistA 

The classifications are whimsical, and should not need elaborate definition; there is no "hidden meaning."Perceptions

At the beginning of this decade, the Roger Maris Cancer Center was formed in Fargo, ND. The staff was faced with the challenge of managing four legacy systems which could not communicate with each other. Important goals were to develop an information support system (for the clinicians brought together by the center) and to increase business efficiency. The usual barriers were encountered while trying to link the four legacy systems through cooperative efforts with the vendors, and during this process Linux emerged into the world.

Dr. Wettstein has offered some reflections on my comments, which I quote in this typeface (Helvetica).

Dr. Greg Wettstein developed this information system, deployed with Linux 0.96c and continuously upgrading kernels as Linux matured, which successfully achieved all the functional goals of the group, and which survived from 1993 until 1997, when it was replaced by a less efficient commercial system. The design and functionalities of Perceptions are important to this project.

I call Perceptions "creviceware" because an important prerequisite for clinical usefulness is to fill the (very large) interstices left by commercial software, which is characteristically single-task software, and which is never designed primarily with the information needs or efficiency needs of clinicians in mind.

Using shell scripts, Dr. Greg created a set of "interrogatory robots" to mine the legacy databases. When a patient came to the center and registered, these interrogatory robots were dispatched from the workstation to collect data for a packaging utility. This data packaging utility followed the patient through the center and additional data was added. Update utilities were used to maintain parallel database concurrency. A modular mid-level tool set, written in Perl 5, was used to manage this process. The user interface was written with tcl/Tk.

Perceptions was basically built as a series of software packages that sat on top of the information distribution system. This design strategy basically flowed from the fact that Perceptions started out as simple patient tracking software.

This paradigm actually proved to be quite powerful. One of the most interesting features of the system from a pharmacy perspective was that the pharmacy component of the tracking system actually 'looked' for patients that were scheduled for treatment. This work was actually motivated by my study of Just In Time Inventory (JITI) control methods that were being deployed in the late 1980's and early 1990's in American industry.

A big component of ambulatory treatment of cancer patients involves administration of multi-day treatment regimes, eg. VP16/CDDP, CF/5-FU. The pharmacists would designate that treatment orders were multi-day in nature and Perceptions would immediately schedule subsequent treatment dates. On subsequent days the pharmacy software would watch the tracking logs to see when a patient registered at the front desk. When they did the orders would automatically be executed in the pharmacy and labels created to initiate creation of the chemotherapy product.

This system allowed logical enhancements to be made to the system. One of the problems with JITI was that as dose-intensity increased, situations began to arise when the patient's clinical state warranted discontinuation or modification of therapy. The pharmacy tracking system component was modified to implement a state-engine which required that multiple criteria be recognized from the tracking log analysis before an error could be generated. For example the patient had to check in to the front desk and be placed in a treatment room before the order would be executed. Extensive work was being initiated on this component of the software when the roof fell in.

The Linux workstations that composed the system basically replaced terminals at many locations that were used to contact legacy systems. Typically these terminals had serial connections to the legacy systems which Linux talked to. When a patient arrived at the front desk an initial tracking message was broadcast to all the workstations. These workstations than contributed additional information that they were able to find on these patients and broadcast this information as well.

The software was designed on the following hierarchy:

Shell script wrapper -> Perl functionality -> tcl/TK interface

All the utilities and programs were encapsulated with a shell script wrapper which did things like parse options etc. Major functionality was implemented with Perl programs which could stand by themselves if necessary. In the case where a GUI was needed the Perl scripts were designed to open a wish shell and would talk back and forth over a bidirectional pipe to implement the user interface.

This system provided a means to perform the fundamental clinical information task of the center -- collection, organization, and presentation of clinical data -- and also increased staff efficiency sufficiently that a 75% increase in patient load required only a 20% increase in staff.

Subsequent development of mid-level languages and standards would make this task easier today, but functionally the needed design is the same: both horizontal and vertical modularity of software, particularly to separate the process of data acquisition from the task of presentation.

Two crucially important features of this system, not present in any commercial software I've seen, are that it was specifically designed :

· to aid clinical decision-making by collecting, organizing, and displaying information for physicians, and

· to make the work of business staff and ancillary medical staff more efficient.

Another major feature of the system is that it was actually implementing functional data interchange long before XML was in vogue.

The data abstraction was carried out interestingly enough by using TeX. When a pharmacist entered an order into the pharmacy component of the system the function of the data entry was to basically encapsulate all the patient information into a TeX script. Running the TeX script through a document header file specific to the needs of the pharmacy resulted in generation of IV labels etc. The same TeX script run through a document header specific to the nursing unit caused it to generate a charting label which met the requirements laid out by the Oncology Nursing Society (ONS) for chart notes.

This worked to support one of the most important design criteria of the system. This criterion was that information obtained and/or entered by one discipline or group within the center should work to increase or aid the functioning of other groups. More simplistically we were trying to address the age-old issue of having to double enter data.

Perceptions is no longer in use due to the merger of the Cancer Center with a large hospital and the administrative insistence on a commercial "solution." Perceptions is not maintained, nor is the code available to the community. Dr. Wettstein and his colleague, oncologist Paul Etzell, MD, presented an excellent summary of their work at the first MIT conference on Free Software in 1996. Dr. Wettstein has promised to convert this paper to .html and .ps or .pdf and place it on his web site "shortly" at
and he is considering making the entire code available to the free software community.

Perceptions deserves a place of honor in f-s/o-s annals not only as the first open-source medical information manager but also because it was thoughtfully conceived, ergonomically designed, and well engineered.

We hope that future medical information management software will be not only creviceware, but the central software tool of the enterprise. Still, the special strength of the GNU/Linux system is in communications and data acquisition, so that this is the best choice for linking disparate systems no matter what other tasks are assigned to it.

I would have to say quite unequivocally that the most important component of the success of Perceptions was that it was based on an open source philosophy and toolset. I was never really afraid of failure since we were in control of all aspects of the project. If something didn't work we simply invented something different that did work. That sense of flexibility and solution mobility simply does not exist when working with commercial solutions.


This is the medical software project of the (USA) Veterans Administration system. We'll call this Whaleware because it is the largest public-domain medical information mammal on earth. Originally called the DHCP (Decentralized Hospital Computer Program), it was begun in 1982. It is now called VistA, Veterans Health Information Systems & Technology Architecture. Information about this sophisticated and complete medical information system is available at the VA's web site at
and at a programmers' web site
where it is maintained by volunteers who are current and former employees of the Department of Veterans Affairs (DVA). The VistA system is available on CD-ROM through a Freedom of Information Request, which can be initiated at the hardhats web site. Some software components have been published to the hardhats web site. A full descriptive monograph is published at

VistA is clearly well developed and in use. It comprises more than 80 integrated DHCP applications which include both administrative and clinical functions, including medical imaging, laboratory management, and pharmacy management.

The M computer language is the foundation of DHCP / VistA. This non-proprietary 4GL began life at Massachusetts General Hospital as MUMPS, and has become an ANSI-standard programming language, database management system with related bindings and protocols (for a non-technical explanation of M, see

I do not know whether this work, designed around the needs of the VA system, could be "ported" to the private-sector medical community with an acceptable expenditure of time and effort. There are two barriers to use: the M language, which "makes sendmail scripts seem organized and Perl seem well structured;" and that it was designed around the needs of the VA system, which is like no other medical organization on earth.


The Littlefish project is an ambitious enterprise led by an Australian, Chris Fraser, to bring the power and efficiency of database tools, particularly epidemiologic analysis, to third-world and remote practices. The project will follow the GEHR or Good Electronic Health Record standards (see The GEHR standards are at in .pdf format).

I lightheartedly call such software "wholeware" because their goal is to be a complete solution to the perceived needs. This project is in design. I have not investigated it in detail yet.


This is a personal project of Dr. Alexander Caldwell, a family physician in California. He uses tcl/Tk to produce an information-gathering and documentation system. Features include menu-driven progress note generation, prescription management, preventive health management, and importation of lab values from his lab information system.

This software is available for Linux, Windows 95/98/NT, and Macintosh OS's. It is oriented toward specific tasks important to the physician, so I'll call this "taskware."

Dr. Caldwell is constructing a system that serves the primary care physician's ergonomic needs, as can be seen by his list of working modules. Inspection shows that some of these tasks are essential to efficiency, others are decorative enhancements made possible by powerful software tools. I quote from his own description.

Insert drawings into progress notes - mods to Impress, a Tcl/Tk program. Store templates for various anatomical or other drawings, draw on them, then save directly into a progress note.

Lab Results - automated download from an IBM AS/400 directly into Tk_familypractice with some user intervention on Linux only. Requires a TCP/IP connection to the AS/400. Script edited by hand to suit each user's log in and host configuration.

Clinical Decision MAPs - integrated with progress note writing module. Tcl/Tk widgets used in clinical decision making algorithms create chart notes as you interact with the program.

Prescription Module - Fax based, stores and sends drug refill information to drugstores. The stored data can be used to compile a medication list for inclusion in clinical notes. Prints hardcopy Rx and patient education monograph for the patient.

Demographic Data Module - addresses, phone numbers etc.

Problem List Module - stores problem list, allergies, past medical history. Data can be inserted into clinical notes.

Progress Note Generator - History and Physical Synthesizer - GUI based program that presents menus for numerous common office problems or presenting complaints. Your commonly used phrases can be inserted at the click of a mouse. Phrases are easily added or removed from the menus. Automatic saving to patient's file and a daily file that can be printed out at the end of the day for hard copies. Integrated with the other modules so data from problem list, allergies, medications can be inserted into the notes with the push of one button.

Progress Note Display Module - data stored as HTML so you can insert pictures, tables, etc. and enables data to be accessed via a web server. Can work with IBM's Via Voice for speech recognition under Windows 95/98, or run the Linux version on one machine and use X-win 32 (an Xwindow server that runs on a Windows machine) for the display to dictate into the Tk text widgets using Via Voice with the data stored on the Linux machine.

Allergy Checking - checks prescription refills against patient's allergies

Drug Interactions - checks prescription refills against the patient's drug profile for drug interactions.

Drug Doses - checks prescription refills for appropriate doses, with Pop-up menus.

Drug Information - patient package insert information can be viewed or printed out for the patient.

HMO authorization request generation - generates an HMO authorization request form that can be faxed or send via e-mail to an HMO office. Includes copies of progress notes you specify.

Recall Letter Generator - when writing a progress note, you can set a future date for a recall letter .

Referral Letter Generator - fax or e-mail a referral letter to the consultant, including a copy of your patient's progress note if desired. You just highlight the part of the note you want to send and pick from a menu of consultants you use.

Statistics - view or print various stats on no. of prescriptions, list of patients on a certain drug, most commonly prescribed drugs etc.

Graphic data plotting - scan your data and plot weights, blood pressures and lab data, etc., over time.

The modules are linked so that if you are working on a patient's drug refills, all his or her data in the other modules are pulled up at the same time so you have access to all information on that patient if you need it.


The Arachne project was a "Toolset for the Development of Clinical Workstation Applications from Distributed Components." It currently is a more general tool designed to provide an extensive, CORBA-2 compliant, object-oriented tool set for integration of disparate systems. The first iteration was done in order to permit the development of clinically useful tools. The Arachne group, whose principals were IT specialists in Massachusetts, was part of an "Internet Collaboratory" project, InterMed, that included medical informatics specialists at Columbia, Stanford, Massachusetts General Hospital, and others. This part of the project has been abandoned. The open source license does not include any healthcare-specific parts of the code. So at this point the project has been pared down to purely an open source CORBA implementation.

The motivation for its development was the frustration inherent in the current state of commercial medical software, based primarily on large, single-vendor systems not designed for integration or customization. This is characterized by a lack of common software infrastructure, services, or paradigms. The Arachne project has as its goal the ability to construct richly interoperating software components via a suite of cross-platform software tools, collectively referred to as Arachne. A description of this ambitious project is at

Work on Arachne began in 1992. Its first release was in August, 1997; its current version is, is dated Dec 16, 1998. The developers of course discovered that cost and limited function prohibited the use of commercial products as a basis for portable component development and integration. The development of Arachne required laborious tracking of relevant and conflicting industry standards, and consists of several largely independent subsystems, which are capable of platform independence and permit the construction and integration of arbitrary software components. Arachne is currently available for Windows 95/NT, Linux, HP/UX, SunOS, and Macintosh .

Arachne fits into the category of "creviceware" because its most important purpose was to permit connectivity and data exchange for the purpose of presenting integrated medical data to clinicians to support decision making.


This is a project of a Connecticut IT professional, Jeff B, and a podiatrist. It is designed to be an functional clone of a commercial medical management system, The Medical Manager


It uses MySQL, a proprietary SQL server. It is functional. I have not been able to access its site for a couple of weeks, so I don't know the current public status of the project.

Its chief limitation as an open-source/free software "solution" is that it was designed to duplicate the functionality of a particular commercial product, and thus has the inherent limitations that this design approach entails.


FreePM is a new project, aiming to produce a completely open-source practice management system. Design was begun in June, 1999.

The design of this project appears to be carefully done, and the database is currently in design.

PostgreSQL is being used for the database; ZOPE for middleware. They plan to use CORBA based OMG's (object management groups).

FreePM's efforts are being coordinated with Circare.


Circare is a new project to build an open source patient centered network index system. It is a commercial project of Minoru Development, Inc., a Toronto firm. Primary funding has been by the non-profit Hamilton, Ontario, Information Network, HappIN. The project's goal is to provide key infrastructure for Regional Health Networks, by developing new modules which include
· a clinical management system,
· a Web-based physician-pharmacist consultation system (this involves an effort by pharmacists to electronically notify physicians of the full list of medications--including herbal remedies--taken by an individual and potential interactions; and
· a geriatric patient index: this includes a minimum clinical data set.

Circare is a client and provider index that ties together the information about a single patient and makes it available securely to care providers in a distributed network. Thus it aims to be a solution to the "portability" problem that hinders the exchange of clinical information necessary to care for patients as they are referred from one provider to another in an extended health care system, or as they necessarily change primary providers for all the usual reasons.

Overall, this appears to be a sophisticated and well-managed project by experienced IT professionals.

Circare is open source software, distributed under the GNU public license:

Minoru Development maintains a web page that collects all the open-source efforts related to health care

Minoru sponsored an Open Source Practice Management Summit, held in Toronto, Canada, on September 23, 1999. I was not able to attend this conference, and have not seen a summary of it yet. At this conference, Minoru staff offered to coordinate open-source coding projects by sponsoring a discussion group and offering space.


QQ-MIM is my project. It began in 1985, when I changed practices, and been able to learn about database, medical software, and microcomputers, so that the clinician's need to gather information came into focus while I adjusted to a new charting system at the same time that I became fully aware of the potential of computers as a tool to information storage and retrieval. I wrote a long specification that I whimsically called "QuickQuack" that described the ergonomics a physician needed in such a system, but was unable to persuade any of the leaders I knew in software firms to invest their capital in solving all the world's problems.

It also turned out that the leadership of my clinic and most of the other physicians were extremely comfortable with the world as it was, and had neither sufficient discomfort with the limitations of the paper chart to motivate interest, nor sufficient knowledge of computing to understand what I was trying to say. In retrospect, they have not been interested in learning, either; but I naively believed that if they saw a system that put a few concepts into practice, that the light would dawn. It has not.

Meanwhile, my First Offspring, Michael K. Johnson, who had been immersed in the free software world since Linus Torvald's first post, has worked diligently to educate me on its potential.

After I was done paying college tuition, I decided to fund a demonstration project with some simple programming. Stage One was the development of a simple progress-notes reader: In this project we took a collection of ASCII progress notes, created a simple PostgreSQL database, and used Perl to create a reader that displayed each patient's notes in a standard browser, in reverse chronological order.

Stage Two was the development of a prescription-tracking system. This was done in two steps: First, the drugs in the FDA Orange Book (available in several segments on the web) was reduced to a PostgreSQL database.

Later, ZOPE was used to create a medication-tracking system and prescription writer that is still in development.

One criterion of this project has been to use only free software, and to donate the finished work to the free software community when it is sufficiently mature to do so.

During this project, I came out of my hole into the sunlight and looked around, blinked, and discovered that several other projects were under way, most begun in 1999, to perform many of the same functions. This has redirected my interests toward the overall effort to create a free-software medical information manager; the most important question to answer, in aiming to contribute, is to ask how free software can be most useful to the medical community.

The functional priorities seem clear to many people: connectivity, data exchange, and usability (ergonomics) are worth attention, energy, and time. A crucial secondary priority, because we deal with confidential information, is authentication and security. (It is secondary not in importance but is functionally subordinate to the task of achieving connectivity and exchange while creating tolerable ergonomics.)

I assume that you understand the need for connectivity and easier data exchange. It is not clear whether the need for efficient ergonomics is well understood by anyone. Administrators seem to assume that the inherent efficiencies of computers are obtained automatically; programmers insulate themselves from users to increase their work output and do not do "time and motion" studies to see if users are actually made efficient. Users are hindered both administratively and technologically from making their tasks more efficient. Physician-users sometimes have the power and sophistication necessary, but tend to be afflicted with egocentrism, which tends to produce idiosyncratic solutions rather than general solutions.

Important tasks for medical software projects:

I believe that the most effective evolution of free and open source medical information management software will be roughly like this: First, the greatest strength of the GNU/Linux system is connectivity. This is why the first clinically effective use of free software, the Perceptions project, was "creviceware," a collection of tools and programs that made connectivity and efficiency possible among independent commercial products that were never designed to work together.

Second, free and open software tends to begin as hobby projects. These begin with individuals designing tools that meet their own needs. Superficial examination of such projects may leave the impression that they are purely idiosyncratic; the truth is that beneath the idiosyncracy there are usually useful, generalizable paradigms. More importantly, they serve as demonstration projects that teach users, observers, and programmers how to make the next iteration of software design more efficient and functional. In this stage specific tools are designed -- "taskware."

If specific tools are designed within a culture that understands their design principles ("has a clue") and recognizes the importance of connectivity, then the project can begin to actually reshape the medical community, producing "wholeware" -- software systems that actually begin to meet the needs of an enterprise.

At first, these enterprise systems will need to "embrace" proprietary legacy software; but as the power and flexibility of using open code becomes generally understood, strictly proprietary solutions will atrophy.

How can we best proceed to build shareable open code? I believe that the following conceptual foundations are necessary:
· Protocols and data structures for area-wide exchange of individuals' basic health information: a "medical demographic." This is the significance of the Circare project in Hamilton, Ontario.
· Agreement on database design . I believe that it is important for the medical f-s/o-s community to maintain a common database design, independent of code. This is the significance of the FreePM project of Tim Cook.
· Free availability of coding systems for clinical information. The fact that the CPT system is proprietary to the AMA is a great hindrance, and the AMA should be pressured to make this coding system freely available to the community, under appropriate commercial restrictions that permit the AMA to recover the high costs of maintaining this complex system.

Let's look at these principles in more detail.

Foundational needs of open-source medical software

Contemplating the various f-s/o-s efforts that are being attempted, and the work that has been done, such as the Health Level 7 project ( ), to create data standards for medical software it is difficult to envision how to effectively corral the enormous efforts that are going on and harness them to a single wagon.

Each "player" is performing in a different arena; has unique needs; has individual priorities. There is such a large variety of tools and mid-level languages that it is difficult to envision how it may be possible to provide a substrate suitable for every need.

I conclude that it is impossible to create a software project that can encompass the needs of everyone in the medical community, and therefore we should not attempt to do so. Instead, we must focus first on identifying the commonalities: what everyone needs (whether they realize it or not).

Let's approach this logically. The reason this community exists is to provide health care for individuals. Hence the first consideration is, what are the fundamental "data needs" of the individual?

Demographics must include health information

The first mistake made by commercial software developers is to assume that the standard "demographic information" is an adequate description of a patient. It is not. For billing we try to collect enough information about each person to uniquely identify them. Name and date of birth are the starting points; in a large population this is not sufficient. Social security number is used in the USA, but organizations do not have a legal right to require this, and many people either refuse or provide a false one. So we collect a large amount of ancillary, usually temporary, information that serves to locate rather than identify persons.

But in all this effort, we have not included in our demographics the medical identity of the patient: their conditions. Every physician knows that this is the medical identity of a person; this is why we refer to "the gallbladder in room 320A." Such talk can be demeaning, but in a professional context it is a whimsical way of focusing discussion on a disease process rather than on a personality.

So a minimum medical demographic must include some version of what is called a person's "problem list." The fact that this is not "public knowledge" and must be protected from inappropriate access and use has been an absolute hindrance to adding it to the demographics. But medically to do so is an absolute necessity. There is a corresponding obligation to include in the specification of the complete medical demographic a confidentiality rule and procedure.

This rule and procedure is at its heart simple: the public and non-public data in the medical demographic record must be differentiated within the record; release of non-public data is permissible from one provider to another in order to provide health care to the individual and other release is permissible only with the express, documented, permission of the individual. Any software project must therefore include security procedures that permit protection of this confidentiality.

But without this "problem list" the patient remains "unidentified" medically; no software will be clinically useful that excludes this data.

As an aside, it is worth mentioning that all the information in the demographic record, including the patient's name, may be considered confidential and non-public by the patient. Famous or notorious people may not want anyone to know they are part of your organization's client list; telephone numbers may be unlisted, addresses private. The medical information is especially sensitive to breaches of confidentiality, but it is proper and prudent to give equal importance to preserving the confidentiality and privacy of every datum held on a person.

This has smaller implications for area-wide data repositories than might be thought. First, the patient must be informed that this area-wide repository exists, its purposes, the conditions under which consent to share data is implied, the conditions under which explicit release is necessary, and the recourse the patient has for violations. If all the organization's data is held in an off-site repository, the organization has the responsibility to ensure its security and confidentiality. There is no reason for an individual to object to an area-wide data-sharing arrangement if the privacy protection is as good as it should be.

In fact, practices are already beginning to use data services located as far as across the continent from the practice location, and there will of course the suspicion that vendors will mine this data for commercial purposes. I have no doubt that this will occur unless contractual and procedural protection is added to the legal protections that already exist.

Dynamic inaccuracy.

The most important feature of the medical demographic is its dynamic inaccuracy . What do I mean by this?

Persons are real, and exist until death (after which they are at least no longer dynamic). Any demographic information is a representation of a person, to permit indexing of records and to aid locating persons in order to communicate with them. Historically, the only interest in communicating with the patient was to send a bill. Recent changes in practice paradigms have led to communications about health maintenance, which a cynic would see as a means to ensure the sending of more bills.

Anyone who has ever interacted with demographic records knows that they are inherently inaccurate for several reasons:
· The person is constant, but characteristics, even names, change.
· Typographical errors and incomplete or partially duplicate entries are made by operators.
· Locations change.
· Medical conditions change.

This means that any demographic record is only temporarily accurate, and that the degree of inaccuracy will increase with the passage of time. I call this dynamic inaccuracy, and it is, in my judgment, the most important characteristic of a demographic record.

Thus the most important task for the keeper of the record is maintenance. Who is willing to take responsibility for "keeping" the record? There are four people who have clear and primary interest in its accuracy:
· the person about whom the record is a summary
· the medical professional using the record to make healthcare decisions
· the fiscal intermediary who is responsible for making proper compensation for correct claims
· the programmer who creates the repository and the tools to access it.

Note that this list does not include the well-meaning receptionist who records the information in the data repository, the manager of the clinic or hospital providing care, or anyone's attorney. These folks have a secondary interest in its accuracy.

Successful continual validation of the dynamically erroneous record requires that there be an audit trail of changes. It should include (or point to) the prior field contents and the new contents, and include when the change was made, who made the change (i.e., who was the source or informant and which entry tech recorded the change), why the change was made, if a reason is relevant (such as "marriage" "adoption" or "divorce" for a name change; "moved" "postal office directive" or "temporary address" for an address change). This audit trail need not be kept forever; once changes are validated independently and confirmed, the audit trail becomes irrelevant, and the record is ready to acquire fresh inaccuracy.

The remarkable thing about our current record-keeping practices is that we never allow the person whose data is stored to see, enter, or verify its accuracy directly. With rare exceptions, even the medical information stored there is not reviewed or verified by the healthcare professional who created it. (In most current systems this is the collection of diagnosis codes in the accounts-receivable system.)

A goal therefore, of any clinical information system must be to allow the patient and the physician each to have access to this summary record and to propose corrections to it.

What should this "medical demographic" contain?
· Identifying information
· Location
· Payors
· Medical data set.

The minimum contents of this medical data set are well known to practitioners:
· Medication allergies and intolerances
· Active medical problems
· Past health events that will affect future health decisions
· Heritable health influences
· Continuing medication use.


Two seductive characteristics of computers lure the unsuspecting: the potential for automated data handling, and the possibility of enhanced efficiency. But computers do not save time, money, or effort -- unless they and their use are managed intelligently and with discipline. To use computers to manage large databases efficiently and effectively does require technical expertise. Ergonomic design for efficiency requires detailed knowledge of how specific work is or should be done.

Studies have shown that the chief barriers to acceptance and deployment of computerized medical record systems are cost and usability. It is not my purpose to address cost, except to note as an aside that medical enterprises are throwing away money on commercial systems that turn out to be extremely expensive to maintain due to needless inefficiency and to the economic captivity of vendor lock.

It is important to acknowledge the major barriers to usability, well identified in the literature:
· workflow integration
· geographic access to devices
· the importance of actually improving productivity
· the effect of the "learning curve" on the use of systems
· the effect of failing to use web-based, modular, "lightly structured" approaches.

Executives seldom have knowledge of either information technology or physician work patterns; in any organization they tend to isolate themselves from detail and often become captives to a whirlwind of communication with each other and communiques to their underlings. IT professionals and physicians are both often thought to be too "narrow" to be effective leaders; American management culture is mistrustful of experts, who are presumptively viewed as egocentric and biased, as if ignorance granted objectivity or wisdom.

Commercial medical software has been and remains "pieceware," applications designed to serve a particular part of the enterprise. More importantly, data exchange with other applications has been completely and deliberately ignored, in the best circumstance to ensure that when a company expands its efforts to another area within the medical enterprise, it will not have "given away" anything to a potential competitor.

The first software in clinics and hospitals was accounts-receivable systems, and these products continue to dominate the market. They have been completely unsuccessful in producing useful computerized medical record applications because they have designed their efforts around the needs and requirements of medical records technicians, which has characteristically produced software that is ergonomically stressful for physicians.

Comprehensive laboratory data-management systems have emerged in this decade; they are chiefly oriented to the needs of laboratory technicians and their regulatory responsibilities; providing data to clinicians is superficially considered, and interfacing to clinical systems is not planned for.

Pharmacy software has emerged to aid in pharmacy management and dispensing; this has not been designed with the idea of providing integration with physician prescription-writing, or receiving prescriptions from physicians with electronic validation.

Transcription software has been designed around the needs of typists, and has not been planned to create any kind of a structured record that might be useful in creating even a modestly useful database.

Some hospitals have tried to create ordering software and integrated charting; but these are characteristically oriented toward complete documentation, not for ergonomic efficiency; in any case, the hospitals that have felt able to invest in such systems are large and complex; the temptation is to try to do everything imaginable at once, which spawns bloatware, and hinders the discipline that could progress from a simple system that does essential, easily-automated tasks well into a collection of simple systems that interact smoothly, later adding features and complexity in a logical and ergonomically-driven sequence.

A few physicians have tried to create electronic medical records; not surprisingly, the only clinically respected EMR/CPR systems are those that have been designed by physicians for clinical usefulness. These developers have discovered that integration with legacy systems, especially those doing accounts receivable and laboratory data management, has been an interesting and laborious challenge.

Security, Confidentiality, and Authentication

It is not my goal to reproduce here the important discussions that have taken place regarding the challenge that the Internet presents for security and authentication for system access and data exchange. Instead, I wish to point out that there are several important issues, of which the technological challenge of S & A is only the foundation.

It is worth pointing out, as an aside to this discussion, that the most important threat to confidentiality of medical information is not unauthorized access to either paper or electronic records. The greatest threat is authorized access. Insurance companies, in particular, have for decades habitually required applicants to sign a blanket release for all medical records. Their signature authorizes exactly this, and clinics comply. What electronic records do is merely to make release easier and less expensive, and permit extremely sophisticated searches and statistical analysis. They also present an opportunity to use the information for commercial purposes without release, which itself presents some ethical concerns.


Ethically, all information regarding a patient, whether provided voluntarily or at the insistence of the nice receptionist, or produced by the clinicians and consultants, belongs jointly to the patient and the organization; despite having a share in ownership, the organization has only a limited "property right" over this information, as the patient has a greater interest in its control and more to lose by mishandling than does the organization.

There are no ethical levels of confidentiality; there are regulatory levels, related to the adverse consequences to the organization or to individuals in it that may follow inappropriate release. The organization does not have the right to release even the "public" information about a patient, for to do so reveals that the patient is in fact a client of the organization.

Based on the likelihood of "injury" to the patient and the consequences to the miscreant of inappropriate release, there are at least three levels of confidentiality:
· Public information: address, telephone number, names of related persons, etc.
· Routine medical information: ordinary diagnoses, lab results of no "general" interest and so on.
· Sensitive medical and counseling information: psychology and psychiatry notes, pregnancy test results, lab results and clinical notes regarding sexually transmitted disease and the like.

Because of this construct, clinics customarily create "superconfidentiality" for psychology notes and HIV test results, which in most states is protected by law.

Thus any medical information management system should be designed to provide multiple levels of confidentiality.

Medical enterprises depend primarily on a culture of confidentiality rather than strict policing to preserve the privacy of medical records. The rules have been simple and clear for ages; the incentive to comply with these rules is part of "professionalism," and we store the records in mildly inaccessible places, not in bank vaults. Within our buildings, we leave charts laying all over, in stacks that are part of the records-handling process, and only a few employees are permitted (by rule) to have access to them. Security within an intranet can use professionalism similarly to keep security procedures simple enough that they do not hinder efficient work.

But the Internet poses special problems for security and authentication, both to make sure our firewalls effectively prevent unauthorized access, but also with the growing use of off-site data storage pioneered by Oracle and others, to devise effective protections against both confidentiality breaches and against commercial exploitation of their contents.

Clinical data (database design)

The key to creating a community medical informatics system is common agreement on the database configuration. It is not necessary to agree on every detail, as any user is free to modify the structure and any code as desired. But the felt need for such modifications will be minimized by careful attention to defining the essentials. The FreePM project, led by Tim Cook, is paying careful attention to database design in this manner. He realizes that it is more important to design well than to begin coding promptly. He expects to spend several months with database design.

There are four arenas which require this fundamental attention:

Professional fees. To create a billing system for the salaried health care provider is a non-task. All other professionals have some fee structure that has these components:
- the service(s) provided
- the condition(s) treated
- the identity of the provider (with location)
- the identity of the patient (with location)
- the date or duration of the encounter

Clinical documentation. This is most importantly "progress notes," but also related records such as prescription or medication tracking, problem lists, immunization history, growth charts, pregnancy flow sheets, and other documents created by the provider that serve as an institutional memory.

We must remember, in designing the database, that individual practitioners have strong personal preferences for various types of organization and presentation of clinical documents, so the database design must not presume any particular display organization for the elements that contribute to this material.

For example, some providers produce a single, unfragmented, narrative clinical document for each encounter. At the other end of the spectrum are technologically sophisticated providers such as Dr. Alex Caldwell, author of Tk-FP, who has written finely granular menuing software that permits creation of a progress note via mouse clicks from boilerplate text and can past text from the problem list, medication list, etc.

The solution for the database designer is to create a finely granular structure, as this can accommodate the highly subdivided note structure, and can be easily adapted to the needs of the user who creates a less structured note.

But to create a clinical record, even an organized one, is not the real task. The real challenge is to create a clinical record from which either oneself or another provider can easily cull newly-important information. The difficulty of this task is apparent when one thinks about the fact that the record created today, for a well patient with a sprained ankle, may be reviewed next month when the same person presents with abdominal pain. The fragments of history and observation that are generally pertinent across time need to be recorded or displayed differently that those fragments that are not of enduring interest medically, and are preserved only because the profession of law has a representative down the block who also may serve the patient in the indefinite near future.

What does a clinician actually look for in reviewing past notes. I claim, based on twenty years of experience in internal medicine, that a clinicial is never interested in the continuous narrative of an old note. We chiefly look for the following types of information:
· The (approximate) date of onset of a chronic condition; e.g., when hypertension was noted, or diabetes began.
· The(approximate) date and nature of any life-changing isolated events; e.g., when the left ankle fracture occurred, how severe it was, and what treatment was used; or when the laparotomy was done, why, and what was found; or significant but temporarily medical illnesses like tuberculosis.
· A history of (dates of and results of) "health maintenance" actions: e.g., mammograms, endoscopies. immunizations, pap smears, and the like.
· A medication history: allergies and intolerances, hopefully with a contemporaneous description; what medications were prescribed (especially for chronic conditions), what was the clnical response, and when and why they were discontinued.
· Treatment history of chronic disease: especially radiation or chemotherapy for malignancies, immunologic treatment of non-malignant disease (e.g. Crohn's or RA), or significant changes in insulin regimens for diabetics
· "Descriptive" historical data: baseline and recent chest xrays, EKG's, laboratory values, weight, blood pressure, visual acuity, etc.

We tend to depend on summaries for this information, and so look for hospital admission and discharge summaries, surgical reports, radiology reports, problem lists, and laboratory flow sheets.

Ergonomically, the clinician's task in reviewing past clinical notes is to sort out what has become "chaff" from what has been made "wheat" by the current complaint. It is possible to design our database, the display, and the user interface either to hinder or to expedite this task.

New enforcement pressures have been put on providers from the US government, which threatens criminal fraud action if the "level of documentation" does not match or exceed the "level of care" charged for. This has resulted in sometimes massive increases in verbiage to guarantee that the clinical note is as fat as the professional fee, and the result for the clinician, winnowing old notes in the chart for kernels of crucial information, has been a blizzard of chaff that often successfully obscures essential factoids and at least make the information-gathering task laborious.

Dual-track Clinical Notes Record Needed.

The best solution to this challenge is to judiciously fragment clinical notes, using boilerplate as needed for documentation of charges, but to store separately the boilerplate and any customization. The electronic record, then, would consist of a collection of pointers to boilerplate text and a collection of unique observations about the individual.

When the entire note is to be printed or displayed, the boilerplate and customizations are merged to produce a complete document; but the clinician when viewing the record can choose to view only the unique observations.

Problem lists, flow sheets, medication lists and prescription writing are all features of a clinical notes system that each requires its own specification. It is not now my purpose to create such specifications.

Clinical supporting data: intelligent system needed.

This supporting data is pre-eminently laboratory results. The clinician's need for presentation of this data is usually different than the standard laboratory output provides. Hence the extensive use of flow sheets in clinical practice, for patients with conditions that persist for some time. Not nearly enough is done with data analysis. We need an intelligent system to mine and present this physiologic data in ways that are pertinent to the patient's current and continuing medical conditions.

This intelligent system would, based on the diagnoses in the patient's "problem list" (summarized as ICD-9 codes, for example) and the presenting complaint(s) (gathered by ancillary staff and summarized as V-codes, for example), "mine" the electronic record for pertinent laboratory data, radiology reports, health maintenance actions, and relevant clinical notes. It would, between the time the patient checks in and the time the physician joins the patient, prepare customized flow sheets of lab data and an index to relevant narrative notes, and have these available for display on screen.

Here's how this could work for laboratory results:

The flow sheet format that will work best is one which does not display empty columns, and that "collapses" into a single column those values obtained on approximately equivalent dates. The older the lab value, the less important for clinical decisions is its exact date. Use actual dates for the past month, monthly dates for four months, quarterly for a year, and then annual data telescoped in single columns. The "collapsed" or "telescoped" columns might contain several values, in which the rang) and number of observations could be reported, e.g.,
            |alk ptase | 32-75 (4) | .
If exact numbers are wanted, the user should be able to reference the cell or column with the cursor or keystrokes and "expand" perhaps by clicking on it. e.g.,
            YR - QTR - MON - DAY .

The rightmost column could simply be labelled "older," and either contain simply a tag noting that values exist (this would be easier to implement and faster to calculate and display), or the range of all older values. It would be useful to summarize lengthy reports such as a urinalysis or complete blood count by simply noting than they have been done--with a minus if no abnormal values are reported or a plus if there are abnormal values (the plus could be over-ridden by a doctor if the abnormalities were judged to be trivial, so the display would not be misleading in the future.) An"expand" feature would open to the complete report, or to a flow sheet devoted to a set of complete reports for the period of time encompassed in the column. For example,

Joe Markiewicz
Wednesday 3-20-99
Clinic # 377-95-2287
 DOB 12-17-64
Dr. Gruenhagen

Jan 99
4 Q 98 
3.3-4.0 (2)
3.5-4.3 (3)
 3.2-5.0 (12)
1.6 (2)
10.2 (3)
Fast. gluc
193 (2) 
92-420 (15)
- (7) + (2)

Besides clinical notes, there are many other types of narrative supporting data:

- medical imaging - typically consisting of a report by a physician of a graphic
        xray and fluoroscopy
        nuclear medicine
        procedure videos
        clinical photographs
- pulmonary function
- nerve conduction (EMG)

Chart index.  It is not be necessary, for example, to always show the actual xray report (or other narrative report). Just having the xray procedure and its date would permit the creation of a table of contents to this part of the chart as well as a flow chart of the procedures that had been performed. At most, one could add a summary diagnosis to the list:
Date Exam Result
6-85  Barium Enema diverticulosis
8-99  CXR nl
11-94  UGI DU
8-98  BE  tics, polyp
9-00  MRI lumbar spine  HNP Rt L5

Professional communication. Many types of communication need to be kept in a patient's record. Many of these need not be reduced to electronic form. At most, the electronic record should be an index to the paper documents that have been archived in a paper file. Examples of historical records that do not require electronic access are:
- reports from consultants
- hospital admission and discharge summaries, operative and procedure notes
- letters to or from patients
- correspondence with employers and attorneys
- forms from or for employers, DME providers, home health nurses, insurance companies, etc.
- old records sent from past caregivers
- nursing home documentation
- reprints from the medical literature
- archival clinical records

Communication between providers is more and more likely to be in electronic form, usually as email. If intranet connectivity is achieved between providers' clinical databases, it will be possible to directly add data and notes generated by other providers in a clinician's own electronic record. If this is done, then the source of the data must be indicated within the clinician's database.

Prescriptions and a drugs database

This I consider to be a part of the "clinical notes" function, because it is the clinician who uses it, and the results of the prescribing process belong with the clinical note. I will briefly list here the essential features of a prescription system.

· Drugs database. It is not a difficult task to import the FDA Orange Book, available at their web site (, into a database, and the FDA publishes monthly updates which could be used to automatically the drugs database.

· Prescription-writing tool. A prescription should be easy to create, and the software should be capable of printing a prescription on paper for signature, faxing a signed prescription directly to the pharmacist, or electronically submitting the prescription if the pharmacy is capable of accepting this.

· Medication list. It is important to be able to produce not only a complete list of current and recent medications for the chart, but to prepare a list of the patient's current medications for the patient, for the pharmacist, and for home health nurses.

· Allergy/intolerance list. It is not essential for this list to be part of the database if the clinician is working from a paper chart, as it is the clinician's responsibility to be sure that all medication intolerance is considered prior to writing a prescription, and such lists are notoriously incomplete. But it is helpful to have this feature, and the clinician should work diligently to maintain an accurate list in the database.

· Medication interactions. The Medical Letter has a web site,
and during this last year has made their drug interactions software available to subscribers via the Internet. As set up on the Internet, the user can check interactions on up to six drugs at a time. This is a limit of the web page, not of the software. The program does not check for interactions with food or "natural" products, but does provide reprints of literature for many interactions. I have found it more than sufficient for my own needs, and the response has been very fast.

It is important to the clinician providing longitudinal care to have an audit trail of past prescriptions, listing when each was begun and stopped, and when a change was made in medication, form, dosage, or schedule, to indicate succinctly a reason. Such an audit trail can eliminate the repetition of past futility.

Coding Systems

It is not my purpose in this document to review coding systems for symptoms, diagnoses, supplies, and professional services. A concern is that the AMA does not make the CPT coding system freely available; nevertheless the cost of this database is within the means of any practice.

Information Resources

Two software technological advances have greatly reduced the work needed to bring a usable integrated medical information system into the exam room: These are the multi-windowed desktop and the web browser.

The Multi-window desktop, with proper tools, permits the clinician to simultaneously have access to disparate systems that live on the same network. It is not necessary to switch from one application to another. I can have simultaneously open a word processor for patient instructions and educational material that can be customized for the patient's individual needs in seconds, a terminal session to access their lab data, an Internet session to check drug interactions, an intranet session for access to an electronic medical library, and a window on real-time radar so she can see if the rain will quit before she has to go back out to her car.

The web browser solves an access and display problem, and its ubiquity has made information and tools readily available without the intervention (i.e., work) of the local programmer. We will do well if all displays use this technology to present information. However, HTML is presently has very limited ability to use screen real estate efficiently, and has inadequate flexibility in managing keyboard input, so that the ergonomics of browser-based data display and entry are awkward. A current example of this is the FAA's new DIWS software for Aviation Medical Examiners. Most of you will never see this, and it is not possible for me to demonstrate it here; so let me say simply that its ergonomics is awkward, error checking of input is scant, response times are highly variable from seconds to tens of minutes, and security and authentication were a daunting challenge. After talking personally to the programmer, I am confident that this system pushes pure HTML as far as it can.

Despite its severe ergonomic limitations, HTML and browsers prove that a shared display technology is an important part of the foundation for the necessary clinical connectivity of the future. In my own institution, the use of XML is being specified for all future clinical applications, and many commentators have agreed that this will likely be the future common specification for presentation and display.

The Organizational Politics of Systems Design

This discussion about the genuine and perceived strengths and shortcomings of commercial versus open software is in fact a political one, as within institutions decisions are usually made politically, even though experts would prefer that technical merit be the basis for planning and decisions.

Commercial software solutions do have some strengths. These are, by and large, well understood. But vendors, no surprise, obfuscate or deny their deficiencies.

Free and open source software does have some limitations and deficiencies, upon which vendors focus; but also some strengths, which are not widely recognized in part because vendors deny their existence and partly because they are simply new and word has not gotten out.

Strengths of commercial software

Commercial applications exist, they are supported, many have a very long history, the companies understand their customers' needs, and many commercial systems provide extensive training as well as installation, customization, and usually maintenance. Only a commercial solution can provide a "turnkey installation." A customer with no technical expertise whatever can own and use a well designed software application.

A strength of commercial software that is not well understood is the cutting off of political unrest by taking huge portions of a company's IT efforts outside company walls. Is there controversy in the company over technical or ergonomic issues? The canny administrator can obviate all this infighting simply by going to a turnkey commercial solution. A side benefit is that the internal combatants then become allies against the alien invader that management has so "stupidly" obtained.

A corollary to this principle is that if a company is not able to bring organizational discipline and focus to its IT efforts, a commercial solution may be less costly. Organizational "focus" is particularly difficult in academic institutions and highly democratized firms which have powerful, independent department heads, particularly when IT professionals are kept in a "service" role and not permitted to participate at leadership levels.

Myths about commercial software

The assumption that in buying commercial software the customer is typically getting a reliable, cost-effective, well supported system is wrong. Some very good commercial solutions exist, but good experiences are hardly universal.

In particular, the cost effectiveness of commercial software is often assumed, without actually examining costs carefully before purchase, and almost no one actually does a continuing cost analysis of such software after installation. My former clinic, the Rhinelander Medical Center, which did this in the mid 1980's and responded to favorable numbers by opening a service bureau as a profit-making subsidiary after reviewing excellent performance, is exceptional.

Weaknesses inherent in commercial software

The two greatest hindrances to the use of commercial software are the proprietary (and therefore unique) files, databases, protocols, display technology, etc.; and inflexibility toward any single user. Getting two separate vendors to cooperate with exchanging data is difficult and slow. At the two clinics with which I am affiliated, approximately eight disparate systems in two locations were more or less wedded by creating a single common demographic database which all applications must access and use (to some extent), a task that was not easy or brief.

Administrators often say that one reason they use commercial software is that if something goes wrong, there is someone to sue. This is one of those rhetorical bites that is catchy but wrong. An economically strong vendor will be able to resolve genuine difficulties; an economically weak one might be unable to garner the resources to solve a difficult issue, may have "thin" technical support staff, and the failing company will have nothing for the user to recover in a suit -- never mind the years it takes to get a settlement.

In fact, the chance that a vendor might disappear while their application is still needed is one of the biggest potential weaknesses of a commercial product.

The most difficult situation that plagues the users of commercial software is vendor lock. This is a real phenomenon; fundamentally it involves the vendor milking the cow once it is in the barn. For example, our hospital's accounts-receivable software vendor told us that A) they planned not to upgrade or support the product we had been using; B) it was not Y2K compatible; C) they had replacement software, a completely different system, available for $1 million and we should plan to purchase and install it well before December 31. A conflict followed, wasting time and money for both parties, following which the old software became Y2K compliant.

Strengths of free and open source software

There is no vendor lock. If one manages projects poorly, there can develop "programmer lock," if code is not well designed or properly documented. But the organization possesses its own code, and is free to hire any programmer, as a contractor or employee, to improve, modify, or adapt to the organization's specific needs.

Interconnectivity. The majority of the Internet is run by GNU/Linux systems and tools. The genesis of the free and open source community was within the Internet, and so this system has the best toolset and best expertise; and it is designed to use old, out of date, inexpensive equipment efficiently.

Large numbers of skilled programmers. The GNU/Linux community has thousands of developers, most able to work remotely, by contract. The challenge to an organization is not finding them, but managing its projects and personnel. Organizations which cannot manage technical professionals well should purchase commercial solutions and be happy with vendor lock.

Robust, redundant toolset. The GNU/Linux system, bolstered by more or less open software, has the broadest, most complete collection of software tools, including the most interesting middleware. (Medical information will continue to be dominated by commercial vendors of proprietary, closed software for at least a decade, and there is no solution for the clinician who hopes to create a comprehensive system for data access and display except to use middleware to mine data from disparate proprietary systems.

Reasonable and controllable costs. There are no royalties to be paid in this arena; the organization that is able to manage software projects and is able to focus strategically to use free and open software for its strengths will be rewarded with the ability to control its costs and manage the pace of its development, as well as focusing software on its own priorities.

Weaknesses of open source software

No medical applications. In the past, GNU/Linux was disparaged because it had no applications; this is because this is a new OS and free/open software is a new paradigm for development and management. This will change, but it is true that now there are no applications ready for deployment. In fact, non-commercial code will likely be slow to enter this arena.

No turnkey installations. Obviously.

No training. This is a corollary of "no apps."

Myths about open-source software

No support. Actually, one of the fascinating characteristics of the free-software community is its broad and skilled support via Internet discussion groups. But commercial support has become available as companies have emerged to serve this arena. As commercial offerings continue to develop, the growing number of organizations offering 24 x 7 support will continue to increase.

Unreliable. This is simply false. My Linux systems have been up without crashing since installation, and have been down only for kernel upgrades. Device drivers can be installed without re-booting. My Windows98 system must be taken down every third day, or memory leaks lock it up. WinNT/2000 must be re-booted for installation of any device driver.

Summary: Free/open software solutions are suitable for medical organizations who have managers with technical understanding who need maximal cost effectiveness; and are best used as "creviceware" for connectivity and data gathering, and "taskware" that is designed for display and manipulation of stored data.

The most important limitation of open source solutions now is that an enterprise needs truly to manage its IS and its "internal customers" well in order to benefit efficiently from their use. For many businesses to purchase turnkey software, with all its obvious severe limitations and inherent frustrations, is a way to avoid managing IS personnel or to obviate internal debates over ergonomics, priorities, and function. As turnkey open-source solutions emerge, this concern will abate.

Commercial solutions are most appropriate for organizations that are unable to understand or manage technical resources, who have more money than knowledge, which are not able to agree on IT priorities, and for those who need proven software or turnkey solutions.

It is my judgment that it will not be possible to provide a complete free/open source medical information management system for three to five years; that the chief mistake managers are making with regards to open software is to ban it or ignore its genuine strengths, as it is extremely cost effective.

I am not sure whether vendors and users can agree on a basis for interchange of common data; but that we do so is important for our patients, who need their health information to be portable as they are referred among specialists and move about the country and the world.

Responses may be directed to me at:

johnson DOT danl AT mayo DOT edu