Judging Cardiac Surgical Quality is Necessarily Complex

Consumers easily judge quality of most products and services. We all know from our experiences that places like Disney and Wegmans are superior to the alternatives. Healthcare is no different. Patients know when they are listened to, treated with respect and given explanations by their physicians that they understand. The importance of providing a superior patient experience has grown as aspects of this are now measured with the results used for public reporting and to financially incentivize performance. When it comes to high risk activities like heart surgery, consumers also know that outcomes are better at some facilities than others and that surgeons don’t all have the same skillsets. They know whether they would recommend the hospital or surgeon to others in need of cardiac care, based in part on their interpersonal interactions with the surgical team and perceptions of their surgical outcomes. Surgery patients recognize when they communicated well with their care team, had no complications after surgery and were willing to provide recommendations to others. These are the patients likely to have judged their overall service as high quality.

However, the patient perspective is only part of the story. Most consumers are not equipped to accurately judge the true quality of specialized services like heart surgery because they don’t understand the technical and scientific aspects of highly complex care. These are the critical determinates of their outcome and therefore the best measure of the quality of the service they received. Instead of making independent judgments on these issues, patients rely on their referring providers to judge these issues on their behalf and trust their recommendations about the best place to go. This tried and true approach works well for mature programs performing traditional surgical procedures like coronary artery bypass where reputations become well known around the medical community. When a patient dies after heart surgery, it is a safe assumption that the surgeon with an excellent reputation did everything appropriately and that this event does not change risk for future patients.

In the case of innovative procedures like robotic coronary artery bypass, this assumption no longer consistently works. Referring doctors find it difficult to appraise the quality of a procedure that is new to a community. Skepticism, denial, fear, and confusion always attend new ideas, particularly for a robotic approach performed by only a small subset of surgeons that challenges a powerful and entrenched status quo performed by all cardiac surgeons (i.e. open chest surgery). Safety, which is the cornerstone of quality, can be questioned during the early learning curve phase of robotic heart surgery. When complications occur at the hands of a surgeon and/or procedure with no established reputation, the community is no longer able to give the benefit of the doubt and turns to the hospital to verify quality. US hospitals do a generally poor job of at identifying and rooting out patient risks in everyday routine care and are even less capable of identifying the preventable risks of a new unfamiliar procedure like robotics. A hospital’s standard approach is for their chief medical officer and/or medical executive committee to document the appropriate prerequisites and training, investigate incident reports, determine the cause of death obtained from autopsies, and review annual performance reviews of the operative surgeon. Unfortunately, these standard quality assurance tools are even less equipped than referring doctors to decipher the quality the new program.

Surgical peers attending morbidity and mortality rounds are more qualified for this role. The purpose of these conferences, mandated by the Joint Commission, is to review anecdotal adverse events (deaths and major complications) and determine whether surgeon performance was blameworthy and in need of corrective action. A hindsight bias tends to focus the discussions on surgeons decisions/skills with little understanding of performance relative to benchmarks or the broader impact of team performance on the event. Sometimes, the analysis takes on a “shame and blame” approach which limits the psychological safety of others to admit the role they might have played in the adverse event. On other occasions, the rigor of case review is hindered by the political climate of the institution which makes surgeons reluctant to openly criticize their colleagues. These flaws end up being particularly harmful for innovating in cardiac surgery because it is ultimately a team sport – everyone must have the same playbook and the same goal in order for high quality to become a reality. Encumbered with these deficits, the usefulness of M&M committees is modest at best.

The most effective means for quality assurance (QA) is to compare the overall results of the innovation to a credible benchmark. Quality concerns are then based on the objective findings of results that are significantly worse than the benchmark, which in cardiac surgery are provided by the Society of Thoracic Surgeons National Cardiac Database (STS NCD). The worst possible quality outcome is the death of the patient, so the most common performance measurement is the mortality rate. However, mortality is a crude measure that doesn’t have the precision to discriminate quality in a reliable way for most programs. At volumes less than 200 cases/year, it is more likely that a mortality rate higher than expected is due to chance rather than a true problem with quality (i.e. “type I error”). Few clinicians or administrators understand statistical concepts such as risk adjustment, confidence intervals and statistical power that are required to analyze rare events. This is not just an academic point. An adverse result thought to be due to poor underlying quality that prompts hasty corrective action but was in fact just due to chance compromises the credibility of the entire QA effort.

Just as patients don’t consider only a single variable in their own choices about healthcare, it is widely accepted that the mortality rate should not be the sole measure of quality of a cardiac surgery program. For example, patients that survive surgery but have prolonged hospital stays due to complications are unlikely to judge their care as high quality. A more comprehensive way to measure quality of care includes the risk of mortality along with major complications (e.g. stroke, kidney failure, deep wound infection, etc) into a single composite score. The STS composite score helps to address the inherent problem of statistical power of using just the mortality rate and is now one of the most sophisticated and widely regarded overall measures of quality in health care.

Quality is also defined by compliance with processes in accordance with evidence-based clinical guidelines. An example of process measures from the STS NCD is the use of the internal mammary artery as a bypass graft and the appropriate prescription of certain medications on hospital discharge. Combining both process and outcome metrics provides a more complete picture of quality. The process measures from the STS NCD were chosen because their use has been shown to improve patient outcomes. A high level of compliance with these measures reveals a program that is able to adapt to a standardized approach to its care delivery, which is an important component of a culture of safety. This culture signifies a cohesive surgical team that supports each other by speaking up on behalf of patient safety and is able to rapidly learn from and improve on the basis of their past mistakes. Successful adoption of robotics in cardiac surgery is heavily dependent on such a culture, mainly by helping to mitigate the difficult and potentially lethal period at the outset of the program, known as the “learning curve”. Several processes promote such an environment: 1) case briefings for the OR team prior to skin incision, 2) debriefings after each OR case, 3) interdisciplinary team meetings to consider issues outside the OR that impact safety of patients undergoing the new procedure and 4) the use of simulation to rehearse new or unusual procedures such as robotic surgery or ICU codes that involve reopening the sternotomy (http://www.ncbi.nlm.nih.gov/pubmed/25959836). These efforts are rare at most cardiac surgical programs.

It is reasonable to question why it is necessary to make judging cardiac surgical quality more complex than just asking about the mortality rate. When answering this question, consider a tale of two heart surgery programs with similar mortality rates. One program has patients that have a low morbidity rate after surgery and team members that are able to communicate well enough to identify the system issues and root causes of problems. Another doesn’t pay attention to their complication rates, has minimal compliance with evidence-based processes and tends to blame individuals when mortalities result. Which program is going to be more aggressive at improving safety and quality rather than just responding to adverse events after they happen? Which is likely to struggle with the adoption of novel techniques and technologies? Where would you want to have your robotic heart surgery performed at?