It is well known that few things in life are risk free and that we therefore take on risk whenever we seek the benefits of healthcare. This leads to the straightforward notion that we should evaluate risks and benefits, and then compare them in order to make a rational decision about undertaking the activity. While it is often said that this is being done, there is no consistent methodology for assessing either risk or benefit. And even if we could determine a measure for risk and one for benefits, we wouldn’t know how to rationally compare them because they are different concepts that would be measured in different ways. In addition, how much benefit offsets how much risk? Do benefits have to just edge out risk or be significantly greater by some amount? Similarly, do benefits offset risks even when those risks could be reasonably mitigated?
These kinds of evaluations as applied to medical devices are addressed in a recent FDA draft guidance document called Factors to Consider Regarding Benefit-Risk in Medical Device Product Availability, Compliance, and Enforcement Decisions. This document presents FDA’s thinking on the elements of risks and benefits that should be considered in reaching a risk-benefit determination with regard to certain postmarket regulatory issues, including recalls. The core issue here is that a device that is a candidate for a recall may have current benefits that should be considered relative to those risks. In a stop-use recall, these benefits could be lost, perhaps temporarily, subject to the availability of alternative devices or treatments.
The components of risk considered in the FDA document include the common attributes of severity and likelihood, along with the duration of exposure, uncertainty, detectability, and patient tolerance of risk. Of these, severity and likelihood have often been subject to quasi-numerical analysis by dividing each into a finite number of steps and assigning a numerical value to each step. A risk score then might be obtained by multiplying the individual rankings. While common, this type of calculation has little fundamental basis and a number of problems.
Uncertainty can be considered by recognizing that the assignment of a severity or likelihood level is not absolute. Similarly, detectability might be used to modify likelihood since the idea is that the potential for harm will be noticed before it actually causes injury. This is common in device manufacturing where inspections are meant to catch defects before they reach patients and cause injury. Similarly, a clinical alarm is meant to alert staff to a patient issue (detection) before that issue becomes a source of harm.
Duration of exposure might be used to modify severity, but this is but one of several multipliers or weighting factors that might be used. In assigning varying weights, we face the fundamental question of whether and to what degree some factors are more important than others. Patient tolerance is quite another issue. What level of knowledge is assumed in assessing a patient’s willingness to undertake risk? Is this all patients (measured somehow) or is it individualized? Is the patient’s willingness an overriding factor or just part of the assessment? If just a part, how important?
Combining multiple risks is also problematical. If numerical values are used it is tempting to add the scores for each risk but this has no underlying basis and it creates additional issues. For example, are multiple small risks “equal” to fewer larger risks? Even if a risk result is achieved, there is still no inherent basis for deciding whether or not the risk is acceptable, along with the caveat of acceptable to whom.
In the FDA document, the notion of benefits is also multifactorial, including type of benefit, magnitude of benefit, likelihood of the patient seeing the benefit, duration of the benefit, and patient preference. Each of these factors also could be subject to scales, descriptive words, and scores. For example, the magnitude of the benefit might be labeled very high, high, moderate, low, and very low. Duration might be designated as extended, medium, or short, noting that the various factors need not have the same number of levels. Patient preference has the same issues as patient tolerance of risk. Is this all patients, subgroups of patients, or individual patients? What is their preference based on—or does this matter? How good are people at deciding what is best for them, and is this an unfettered decision or one that can be second guessed by family, regulators, courts, providers, etc.? Again a hierarchy of importance is needed in order to weigh the various factors. For example, is duration more or less important than magnitude of benefit?
And if we did have a rational assessment of risk, perhaps reduced to a score, and an assessment of benefit, perhaps also scored, then what? What do the assessments or numbers mean for things that are fundamentally different and not inherently quantifiable? How do we compare them? Do we need to just tip the scale toward benefit or do we need a more demanding standard? As an engineer, I am used to having specifications that are measurable. Which is greater, stronger, or stiffer is a question that can then be answered by objective evidence using a consistent set of units and measurements. I can describe the evaluation and others can see exactly what I did. But I don’t know how to compare strength or weight to, say, color. Nor do I know how to compare risks to benefits, even if I knew how to actually measure each, which I don’t.
The FDA does not have any suggestions in its draft document for how to combine factors and make comparisons, but perhaps if this draft ever emerges as an actual guidance document it will be more forthcoming. If not, we will have a list of things to consider (which might be of value) but not how to rate and combine them. If in an analysis each component was carefully described, we might be able to follow the analyst’s thinking if decisions. But decisions based on this process will not necessarily be consistent and transparent which is part of the FDA’s goals for the guidance document. These challenges are not limited to this document or the FDA. Whenever people speak of comparing benefits and risks, they are on shaky ground with respect to how they are making their determinations of each, and how they are making the comparison.
William Hyman, ScD, is professor emeritus of biomedical engineering at Texas A&M University. He now lives in New York where he is a consultant and adjunct professor of biomedical engineering at The Cooper Union.