Use of a Validated Algorithm to Judge the Appropriateness of Total Knee Arthroplasty in the United States: A Multicenter Longitudinal Cohort Study

2014 
Several recent high-profile publications have described the dramatic growth in utilization of total knee arthroplasty (TKA) in the United States (1–4). Between 1991 and 2010, for example, the annual volume of TKA surgeries among Medicare beneficiaries increased 161.5% and per capita utilization increased 99.2% over the same period (1). Some have suggested that TKA is over-utilized (2) or that over-utilization may be one factor explaining large per capita increases in TKA surgery (1). Cram and colleagues (1) contend that recent growth in TKA utilization is likely due both to an increase in utilization of a highly effective procedure and over-utilization of a procedure that is highly reliant on subjective criteria. Any determination of the extent to which TKA surgery is appropriate or inappropriate requires the use of valid appropriateness criteria. These criteria, as applied to patients undergoing TKA, to our knowledge, have not been formally developed or studied in the US but have been developed in other countries (5–8). The most commonly recommended approach for establishing appropriateness criteria for elective surgical procedures is the RAND/UCLA method (9–11). First, a systematic literature review of risks, benefits and indications for the procedure is conducted. Second, an extensive and mutually exclusive set of clinical scenarios (typically numbering in the hundreds) are written to capture the gamut of potential patient scenarios reflecting all potentially important clinical indications. Third, an expert panel is formed to conduct a modified Delphi survey to classify each scenario as appropriate, inconclusive or inappropriate for the procedure. A rating of “appropriate” indicates the expected benefits of the procedure outweigh the expected harms to the extent that the procedure is justified. A rating of “inconclusive” indicates either that the expected benefits and harms are roughly equal or that a lack of consensus among panel members was found. An “inappropriate” rating indicates the expected harms outweigh the expected benefits. The most extensively studied RAND/UCLA-based appropriateness algorithm for TKA is the approach developed in Spain by Escobar and colleagues (5;12–15). The authors conducted a systematic review of TKA evidence related to indications, effectiveness, and risks and used this evidence to develop 624 clinical scenarios based on the following literature-based key variables: symptom behavior, functional status, extent and location of radiographic arthritis, age, knee joint mobility and stability, and prior history of surgical and non-surgical treatment. A modified Delphi survey approach was used with two independent national panels (n=11 each) of arthroplasty surgeons (n=18) and physiatrists or rheumatologists (n=4). Reliability of recommendations between the two panels was found to be high (Weighted Kappa = 0.75) for judging whether TKA for each scenario was judged to be appropriate, inappropriate or inconclusive. A subsequent study of 775 TKA patients judged as appropriate based on the appropriateness criteria (5) demonstrated the largest WOMAC Scale improvements 6 months following surgery and patients judged as inappropriate had the smallest improvements (14). Ghomrawi and colleagues contend that appropriateness criteria like those developed by Escobar and colleagues, are among the most powerful tools for improving quality of care and controlling costs (2). Because studies in other countries reported that 60% to 80% of arthroplasty procedures were found to be appropriate, Ghomrawi et al suggested that similar over-utilization in TKA was possible in the US. Given that no appropriateness criteria for TKA have been developed in the US, we used a modified version of the Escobar et al appropriateness criteria to make an initial approximation of the proportion of knee arthroplasties that may be inappropriate in the US. While the Escobar et al system was not designed for US patients, we contend that the key criteria used in the system (i.e., pain and functional status, extent of radiographic arthritis, age and knee joint impairment) are likely among the most important criteria for US patients as well (16). Our purpose was to use a modified version of the Escobar et al (5) appropriateness criteria to estimate the proportion of TKA procedures classified as appropriate, inconclusive and inappropriate. We hypothesized that the prevalence rate of TKAs judged to be inappropriate would be similar to prior reports (11;13;14) and approximate 20%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    95
    Citations
    NaN
    KQI
    []