Identifying the design of a failed implant is a key step in the preoperative planning of revision total joint arthroplasty. Manual identification of the implant design from radiographic images is time-consuming and prone to error. Failure to identify the implant design preoperatively can lead to increased operating room time, more complex surgery, increased blood loss, increased bone loss, increased recovery time, and overall increased healthcare costs. In this study, we present a novel, fully automatic and interpretable approach to identify the design of total hip replacement (THR) implants from plain radiographs using deep convolutional neural network (CNN). CNN achieved 100% accuracy in the identification of three commonly used THR implant designs. Such CNN can be used to automatically identify the design of a failed THR implant preoperatively in just a few seconds, saving time and improving the identification accuracy. This can potentially improve patient outcomes, free practitioners' time, and reduce healthcare costs.
Purpose A crucial step in the preoperative planning for a revision total hip replacement (THR) surgery is the accurate identification of the failed implant design, especially if one or more well‐fixed/functioning components are to be retained. Manual identification of the implant design from preoperative radiographic images can be time‐consuming and inaccurate, which can ultimately lead to increased operating room time, more complex surgery, and increased healthcare costs. Method In this study, we present a novel approach to identifying THR femoral implants' design from plain radiographs using a convolutional neural network (CNN). We evaluated a total of 402 radiographs of nine different THR implant designs including, Accolade II (130 radiographs), Corail (89 radiographs), M/L Taper (31 radiographs), Summit (31 radiographs), Anthology (26 radiographs), Versys (26 radiographs), S‐ROM (24 radiographs), Taperloc Standard Offset (24 radiographs), and Taperloc High Offset (21 radiographs). We implemented a transfer learning approach and adopted a DenseNet‐201 CNN architecture by replacing the final classifier with nine fully connected neurons. Furthermore, we used saliency maps to explain the CNN decision‐making process by visualizing the most important pixels in a given radiograph on the CNN's outcome. We also compared the CNN's performance with three board‐certified and fellowship‐trained orthopedic surgeons. Results The CNN achieved the same or higher performance than at least one of the surgeons in identifying eight of nine THR implant designs and underperformed all of the surgeons in identifying one THR implant design (Anthology). Overall, the CNN achieved a lower Cohen's kappa (0.78) than surgeon 1 (1.00), the same Cohen's kappa as surgeon 2 (0.78), and a slightly higher Cohen's kappa than surgeon 3 (0.76) in identifying all the nine THR implant designs. Furthermore, the saliency maps showed that the CNN generally focused on each implant's unique design features to make a decision. Regarding the time spent performing the implant identification, the CNN accomplished this task in ~0.06 s per radiograph. The surgeon's identification time varied based on the method they utilized. When using their personal experience to identify the THR implant design, they spent negligible time. However, the identification time increased to an average of 8.4 min (standard deviation 6.1 min) per radiograph when they used another identification method (online search, consulting with the orthopedic company representative, and using image atlas), which occurred in about 17% of cases in the test subset (40 radiographs). Conclusions CNNs such as the one developed in this study can be used to automatically identify the design of a failed THR femoral implant preoperatively in just a fraction of a second, saving time and in some cases improving identification accuracy.
Plain radiography is widely used to detect mechanical loosening of total hip replacement (THR) implants. Currently, radiographs are assessed manually by medical professionals, which may be prone to poor inter and intra observer reliability and low accuracy. Furthermore, manual detection of mechanical loosening of THR implants requires experienced clinicians who might not always be readily available, potentially resulting in delayed diagnosis. In this study, we present a novel, fully automatic and interpretable approach to detect mechanical loosening of THR implants from plain radiographs using deep convolutional neural network (CNN). We trained a CNN on 40 patients anteroposterior hip x rays using five fold cross validation and compared its performance with a high volume board certified orthopaedic surgeon (AFC). To increase the confidence in the machine outcome, we also implemented saliency maps to visualize where the CNN looked at to make a diagnosis. CNN outperformed the orthopaedic surgeon in diagnosing mechanical loosening of THR implants achieving significantly higher sensitively (0.94) than the orthopaedic surgeon (0.53) with the same specificity (0.96). The saliency maps showed that the CNN looked at clinically relevant features to make a diagnosis. Such CNNs can be used for automatic radiologic assessment of mechanical loosening of THR implants to supplement the practitioners decision making process, increasing their diagnostic accuracy, and freeing them to engage in more patient centric care.
Plain radiography is widely used to detect mechanical loosening of total hip replacement (THR) implants. Currently, radiographs are assessed manually by medical professionals, which may be prone to poor inter and intra observer reliability and low accuracy. Furthermore, manual detection of mechanical loosening of THR implants requires experienced clinicians who might not always be readily available, potentially resulting in delayed diagnosis. In this study, we present a novel, fully automatic and interpretable approach to detect mechanical loosening of THR implants from plain radiographs using deep convolutional neural network (CNN). We trained a CNN on 40 patients anteroposterior hip x rays using five fold cross validation and compared its performance with a high volume board certified orthopaedic surgeon (AFC). To increase the confidence in the machine outcome, we also implemented saliency maps to visualize where the CNN looked at to make a diagnosis. CNN outperformed the orthopaedic surgeon in diagnosing mechanical loosening of THR implants achieving significantly higher sensitively (0.94) than the orthopaedic surgeon (0.53) with the same specificity (0.96). The saliency maps showed that the CNN looked at clinically relevant features to make a diagnosis. Such CNNs can be used for automatic radiologic assessment of mechanical loosening of THR implants to supplement the practitioners decision making process, increasing their diagnostic accuracy, and freeing them to engage in more patient centric care.
Delayed diagnosis of syndesmosis instability can lead to significant morbidity and accelerated arthritic change in the ankle joint. Weight-bearing computed tomography (WBCT) has shown promising potential for early and reliable detection of isolated syndesmotic instability using 3D volumetric measurements. While these measurements have been reported to be highly accurate, they are also experience-dependent, time-consuming, and need a particular 3D measurement software tool that leads the clinicians to still show more interest in the conventional diagnostic methods for syndesmotic instability. The purpose of this study was to increase accuracy, accelerate analysis time, and reduce inter-observer bias by automating 3D volume assessment of syndesmosis anatomy using WBCT scans. We conducted a retrospective study using previously collected WBCT scans of patients with unilateral syndesmotic instability. 144 bilateral ankle WBCT scans were evaluated (48 unstable, 96 control). We developed three deep learning (DL) models for analyzing WBCT scans to recognize syndesmosis instability. These three models included two state-of-the-art models (Model 1 - 3D convolutional neural network [CNN], and Model 2 - CNN with long short-term memory [LSTM]), and a new model (Model 3 - differential CNN LSTM) that we introduced in this study. Model 1 failed to analyze the WBCT scans (F1-score = 0). Model 2 only misclassified two cases (F1-score = 0.80). Model 3 outperformed Model 2 and achieved a nearly perfect performance, misclassifying only one case (F1-score = 0.91) in the control group as unstable while being faster than Model 2.