Quality-based assessment of camera navigation skills for laparoscopic fundoplication
Florentine HuettlHauke LangM. PascholdF. BartschSebastian HillerB. HenselFlorian CorvinusPeter GrimmingerWerner KneistTobias Huber
5
Citation
33
Reference
10
Related Paper
Citation Trend
Abstract:
Summary Laparoscopic fundoplication is considered the gold standard surgical procedure for the treatment of symptomatic hiatus hernia. Studies on surgical performance in minimally invasive hiatus hernia repair have neglected the role of the camera assistant so far. The current study was designed to assess the applicability of the structured assessment of laparoscopic assistance skills (SALAS) score to laparoscopic fundoplication as an advanced and commonly performed laparoscopic upper GI procedure. Randomly selected laparoscopic fundoplications (n = 20) at a single institute were evaluated. Four trained reviewers independently assigned SALAS scoring based on synchronized video and voice recordings. The SALAS score (5–25 points) consists of five key aspects of laparoscopic camera navigation as previously described. Experience in camera assistance was defined as at least 100 assistances in complex laparoscopic procedures. Nine different surgical teams, consisting of five surgical residents, three fellows, and two attending physicians, were included. Experienced and inexperienced camera assistants were equally distributed (10/10). Construct validity was proven with a significant discrimination between experienced and inexperienced camera assistants for all reviewers (P < 0.05). The intraclass correlation coefficient of 0.897 demonstrates the score’s low interrater variability. The total operation time decreases with increasing SALAS score, not reaching statistical significance. The applied SALAS score proves effective by discriminating between experienced and inexperienced camera assistants in an upper GI surgical procedure. This study demonstrates the applicability of the SALAS score to a more advanced laparoscopic procedure such as fundoplication enabling future investigations on the influence of camera navigation on surgical performance and operative outcome.Keywords:
Gold standard (test)
Inter-Rater Reliability
BACKGROUND AND PURPOSE:
Hemodynamic features of brain AVMs may portend increased hemorrhage risk. Previous studies have suggested that MTT is shorter in ruptured AVMs as assessed on quantitative color-coded parametric DSA. This study assesses the interrater reliability of MTT measurements obtained using quantitative color-coded DSA.MATERIALS AND METHODS:
Thirty-five color-coded parametric DSA images of 34 brain AVMs were analyzed by 4 neuroradiologists with experience in interventional neuroradiology. Hemodynamic features assessed included MTT of the AVM and TTP of the dominant feeding artery and draining vein. Agreement among the 4 raters was assessed using the intraclass correlation coefficient.RESULTS:
The interrater reliability among the 4 raters was poor (intraclass correlation coefficient = 0.218; 95% CI, 0.062–0.414; P value = .002) as it related to MTT assessment. When the analysis was limited to cases in which the raters selected the same image to analyze and selected the same primary feeding artery and the same primary draining vein, interrater reliability improved to fair (intraclass correlation coefficient = 0.564; 95% CI, 0.367–0.717; P < .001).CONCLUSIONS:
Interrater reliability in deriving color-coded parametric DSA measurements such as MTT is poor so minor differences among raters may result in a large variance in MTT and TTP results, partly due to the sensitivity and 2D nature of the technique. Reliability can be improved by defining a standard projection, feeding artery, and draining vein for analysis.Inter-Rater Reliability
Cite
Citations (15)
Inter-Rater Reliability
Cite
Citations (28)
Intraclass correlation (ICC) is one of the most commonly misused indicators of interrater reliability, but a simple step-by-step process will get it right. In this article, I provide a brief review of reliability theory and interrater reliability, followed by a set of practical guidelines for the calculation of ICC in SPSS.
Inter-Rater Reliability
Cite
Citations (217)
Inter-Rater Reliability
Kappa
Intra-rater reliability
Cohen's kappa
Cite
Citations (62)
Objectives
To confirm interrater reliability using blinded evaluation of a skills-assessment instrument to assess the surgical performance of resident and fellow trainees performing pediatric direct laryngoscopy and rigid bronchoscopy in simulated models.Design
Prospective, paired, blinded observational validation study.Subjects
Paired observers from multiple institutions simultaneously evaluated residents and fellows who were performing surgery in an animal laboratory or using high-fidelity manikins. The evaluators had no previous affiliation with the residents and fellows and did not know their year of training.Interventions
One- and 2-page versions of an objective structured assessment of technical skills (OSATS) assessment instrument composed of global and a task-specific surgical items were used to evaluate surgical performance.Results
Fifty-two evaluations were completed by 17 attending evaluators. The instrument agreement for the 2-page assessment was 71.4% when measured as a binary variable (ie, competent vs not competent) (κ = 0.38; P = .08). Evaluation as a continuous variable revealed a 42.9% percentage agreement (κ = 0.18; P = .14). The intraclass correlation was 0.53, considered substantial/good interrater reliability (69% reliable). For the 1-page instrument, agreement was 77.4% when measured as a binary variable (κ = 0.53, P = .0015). Agreement when evaluated as a continuous measure was 71.0% (κ = 0.54, P < .001). The intraclass correlation was 0.73, considered high interrater reliability (85% reliable).Conclusions
The OSATS assessment instrument is an effective tool for evaluating surgical performance among trainees with acceptable interrater reliability in a simulator setting. Reliability was good for both the 1- and 2-page OSATS checklists, and both serve as excellent tools to provide immediate formative feedback on operational competency.Inter-Rater Reliability
Cite
Citations (21)
Inter-Rater Reliability
Movement assessment
Kappa
Cohen's kappa
Cite
Citations (3)
Intraclass correlation coefficients are useful statistics for estimating interrater reliability. The ICC provides a means for quantifying the level of rater agreement as well as rater consistency. The ICC is easier to use than the Pearson r when more than two raters are involved and can be computed when data are missing on some subjects (Haggard, 1958). Use of this statistic allows the researcher to decide whether or not to include rater effects in estimating IRR and to determine the precision of the reliability estimate. Information about the various types of intraclass correlations and their use is frequently absent from psychometric references commonly used by nurse researchers, resulting in confusion about correct usage and interpretation. Because different values are obtained depending on which ICC formula is selected, ICC formulae reported in the literature can have varying interpretations. For this reason, it is important for researchers to become familiar with the various forms of intraclass correlations and to report the version used in their calculations and the rationale for their choice.
Inter-Rater Reliability
Statistic
Confusion
Cite
Citations (38)
This chapter focuses on three measures of interrater agreement, including Cohen's kappa, Scott's pi, and Krippendorff's alpha, which researchers use to assess reliability in content analyses. Statisticians generally consider kappa the most popular measure of agreement for categorical data. Weighted kappa became an important measure in the social sciences, allowing researchers to move beyond unordered nominal categories to measures containing ordered observations. The intraclass correlation coefficient serves as a viable option for testing agreement when more than two raters assess ordinal content. A key concern in using an intraclass correlation coefficient as a measure of agreement is the selection of the correct ICC statistic. Intraclass correlation coefficients also provide indications of reliability with ordinal data, as does Kendal's coefficient of concordance. The chapter offers SPSS instructions for computing kappa and intraclass correlation coefficients.
Inter-Rater Reliability
Cohen's kappa
Categorical variable
Kappa
Concordance
Agreement
Correlation ratio
Concordance correlation coefficient
Cite
Citations (6)
OBJECTIVE: Evaluate the intra- and interrater reliability in the thigh circumference measurement using different anatomical references. MATERIAL AND METHODS: Twenty five volunteers without history of pathology or surgery in the dominant leg entered in the study. The measurements were performed by two independent evaluators, on two occasions with an interval of one week. The measurements and participants order were randomized. The results of the interim measures were concealed, being analyzed by a third investigator. The assessment protocol was previously defined. The intra- and inter-rater correlation was measured through the intraclass correlation coefficient (ICC). The limits of agreement were established in accordance with the method of Bland and Altman. RESULTS: The intraclass agreement in intrarater reproducibility was high (SPP ICC 0,96, KJL ICC 0,95, ASIS ICC 0,96). In the interrater results the limits of agreement were: SPP ICC 0,91 (IC 95%: 0,79–0,96), KJL ICC 0,94 (IC 95%: 0,86–0,97), ASIS ICC 0,90 (IC 95%: 0,77–0,95). CONCLUSIONS: All methods presented high intra- and interrater reliability, which by the simplicity of the measurement method may favor the choice for SPP in the absence of pathology in anatomical segment evaluated.
Inter-Rater Reliability
Concordance
Circumference
Intra-rater reliability
Bland–Altman plot
Cite
Citations (0)
To determine the reliability of volitional and nonvolitional limb muscle strength assessment in critically ill patients and to provide guidelines for the implementation of limb muscle strength assessment this population.The following computerized bibliographic databases were searched with MeSH terms and keywords or combinations: MEDLINE through PubMed and Embase through Embase.com.Articles were screened by two independent reviewers. Included studies were all performed in humans and were original articles. The research population exists of adult, critically ill patients or ICU survivors of either sex, and those admitted to a medical, surgical, respiratory, or mixed ICU. A study was included if reliability of muscle strength measurements was determined in this population.Data on baseline characteristics (country, study population, eligibility, age, setting and method, and equipment of limb muscle strength assessment) and reliability scores were obtained by two independent reviewers.Data of six observational studies were analyzed. Interrater reliability of the Medical Research Council scale for individual muscle groups varied from "fair" or "substantial" (weighted κ, 0.23-0.64) to "very good" agreement (weighted κ, 0.80-0.96). Interrater reliability of the Medical Research Council-sum score was found to be very good in all four studies (intraclass correlation coefficients, 0.86-0.99 or Pearson product moment correlation coefficient = 0.96). Interrater reliability of handheld dynamometry was comparable between two studies (intraclass correlation coefficients, 0.62-0.96). Interrater reliability of handgrip dynamometry was very good in two studies (intraclass correlation coefficients, 0.89-0.97). Intrarater reliability of handheld dynamometry and handgrip dynamometry was assessed in one study, and results were very good (intraclass correlation coefficients > 0.81). No studies were obtained on reliability of nonvolitional muscle strength assessment.Voluntary muscle strength measurement has proven reliable in critically ill patients provided that strict guidelines on adequacy and standardized test procedures and positions are followed.
Inter-Rater Reliability
Intra-rater reliability
Cite
Citations (150)