Leveraging class semantic descriptions and examples of known objects, zero-shot learning makes it possible to train a recognition model for an object class whose examples are not available. In this paper, we propose a novel zero-shot learning model that takes advantage of clustering structures in the semantic embedding space. The key idea is to impose the structural constraint that semantic representations must be predictive of the locations of their corresponding visual exemplars. To this end, this reduces to training multiple kernel-based regressors from semantic representation-exemplar pairs from labeled data of the seen object categories. Despite its simplicity, our approach significantly outperforms existing zero-shot learning methods on standard benchmark datasets, including the ImageNet dataset with more than 20,000 unseen categories.
We study the problem of developing autonomous agents that can follow human instructions to infer and perform a sequence of actions to complete the underlying task. Significant progress has been made in recent years, especially for tasks with short horizons. However, when it comes to long-horizon tasks with extended sequences of actions, an agent can easily ignore some instructions or get stuck in the middle of the long instructions and eventually fail the task. To address this challenge, we propose a modelagnostic milestone-based task tracker (M-TRACK) to guide the agent and monitor its progress. Specifically, we propose a milestone builder that tags the instructions with navigation and interaction milestones which the agent needs to complete step by step, and a milestone checker that systemically checks the agent's progress in its current milestone and determines when to proceed to the next. On the challenging ALFRED dataset, our M-Track leads to a notable 33% and 52% relative improvement in unseen success rate over two competitive base models.
The aims of this study were to investigate (i) if and when the blood pressure would rise or fall and (ii) the associated changes of human heart rate variability (HRV) by manual stimulation of the Neiguan (PC 6) acupuncture site. In this paper, two groups of six healthy male volunteers with ranges of ages 20-56 and 20-55 and with no neurological diseases participated in this study. In order to minimize artefacts, the electrocardiogram (ECG) and radial arterial pulse pressure wave were collected with the subjects alert but eyes closed before, during, and after sham/manual acupuncture. No statistically significant changes (P > 0.05) were found in the sham acupuncture group. As for the manual acupuncture group, the needle was inserted into the PC 6 acupoint and manually stimulated about 15 to 30 seconds to achieve De Qi sensation. Needles were left in place for 30 min and then removed. Analysis of the data due to acupuncture was then compared with the baseline values. Results indicate that the blood pressures of different subject can either rise (P < 0.01) or fall (P < 0.01). To further determine the indicator for one subject who exhibited both rise and fall of blood pressures, 7 more trials were given conducted with the same protocol until statistically significant results were obtained (P < 0.01). We found that his change of blood pressure was highly correlated (p = -0.94 and -0.99 for rise and fall, respectively) with the ratio of the magnitude of pulse pressure to that of the dicrotic notch in the local radial pulse wave (P < 0.01). As to the heart rate variability (HRV) spectra, significant changes in the low frequency (LF) and very low frequency (VLF) ranges were also detected. These results indicate that the autonomic innervations of heart have been modified. However, the information on the power of LF, high frequency (HF), and LF/HF of HRV are not conclusive to statistically differentiate the sympathetic contribution from that of the parasympathetic nervous systems at present stage.
Federated learning aims to collaboratively train a strong global model by accessing users' locally trained models but not their own data. A crucial step is therefore to aggregate local models into a global model, which has been shown challenging when users have non-i.i.d. data. In this paper, we propose a novel aggregation algorithm named FedBE, which takes a Bayesian inference perspective by sampling higher-quality global models and combining them via Bayesian model Ensemble, leading to much robust aggregation. We show that an effective model distribution can be constructed by simply fitting a Gaussian or Dirichlet distribution to the local models. Our empirical studies validate FedBE's superior performance, especially when users' data are not i.i.d. and when the neural networks go deeper. Moreover, FedBE is compatible with recent efforts in regularizing users' model training, making it an easily applicable module: you only need to replace the aggregation method but leave other parts of your federated learning algorithm intact.
Introduction: The revised Fukuoka 2017 International Consensus Guidelines (ICG) for branch duct (BD)-IPMNs are extensively employed; however, the diagnostic accuracies for distinguishing BD-IPMNs with low-grade dysplasia (LGD) from high-grade dysplasia/adenocarcinoma (HGD-Ca) remain suboptimal. EUS-guided needle-based confocal endomicroscopy (EUS-nCLE) has shown promising diagnostic performance in predicting HGD-Ca in IPMNs. The objective of this study is to integrate nCLE variables with existing ICG guidelines to enhance the differentiation of BD-IPMNs. Methods: Subjects with available reference histopathology for BD-IPMNs were enrolled from prospective databases at a single center (2015-2023). Unedited EUS-nCLE videos were reviewed by a blinded expert and for each subject, parameters of papillary size, epithelial darkness, epithelial thickness, and density were documented (Figure 1). The diagnostic performance in identifying HGD-Ca for ICG high-risk criteria and nCLE variables were assessed. Multivariable logistic regression was further performed to predict HGD-Ca. Results: In 67 enrolled subjects, the mean (±SD) age was 67.5±9.7 years, and 43.3% were female. The mean (±SD) BD-IPMN diameter was 34.6±10.3 mm. HGD-Ca was detected in 24 (35.8%) of the BD-IPMNs. The sensitivity, specificity, and accuracy for detecting HGD-Ca using the 2017 ICG high-risk criteria were 46% (95% CI: 28-65%), 93% (95% CI: 81-98%), and 76% (95% CI: 65-85%), respectively. Table 1 presents diagnostic indices for each of the nCLE variables in predicting HGD-Ca. The variable with highest performance was papillary epithelial thickness (sensitivity: 63%, specificity: 81%, and accuracy: 75%). Logistic regression analysis to predict HGD-Ca using ICG high-risk classification and nCLE variables revealed that papillary epithelium thickness (OR 9.74, 95% CI 1.65-57.37) was the nCLE variable most predictive of HGD-Ca in BD-IPMNs. The integration of papillary epithelial thickness and ICG-high risk (Table 1) resulted in an improved sensitivity of 83% (95% CI: 64-93%) and enhanced accuracy of 79% (95% CI: 68-87%). Conclusion: By providing in vivo real-time visualization of cyst epithelium, EUS-nCLE allows for the characterization of papillary structures and enhances the accuracy of risk stratification in BD-IPMNs. This finding supports the ongoing initiatives in multicenter collaborations and the utilization of artificial intelligence-assisted models to further improve image analysis.Figure 1.: EUS-nCLE images illustrating different degrees of papillae size, papillary epithelium darkness (representing nuclear stratification), papillary epithelium thickness (representing cellular stratification) and papillary/epithelium density. Table 1. - Diagnostic performance of EUS-nCLE variables and ICG classification for advanced neoplasia based on high-risk features Sensitivity % (95% CI) Specificity % (95% CI) PPV % (95% CI) NPV % (95% CI) Accuracy % (95% CI) Papillae Size 54.2 (35.1, 72.1) 60.5 (45.6, 73.6) 43.3 (27.4, 60.8) 70.3 (54.2, 82.5) 58.2 (46.3, 69.3) Papillary epithelium darkness 50.0 (31.4, 68.6) 81.4 (67.4, 90.3) 60.0 (38.7, 78.1) 74.5 (60.5, 84.7) 70.1 (58.3, 79.8) Papillary epithelium thickness 62.5 (42.7, 78.8) 81.4 (67.4, 90.3) 65.2 (44.9, 81.2) 79.5 (65.5, 88.8) 74.6 (63.1, 83.5) Papillae/epithelium density 41.7 (24.5, 61.2) 81.4 (67.4, 90.3) 55.6 (33.7, 75.4) 71.4 (57.6, 82.2) 67.2 (55.3, 77.2) ICG-HR 45.8 (27.9, 64.9) 93 (81.4, 97.6) 78.6 (52.4, 92.4) 75.5 (62.4, 85.1) 76.1 (64.7, 84.7) ICG-HR + Papillary epithelium thickness 83.3 (64.1, 93.3) 76.7 (62.3, 86.8) 66.7 (48.8, 80.8) 89.2 (75.3, 95.7) 79.1 (67.9, 87.1) PPV = Positive Predictive Value, NPV = Negative Predictive Value, CI = Confidence Interval, ICG-HR = 2017 International Consensus Guidelines High-Risk Criteria, EUS-nCLE = EUS-guided needle-based confocal endomicroscopy.
One fundamental challenge in building an instance segmentation model for a large number of classes in complex scenes is the lack of training examples, especially for rare objects. In this paper, we explore the possibility to increase the training examples without laborious data collection and annotation. We find that an abundance of instance segments can potentially be obtained freely from object-centric images, according to two insights: (i) an object-centric image usually contains one salient object in a simple background; (ii) objects from the same class often share similar appearances or similar contrasts to the background. Motivated by these insights, we propose a simple and scalable framework FreeSeg for extracting and leveraging these "free" object foreground segments to facilitate model training in long-tailed instance segmentation. Concretely, we investigate the similarity among object-centric images of the same class to propose candidate segments of foreground instances, followed by a novel ranking of segment quality. The resulting high-quality object segments can then be used to augment the existing long-tailed datasets, e.g., by copying and pasting the segments onto the original training images. Extensive experiments show that FreeSeg yields substantial improvements on top of strong baselines and achieves state-of-the-art accuracy for segmenting rare object categories.
We study the problem of developing autonomous agents that can follow human instructions to infer and perform a sequence of actions to complete the underlying task. Significant progress has been made in recent years, especially for tasks with short horizons. However, when it comes to long-horizon tasks with extended sequences of actions, an agent can easily ignore some instructions or get stuck in the middle of the long instructions and eventually fail the task. To address this challenge, we propose a model-agnostic milestone-based task tracker (M-TRACK) to guide the agent and monitor its progress. Specifically, we propose a milestone builder that tags the instructions with navigation and interaction milestones which the agent needs to complete step by step, and a milestone checker that systemically checks the agent's progress in its current milestone and determines when to proceed to the next. On the challenging ALFRED dataset, our M-TRACK leads to a notable 33% and 52% relative improvement in unseen success rate over two competitive base models.
Natural images are virtually surrounded by low-density misclassified regions that can be efficiently discovered by gradient-guided search --- enabling the generation of adversarial images. While many techniques for detecting these attacks have been proposed, they are easily bypassed when the adversary has full knowledge of the detection mechanism and adapts the attack strategy accordingly. In this paper, we adopt a novel perspective and regard the omnipresence of adversarial perturbations as a strength rather than a weakness. We postulate that if an image has been tampered with, these adversarial directions either become harder to find with gradient methods or have substantially higher density than for natural images. We develop a practical test for this signature characteristic to successfully detect adversarial attacks, achieving unprecedented accuracy under the white-box setting where the adversary is given full knowledge of our detection mechanism.