Abstract Purpose To develop a deep learning–based approach to reduce the scan time of multipool CEST MRI for Parkinson's disease (PD) while maintaining sufficient prediction accuracy. Method A deep learning approach based on a modified one‐dimensional U‐Net, termed Z‐spectral compressed sensing (CS), was proposed to recover dense Z‐spectra from sparse ones. The neural network was trained using simulated Z‐spectra generated by the Bloch equation with various parameter settings. Its feasibility and effectiveness were validated through numerical simulations and in vivo rat brain experiments, compared with commonly used linear, pchip, and Lorentzian interpolation methods. The proposed method was applied to detect metabolism‐related changes in the 6‐hydroxydopamine PD model with multipool CEST MRI, including APT, CEST@2 ppm, nuclear Overhauser enhancement, direct saturation, and magnetization transfer, and the prediction performance was evaluated by area under the curve. Results The numerical simulations and in vivo rat‐brain experiments demonstrated that the proposed method could yield superior fidelity in retrieving dense Z‐spectra compared with existing methods. Significant differences were observed in APT, CEST@2 ppm, nuclear Overhauser enhancement, and direct saturation between the striatum regions of wild‐type and PD models, whereas magnetization transfer exhibited no significant difference. Receiver operating characteristic analysis demonstrated that multipool CEST achieved better predictive performance compared with individual pools. Combined with Z‐spectral CS, the scan time of multipool CEST MRI can be reduced to 33% without distinctly compromising prediction accuracy. Conclusion The integration of Z‐spectral CS with multipool CEST MRI can enhance the prediction accuracy of PD and maintain the scan time within a reasonable range.
Acute respiratory distress syndrome (ARDS) is a life-threatening condition with a high incidence and mortality rate in intensive care unit (ICU) admissions. Early identification of patients at high risk for developing ARDS is crucial for timely intervention and improved clinical outcomes. However, the complex pathophysiology of ARDS makes early prediction challenging. This study aimed to develop an artificial intelligence (AI) model for automated lung lesion segmentation and early prediction of ARDS to facilitate timely intervention in the intensive care unit.
Acute respiratory distress syndrome (ARDS) is a life-threatening condition that can develop in critically ill patients. Early identification of risk factors associated with ARDS development is essential for timely intervention and improved patient outcomes. This study aimed to investigate the potential predictors of ARDS in critically ill patients admitted to the intensive care unit (ICU).
Prompt tuning methods have achieved remarkable success in parameter-efficient fine-tuning on large pre-trained models. However, their application to dual-modal fusion-based visual-language pre-trained models (VLPMs), such as GLIP, has encountered issues. Existing prompt tuning methods have not effectively addressed the modal mapping and aligning problem for tokens in different modalities, leading to poor transfer generalization. To address this issue, we propose Synchronous Dual Prompt Tuning (SDPT). SDPT initializes a single set of learnable unified prototype tokens in the established modal aligning space to represent the aligned semantics of text and image modalities for downstream tasks. Furthermore, SDPT establishes inverse linear projections that require no training to embed the information of unified prototype tokens into the input space of different modalities. The inverse linear projections allow the unified prototype token to synchronously represent the two modalities and enable SDPT to share the unified semantics of text and image for downstream tasks across different modal prompts. Experimental results demonstrate that SDPT assists fusion-based VLPMs to achieve superior outcomes with only 0.04\% of model parameters for training across various scenarios, outperforming other single- or dual-modal methods. The code will be released at https://github.com/wuyongjianCODE/SDPT.
Recent advancements in large vision language models (VLMs) tailored for autonomous driving (AD) have shown strong scene understanding and reasoning capabilities, making them undeniable candidates for end-to-end driving systems. However, limited work exists on studying the trustworthiness of DriveVLMs -- a critical factor that directly impacts public transportation safety. In this paper, we introduce AutoTrust, a comprehensive trustworthiness benchmark for large vision-language models in autonomous driving (DriveVLMs), considering diverse perspectives -- including trustfulness, safety, robustness, privacy, and fairness. We constructed the largest visual question-answering dataset for investigating trustworthiness issues in driving scenarios, comprising over 10k unique scenes and 18k queries. We evaluated six publicly available VLMs, spanning from generalist to specialist, from open-source to commercial models. Our exhaustive evaluations have unveiled previously undiscovered vulnerabilities of DriveVLMs to trustworthiness threats. Specifically, we found that the general VLMs like LLaVA-v1.6 and GPT-4o-mini surprisingly outperform specialized models fine-tuned for driving in terms of overall trustworthiness. DriveVLMs like DriveLM-Agent are particularly vulnerable to disclosing sensitive information. Additionally, both generalist and specialist VLMs remain susceptible to adversarial attacks and struggle to ensure unbiased decision-making across diverse environments and populations. Our findings call for immediate and decisive action to address the trustworthiness of DriveVLMs -- an issue of critical importance to public safety and the welfare of all citizens relying on autonomous transportation systems. Our benchmark is publicly available at \url{https://github.com/taco-group/AutoTrust}, and the leaderboard is released at \url{https://taco-group.github.io/AutoTrust/}.
ABSTRACT Accurate carbon price forecasting is crucial for effective carbon market analysis and decision‐making. We propose a novel Temporal Feature‐Refined (TFR) model to address the challenges of complex dependencies and high noise levels in carbon price time series data. The TFR model integrates Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for signal decomposition, an Autoencoder for feature optimization, and a Temporal Convolutional Network (TCN) for capturing long‐range temporal dependencies. It incorporates both traditional economic factors and unconventional determinants such as air quality, policy uncertainty, and public sentiment. Experiments on the Shanghai carbon trading market demonstrate that the TFR model significantly outperforms existing methods, achieving an 83.96% improvement in MAE over Support Vector Regression (SVR) and up to a 65.56% improvement over Long Short‐Term Memory (LSTM) networks. Further analyses, including comparisons with different decomposition models and ablation studies, confirm the effectiveness of each component and the overall model.
Action Quality Assessment (AQA) has wide applications in various scenarios. Regarding the AQA of long-term figure skating, the big challenge lies in semantic context feature learning for Program Component Score (PCS) prediction and fine-grained technical subaction analysis for Technical Element Score (TES) prediction. In this paper, we propose a Localization-assisted Uncertainty Score Disentanglement Network (LUSD-Net) to deal with PCS and TES two predictions. In the LUSD-Net, we design an uncertainty score disentanglement solution, including score disentanglement and uncertainty regression, to decouple PCS-oriented and TES-oriented representations from skating sequences, ensuring learning differential representations for two types of score prediction. For long-term feature learning, a temporal interaction encoder is presented to build temporal context relation learning on PCS-oriented and TES-oriented features. To address subactions in TES prediction, a weakly-supervised temporal subaction localization is adopted to locate technical subactions in long sequences. For evaluation, we collect a large-scale Fine-grained Figure Skating dataset (FineFS) involving RGB videos and estimated skeleton sequences, providing rich annotations for multiple downstream action analysis tasks. The extensive experiments illustrate that our proposed LUSD-Net significantly improves the AQA performance, and the FineFS dataset provides a quantity data source for the AQA. The source code of LUSD-Net and the FineFS dataset is released at https://github.com/yanliji/FineFS-dataset.