The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches.
Abstract Addressing the ‘data silo’ issue among different elevator operating units and the temporal correlations in elevator vibration signals, a novel small-sample fault diagnosis method for elevator carriages based on temporal generative federated distillation is proposed. This method incorporates a temporal generative adversarial network into Federated Distillation via Generative Learning (FedGen). FedGen combines federated learning, knowledge distillation, and generative models to enhance model aggregation efficiency while mitigating data heterogeneity. However, the original generative model struggles to maintain dynamic correlations between signals when extracting temporal features. Therefore, an improved Time Series Generative Adversarial Networks (TimeGAN) model is introduced, substituting the initial logarithmic loss function with a least squares error function, thereby enhancing training stability and data quality. This approach eliminates the need for proxy datasets in knowledge distillation, avoiding the loss of temporal information during central server feature extraction. Simulation results demonstrate that this method enables data sharing while protecting data privacy, and enhances model generalization capabilities.
The prediction of pick-up regions for online ride-hailing can reduce the number of vacant vehicles on the streets, which will optimize the transportation efficiency of cities, reduce energy consumption and carbon emissions, and increase the income of online ride-hailing drivers. However, traditional studies have ignored the temporal and spatial dependencies among pick-up regions and the effects of similarity of POI attributes in different regions in modelling, making the features of the model incomplete. To address the above problems, we propose a new multigraph aggregation spatiotemporal graph convolutional network (MAST-GCN) model to predict pick-up regions for online ride-hailing. In this paper, we propose a graph aggregation method to extract the spatiotemporal aspects and preference features of spatial graphs, order graphs, and POI graphs. GCN is used on the aggregated graphs to extract spatial dimensional features from graph-structured data. The historical data are sequentially divided into temporal granularity according to the period, and convolution operations are performed on the time axis to obtain the features in the temporal dimension. The attention mechanism is used to assign different weights to features with strong periodicity and strong correlation, which effectively solves the pick-up region prediction problem. We implemented the MAST-GCN model based on the PyTorch framework, stacked with a two-layer spatiotemporal graph convolution module, where the dimension of the graph convolution is 64. We evaluate the proposed model on two real-world large scale ride-hailing datasets. The results show that our method provides significant improvements over state-of-the-art baselines.
As artificial intelligence technology rapidly advances, its deployment within the medical sector presents substantial ethical challenges. Consequently, it becomes crucial to create a standardized, transparent, and secure framework for processing medical data. This includes setting the ethical boundaries for medical artificial intelligence and safeguarding both patient rights and data integrity. This consensus governs every facet of medical data handling through artificial intelligence, encompassing data gathering, processing, storage, transmission, utilization, and sharing. Its purpose is to ensure the management of medical data adheres to ethical standards and legal requirements, while safeguarding patient privacy and data security. Concurrently, the principles of compliance with the law, patient privacy respect, patient interest protection, and safety and reliability are underscored. Key issues such as informed consent, data usage, intellectual property protection, conflict of interest, and benefit sharing are examined in depth. The enactment of this expert consensus is intended to foster the profound integration and sustainable advancement of artificial intelligence within the medical domain, while simultaneously ensuring that artificial intelligence adheres strictly to the relevant ethical norms and legal frameworks during the processing of medical data.