Multimodal sensor fusion techniques have promoted the development of autonomous driving, while perception in a complex environment remains a challenging problem. This chapter proposes the Open Multimodal Perception Dataset (OpenMPD), a multimodal perception benchmark aimed at difficult examples. Compared with existing datasets, OpenMPD focuses more on those complex traffic scenes in urban areas with overexposure or darkness, crowded environments, unstructured roads, and intersections. It acquires the multimodal data through a vehicle with 6 cameras and 4 LiDAR for a 360-degree field of view and collects 180 clips of 20-s synchronized images at 20Hz and point clouds at 10Hz. In particular, we applied a 128-beam LiDAR to provide Hi-Res point clouds to understand the 3D environment and sensor fusion better. We sampled 15K keyframes at equal intervals from clips for annotations, including 2D/3D object detections, 3D object tracking, and 2D semantic segmentation. Moreover, we provide four benchmarks for all tasks to evaluate algorithms and conduct extensive 2D/3D detection and segmentation experiments on OpenMPD. Data and further information are available at http://www.openmpd.com/ .
Recently, researchers observed that gradient descent for deep neural networks operates in an ``edge-of-stability'' (EoS) regime: the sharpness (maximum eigenvalue of the Hessian) is often larger than stability threshold $2/\eta$ (where $\eta$ is the step size). Despite this, the loss oscillates and converges in the long run, and the sharpness at the end is just slightly below $2/\eta$. While many other well-understood nonconvex objectives such as matrix factorization or two-layer networks can also converge despite large sharpness, there is often a larger gap between sharpness of the endpoint and $2/\eta$. In this paper, we study EoS phenomenon by constructing a simple function that has the same behavior. We give rigorous analysis for its training dynamics in a large local region and explain why the final converging point has sharpness close to $2/\eta$. Globally we observe that the training dynamics for our example has an interesting bifurcating behavior, which was also observed in the training of neural nets.
High quality quasi-parallel x-ray microbeams have an appreciable application value in the x-ray diffraction analysis technique, which is currently one of the most significant non-destructive analysis techniques. A simulation of a parabolic single capillary is carried out based on the Monte Carlo simulation toolkit Geant4. The simulation results show that it is feasible to obtain high quality quasi-parallel x-ray microbeams based on a parabolic capillary and a traditional laboratorial x-ray source. We manufacture a parabolic capillary based on the simulation results. The physical parameters of the obtained x-ray beams are characterized by building an x-ray imaging system. The experimental results show that the x-ray beam with submicrometer size and almost zero divergence can be obtained from the traditional laboratorial x-ray source by utilizing a parabolic single capillary as a collimator.
As an essential element of multi-sensor fusion, uncertainty provides theoretical support for making better decisions using multi-source data. This chapter begins by introducing the prerequisite knowledge. Following that, two types of uncertainty are introduced from the perspective of models and data, known as epistemic uncertainty and theoretical uncertainty. It is possible to deal with the contingencies for a given model or set of sensors on a vehicle by delving into these two types of uncertainty. This chapter proposes a novel multimodal fusion architecture from an information-theoretic perspective. The proposed model is presented from four aspects: baseline, uncertainty modeling, fusion step, and implementation. Furthermore, the experimental procedure, including data preprocessing, noise simulation, experimental results, and analysis, is described in detail.
Deep Neural Network classifiers are vulnerable to adversarial attack, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities, and then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, an anti-collapse triplet defense is proposed to improve the ranking model robustness against all proposed attacks, where the model learns to prevent the positive and negative samples being pulled close to each other by adversarial attack. To comprehensively measure the empirical adversarial robustness of a ranking model with our defense, we propose an empirical robustness score, which involves a set of representative attacks against ranking models. Our adversarial ranking attacks and defenses are evaluated on MNIST, Fashion-MNIST, CUB200-2011, CARS196 and Stanford Online Products datasets. Experimental results demonstrate that a typical deep ranking system can be effectively compromised by our attacks. Nevertheless, our defense can significantly improve the ranking system robustness, and simultaneously mitigate a wide range of attacks.
Recent studies unveil the vulnerabilities of deep ranking models, where an imperceptible perturbation can trigger dramatic changes in the ranking result. While previous attempts focus on manipulating absolute ranks of certain candidates, the possibility of adjusting their relative order remains under-explored. In this paper, we formulate a new adversarial attack against deep ranking systems, i.e., the Order Attack, which covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates. Specifically, it is formulated as a triplet-style loss imposing an inequality chain reflecting the specified permutation. However, direct optimization of such white-box objective is infeasible in a real-world attack scenario due to various black-box limitations. To cope with them, we propose a Short-range Ranking Correlation metric as a surrogate objective for black-box Order Attack to approximate the white-box method. The Order Attack is evaluated on the Fashion-MNIST and Stanford-Online-Products datasets under both white-box and black-box threat models. The black-box attack is also successfully implemented on a major e-commerce platform. Comprehensive experimental evaluations demonstrate the effectiveness of the proposed methods, revealing a new type of ranking model vulnerability.
Intangible welfare refers to the welfare that cannot be reflected in traditional auditing report or be found in any means which is different from the "tangible welfare" stipulated by state laws and regulations.Intangible welfare is invisible, diversified and regulation-violating.It is embodied in intangible housing welfare, daily consumption welfare, financial speculation welfare and knowledge product welfare etc.To prevent the happening of intangible welfare among state-owned enterprises, it is necessary to establish an auditing warning model based on risk-orientation, reinforce the government's macro guidance to the enterprises and set up effective self-regulative scheme.
To achieve better results on object detection for autonomous vehicle under complex outdoor conditions, we attempt to integrated the sensor-fusion, hierarchical multi-view networks and traditional heuristical method together. The most significant environmental perception sensors for autonomous vehicles are camera and LIDAR. The 2D RGB image and 3D point cloud from camera and LIDAR respectively are utilized. The hierarchical multi-view proposal network (HMVPN) is proposed in this paper, which can effectively fuse the multi-modal information of the camera with LIDAR. As there are several hierarchical network layers in HMVPN, image becomes the input of the primary network for object detection. Moreover, LIDAR data is divided into four projection image (HBV, IBV, HCV, DCV), and then combines its original 3D point cloud into hierarchical second network to generate candidate proposals using machine learning and heuristic methods. Several simulations on the famous autonomous vehicle benchmark of KITTI show that our approach obtains about 20% higher AP than the state-of-the-art methods.
3D object detection is becoming an indispensable functional module for environmental perception in autonomous driving, and LiDAR-based detection methods have made remarkable progress in terms of accuracy. However, point clouds often fail to distinguish objects with similar structures, leading to false detection. Therefore, other sensors and LiDAR fusion are naturally considered a solution. Nevertheless, current fusion methods are either limited to poor precision or efficiency. To this end, this chapter proposes a plug-and-play module named RI-Fusion to achieve the effective fusion of LiDAR and camera, and the module can be easily accessed by existing LiDAR-based algorithms. Furthermore, a particular fusion method of RaDAR and 16-line LiDAR based on multimodal and multi-scale fusion is proposed, called $$M^{2}$$ -Fusion. The interaction is achieved by learning the features of each modality by exchanging the information of the intermediate feature layers with a self-attention mechanism. Experiments show that the method has better environmental adaptability and low cost.