A One-Dimensional Depthwise Separable Convolutional Neural Network for Bearing Fault Diagnosis Implemented on FPGA
0
Citation
28
Reference
10
Related Paper
Abstract:
This paper presents a hardware implementation of a one-dimensional convolutional neural network using depthwise separable convolution (DSC) on the VC707 FPGA development board. The design processes the one-dimensional rolling bearing current signal dataset provided by Paderborn University (PU), employing minimal preprocessing to maximize the comprehensiveness of feature extraction. To address the high parameter demands commonly associated with convolutional neural networks (CNNs), the model incorporates DSC, significantly reducing computational complexity and parameter load. Additionally, the DoReFa-Net quantization method is applied to compress network parameters and activation function outputs, thereby minimizing memory usage. The quantized DSC model requires approximately 22 KB of storage and performs 1,203,128 floating-point operations in total. The implementation achieves a power consumption of 527 mW at a clock frequency of 50 MHz, while delivering a fault diagnosis accuracy of 96.12%.Keywords:
Activation function
Convolution (computer science)
The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O(C2K2) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O(C⋅(C+K2)) while spatial separable convolution reduces the complexity to O(C2K). However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O(C32K). When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O(C⋅log(CK2)) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.
Convolution (computer science)
Kernel (algebra)
Operator (biology)
Cite
Citations (7)
We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.
Pointwise
Convolution (computer science)
Cite
Citations (160)
We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.
Convolution (computer science)
Pointwise
Cite
Citations (15,109)
In recent times, the use of separable convolutions in deep convolutional neural network architectures has been explored. Several researchers, most notably (Chollet, 2016) and (Ghosh, 2017) have used separable convolutions in their deep architectures and have demonstrated state of the art or close to state of the art performance. However, the underlying mechanism of action of separable convolutions are still not fully understood. Although their mathematical definition is well understood as a depthwise convolution followed by a pointwise convolution, deeper interpretations such as the extreme Inception hypothesis (Chollet, 2016) have failed to provide a thorough explanation of their efficacy. In this paper, we propose a hybrid interpretation that we believe is a better model for explaining the efficacy of separable convolutions.
Pointwise
Convolution (computer science)
Cite
Citations (0)
We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.
Pointwise
Convolution (computer science)
Cite
Citations (12)
We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.
Pointwise
Convolution (computer science)
Cite
Citations (379)
초해상도는 저화질의 영상을 고화질의 영상으로 변환하는 과업으로, 무거운 연산을 활용하여 성능을 향상하는 연구에 비교적 집중되어 있다. 우리는 깊은 합성곱 신경망 기반 초해상도 모델을 다른 기술과 함께 사용하거나 모바일 환경에서 사용할 때, 모델의 경량화가 필요함을 인지하였다. 이에 따라 본 논문에서는 깊이별 분리 합성곱(Depth-wise Separable Convolution)을 활용하여, 최신 초해상도 모델인 MSRN(Multi-Scale Residual Network)을 경량화한 SMSRN(Separable Convolution Based Multi-Scale Residual Network)의 구조를 제안한다. SMSRN의 매개변수는 MSRN의 14.64% 수준으로 감량되었다. 반면에 다양한 벤치마크 데이터 집합에 대한 정량적 실험을 진행한 결과로 성능은 98.53%를 유지함을 보였으며, 정성적 실험 결과로 성능 저하를 확인하기 힘든 결과를 보였다. SMSRN은 다양한 합성곱 필터 크기를 사용한 구조이므로, 다양한 구조의 초해상도 모델에도 적용하여 경량화 할 수 있을 것으로 판단된다.
Convolution (computer science)
Cite
Citations (0)
Without activation functions, it would be only possible for the neural network to learn very basic tasks, so the activation function is a key point in the neural network's architecture. The function allows us to learn more complicated tasks and also it impacts the performance to obtain the outcome. So, activation functions represent the continuous and widespread interest of research to identify the most suitable activation function to a specific task. In this paper, we propose four activation functions that bring improvements for different datasets in the Computer Vision task. These functions are a combination of the popular activation functions such as sigmoid, bipolar sigmoid, Rectified Linear Unit (ReLU), and tangent (tanh). By allowing activation functions to be learnable we obtain models more robust. To validate these functions, we tested using more datasets and more architectures with different depths, showing that their properties are significant and useful. Also, we compared them with other powerful activation functions to see how our proposed activation functions impact accuracy.
Activation function
Sigmoid function
Cite
Citations (8)
In recent times, the use of separable convolutions in deep convolutional neural network architectures has been explored. Several researchers, most notably (Chollet, 2016) and (Ghosh, 2017) have used separable convolutions in their deep architectures and have demonstrated state of the art or close to state of the art performance. However, the underlying mechanism of action of separable convolutions are still not fully understood. Although their mathematical definition is well understood as a depthwise convolution followed by a pointwise convolution, deeper interpretations such as the extreme Inception hypothesis (Chollet, 2016) have failed to provide a thorough explanation of their efficacy. In this paper, we propose a hybrid interpretation that we believe is a better model for explaining the efficacy of separable convolutions.
Pointwise
Convolution (computer science)
Cite
Citations (2)