For the problem whether Graphic Processing Unit(GPU),the stream processor with high performance of floating-point computing is applicable to neural networks, this paper proposes the parallel recognition algorithm of Convolutional Neural Networks(CNNs).It adopts Compute Unified Device Architecture(CUDA)technology, definite the parallel data structures, and describes the mapping mechanism for computing tasks on CUDA. It compares the parallel recognition algorithm achieved on GPU of GTX200 hardware architecture with the serial algorithm on CPU. It improves speed by nearly 60 times. Result shows that GPU based the stream processor architecture ate more applicable to some related applications about neural networks than CPU.
Recent years have witnessed the dramatic popularity of online music streaming and the use of headphones likeAirPods, which millions of people use daily [1]. Melodic EQ was inspired by these users to create the best audiolistening experience for listeners with various preferences [2]. Melodic EQ is a project that creates customEQs tothe user's custom music tastes and filters the audio to fit their favorite settings. To achieve this goal, the processstarts with a song file taken from an existing file, for example, Spotify downloads or mp3s. This file is then uploadedto the app. The software sorts the song in a genre detecting Algorithm and assigns a genre label to that song. Insidethe app, the user will create or select EQs for that genre and apply it to their music. The interface is easy to use andthe app aims to make everyone's preferences achievable and on the fly. That’s why there are presets for eachcategory for users who are unfamiliar with equalizers, and custom settings for advanced users to create their perfect sound for each genre.
In this paper, we propose a scribble-based video colorization network with temporal aggregation called SVCNet. It can colorize monochrome videos based on different user-given color scribbles. It addresses three common issues in the scribble-based video colorization area: colorization vividness, temporal consistency, and color bleeding. To improve the colorization quality and strengthen the temporal consistency, we adopt two sequential sub-networks in SVCNet for precise colorization and temporal smoothing, respectively. The first stage includes a pyramid feature encoder to incorporate color scribbles with a grayscale frame, and a semantic feature encoder to extract semantics. The second stage finetunes the output from the first stage by aggregating the information of neighboring colorized frames (as short-range connections) and the first colorized frame (as a long-range connection). To alleviate the color bleeding artifacts, we learn video colorization and segmentation simultaneously. Furthermore, we set the majority of operations on a fixed small image resolution and use a Super-resolution Module at the tail of SVCNet to recover original sizes. It allows the SVCNet to fit different image resolutions at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo benchmarks. The experimental results demonstrate that SVCNet produces both higher-quality and more temporally consistent videos than other well-known video colorization approaches. The codes and models can be found at https://github.com/zhaoyuzhi/SVCNet.
Methods for interpreting machine learning black-box models increase the outcomes' transparency and in turn generates insight into the reliability and fairness of the algorithms. However, the interpretations themselves could contain significant uncertainty that undermines the trust in the outcomes and raises concern about the model's reliability. Focusing on the method Local Interpretable Model-agnostic Explanations (LIME), we demonstrate the presence of two sources of uncertainty, namely the randomness in its sampling procedure and the variation of interpretation quality across different input data points. Such uncertainty is present even in models with high training and test accuracy. We apply LIME to synthetic data and two public data sets, text classification in 20 Newsgroup and recidivism risk-scoring in COMPAS, to support our argument.
Brain tumors rank among the most lethal types of tumors. Magnetic Resonance Imaging (MRI) technology can clearly display the position, size, and borders of the tumors. Hence, MRI is frequently used in clinical diagnostics to detect brain tumors. Deep learning and related methods have been widely used in computer vision research recently for diagnosis and classification of MRI images of brain tumors. One trend is to achieve better performance by increasing model complexity. However, at the same time, the trainable parameters of the model increases accordingly. Too many parameters will lead to more difficult model training and optimization, and are prone to overfitting. To address this problem, this study constructs a model that incorporates a depthwise separable convolution technique and an attention mechanism to balance model performance and complexity. The model achieves 97.41% accuracy in the brain tumor classification task, which exceeds the 96.72% accuracy of the pre-trained model MobileNetV2, and shows good image classification ability. Future work could test the model's classification results on noisy images and investigate how to optimize the model's generalization ability.
RAW files are widely applied in cameras and scanners as storage because they contain original optical data. Different cameras usually process the RAW files using diverse algorithms that are incompatible. To address the issue, we propose a general transformation method for cross-camera RAW to RGB mapping based on Generative Adversarial Network (GAN). Moreover, we propose a saliency map-aided data augmentation technique and the saliency maps are produced by Saliency GAN (SalGAN). Given RAW file as an input, it jointly predicts the RGB image and corresponding saliency map to enhance perceptual quality in the generated image. The proposed architecture is trained on the Zurich RAW2RGB (ZRR) dataset. Experimental results show that our method can generate more clear and visually plausible images than state-of-the-art networks.
In the evolving landscape of healthcare, timely and accurate medical predictions are paramount, especially in managing chronic conditions like kidney disease. This paper introduces an innovative AI-driven application designed to enhance renal health management by predicting the need for dialysis and anemia, critical aspects of kidney care. Utilizing advanced algorithms such as Support Vector Machine (SVM) and XGBoost, coupled with cross-validation techniques, the application aims to provide reliable health predictions based on patient data. Challenges including model accuracy and processing speed were meticulously addressed through algorithm optimization and efficient data handling, ensuring the system's responsiveness to varying data complexities. Experimentation with mock patient scenarios revealed the system's capability to deliver precise anemia predictions and identify dialysis needs promptly, highlighting its potential in clinical settings. The application's blend of accuracy, speed, and user-centric design positions it as a valuable tool for patients and healthcare providers, promising to improve outcomes and decision-making in kidney health management.