OpenCL-Darknet: implementation and optimization of OpenCL-based deep learning object detection framework

World Wide Web (2020)

Yongbon Koo Sung‐Hoon Kim Young-Guk Ha

Citation

Reference

Related Paper

Citation Trend

Keywords:

Graphics processing unit

Topics:

Advanced Neural Network Applications

Advanced Image and Video Retrieval Techniques

Video Surveillance and Tracking Methods

10.1007/s11280-020-00778-y

Cite

Implementation and optimization of high speed AES algorithm based on GPGPU and CUDA

Journal of the Graduate School of the Chinese Academy of Sciences (2011)

Zhen Bao

Compared with the CPU which is good at handling logic complexity service,GPGPU(general purpose graphic processing unit) is suitable for large-scale parallel processing computing.The emergence of CUDA(compute unified device architecture) accelerates the expansion of application of GPGPU.We accelerate the implementation of AES algorithm based on GPGPU and CUDA and achieve a total throughput of 6~7Gbit/s.Regardless of the time of data loading and storing,a throughput of 20Gbit/s towards an input size over 1MB can be achieved.

Speedup

Graphics processing unit

Source

Cite

Citations (2)

GPGPU Based Simulations for One and Two Dimensional Quantum Walks

Communications in computer and information science (2010)

Marek Sawerwain Roman Gielerak

Quantum walk

10.1007/978-3-642-13861-4_3

Cite

Citations (5)

CUDA를 이용한 GPGPU 시스템 성능 분석

대한전자공학회 학술대회 (2009)

조명진 이홍석 김선욱

이 논문에서는 데이터 병렬성이 매우 좋은 행렬 곱연산을 OpenMP, MPI, 그리고 CUDA 기술로 구현하고 전통적인 방식의 슈퍼컴퓨터와 CUDA를 이용하는 GPGPU 시스템의 성능 비교를 통해서 CUDA 시스템의 성능 확장성과 이 기술의 발전 가능성을 확인하였다.

Source

Cite

Citations (0)

Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator

Parallel Computing (2013)

Mark K. Gardner Paul Sathre Wu-chun Feng Gabriel Bravo Martínez

Graphics processing unit

Code (set theory)

10.1016/j.parco.2013.09.003

Cite

Citations (10)

Parallel Programming For High-Performance Computing on CUDA

Microcomputer applications (2009)

Zhang Min-fang

Aiming at the processing of GPU,this paper provides the solution to high-performance on GPU,including a detailed description of the CUDA programming model,the principle of optimization.It shows by the comparative experiment that CUDA owns strongly of the ability to the parallel processing and provides new methods and ideas to GPGPU.

Source

Cite

Citations (2)

Moving object detection strategy for augmented-reality applications in a GPGPU by using CUDA

2023 IEEE International Conference on Consumer Electronics (ICCE) (2012)

Daniel Berjón Carlos Cuevas Francisco Morán Narciso Garcı́a

A spatial-color-based non-parametric background-foreground modeling strategy in a GPGPU by using CUDA is proposed. This strategy is suitable for augmented-reality applications, providing real-time high-quality results in a great variety of scenarios.

10.1109/icce.2012.6161886

Cite

Citations (2)

Новітні архітектури відеоадаптерів. Технологія GPGPU. Частина 2

Реєстрація зберігання і обробка даних (2013)

Sergiy Pogorilyy Dmytro Vitel O. A. Vereshchynsky

Детально розглянуто основні принципи роботи зі спільною та розподіленою пам’яттю в технології NVidia CUDA. Описано шаблони взаємодії потоків і проблеми глобальної синхронізації. Проведено порівняльний аналіз основних технологій, що використовуються в підході GPGPU — Nvidia CUDA, OpenCL, Direct Compute.

10.35681/1560-9189.2013.15.1.103367

Cite

Citations (0)

Parallel connected-component labeling algorithm for GPGPU applications

In-Yong Jung Chang‐Sung Jeong

This paper proposes a new connected component labeling algorithm for GPGPU applications based on NVIDIA's CUDA. Various approaches and algorithms for connected component labeling with minimal execution time were designed, but the most of them have been focused on optimizing CPU algorithm. Therefore it is hard to apply these approaches to GPGPU programming models such as NVIDIA's CUDA. Today, GPGPU (General Purpose Graphic Processing Unit) technologies offer dedicated parallel hardware and programming model, and many applications are being moved onto the GPGPU. This algorithm is a multi-pass algorithm to utilize for GPGPU applications, and evaluation results show that maximum speedup is more than double compared with conventional CPU algorithms.

Speedup

Graphics processing unit

Component (thermodynamics)

10.1109/iscit.2010.5665161

Cite

Citations (15)

Introductory on GPGPU Programming Technique

Computer Programming Skills & Maintenance (2010)

Jinnan Zhang

The structure of HPC is changing since the growing applications of GPGPU,this change points out a new direction for HPC development.CUDA provided by NIVIDIA is a programming environment using C language for developing parallel computing applications.The efficiency of compute system can be improved by using CUDA to speed large scale and extensive parallel computing on specified graphic cards.This essay mainly introduces the situation of the development of GPU and how to using CUDA to developing parallel computing applications.

Speedup

Source

Cite

Citations (0)

CUDA-NP： Realizing Nested Thread-Level Parallelism in GPGPU Applications

计算机科学技术学报：英文版 (2015)

Yang Yun Yi Chao Li 周辉阳

平行程序与不同线级的并行(TLP ) 由代码节的系列组成。作为结果，在一个平行程序的一个线程例如在 CUDA 程序的一个 GPU 内核，仍然包含顺序的代码和平行的环，是相当普通的。为了利用如此的平行，循环，最近的开普勒·恩威迪亚体系结构介绍动态并行，它允许一个 GPU 线程开始另一个 GPU 内核，从而减少从一个中央处理器运行内核的开销。与动态并行，然而，一个父母线程能仅仅通过全球存储器与它的孩子线程交流，运行 GPU 内核的开销甚至在 GPU 以内是重要的。在这份报纸，我们首先学习包含这些基准没有的平行的环，和热点的一套 GPGPU 基准一个很高的环计数或 TLP 的高度。因而，用动态并行利用如此的平行的环的好处也被限制抵消它的开销。我们然后介绍我们的建议答案在 CUDA 利用嵌套的并行，叫作 CUDA-NP。与 CUDA-NP，当一个 GPU 程序开始时，我们开始启用线程的一个高数字，并且使用控制流动为不同代码节激活线程的不同数字。我们用一条基于指令的编译器途径实现我们的建议 CUDA-NP 框架。为一个 GPU 核，一个应用程序开发者仅仅需要为可并行化的代码节增加象 OpenMP 一样编译指示。然后，我们的 CUDA-NP 编译器自动地产生优化 GPU 内核。它支持减小和扫描原语，探索不同方法散布平行的环重复进线程，并且高效地管理在薄片上资源。我们的实验证明为一套 GPGPU 基准，它已经被优化了并且包含嵌套的并行，我们的建议 CUDA-NP 框架进一步平均到多达 6.69 次和 2.01 次改进表演。

Source

Cite

Citations (0)