Anthony Skjellum

University of Tennessee at Chattanooga

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Purushotham Bangalore

University of Alabama

William Gropp

University of Illinois Urbana-Champaign

Derek Schafer

University of New Mexico

Ewing Lusk

Argonne National Laboratory

Matthew Leon Curry

Sandia National Laboratories California

Ryan E. Grant

Queen's University

Ewing Lusk

Argonne National Laboratory

Martin C. Herbordt

Boston University

Ron Brightwell

Sandia National Laboratories California

Amani Altarawneh

Tennessee Technological University

Cooperative Institutions

Mississippi State University

University of Alabama at Birmingham

University of Tennessee at Chattanooga

Auburn University

Sandia National Laboratories

University of Tennessee at Knoxville

Lawrence Livermore National Laboratory

Clemson University

University of Alabama

Sandia National Laboratories California

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Using MPI: Portable parallel programming with the message-passing interface

Computers & Mathematics with Applications (1995)

William Gropp Ewing Lusk Anthony Skjellum

Message Passing Interface

Interface (matter)

10.1016/0898-1221(95)90199-x

Cite

Citations (2,933)

MPI/RT: Design and Implementation of a Real-Time Message Passing Interface.

Parallel and Distributed Processing Techniques and Applications (1997)

Zhenqian Cui Arkady Kanevsky Jin Li Anthony Skjellum

Message Passing Interface

Interface (matter)

Source

Cite

Citations (4)

Dense and iterative concurrent linear algebra in the Multicomputers Toolbox

Purushotham Bangalore Anthony Skjellum Chuck Baldwin Steven M. Smith

The Multicomputer Toolbox includes sparse, dense, and iterative scalable linear algebra libraries. Dense direct, and iterative linear algebra libraries are covered in this paper, as well as the distributed data structures used to implement these algorithms; concurrent BLAS are covered elsewhere. We discuss uniform calling interfaces and functionality for linear algebra libraries. We include a detailed explanation of how the level-3 dense LU factorization works, including features that support data distribution independence with a blocked algorithm. We illustrate the data motion for this algorithm, and for a representative iterative algorithm, PCGS. We conclude that data distribution independent libraries are feasible and highly desirable. Much work remains to be done in performance tuning of these algorithms, though good portability and application-relevance have already been achieved.< >

Software portability

Linear algebra

Toolbox

Independence

10.1109/splc.1993.365574

Cite

Citations (3)

Extending the message passing interface (MPI)

Anthony Skjellum Nathan Doss K. Viswanathan A. Chowdappa Purushotham Bangalore

MPI is the de facto message passing standard for multicomputers and networks of workstations, established by the MPI Forum, a group of universities, research centers, and national laboratories (from both the United States and Europe), as well as multi-national vendors in the area of high performance computing. MPI has been implemented already by several groups. Worldwide acceptance of MPI has been quite rapid. This paper overviews several areas in which MPI can be extended, discusses the merits of making such extensions, and begins to demonstrate how some of these extensions can be made. In some areas, such as intercommunicator extensions, significant progress has been made by us already. In other areas (such as remote memory access), we are merely proposing extensions to MPI that we have not yet reduced to practice. Furthermore, we point out that other researchers are evidently working in parallel with us on their own extension concepts for MPI.< >

Message Passing Interface

Interface (matter)

Workstation

De facto

10.1109/splc.1994.376998

Cite

Citations (44)

Comparing MPI with Sockets

William Gropp Ewing Lusk Anthony Skjellum

This chapter contains sections titled: 9.1 Process Startup and Shutdown, 9.2 Handling Faults

Source

Cite

Citations (0)

Concurrent dynamic simulation: multicomputer algorithms research applied to ordinary differential-algebraic process systems in chemical engineering

Anthony Skjellum

We consider systematic parallel solution of ordinary differential-algebraic equations (DAE's) of low index (including stiff ODE's). We target multicomputers, message-passing concurrent computers, such as Intel's iPSC/2 hypercube and the Symult s2010 2D mesh. The programming model is reactive and/or loosely synchronized communicating sequential processes. We present new approaches to efficient application-level message passing through the Zipcode communication layer (built upon the Caltech Reactive Kernel), which is shown to be both portable and effective for complex multicomputer codes. Zipcode promotes the elegant expression of message passing in large applications, an important sub-goal. We present closed-form O(1)-memory, O(1)-time data distributions providing parametric control over degree of coefficient blocking and scattering. These new distributions permit effective formulations of the DAE's and higher sparse linear algebra performance. We present results for concurrent sparse, unsymmetric linear algebra. A two-phase approach is used, like Harwell's MA28. New results include: reduced communication pivoting and improvement of triangular-solve performance via the parametric distributions: LU factorization load balance is traded against solve performance. Overall performance is thereby increased. Good factorization speedups are attained for examples, but exploitation of multiple concurrent pivots remains a needed extension. Triangular solves prove disappointing on an absolute scale, despite significant effort. Two approaches to concurrent simulation are developed: the Waveform Relaxation (Picard-Lindelof) methodology extends to binary distillation simulation and further; it is inherently very concurrent. We address the achievable concurrent performance of sequential approaches via Concurrent DASSL, which extends Petzold's DASSL algorithm to multicomputers. A simulation driver for arbitrary networks of distillation columns is described. For a 9009-integration-state system with seven distillation columns, we demonstrate a speedup of approximately five. The low speedup is attributable to the simplicity of the thermodynamic model used, and the nearly narrow-banded Jacobian structure. Other chemical-engineering systems could perform substantially better. We suggest Waveform Relaxation as the key focus of future research for the particular distillation problem class cited. We indicate future areas for application of Concurrent DASSL, and suggest ways to improve its concurrent performance, coupled with improvements in sparse linear algebra.

Intel iPSC

10.7907/ac3b-h085.

Cite

Citations (20)

Gibraltar: A Reed‐Solomon coding library for storage applications on programmable graphics processors

Concurrency and Computation Practice and Experience (2011)

Matthew Leon Curry Anthony Skjellum H. Lee Ward Ron Brightwell

SUMMARY Reed–Solomon coding is a method for generating arbitrary amounts of erasure correction information from original data via matrix–vector multiplication in finite fields. Previous work has shown that modern CPUs are not well‐matched to this type of computation, requiring applications that depend on Reed–Solomon coding at high speeds (such as high‐performance storage arrays) to use hardware implementations. This work demonstrates that high performance is possible with current cost‐effective graphics processing units across a wide range of operating conditions and describes how performance will likely evolve in similar architectures. It describes the characteristics of the graphics processing unit architecture that enable high‐speed Reed–Solomon coding. A high‐performance practical library, Gibraltar, has been prototyped that performs Reed–Solomon coding on graphics processors in a manner suitable for storage arrays, along with applications with similar data resiliency needs. This library enables variably resilient erasure correcting codes to be used in a broad range of applications. Its performance is compared with that of a widely available CPU implementation, and a rationale for its API is presented. Its practicality is demonstrated through a usage example. Copyright © 2011 John Wiley & Sons, Ltd.

Erasure code

Reed–Solomon error correction

Erasure

Graphics processing unit

10.1002/cpe.1810

Cite

Citations (33)

NHPDCC: The National High Performance Distributed Computing Consortium

W. R. Briley Donna Reese Anthony Skjellum Louis H. Turcotte

This paper discusses the proposed National High Performance Distributed Computing Consortium (NHPDCC), a second-generation high performance computing consortium. NHPDCC will build high performance computing solutions via distributed, heterogeneous machines with diverse, complementary capabilities. Unlike single-machine consortia, NHPDCC will strive to demonstrate practical solutions to current and future "challenge" and "grand-challenge" applications, by utilizing multi-vendor, commercial products, rather than prototypes. Specific software projects to be implemented by the consortium personnel include MPI message-passing standard and related tools, scalable parallel libraries, benchmarks, as well as systems software. Consortium members will benefit through access to both the hardware and software of the consortium. In addition, the synergy generated by interactions among NHPDCC members will be a significant benefit to consortium members.< >

10.1109/splc.1993.365588

Cite

Citations (0)

Petal Tool for Analyzing and Transforming Legacy MPI Applications

Lecture notes in computer science (2016)

Hadia Ahmed Anthony Skjellum Peter Pirkelbauer

Software portability

Porting

Message Passing Interface

10.1007/978-3-319-29778-1_10

Cite

Citations (1)

A Blockchain-based Trust Management Approach for Connected Autonomous Vehicles in Smart Cities

2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC) (2019)

Farah Kandah Brennan Huber Anthony Skjellum Amani Altarawneh

Advancement in communication technologies and the Internet of Things (IoT) is driving smart cities adoption that aims to increase operational efficiency and improve the quality of services and citizen welfare. It is estimated that by 2020, 75% of cars shipped globally will be equipped with hardware to facilitate vehicle connectivity. The privacy, reliability and integrity of communication must be ensured so that actions can be accurate and implemented promptly after receiving actionable information. Because vehicles are equipped with the ability to compute, communicate, and sense their environment, there is a concomitant critical need to create and maintain trust among network entities in the context of the network's dynamism, an issue that requires building and validating the trust between entities in a small amount of time before entities leave each other's range. In this work, we present a multi-tier scheme consisting of an authentication and trust building/distribution framework designed to ensure the safety and validity of the information exchanged in the system.

Dynamism

10.1109/ccwc.2019.8666505

Cite

Citations (23)