En este articulo se presenta el desarrollo de nuevos materiales para la imparticion de una asignatura de Comunicaciones Industriales, dentro de las carreras de Ingenieria Tecnica Industrial e Ingenieria Industrial, y con unos objetivos orientados a l
Several studies have proved the benefits of job malleability, that is, the capacity of an application to adapt its parallelism to a dynamically changing number of allocated processors. The most remarkable advantages of executing malleable jobs as part of a high performance computer workload are the throughput increase and the more efficient utilization of the underlying resources. Malleability has been mostly applied to iterative applications where all the processes execute the same operations over different sets of data and with a balanced per process load. Unfortunately, not all scientific applications adhere to this process-level malleable job structure. There are scientific applications which are either noniterative or present an irregular per process load distribution. Unlike many other reconfiguration tools, the Dynamic Management of Resources Application Programming Interface (DMR API) provides the necessary flexibility to make malleable these out-of-target applications. In this article, we study the particular case of using the DMR API to generate a malleable version of HPG aligner, a distributed-memory noniterative genomic sequencer featuring an irregular communication pattern among processes. Through this first conversion of an out-of-target application to a malleable job, we both illustrate how the DMR API may be used to convert this type of applications into malleable and test the benefits of this conversion in production clusters. Our experimental results reveal an important reduction of the malleable HPG aligner jobs completion time compared to the original HPG aligner version. Furthermore, HPG aligner malleable workloads achieve a greater throughput than their fixed counterparts.
Laser Induced Breakdown Spectroscopy (LIBS) is a one of the successfully used technique for measuring experimental atomic and ionic transition probabilities. This is due to the excitation procedure easily provides highly ionised species and neutral atoms. Nevertheless, its range also extends to another applications such as the industry or in astrophysics. In this work, we explain a specific experimental set-up -consist of fomred by a Nd:YAG laser and grating monochromator coupled with a time-resolved optical multichannel analyser. The employment of the time and spatial spectroscopy in a laser produced plasma for obtaining transition probabilities is also described. From the laser produced plasmas it is also possible to determine some of their properties such as the temperature or the composition. Besides, due to the high emission and temperature, it can be proved the existence of Local Thermodynamic Equilibrium allows determination of absolute values for the transition probabilities and the evaluation of some characteristics such as the self-absorption. The experimental data treatment for obtaining the transition probabilities and the different plasma properties that can be derived was explained.
We describe a prototype Web service for model reduction of very large-scale linear systems. A user-friendly interface is designed so that model reduction can be easily performed on a cluster of 32 nodes. Access via the www isolates the user of the service from the complexities of installing and using the parallel model reduction codes and the maintenance of the hardware.
OpenMP is the de facto standard application programming interface (API) for on-node parallelism. The most popular OpenMP runtimes rely on POSIX threads (pthreads) implementations that offer an excellent performance for coarse-grained parallelism and match perfectly with the current hardware. However, a recent trend in runtimes/applications points in the direction of leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms. It has been demonstrated that lightweight thread (LWT) solutions are more appropriate for these new parallel paradigms. We have developed GLTO, an OpenMP implementation over the recently-emerged Generic Lightweight Threads (GLT) API. GLT exports a common API for LWT libraries that offers the possibility of running the same application over different native LWT solutions. In this paper we use GLTO to analyze different scenarios where OpenMP implementations may benefit from the use of either LWT or pthreads. Our study reveals that none of the threading approaches obtains the best performance in all the scenarios, but that there are important gaps among them.
In this paper, we present precise time and energy models for an intra-only HEVC video encoder. These models are a step forward to understand and estimate the computational complexity and energy demands of an HEVC encoder, which in turn opens the path to finely tuning the computational resources that are dedicated to this purpose. Our models estimate the complexity and energy consumed by the HEVC encoder, in a frame by frame basis, considering two factors: the quantification parameter used to encode each frame and the spatial information of that frame. Our experimental validation demonstrates the accuracy of these models, which report errors that are, on average, below 10% for full HD videos, and 5% for 832 × 480 videos.
In this paper we analyze the impact that energy-saving strategies, like the application of DVFS via Linux governors and the MPI communication mode, have on the performance and energy consumption of message-passing dense linear algebra operations. In the study, we employ codes from ScaLAPACK for three matrix kernels, the matrix-matrix and matrix-vector products and the Cholesky factorization, which exhibit different levels of concurrency and CPU/memory activity. Following a recent trend, we also include an accelerated version of the matrix-matrix product that off-loads all computation to a graphics processor and study the energy gains of this hybrid solver when the general-purpose cores of the system are promoted to a low consuming mode. Experimental results on a cluster equipped with state-of-the-art computation and communication hardware illustrate the results of this study.