[ Benchmark Database | Performance Prediction | Comparative Diagrams ]

Performance Prediction
Funded by:

One main focus of the IPACS-Project is the development of methods for the modeling and the prediction of performance of commercial codes. The need for and the benefit of performance prediction techniques can be roughly divided into two areas: On the one hand, there is software development, where the understanding of the performance of an implemented algorithm is important for the developer to find and perhaps cure the bottlenecks of the code. For this aim, very precise modeling techniques are necessary, and detailed information and data from hardware counters and/or source code analysis is mandatory. But in many situations and on many systems such kind of information is hard or even impossible (e.g. for commercial software) to gain. On the other hand, performance prediction is also important for the user, who wants to find out, which hardware upgrade would improve the run-time of 'his' application most effectively. In this case, the data, the modeling is based on, must be easily obtainable, while an accuracy of ~ 10% for the prediction might be sufficient. But a reasonable estimate for the performance of an architecture, which can not be accessed or is not even build yet, should be possible. This is the area, that is addressed by the modeling methods in the IPACS-Project. With this in mind a rather simple model has been developed. A detailed description and comparism with experimental data can be found here Here we will mention the main characteristics only: As a simplified approach it depends on some basic a ssumptions and is thereby restricted to a special class of applications. It is assumed that the main part of the run-time is spend in a repeated loop over a large number of small elementary building blocks like points or cells, 'large' and 'small' with respect to the available fastest memory layer. This can be expected to be fulfilled by general evolution problem applications and expecially by CFD applications where the given results here are restricted to. In the model the application and the case under consideration are described by a set of characteristic numbers. These are the number of cache loads, stores, main memory accesses and flops for a single processor run. For a parallel run the size of the boundary partitions per process, the amount of data to be communicated and the number of communication steps have to be added. These numbers are combined with the results of the low-level benchmarks and performance metrics. These are the theoretical peak performance, the Cachebench read and write bandwidths on different memory levels and the network bandwidth and latency from the PMB benchmarks. These results are taken from the repository directly, making the performance prediction depending automatically on the measured benchmark data.

List of available computers
NameTypeArchitectureManufacturerInsert Date 
Seaborg SP POWER3 IBM SP IBM 2004-06-09 View
Unimatrix Zero Linux SMP-Cluster x86 Dell 2005-02-04 View
vr-master PC cluster Linux NetworX 2005-04-22 View
cheetah IBM pSeries System Power 4 IBM 2005-04-26 View
rumtest7.rz.uni-mannheim.de n/a x68_64 n/a 2006-01-25 View
rumtest6.rz.uni-mannheim.de n/a x86_64 n/a 2006-01-25 View
rumtest5.rz.uni-mannheim.de n/a SMP n/a 2006-01-30 View
rumtest2.rz.uni-mannheim.de n/a x68_64 n/a 2006-03-17 View

Copyright 2002-2005 ipacs-benchmark.org