Performance Skeletons
The performance skeleton of an application is a short-running program whose performance
in any scenario reflects the performance of the application it represents. Such
a skeleton can be employed to quickly estimate the performance of a large application
under a new and unpredictable environment.
Skeleton Construction
The approach is based on capturing the execution behavior of an application and
automatically generating a synthetic skeleton program that reflects that execution
behavior.
Illustration of our Approach
- Record application's execution trace: The application is executed
on a controlled test bed and its execution activity, specifically CPU usage and
message exchanges, is recorded. This is the execution trace.
- Trace logicalization: The communication pattern is identified by
analyzing the communication traffic in the execution traces. Subsequently the family
of traces from parallel execution is converted to a single trace with communication
between physical neighbors converted to communication beween logical neighbors.
Cool pics of communication pattern of NAS benchmarks.
- Compress execution trace into an execution signature: The repeated
patterns in the recorded execution trace are identified and used to generate a compact
representation of the trace by introducing a "loop structure". The new compact representation
is the execution signature.
- Generate performance skeleton program from the execution signature:
The application execution signature is converted to a computer program which generates
execution activity that is similar to the recorded execution signature but with
execution time scaled down by a given factor K. This is the performance skeleton.
Publications
- J. Subhlok and Q, Xu, Automatic Construction of Coordinated Performance Skeletons,
The NSF Next Generation Software Workshop at IPDPS 2008, Miami, FL, April 2008,
pdf,
slides
These 3 publications comprehensively cover the major recent results from this project:
- Q. Xu and J. Subhlok., Construction and Evaluation of Coordinated Performance Skeletons,
Technical Report UH-CS-08-09, University of Houston, May 2008
pdf
- Q. Xu and J. Subhlok., Efficient discovery of loop nests in communication traces
of parallel programs, Technical Report UH-CS-08-08, University of Houston, May 2008
pdf
- Q. Xu, R. Prithivathi, J. Subhlok, and R.Zheng, Logicalization of {MPI} communication
traces, Technical Report UH-CS-08-07, University of Houston, May 2008
pdf
The following covers the general problem of skeleton construction, basic techniques
and a wider set of results:
- S. Sodhi, Q. Xu and J. Subhlok, Performance Prediction with Skeletons, Cluster Computing:
The Journal of Networks, Software Tools and Applications, Volume 11, No 2/June 2008.
source,
pdf
A full list of publications relating to this project is included in
Subhlok's papers
This work supported by the National Science Foundation under Grant No. ACI-0234328
and Grant No. CNS-0410797.
Contact
For questions, please send email to Qiang Xu Qiang.Xu[AT]mail[DOT]uh[DOT]edu or
Jaspal Subhlok at jaspal[AT]uh[DOT]edu
|