UH Header Computer Science Department Logo
Self Adapting Numerical Software (SANS) – Effort and Fault Tolerance in Linear Algebra Algorithms: 03.06.06

Abstract
A new generation of software libraries and algorithms are needed for the effective and reliable use of (wide area) dynamic, distributed and parallel environments.  Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder.

Along these lines we will discuss work on the development of parameterizable and annotatable software libraries in the linear algebra area that will permit performance tuning for a broad range of architectures including grid computing.   Self Adapting Numerical Software (SANS) is a software effort that will automatically generate highly optimized numerical kernels for our high performance computers.
In addition we will describe an implementation of MPI which extends the message passing model to allow for recovery in the presence of a faulty process. Our implementation allows a user to catch the fault and then provide for a recovery.

We will also touch on the issues related to using diskless checkpointing to allow for effective recovery of an application in the presence of a process fault.

 

 

 

 

 

 

 

 

Feedback Contact U H Site Map Privacy and Policies U H System Statewide Search Compact with Texans State of Texas