-
Notifications
You must be signed in to change notification settings - Fork 151
SURF2008
[wiki:SURF2007]
- Parallelization with Trilinos in FiPy - Inclusion of Parallel Vectors in FiPy
For incorporating Trilinos into FiPy to be advantageous, there must be a demonstrable speedup when using Trilinos and a multi-processor environment over serial operations in Trilinos, serial operations with Numpy, and serial operations using inline C code. This was tested using an automated Python script to collect time data. Each run of the script created a vector (single-dimension array) of a certain size filled with random float values, multiplied it by itself repeatedly, and recorded the elapsed wall time during the multiplication operation. The multiplication operation was repeated 25 times to ensure that the time recorded was significant. The process was tested using Numpy arrays, inlining code in C, serially with MPI/Trilinos, and through up to 10 processors with MPI/Trilinos. The number of operations varied from 3 to 30 at every multiple of 3, and the size varied from 100,000 to 500,000 at every multiple of 10,000. The machines used for testing included three 32-bit Debian machines, one with two processors and two with four. The work was distributed across multiple machines in the case of a small number of processors.
The data showed a significant average speedup when using parallel processing as opposed to serial processing, which correlated somewhat to the number of processors -- two processors reduced the time by half. This speedup factor was demonstrated to be essentially invariant versus the number of operations performed. Even for a small number of operations performed, the speedup of parallel performance was still significant over all serial performance. Versus size, times increased for all methods linearly. The difference between speedup for large sized vectors and small sized vectors is not significant. The data did contain abnormalities in what would be expected to be a linear increase in time elapsed versus size of vector. These abnormalities consisted of large, temporary jumps in time elapsed around certain array sizes, regardless of operation method used. Some of the variance may be attributed to expected computational error. Since the machines were not dedicated exclusively to the running of these tests, they may not have been perfectly prioritized, and therefore results cannot be expected to be perfectly consistent. Some variance may also be attributed to communication overhead between processors and machines. However, given the regularity of the anomalous data, the variance may also be attributed to some other factor, such as exceeding cache size.
The general trend of the data shows significant advantage to using parallel processing with MPI/Trilinos over serial methods. FiPy can perform more efficiently using parallel operations to solve PDEs.
When running FiPy in parallel mode with Trilinos, the $FIPY_SOLVERS environment variable must be set to 'Trilinos' in the resource configuration file for the user. mpirun does not accept arguments to the program it is running in parallel from the command line, so FiPy must take its cue from the environment instead.
For any future work using Trilinos, Epetra.Import objects allow information to be passed between processors.