Figure: Visual description of random stream chunking, M.Suzen (2017) |
Terminology on repeatability, reproducibility and replicability
Even though we only use reproducibility as a term as an umbrella term, there are clear distinctions, recommended by ACM, here. We summarise this from computational science perspective :
repeatability : Owner re-run the code to produce the same results on own environment.
reproducibility: Others can re-run the code to produce the same results on other's environment.
replicability: Others can re-code the same thing differently and produce the same results on other's environment.
In the context of this post, since parallel and serial settings would constitute different environments, the practice of getting the same results, this falls under reproducibility.
Single Instruction Multiple Data (SIMD) setting.
This is the most used, and probably the only one you would ever need, approach in parallel processing. It implies using the same instruction, i.e., algorithm or function/module, for the distinct portions of the data. This originates from applied mathematics techniques so called domain decomposition method.
Simultaneous Random Stream Chunking
The approach in ensuring exact reproducibility of a stochastic algorithm in both serial and parallel implementation lies in default chunking in producing the random numbers both in serial and parallel code. This approach used in the Bristol Python package Here is the mathematical definition.
Defintion Random Stream Chunking Given a random number generator $\mathscr{R}(s_{k})$ with as seed $s_{k}$ is used over $k$-partitions. These partitions are always corresponds to $k$ datasets, MD portion of SIMD. In the case of parallel algorithms, each $k$ corresponds to a different compute resource $\mathscr{C}_{k}$.
By this definition we ensure that both parallel and serial processing receiving exactly the same random number, this is summarised in the Figure.
We outline a simple yet effective way of ensuring exact reproducibility of serial and parallel simulations simultaneously. However, reproducibility of stochastic simulations are highly hardware dependent as well, such as GPUs and NPUs, and their internal streams might not be that easy to control partitions, but generic idea presented here should be applicable.
Please cite this article as:
@misc{suezen24rep,
title = {Exact reproducibility of stochastic simulations for parallel and serial algorithms simultaneously},
howpublished = {\url{https://memosisland.blogspot.com/2024/02/exact-reproducibility-of-stochastic.html},
author = {Mehmet Süzen},
year = {2024}
}
No comments:
Post a Comment