Saturday 24 February 2024

Inducing time-asymmetry on reversible classical statistical mechanics via Interventional Thermodynamic Ensembles (ITEs).

Preamble 

Probably, one of the most fundamental issue in classical statistical mechanics is extending reversible dynamics to many-particle systems that behaves irreversibly. In other words, how time's arrow appears even though constituted systems evolves in reversible dynamics. This is the main idea of Loschmidt's paradox. The resolution to this paradox lies into something called interventional thermodynamic ensembles (ITEs).  

Leaning Tower
of Pisa:Recall Galileo's 
Experiments
 (Wikipedia)

Time-asymmetry is about different histories : Counterfactual dynamics

Before trying to understand how ITEs are used in resolving Loschmidt's paradox, we understand that inducing different trajectories on an identical dynamical system in "a parallel universe" implies time-asymmetry. A trajectory provides here a reversibility.  So called "a parallel universe" is about imagining a different dynamics via a sampling, this corresponds to counterfactuals within Causal inference frameworks. 

Interventional Thermodynamic Ensembles (ITEs)

Interventional ensemble build upon an other ensemble, for the sake of simplicity, we can think of an ensemble as an associated chosen sampling scheme. From this perspective,  sampling scheme $\mathscr{E}$ would have an interventional sampling $do(\mathscr{E})$ if the adjusted scheme only  introduces a change in the scheme that doesn't change the inherent dynamics but effects the dynamical  history. One of the first examples of this is appeared recently: single-spin-flip vs. dual-spin-flip dynamics [suezen23]. This is shown with simulations. 

Outlook

Reversibility and time-asymmetry in classical dynamics are a long standing issues in physics. By inducing causal inference perspective in computing dynamical evolution of many body systems leads to reconciliation of reversibility and time-asymmetry i.e., $do-$operator's interpretation.

References

[suezen23] H-theorem do-conjecture (2023) arXiv:2310.01458 (simulation code GitHub).

Please Cite as:

 @misc{suezen24ite, 
     title = {Inducing time-asymmetry on reversible classical statistical mechanics via  Interventional Thermodynamic Ensembles (ITEs)}, 
     howpublished = {\url{https://memosisland.blogspot.com/2024/02/inducing-time-asymmetry-on-reversible.html}, 
     author = {Mehmet Süzen},
     year = {2024}
}  





Friday 9 February 2024

Exact reproducibility of stochastic simulations for parallel and serial algorithms simultaneously
Random Stream Chunking

Preamble 

Figure: Visual description of
random stream chunking, M.Suzen (2017)
The advent of using computational sciences approaches, i.e., data science and machine learning, in the industry becomes more common practice in almost all organisations due to the democratisation of data science tools and availability of inexpensive cloud infrastructure. This brings the requirement or even compulsory practice of a code being so called parallelised. Note that parallelisation is used as an umbrella term here for using multiple compute resources in accelerating otherwise a serial computation and could mean a distributed or multi-core computing, i.e., multiple CPUs/GPUs. Here we provide a simple yet a very powerful approach to preserve reproducibility of parallel and serial implementation of the same algorithm that uses random numbers, i.e., stochastic simulations. We assume Single Instruction Multiple Data (SIMD) setting. 

Terminology on repeatability, reproducibility and  replicability 

Even though we only use reproducibility as a term as an umbrella term, there are clear distinctions, recommended by ACM, here. We summarise this from computational science perspective :

repeatability : Owner re-run the code to produce the same results on own environment. 
reproducibility: Others can  re-run the code to produce the same results on other's environment. 
replicability: Others can re-code the same thing differently and produce the same results on other's environment. 

In the context of this post, since parallel and serial settings would constitute different environments, the practice of getting the same results, this falls under reproducibility.   

Single Instruction Multiple Data (SIMD) setting. 

This is the most used, and probably the only one you would ever need, approach in parallel processing. It implies using the same instruction, i.e., algorithm or function/module, for the distinct portions of the data. This originates from applied mathematics techniques so called domain decomposition method

Simultaneous Random Stream Chunking 

The approach in ensuring exact reproducibility of a stochastic algorithm in both serial and parallel implementation lies in default chunking in producing the random numbers both in serial and parallel code. This approach used in the Bristol Python package Here is the mathematical definition. 

Defintion Random Stream Chunking Given a random number generator $\mathscr{R}(s_{k})$ with as seed $s_{k}$ is used over $k$-partitions. These partitions are always corresponds to $k$ datasets, MD portion of SIMD. In the case of parallel algorithms, each $k$ corresponds to a different compute resource     $\mathscr{C}_{k}$. 

By this definition we ensure that both parallel and serial processing receiving exactly the same random number, this is summarised in the Figure.

Conclusion

We outline a simple yet effective way of ensuring exact reproducibility of serial and parallel simulations simultaneously. However, reproducibility of stochastic simulations are highly hardware dependent as well, such as GPUs and NPUs, and their internal streams might not be that easy to control partitions, but generic idea presented here should be applicable.


Please cite this article as: 

 @misc{suezen24rep, 
     title = {Exact reproducibility of stochastic simulations for parallel and serial algorithms simultaneously}, 
     howpublished = {\url{https://memosisland.blogspot.com/2024/02/exact-reproducibility-of-stochastic.html}, 
     author = {Mehmet Süzen},
     year = {2024}
}
  



Tuesday 28 November 2023

What's the purpose of randomness in causal discovery techniques?

Roulette
Wheel (Wikipedia)

Preamble
 

In this short exposition, we inquire about the purpose of randomness and how this related to discovering or testing causal inferential problem solving using data and causal models. In his seminal work by Holland (1986) point out something striking that was not put in such form earlier works. He stated the "obvious" that almost all data sets addressing interventional nature, such as treatment vs. non-treatment, that a person or unit we study, can not be treated and not-treated at the same time. We delve question of randomness from this perspective, i.e., so called fundamental problem of causal inference.

Group assignment for causal inference   

Group assignment probably one of the most fundamental approach in statistical research, such as in the famous Lady tea tasting problem.  The idea of assignment in causal inference, we need to find a matching person or unit that is not-treated if we have a treated sample or the other-way around, so called a matching or balancing.  

Randomness in causality: Removal of  pseudo-confounders

Randomness doesn't only allow fair representation of control and treatment group assignments, reducing bias essentially. The primary effect of randomness is removal of  pseudo-confounders, this is not well studied in the literature. What it means, if we don't randomise there would be other causal connections that would really shouldn't be there. 

Conclusion

Here, we hint about something called  pseudo-confounders.  Randomisation in both matching and other causal techniques primarily removes bias but  removal of pseudo-confounders is not commonly mentioned and an open research.

Further reading

Please cite this article as: 

 @misc{suezen23ran, 
     title = {What's the purpose of randomness in causal discovery techniques?}, 
     howpublished = {\url{https://memosisland.blogspot.com/2023/11/causal-inference-randomisation.html}, 
     author = {Mehmet Süzen},
     year = {2023}
}
  




  

Saturday 25 November 2023

Why should there be no simultaneity rule for causal models?

Dominos in motion
(Wikipedia)

Preamble
 

The definition of weighted directed graphs (wDAGs) provides a great opportunity to express causal relationships among given variates. Usually this is expressed as SCMs, Structural Causal Model or in more generally causal model. A given causal model can be expressed as set of simultaneous equations, given a direction for the equality, right to left , meaning $A=B$ implies B causes A to happen  $B \to A$ . Then what happens if A is a function of B and C, $A=f(B,C)$, then we say $ B \to A$ and $C \to A$ occurs simultaneously.  In this post we discuss this situation that there should be no simultaneity rule in causal models, regardless of if they are not time-series models.

Understanding  causal models

Basic definition of a causal model follows a functional form with set of equations, realistically with added noise. The models forms a weighted Directed Acyclic Graphs (wDAGs) visually. Here is the mathematical definition due to Pearl (Causality 2009), we made it a bit more coarser in this definition: 

Definition (Causal Model) : Given set of $n$-variables $X \in \mathbb{R}^{n}$, two subsets of $X= x_{1} \cup x_{2}$, they can form set of equations $x_{2}=f(x_{1}; \alpha; \epsilon)$,  $\alpha$ being the causal effect sizes on $x_{1}$ as causes of $x_{2}$ with some noise $\epsilon$. This corresponds to a $wDAG$ formed among $X$ with weights $\alpha$. So that there is a graph $\mathscr{G}(X, \alpha)$ representing this set of equations, where by equality put direction from right to the left side of the equation. 

However this definition does not set any constraints on the values of $\alpha$. Any two or more values of $\alpha$-s can be equivalent on the same path within $X$. This implies an interestingly that there would be set of variates simultaneously causes the same thing. It sounds plausible and physically possible to a degree within Planck-time. However, this brings an ambiguity of breaking ties in ordering events.

Perfect Causal Ordering

Given wDAG as a causal model induces causal ordering among all members of $X$. As we defined how this can be achieved in a recent post: Practical causal ordering. In this context, perfect causal ordering implies  $\alpha$ values within the first order paths to a given end variable are different. Mathematically speaking a definition follows. 

Definition (No simultaneity rule) Given all $k$ triplets ($x_{i}, y, \alpha_{i}$), that $x_{i}$ is one of the causes of $y$, and $\alpha_{i}$ causal effect sizes, all $\alpha_{i}$ are different numbers, inducing a perfect causal ordering.

This rule ensures we don't need to break ties randomly as causal ordering is established uniformly.

Conclusion: Importance of no simultaneity 

By this definition we ruled out any simultaneous causes. This may sound too restrictive for modelling but this impacts decision making significantly; ranking causes of an outcome will impact how to prioritise the policy in addressing the outcome, i.e., such as medical intervention to prevent first cause. Also, it may not be feasible to intervene simultaneous causes. Hence, establishing primary causes in order is paramount in decision making and execution of any reliable policy.


Further reading

Please cite this article as: 

 @misc{suezen23nos, 
     title = {Why should there be no simultaneity rule for causal models?}, 
     howpublished = {\url{https://memosisland.blogspot.com/2023/11/causal-model-simultaneous.html}, 
     author = {Mehmet Süzen},
     year = {2023}
}
  





(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License