Showing posts with label complexity. Show all posts
Showing posts with label complexity. Show all posts

Saturday, 28 September 2024

Matrix language conjecture for combinatorics:
Combinatorial set generation via nested cartesian products

Rubik's cube (Wikipedia)

Preamble
 

Counting is one of the most important concepts in probability and statistical mechanics as well. Two primary characteristics of choosing $n$ items from $N$ items are (no)-order and (no)-repeat. This leads to four  possible cases that leads to combinations and permutations of different kinds to found the resulting set size. Specially the case for repeated no-ordering would be quite hard to memorise or derive on the spot. There is a catch, we can compute the set sizes in combinatorics but usually there is no explicit explanation how could we get the entire combinatorial set member via a computation.  In this short exposition, we have demonstrated a conjecture  that obtaining resulting combinatorial set members is possible via nested cartesian products of the set of the $N$ items: Concepts from matrices help to identify resulting set members in the given combinatorics case.  

Compact Combinatorics

Combinatorics over choosing $n$ items over $N$ has four cases as mentioned. Here is the formulations of choosing sets over $N$ items of size $n$ with different constraints, resulting set sizes will be: 

  • Ordering matters with replacement : $N^n$
  • Ordering matters without replacement: $\Large \frac{N!}{(N-n)!} $
  • Ordering doesn't matter with replacement: $\Large \frac{(n+N-1)!}{n!(N-1)!}$
  • Ordering doesn't matter without replacement: $\Large \frac{N!}{n!(N-n)!} $
However, it is often we forget the formulations of the each case, even to identify the case given a problem.

Obtaining resulting sets: Combinatorial Matrix

We know the size of the resulting sets if we do the choose over a set given $n$. However, the members of the resulting set is not usually known via a computational procedure in standard texts. However, this is possible via cartesian product of the initial set by itself if $n=2$, i.e., pairwise matrices over $N$ item, a combinatorial matrix.     

Matrix language conjecture for combinatorics: A combinatorial matrix has different portions. An entire matrix (E), diagonals (D), upper diagonal (UD) and lower diagonal (LD).  These portions or combinations of them corresponds to the cases above, somehow obvious after a close look: 

  • Ordering matters with replacement : E
  • Ordering matters without replacement: E-D
  • Ordering doesn't matter with replacement:   UD+D or LD+D   
  • Ordering doesn't matter without replacement: UD or LD
By choosing certain portions of the matrix we full-fill constraints of generating new sets. 

Nested Combinatorial Matrices ($n \ge 3$  cases) 

It is obvious that the above conjecture is valid for $n=2$. In the case of where $n \ge 3$ we would need to do more cartesian products over a combinatorial matrix selectively. The idea is as follows: we generate new cartesian products with $N$ items row-wise with removal of the base row at each level due to portions selected. Here is the sketch of the nested cartesian products: 

  1. We start with the case of $n=2$ and generate the matrix.  Retain the corresponding matrix portion.
  2. Then repeat portion for the next levels till exhaustion that items are size $n$: We do cartesian product of the given row to all items again to generate new combinatorial matrix of each element, except the elements deleted from the row due to portion selection, if deleted.
  3. At the end we obtain multiple matrices of possibly different sizes, then choose the portions again to form the resulting set of combinatorial items. 

Conclusion

We have discussed shortly how to leverage a matrix language in finding the resulting sets in core combinatorics counting operations. This could be used as a pedagogical tool, algorithmic development and bringing new relationships between matrices and combinatorial formulaThis is still a conjecture, but taught provoking one that connects matrix generation to combinatorial structures.  

Cite this as follows: 

 @misc{suezen24mcg, 
     title = { Matrix language conjecture for combinatorics: Combinatorial set generation via nested cartesian products}, 
     howpublished = {\url{https://memosisland.blogspot.com/2024/09/combinatorics-touch-of-linear-algebra.html}}
     author = {Mehmet Süzen},
     year = {2024}
}  




Saturday, 14 October 2023

Ising-Conway lattice-games: Understanding increasing entropy

Preamble

The entropy is probably one of the most difficult physical concepts to grasp. Its inception roots in efficiency of engines and foundational connection to multi-particle classical mechanics to thermodynamics,  i.e., kinetic theory to thermo-statistics. However, computing entropy for a physical systems is a difficult task, as most of the real-physical systems lacks the explicit formulation. Apart from advanced simulation techniques that invokes thermodynamical expressions, pedagogically accessible and physically plausible system is lacking in the literature. Addressing this, we explore here, recently proposed Ising-Conway Games.

Figure: Evolution of Ising-Conway
Game  (arXiv:2310.01458)
Ising-Conway Lattice-Games (ICG)

Ising-Lenz model is probably one of the landmark models in physics, remarkably provides beyond its idealised case of magnetic domains,  now impacts even quantum computational research. However, computing entropy of Ising-Lenz models are still quite difficult. On the other hand, Conway introduce a game with simple rules generating complexity in various orders, via simple dynamical rules. By analogy to these two modelling approach,  we recently introduce game like physical system of spins or lattice sides on a finite space with constraints. This gives a physically plausible dynamics but simpler dynamical evolution to generate the trajectories. Because vanilla Ising-Models requires more complicated Monte Carlo techniques.  Here is the configuration and dynamics of Ising-Conway games,

  1. $M$ sites as a fixed space.
  2. $N$ occupied sites, or 1s.  
  3. Configuration $C(M,N,t)=C(i)$ over time changes. But at $t=0$ all occupied sites live in at the corner.
  4. Configuration can only change to neighbouring sites if they are empty. This is closely related to spin-flip dynamics of the Ising Model. 
  5. No sites occupy the same lattice cell, Pauli exclusion
  6. Should be contained within $M$ Cell.
An example evolution is shown on the Figure.

Defining ensemble Entropy on ICG

Now we are in position to define the entropy for ICGs, which easy to grasp conceptually and computationally.  $C(i, t) \in \{1,0\}$ defines the states of  the game. We build an ensemble at a given time $t$ by defining a region enclosed by 1s.  Then dimensionality of the ensemble  $ k(t) = argmax[\mathbb{I}(C(i))] - argmin [\mathbb{I}(C(i)) ]$. Here,  $\mathbb{I}$ returns index of $1$s on the lattice. This ensemble closely track maximum entropy of the system at a given time. 

Conclusions

A new game-like system that helps us to understand entropy increase that has a plausible physical characteristics that one can easily simulate.

Further reading

  • H-theorem do-conjecture, M.Süzen, arXiv:2310.01458
  • Effective ergodicity in single-spin-flip dynamics, Mehmet Süzen. Phys. Rev. E 90, 03214 url
  • do_ensemble module provides such simulation via simulate_single_spin_flip_game  from the repo h-do-conjecture 

Please cite as 

 @misc{suezen23iclg, 
     title = {Ising-Conway lattice-games: Understanding increasing entropy}, 
     howpublished = {\url{https://memosisland.blogspot.com/2023/10/ising-conway-games-entropy-increase.html}}, 
     author = {Mehmet Süzen},
     year = {2023}
}  


Sunday, 1 November 2020

Gems of data science: 1, 2, infinity

 Summary

Figure: George Gamow's book. (Wikipedia)
Problem-solving is the core activity of data science using scientific principles and evidence. On our side, there is an irresistible urge to solve the most generic form of the problem. We do this almost always from programming to formulation of the problem. But, don't try to solve a generalised version of the problem. Solve it for N=1 if N is 1 in your setting, not for any integer: Save time and resources and try to embed this culture to your teams and management. Extent later when needed on demand.

Solving for N=1 is sufficient if it is the setting


This generalisation phenomenon manifests itself as an algorithmic design: From programming to problem formulation, strategy and policy setting. The core idea can be expressed as mapping, let's say the solution to a problem  is a function, mapping from one domain to a range 

$$ f : \mathbb{R} \to \mathbb{R} $$

Trying to solve for the most generic setting of the problem, namely multivariate setting

$$ f : \mathbb{R}^{m} \to \mathbb{R}^{n} $$

where $m, n$ are the integers generalising the problem.  

Conclusion

It is elegant to solve a generic version of a problem. But is it really needed? Does it reflect reality and would be used? If N=1 is sufficient, then try to implement that solution first before generalising the problem. An exception to this basic pattern would be if you don't have a solution at N=1 but once you move larger N that there is a solution: you might think this is absurd, but SVM works exactly in this setting by solving classification problem for disconnected regions.

Postscripts

  • The title intentionally omits three, while it is a reference to Physics's inability to solve, or rather a mathematical issue of the three-body problem.


Tuesday, 5 March 2013

Big-O notation: Care on asymptotic behaviour

Big-O notation conveys us information about how a given algorithm run time scales with respect to a given input size.  However, one has to be careful on the interpretation of this scaling (limiting) behaviour. First of all, when we talk about $O(N^2)$, it doesn't mean that it is valid for any input size. Let's  illustrate that asymptotic behaviour shown by big-O notation could be “misleading” for “smaller” input. Let’s take number of operation $T(N)=4 N^2 – 2N$. $T’(4)$ must be 4 times of $T’(2)$ if we consider as $T’(N) = N^2$. But if we use $T(N)$ instead, $T(4)$ would be $4.666667$ times of $T(2)$.

A good example in this direction  is N-body force calculations (or here). An $O(N^2)$ algorithm performs better compare to $O(NlogN)$ ones up to a break-even point even 50K particles. This is a good paper explaining this;
Pringle, Gavin J. "Comparison of an O (N) and an O (N log N) N-body solver."
[pdf].
(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License