Tuesday, 22 April 2025

Numerical stability showcase: Ranking with SoftMax or Boltzmann factor

Preamble 

Image: Babylonian table for
computation (Wikipedia)
Probably, one of the most important aspects of computational work in quantitative fields, such as physics and data sciences is stability of numerical computations. It implies given inputs, outputs should not wildly deviates to large numbers or it must not distort the results, such as ranking based on scores, one of the most used computation in data science tasks, such as in classification of clustering. In this short post, we provide a stunning example of using SoftMax that creates wrong results if it applied naively. 

SoftMax:  Normalisation with Boltzmann factor

SoftMax is actually something more physics concept than a data science usage. The most common usage in data science is used for ranking.  Given a vector $x_{i}$, then softmax can be computed with the following expression read $$exp(x_{i})/\sum_{i} exp(x_{I}).$$ This originates from statistical physics, i.e., Boltzmann factor. 

Source of Numerical Instability

Using exponential function in the denominator in a sum creates a numerical instability, if one of the number deviates from other numbers significantly in the vector. This makes all other entries zero for the output of softmax.  

Example Instability: Ranking with SoftMax

Let's say we have the following scores 

scores = [1.4, 1.5, 1.6, 170]

for teams A, B, C,  D for some metric, we want to turn this into probabilistic interpretation with SoftMax, this will read, [0., 0., 0., 1.] we see that D comes on top but A,B,C are tied. 

LogSoftMax

We can rectify this instability by using LogSoftMax. reads $$\log exp(x_{I})-log(\sum_{i} exp(x_{i})),$$reads [-168.6000, -168.5000, -168.4000, 0.0000], so that we can induce consecutive ranking without ties, as follows D, A, B, C.

Conclusion

There is a similar practice in statistics for likelihood computations, as Gaussians brings exponential repeatedly. Using Log of the given operations will stabilise the numerical instabilities caused by repeated exponentiation. This shows the importance of numerical pitfalls in data sciences. 

Cite as follows

 @misc{suezen25softmax, 
     title = {Numerical stability showcase: Ranking with SoftMax or Boltzmann factor}, 
     howpublished = {\url{https://memosisland.blogspot.com/2025/04/softmax-numerical-stability.html}}, 
     author = {Mehmet Süzen},
     year = {2025}
}  

Appendix: Python method 

A python method using PyTorch computing softmax example from the main text. 

import torch

List = list
Tensor = torch.tensor

def get_softmax(scores:List, log :bool = False) -> Tensor:
"""
Compute softmax of a list

Defaults to LogSoftMax
"""
scores = torch.tensor(scores)
if log:
scores = torch.log_softmax(scores, dim=0)
else:
scores = torch.softmax(scores, dim=0)
return scores




(c) Copyright 2008-2024 Mehmet Suzen (suzen at acm dot org)

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License