Saturday, 1 June 2024

A new kind of AutoML: On the fallacy of many-shot in-context learning with LLMs and PLMs

Preamble 

A graph path (Wikipedia)

With the common usage of  Pre-trained Large  Language Models (PLMs/LLMs), now it is possible to direct them to make data analysis, generate predictions or code for very specialised tasks without training. The primary approach in doing such automated analysis is using many-shot learning. In this short post, it is pointed out that pushing analysis into meta model doesn't remove the analyst, as there are now some claims doing so. This is a common generalisation fallacy in many AI systems. Humans are actually are still in the loop and certain automation presented as if human's are removed entirely is not a fair representation of the capabilities and advantages brought by these models. 

How to direct PLMs to do new-prediction without training?

This is quite an attractive premise. Using a foundational model, we could update their ability of prediction beyond training data by simply inducing a memory in their input, so called Chain-of-Though (CoT). At this point, we can deploy the model for an automated task with invoking CoT before its first prompt. This is a new kind of AutoML approach, that supervised learning takes a new great-look: pushes training efforts to building chain-of-thought datasets. Building CoTs appears to be a meta modelling tasks and requires domain knowledge to develop.

Did we really remove the data analyst, software developer or the practitioner from the loop? 

Short answer is No. What we did by is a new way of performing a specialised task, i.e., modelling is moved into a meta modelling stage. Meaning a memory induced by CoT is  designed by experienced human and it will be broken if the task or input data deviates a little differently. A usual culprit in play here; a reliability is problematic. Even though this is quite a promising development and potential to be a game changer on how we practice machine learning, error-rates are still quite high, see Agarwal-Singh et. al. (2024).  

Conclusion 

In this short exposition, we argue that many-shot learning, chain-of-thought learning for foundation models is actually a new kind of AutoML tool. AutoML doesn't imply that human expert is removed from the picture completely. However it greatly assist the scientist, analyst, programmer or practitioner, in automating some tasks as meta programming tool. It will also help less-experienced colleague to start being a bit more productive. These tools indeed improves our computational tool boxes, specifically as a new AutoML tool.  

Further reading

Please cite as follows:

 @misc{suezen24pmlauto, 
     title = {A new kind of AutoML: On the fallacy of many-shot in-context learning with LLMs and PLMs}, 
     howpublished = {\url{https://memosisland.blogspot.com/2024/05/llm-analysis-fallacy.html}, 
     author = {Mehmet Süzen},
     year = {2024}
}  

No comments:

Post a Comment