Memo's Island: A new kind of AutoML: On the fallacy of many-shot in-context learning with LLMs and PLMs

Extension of this post, appear as a short article with a conjecture:

Mehmet Süzen. In-context learning as a new kind of symbolic-AutoML: Lyapunov conjecture for CoTs. 2024 HAL France

Preamble

A graph path (Wikipedia)

With the common usage of Pre-trained Large Language Models (PLMs/LLMs), now it is possible to direct them to make data analysis, generate predictions or code for very specialised tasks without training. The primary approach in doing such automated analysis is using many-shot learning. In this short post, it is pointed out that pushing analysis into meta model doesn't remove the analyst, as there are now some claims doing so. This is a common generalisation fallacy in many AI systems. Humans are actually are still in the loop and certain automation presented as if human's are removed entirely is not a fair representation of the capabilities and advantages brought by these models.

How to direct PLMs to do new-prediction without training?

This is quite an attractive premise. Using a foundational model, we could update their ability of prediction beyond training data by simply inducing a memory in their input, so called Chain-of-Though (CoT). At this point, we can deploy the model for an automated task with invoking CoT before its first prompt. This is a new kind of AutoML approach, that supervised learning takes a new great-look: pushes training efforts to building chain-of-thought datasets. Building CoTs appears to be a meta modelling tasks and requires domain knowledge to develop.

Did we really remove the data analyst, software developer or the practitioner from the loop?

Short answer is No. What we did by is a new way of performing a specialised task, i.e., modelling is moved into a meta modelling stage. Meaning a memory induced by CoT is designed by experienced human and it will be broken if the task or input data deviates a little differently. A usual culprit in play here; a reliability is problematic. Even though this is quite a promising development and potential to be a game changer on how we practice machine learning, error-rates are still quite high, see Agarwal-Singh et. al. (2024).

Conclusion

In this short exposition, we argue that many-shot learning, chain-of-thought learning for foundation models is actually a new kind of AutoML tool. AutoML doesn't imply that human expert is removed from the picture completely. However it greatly assist the scientist, analyst, programmer or practitioner, in automating some tasks as meta programming tool. It will also help less-experienced colleague to start being a bit more productive. These tools indeed improves our computational tool boxes, specifically as a new AutoML tool.

Further reading

On the fallacy of replacing physical laws with machine-learned inference systems, M.Suezen (2021)
AutoML Book, Hutter et. al. (2023)
How does in-context learning work? A framework for understanding the differences from traditional supervised learning, Xie & Min (2022)
Many-Shot In-Context Learning, Agarwal-Singh et. al. (2024)

Please cite as follows:

@misc{suezen24pmlauto,

title = {A new kind of AutoML: On the fallacy of many-shot in-context learning with LLMs and PLMs},

howpublished = {\url{https://memosisland.blogspot.com/2024/05/llm-analysis-fallacy.html},

author = {Mehmet Süzen},

year = {2024}

}

Memo's Island

Saturday, 1 June 2024

A new kind of AutoML: On the fallacy of many-shot in-context learning with LLMs and PLMs

No comments:

Post a Comment