Sort:  

New Liquid AI Model Puts Non-Transformer Architecture to the Test

In this episode, the host discusses a brand new model from Liquid AI that departs from the traditional Transformer architecture. This family of generative AI models, called the Liquid Foundation Models, comes in three sizes - 1 billion, 3 billion, and 40 billion parameters.

Impressive Benchmark Performance

The host shares benchmark results showing the strong performance of these Liquid Foundation Models compared to other prominent language models like LLaMA and Chinchilla. The 1.3 billion parameter model outperforms LLaMA 3.2 on the MMLU Pro benchmark, while the 40 billion parameter "Mixture of Experts" model beats out even the larger 57 billion parameter Chinchilla model.

Memory Efficiency Advantage

One of the key differentiators of the Liquid Foundation Models is their exceptional memory efficiency. The host demonstrates how the memory footprint of these models scales much more gradually as the output length increases, compared to other models like Apple's 3 billion parameter offering. This memory efficiency could make the Liquid models well-suited for edge deployments.

Testing the Models

The host then proceeds to put the Liquid Foundation Models to the test, running them through a series of tasks ranging from coding a Tetris game to solving logic problems. The results are mixed, with the models performing well on some tasks like envelope size calculations, but struggling on others like generating Tetris code or answering basic comprehension questions.

Challenges for Non-Transformer Architectures

The host expresses some disappointment in the model's performance, noting that non-Transformer architectures often seem to underperform compared to Transformer-based models when it comes to these types of benchmark tests. The host remains hopeful that a truly innovative non-Transformer architecture will eventually emerge to challenge the Transformer dominance, but for now, this Liquid AI model does not appear to be that breakthrough.

Overall, this episode provides an in-depth look at a novel AI architecture and the ongoing efforts to develop high-performing models outside of the Transformer paradigm. While the Liquid Foundation Models show promise in certain areas, the host's testing highlights the challenges still faced by alternative approaches in matching the capabilities of leading Transformer-based language models.