You are viewing a single comment's thread from:

RE: LeoThread 2024-10-25 09:33

in LeoFinance4 months ago

New AI Algorithm Can Reduce LLM Energy Usage by 80-95%

New Linear-complexity Multiplication (L-Mul) algorithm claims it can reduce energy costs by 95% for element-wise tensor multiplications and 80% for dot

New Linear-complexity Multiplication (L-Mul) algorithm claims it can reduce energy costs by 95% for element-wise tensor multiplications and 80% for dot products in large language models. It maintains or even improving precision compared to 8-bit floating point operations.

#newsonleo #llms #energy #algorithm #technology

Sort:  

Solution in this Paper

– Approximates floating-point multiplication using integer addition
– Linear O(n) complexity vs O(m^2) for standard floating-point multiplication
– Replaces tensor multiplications in attention mechanisms and linear transformations
– Implements L-Mul-based attention mechanism in transformer models

Key Insights from this Paper

– L-Mul achieves higher precision than 8-bit float operations with less computation

– Potential 95% energy reduction for element-wise tensor multiplications
– 80% energy reduction for dot products compared to 8-bit float operations
– Can be integrated into existing models without additional training

Results

– L-Mul with 4-bit mantissa: comparable precision to float8 e4m3

– L-Mul with 3-bit mantissa: outperforms float8 e5m2
– Attention mechanism replacement: 0.07% average performance loss across NLP tasks
– Vision tasks: 0.12% accuracy improvement
– Full model fine-tuning: equivalent results to float8 e4m3 accumulation precision