RE: LeoThread 2024-11-11 05:49

You are viewing a single comment's thread from:

RE: LeoThread 2024-11-11 05:49

View the full context
View the direct parent

taskmaster4450le (80)in LeoFinance • 14 days ago

In tandem, Berkeley’s Sky Computing Lab also birthed vLLM in 2022, spearheaded by researchers Zhuohan Li, Woosuk Kwon, and Simon Mo, who started the project after developing a system to distribute complex processes across multiple GPUs more efficiently. vLLM leans on a new “attention algorithm” dubbed PagedAttention, which helps reduce memory waste and is already being used by developers at companies such as AWS, Cloudflare, and Nvidia.

14 days ago in LeoFinance by taskmaster4450le (80)

$0.00

Sort:

Trending