RE: LeoThread 2024-09-11 11:59

You are viewing a single comment's thread from:

RE: LeoThread 2024-09-11 11:59

View the full context
View the direct parent

taskmaster4450le (80)in LeoFinance • 5 months ago

Built on one of Mistral’s text models, Nemo 12B, the new model can answer questions about an arbitrary number of images of an arbitrary size given either URLs or images encoded using base64, the binary-to-text encoding scheme. Similar to other multimodal models such as Anthropic’s Claude family and OpenAI’s GPT-4o, Pixtral 12B should — at least in theory — be able to perform tasks like captioning images and counting the number of objects in a photo.

5 months ago in LeoFinance by taskmaster4450le (80)

$0.00

Sort:

Trending