You are viewing a single comment's thread from:

RE: LeoThread 2024-10-22 21:22

in LeoFinance3 months ago

Instead of writing captions, the team asked annotators to record 60- to 90-second verbal descriptions answering a list of questions about each image. They then transcribed the descriptions—which often stretched across several pages—and used other large language models to clean up, crunch down, and standardize them. They found that this simple switch, from written to verbal annotation, yielded far more detail with little extra effort.