Small but mighty: H2O.ai’s new AI models challenge tech giants in document analysis
H2O.ai, a provider of open-source AI platforms, announced today two new vision-language models designed to improve document analysis and optical character recognition (OCR) tasks.
The models, named H2OVL Mississippi-2B and H2OVL-Mississippi-0.8B, show competitive performance against much larger models from major tech companies, potentially offering a more efficient solution for businesses dealing with document-heavy workflows.
The H2OVL Mississippi-0.8B model, with only 800 million parameters, surpassed all other models, including those with billions more parameters, on the OCRBench Text Recognition task. Meanwhile, the 2-billion parameter H2OVL Mississippi-2B model demonstrated strong general performance across a range of vision-language benchmarks.
“We’ve designed H2OVL Mississippi models to be a high-performance yet cost-effective solution, bringing AI-powered OCR, visual understanding, and Document AI to businesses,” Sri Ambati, CEO and Founder of H2O.ai said in an exclusive interview with VentureBeat. “By combining advanced multimodal AI with efficiency, H2OVL Mississippi delivers precise, scalable Document AI solutions across a range of industries.”
The release of these models marks a significant step in H2O.ai’s strategy to make AI technology more accessible. By making the models freely available on Hugging Face, a popular platform for sharing machine learning models, H2O.ai is allowing developers and businesses to modify and adapt the models for specific document AI needs.
Article