Docugami uses NVIDIA Inception to advance market-leading Generative AI for Business Documents
Docugami has joined NVIDIA Inception, a program designed to nurture startups revolutionizing a wide range of industries with AI technology advancements.
Specifically, NVIDIA Inception’s advanced technology assistance in Generative AI training and inference has recently helped Docugami expand and refine its Docugami Foundation Model (DFM) for Business Documents to outperform OpenAI's GPT-4 and Cohere's Command model on key business document understanding tasks.
Using NVIDIA’s deep learning hardware and software stack, we trained a new version of our proprietary, multimodal Docugami Business Document Foundation Model on millions of business documents, resulting in a more powerful Generative AI engine that individual businesses can use to unlock the critical information currently trapped inside their unique documents.
Figure 1: Benchmark Results for CSL (Small Chunks)
Figure 2: Benchmark Results for CSL (Large Chunks)
Specifically, DFM outperforms on the more stringent comparisons i.e., exact match and similarity > 0.8 (which can be thought of as "almost exact match" in terms of semantic similarity). This means that Docugami’s output more closely matches human labels, either exactly or very closely.
For completeness and context, we included some other, less stringent metrics used in the industry, for example a token-wise F1 match. These less exact matches are less relevant in a business setting, where accuracy and completeness are critical. We previously released the code for these measurements on Github, and invite community feedback.
Going forward, NVIDIA's support will help Docugami train and deploy Large Language Models (LLMs) in production with efficient GPU pipelines. NVIDIA's toolkits used through the Nvidia LaunchPad are of the utmost interest for Docugami, including the NeMo and Megatron frameworks for training/fine tuning LLMs and Triton to maximize inference efficiency. Finally, we are actively looking at NeMo Guardrails to manage safety guardrails for our LLMs.
As the leading provider of Generative AI for Business Documents, Docugami collaborates with many industry leaders. In addition to NVIDIA, Docugami has recently announced technical contributions to LangChain, LlamaIndex, and Hugging Face, with more collaborations in the pipeline.
Docugami’s business document foundation model, trained on millions of business documents, enables frontline business users to surface and repurpose all the high value data inside their own unique business documents with minimal effort. Docugami’s powerful AI allows business users and managers to generate highly relevant business reports, abstracts, data feeds and new documents automatically – using their own business documents and reflecting the unique norms and nuances of their company.
Docugami is in the market today with customers in a variety of industry segments, including Commercial Insurance, Commercial Real Estate, Technology, a wide range of Professional Services, and more.
This is a transformative moment for businesses – Generative AI has the potential to transform how organizations access and utilize all of the vital information currently inaccessible in their long-form documents. This is a problem that has frustrated companies of all sizes for decades, and we finally have the ability to solve it for companies of all sizes, by combining a wide array of AI techniques, declarative markup, and other technologies.
The NVIDIA Inception program helps startups like Docugami during critical stages of product development, prototyping and deployment. Every NVIDIA Inception member gets a custom set of ongoing benefits, such as NVIDIA Deep Learning Institute credits, marketing support, and technology assistance, which provides startups with the fundamental tools to help them grow.
We are committed to using Generative AI to turn documents into data, automatically, for frontline business users, so the tools, resources, and technical insights provided by NVIDIA Inception will greatly enhance and accelerate our work.