Glossary: AI for Business Documents
Welcome to Docugami’s glossary of terms related to AI for Business Documents, which aims to provide clear and concise explanations for a wide range of concepts in the rapidly expanding field of artificial intelligence related to business documents.
Definitions of AI Business Uses
AI involves the creation of algorithms and models that allow machines to learn from experience, improve their performance, and make predictions or decisions based on data. AI has become increasingly important in many industries, as it can assist workers and managers with a wide range of tasks, automate many mundane processes, reduce costs, and improve outcomes.
As AI becomes an increasingly prevalent part of our business and personal lives, it's important to have a solid understanding of the terminology and vocabulary that surrounds it.
From machine learning to natural language processing to neural networks, this glossary is a starting point for anyone looking to gain a better understanding of AI and its applications.
A
Algorithm: A set of instructions on how to accomplish a task, organize information, or solve a problem.
Application Programming Interface (API): A set of software code that provides clearly defined methods for utilizing a software component.
Artificial General Intelligence (AGI): A term denoting a level of Artificial Intelligence in which the AI has the ability to do many different things as well as humans can do.
Artificial intelligence (AI): A branch of computer science that aims to create software that can perform tasks that would normally require human intelligence, such as recognizing images, understanding text or spoken language, and making decisions.
Assisted authoring: The use of technology to support or enhance the process of creating or editing written content.
B
Big data: A term used to describe artificial intelligence models that are trained on huge information sets, often totalling millions of pages of information, to develop predictive models of words, terms, or outcomes that are most commonly associated with other words, phrases, or information.
C
Contract analysis: The process of examining the content and structure of a contract to extract information or prepare it for further processing.
Contract analytics: Using AI to discover, interpret, and organize the data of contracts to extract insights and trends.
Contract authoring: The use of AI technology to assist or automate the process of creating legally binding agreements between parties.
Contract compliance: The process of ensuring that a contract is being executed in accordance with its terms and conditions, as well as any relevant laws and regulations. This can include monitoring and verifying that the parties involved are fulfilling their obligations.
Contract generation: The process of automatically creating a contract from a set of data or a template.
Contract lifecycle management (CLM): The process of overseeing and in many cases automating the various stages of a contract, from the initial drafting and negotiation stages, through the execution and performance phases, to the eventual expiration or termination or renewal of the agreement. It encompasses a wide range of tasks and activities, including contract authoring, contract administration and execution, contract analysis, and contract compliance.
Contract management: The process of organizing, storing, and retrieving electronic or paper contracts.
Contract recognition: The process of automatically identifying and extracting information from a contract, such as text or images.
Contract summarization: The process of automatically generating a shorter synopsis of the content of a contract.
Contract intelligence: The application of Artificial intelligence techniques to extract insights and automate contract analysis, generation, management, and recognition.
Clause extraction: Identifying and gathering specific language from a contract.
Compliance checking: Using AI to check if a contract adheres to relevant laws and regulations.
D
Data extraction: The process of automatically identifying and gathering structured information from unstructured or semi-structured sources.
Data mining: The process of discovering patterns in large sets of information.
Deep learning: Artificial intelligence systems that have a number of layers of interconnected nodes, that analyze and compare information, generating a more robust and sophisticated analysis.
Document analysis: The process of examining the content, structure, and layout of a document to extract information or prepare it for further processing.
Document engineering: The field of study that deals with the design, development, and maintenance of documents and document-based systems. It encompasses a wide range of technologies, including document analysis, document recognition, document summarization, document generation, and document management. The goal of document engineering is to make the process of creating, managing, and using documents more efficient, accurate, and effective.
Document generation: The process of automatically creating a new piece of content from a set of data or a template.
Document intelligence: The ability of a computer system to understand and extract information from documents, such as text, images, and tables. This can include a wide range of tasks, such as document analysis, document recognition, document summarization, and document generation.
Document management: The process of organizing, storing, and retrieving electronic or paper documents.
Document recognition: The process of automatically identifying and extracting information from a document, such as text or images.
Document summarization: The process of automatically generating a synopsis of the content of a document.
E
Extensible Markup Language (XML): A system for identifying, categorizing, and organizing the information contained in a document or other data structure, as well as storing, managing, sharing, and using that information. XML is an open standard (co-created by Docugami CEO Jean Paoli), published by the World Wide Web Consortium (W3C).
F
Few shot or few-shot learning: The process of providing a small number of prompts or guidance to make the output of a Generative Artificial Intelligence system more accurate or relevant for the user.
Foundation Model: An extremely large LLM or AI system that has been pre-trained on an enormous amount of data that subsequently can be used for a wide range of tasks.
G
Generative AI: A subfield of artificial intelligence (AI) that focuses on creating models or algorithms that can generate new content, such as images, videos, music, text, code, information, data, or other results, without being explicitly programmed to do so.
H
Hierarchical Data: Information in which the relationships between individual items are identified and represented, in a structure resembling a tree.
I
Intelligent document processing (IDP): The use of AI techniques to automate analyzing, extracting, and organizing information from documents.
L
Large Language Model (LLM): An AI system that has been trained on vast amounts of content, enabling the system to make predictions about words that are likely to relate to other words or collections of words.
M
Machine learning: A subset of AI that involves training computer systems to learn from data and make predictions or decisions without being explicitly programmed.
N
Neural networks: A type of machine learning algorithm inspired by the structure and function of the human brain, consisting of layers of interconnected nodes (neurons) that process and transmit information.
Natural Language Processing (NLP): A subfield of AI that deals with the interaction between computers and human language, including tasks such as language translation, speech recognition, and text generation.
S
Semi-structured data: Information that has a defined format but does not conform to a traditional data mode. This type of data includes a mix of structured and unstructured elements, and it often contains metadata or tags that describe the data, but the overall structure of the data is not rigidly defined.
Small data: A term designed to contrast with Big Data; whereas Big Data denotes AI that is trained on huge volumes of information – often publicly sourced, and usually requiring significant time and expense – Small Data denotes AI that can function effectively on relatively modest volumes of information – usually the information of a single user or company.
Structured data: Information that is organized and formatted in a specific way, such as in a table or database, and follows a defined data model or schema.
Supervised learning: A process in which Artificial Intelligence is trained to reach conclusions based on data that has been labeled by humans.
U
Unstructured data: Information that does not have a pre-defined format or structure. Examples include emails, reports, and articles.
Unsupervised learning: A process in which Artificial Intelligence is given data or information that has not been labeled by humans, so the Artificial Intelligence system has to discover patterns without human guidance or instruction.
X
XML: see Extensible Markup Language.
Z
Zero shot: The initial output of a Generative Artificial Intelligence prior to any additional prompts or guidance intended to make the output more refined or relevant for the user’s purposes.