Profile Picture of the author

Document AI in SnapApp

on 01-30-2026 12:00 AM by SnapApp by BlueVector AI

5342

License: Corporate, Enterprise

Document AI in SnapApp

Document AI is a feature in SnapApp that helps you automatically read, understand, and process documents. Instead of manually reviewing files, Document AI analyzes documents for you and extracts useful information.

It converts unstructured data (such as PDFs, images, or handwritten files) into structured data (clearly defined fields and values). This structured data can be stored in databases, making it easy to search, analyze, validate, and use in workflows.

Document AI uses machine learning and Google Cloud technology to create scalable, cloud-based document processing solutions that work reliably even as document volume increases.

Common examples include extracting data from:

  • Images
  • PDFs
  • Scanned documents
  • Handwritten text

Document AI is built on Google’s Vertex AI platform and uses generative AI to provide end-to-end document processing. This means documents are processed automatically from upload to extraction without requiring deep machine learning knowledge from the user.

To learn more, see the Document AI overview


Table of Contents

Document AI Parser

A Document AI parser acts as a bridge between a document file and a machine learning model. Its main responsibility is to read documents and convert their content into meaningful data.

The parser understands the document structure and performs tasks such as:

  • Classifying documents (identifying what type of document it is)
  • Splitting documents (separating multiple documents within a single file)
  • Extracting data (pulling specific fields like names, dates, or amounts)
  • Analyzing content (understanding layout, sections, and entities)

Parsers are essential because they turn raw documents into structured and usable data inside SnapApp.


Types of parser

SnapApp supports different types of Document AI parsers based on the document format and business use case.

Built-in Parsers
  • Document OCR Uses Optical Character Recognition (OCR) to read text from documents and images. It also captures layout information and automatically improves image quality by correcting skewed or unclear scans. https://cloud.google.com/document-ai/docs/enterprise-document-ocr

  • Form Parser Designed for structured forms. It identifies and extracts key-value pairs, table data, and checkbox values from documents such as applications and surveys. https://cloud.google.com/document-ai/docs/form-parser

  • Layout Parser Detects document structure such as paragraphs, headings, and content blocks. It helps understand how information is organized within the document. https://cloud.google.com/document-ai/docs/layout-parse-chunk

Specialized Parser

Pre-built parsers optimized for specific document types and industries, such as financial, identity, or tax documents.

Specialized Parser List
Parser Description
1003 Parser Extracts over 50 fields from Fannie Mae Form 1003 (URLA).
1040 Parser Extracts data from Form 1040 including name, filing status, and amounts.
1040-C Parser Extracts data from Form 1040 Schedule C.
1040-E Parser Extracts data from Form 1040 Schedule E.
1065 Parser Extracts partnership name, address, and related details.
1099-G Parser Extracts payer and recipient information.
1099-INT Parser Extracts interest income details.
1099-MISC Parser Extracts miscellaneous income details.
1099-NEC Parser Extracts non-employee compensation details.
1099-R Parser Extracts retirement income details.
1120 Parser Extracts corporate tax details.
1120S Parser Extracts S-Corporation details.
Bank Statement Parser Extracts account details and transactions.
Expense Parser Extracts receipt data such as supplier and total amount.
France Driver License Parser Extracts personal details from French driver licenses.
France National ID Parser Extracts identity information from French IDs.
France Passport Parser Extracts personal information from French passports.
Identity Document Proofing Verifies the authenticity of identity documents.
India Aadhar Card Parser Extracts identity details from Aadhar cards.
India Driver License Parser Extracts details from Indian driver licenses.
India Passport Parser Extracts details from Indian passports.
Invoice Parser Extracts invoice ID, amount, and line items.
Lending Doc Splitter/Classifier Identifies and classifies lending documents in bulk files.
Mortgage Statement Parser Extracts mortgage statement details.
Pay Slip Parser Extracts salary and employer information.
Procurement Doc Splitter Splits procurement documents from bundled PDFs.
Receipt Parser Extracts receipt details such as totals and suppliers.
SSA-1099 Parser Extracts Social Security income details.
US Driver License Parser Extracts personal details from US driver licenses.
US Passport Parser Extracts personal details from US passports.
Utility Parser Extracts utility bill information.
W2 Parser Extracts employee and employer wage details.
W9 Parser Extracts taxpayer identification details.
Custom Parser

Custom parsers are used when built-in parsers cannot extract the required information. These parsers are trained specifically for your document type and business needs.

  • Extractor Extracts raw text, layout information, and normalized entities from documents. https://cloud.google.com/document-ai/docs/extracting-overview

  • Classifier Automatically categorizes documents into predefined groups. https://cloud.google.com/document-ai/docs/custom-classifier

  • Splitter Identifies document boundaries and separates multiple documents within a single file. https://cloud.google.com/document-ai/docs/custom-splitter

  • Summarizer Generates concise summaries for both short and long documents. https://cloud.google.com/document-ai/docs/custom-summarizer

To learn more, see the Parser List


How to create a custom Document AI Parser?

Follow these steps carefully to create and deploy your own custom parser:

1. Create
  • Open the Google Cloud Console
  • Navigate to the Document AI dashboard
  • Click Create Processor to begin
  • Select Custom Processor to define your own parser
2. Define
  • Specify the document formats the parser should support
  • Define entity types and extraction rules using JSON or YAML
  • Configure any required pre-processing or post-processing logic
3. Upload
  • Upload a sample document that represents the documents you want to process
  • This document is used for both training and validation
4. Train
  • Click Train to start the learning process
  • Wait until the training process completes successfully
5. Test
  • Upload a new document
  • Verify that the extracted fields and values are accurate
6. Deploy
  • Click Deploy to make the parser available
  • The parser can now be used within SnapApp workflows

Note: Make sure you have the required Google Cloud permissions.

To learn more, see Document AI Create Processor

custom-parser-creation custom-parser-details


How to import a Document AI parser?

This feature is mainly used for Custom Document AI Parsers that have already been created in Google Cloud.

  1. Go to Settings from the top bar
  2. Select Import Doc AI from the Data menu
  3. Enter the Parser Name to identify the parser
  4. Enter the Expected Entities that should be extracted
  5. Click on Import.

Example:

import-docai


How to do real time extraction and validation using Doc AI?

Available Settings
Setting Description
Quality Validation Checks document quality and rejects documents that are unreadable or unclear
Classify and Validate Identifies the correct document type before processing
Entity Validation Ensures extracted fields match expected values
Pre-Conditions
  • The object must have Elevated Admin Access
  • Track Attachments must be enabled to allow document uploads
  • A Document AI field must be configured with validation options

Example:

DocAI-field-with-parser


1. Create a custom field for real-time extraction

Use the formula below to dynamically extract data from the uploaded document:

=LOOKUP(CONCAT("gs://",[[docAI_type_fieldname]]), "object_of_selected_parser", "_gcs_uri", "data_extraction_field_name")

Output :

Extracted value (from data_extraction_field_name)

lookup-formula-configuration


2. Data extraction and validation
  1. Go to Settings → Objects
  2. Open the required object
  3. Click Preview to test the setup
  4. Add a new record
  5. Upload the document and save the record
  6. The system automatically extracts and validates the data

Examples:

ein-generation

validation-successful

validation-failed


How to validate rules using Document AI?

Validation rules are used to compare extracted data against predefined conditions to ensure accuracy.

  1. Enable Document AI validation and publish the view
  2. The Validity column becomes visible
  3. Go to Edit Fields
  4. Select the field that needs validation
  5. Add a validation expression that defines the condition
  6. Define an error message to display if validation fails
  7. Save the configuration

Examples:

validate-condition

valid-check

If the condition fails:

invalid-condition

validation-rules-unsuccessful

Your Document AI parser is now ready to extract and validate data in SnapApp.


Thank you for following these steps to configure your SnapApp components effectively.

For help or support, email us at snapapp@bluevector.ai.


Generate Text