Expanded Document Fields

When data is extracted from an ID document through OCR, data is returned within a document fields object as detailed here.

In addition to document fields, an expanded set of fields can be returned, providing a full list of each extracted field, the source of the extraction, and any corresponding transliterated values.

The following transliterations are currently supported:

Cyrillic and Arabic families of languages
Chinese
Korean
Vietnamese
Turkish languages

Configure Expanded Document Fields

Expanded fields are not returned by default. This must be added to the ID Document Text Extraction task.

For non-latin documents you must enable them separately, see here for guidance

Node.js
Java
JSON
C#
PHP
    
 
new RequestedTextExtractionTaskBuilder().withCreateExpandedDocumentFields(true) // default is false.build()
Copy

Retrieving Expanded Document Fields

The expanded document fields object will not always be present. Expanded fields will only be available within the session when a document has been captured and the OCR (Optical character recognition) has been successful.

Data which is keyed in manually will not be returned in the expanded fields section, this will continue to be returned only within the standard document fields payload.

Node.js
Java
JSON
C#
PHP
    
​x
 
//Retrieve Expanded Document Fields Media ID, idvClient.getSession(sessionId)  .then((session) => {​    // Returns all resources in the session    const resources = session.getResources();​    // Returns a collection of ID Documents    const idDocuments = resources.getIdDocuments();​    idDocuments.map((idDocument) => {         const expandedFields = document.getExpandedDocumentFields()    const expandedFieldsMediaId = expandedFields.getMedia().getId();          });});       // Retrieve data            idvClient.getMediaContent(sessionId, expandedFieldsMediaId).then(media => {    const buffer = media.getContent();    const jsonData = JSON.parse(buffer);    // handle jsonData here}).catch(error => {    // handle error})
Copy

Response

A JSON object will be returned when retrieving the expanded document fields media. Some documents will have multiple sources, eg a VIZ and a barcode. Fields that are in both sources will be returned separately with the source available in the response.

JSON
    
 
{    "fields": [        {            "name": "date_of_birth", //the name of the field            "value": "1970-01-01", //the contents of the field             "locale": "la", //the language/script the field is in on the document            "source": "VIZ" //where this field has been extracted from // VIZ || BARCODE || MRZ            "is_non_latin": true // if the field includes non-latin characters            "is_transliteration": true // if field is transliterated into latin script        },        {            "name": "full_name",            "value": "MELISSA PETERSON",            "locale": "la",            "source": "MRZ"        },      ...    ]}
Copy

Values

Field	Description	Always present
name	The name of the field. A full list is available here	✅
value	The data from the document field	✅
locale	The locale of the field - for any latin script fields this will be "la", for non-latin script this will return the detailed locale eg "ja-JP" for Japan	✅
source	Where on the document this field has been returned, this can be VIZ, MRZ or BARCODE	✅
is_non_latin	This will be present and true if the field is in a non-latin script	❌
is_transliteration	We can transliterate some non-latin fields. This will return the values of the field into latin script	❌

Examples Document Extractions

See the below code blocks for examples of the Expanded Document Fields. The Document Fields JSON is also available at the bottom to compare.

You may use the following keyboard shortcuts to expand or collapse:

Expand: Ctrl + I

Collapse: Ctrl + Y

UK Passport
Canada Driving Licence
Japan Driving Licence
China Driving Licence
    
 
expandedDocumentFields{...}
​​documentFields:{...}
Copy

Last updated on

Was this page helpful?