Azure OCR [printed text] is not reading the receipt lines in the right order - azure

Application Goal: read the receipt image, extract the store/organization name along with the total amount paid. Feed it to web-form for auto-filling & submission.
Post Request - "https://*.cognitiveservices.azure.com/vision/v2.0/recognizeText?{params}
Get Request - https://*.cognitiveservices.azure.com/vision/v2.0/textOperations/{operationId}
however when I get the results back, sometimes it's confusing in line ordering (see below picture [similar results in JSON response])
This mixing is resulting in getting the total as $0.88
Similar situations are present for 2 out of 9 testing receipts.
Q: Why it's working for similar & different structured receipts but for some reason not consistent for all? Also, any ideas how to get around it?

I had a quick look to your case.
OCR Result
As you mentioned, the results are not ordered as you thought. I had a quick look to the bounding boxes values and I don't know how they are ordered. You could try to consolidate fields based on that, but there is a service that is already doing it for you.
Form Recognizer:
Using Form Recognizer and your image, I got the following results for your receipt.
As you can see below, the understandingResults contains the total with its value ("value": 9.11), the MerchantName ("Chick-fil-a") and other fields.
{
"status": "Succeeded",
"recognitionResults": [
{
"page": 1,
"clockwiseOrientation": 0.17,
"width": 404,
"height": 1226,
"unit": "pixel",
"lines": [
{
"boundingBox": [
108,
55,
297,
56,
296,
71,
107,
70
],
"text": "Welcome to Chick-fil-a",
"words": [
{
"boundingBox": [
108,
56,
169,
56,
169,
71,
108,
71
],
"text": "Welcome",
"confidence": "Low"
},
{
"boundingBox": [
177,
56,
194,
56,
194,
71,
177,
71
],
"text": "to"
},
{
"boundingBox": [
201,
56,
296,
57,
296,
71,
201,
71
],
"text": "Chick-fil-a"
}
]
},
...
OTHER LINES CUT FOR DISPLAY
...
]
}
],
"understandingResults": [
{
"pages": [
1
],
"fields": {
"Subtotal": null,
"Total": {
"valueType": "numberValue",
"value": 9.11,
"text": "$9.11",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/32/words/0"
},
{
"$ref": "#/recognitionResults/0/lines/32/words/1"
}
]
},
"Tax": {
"valueType": "numberValue",
"value": 0.88,
"text": "$0.88",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/31/words/0"
},
{
"$ref": "#/recognitionResults/0/lines/31/words/1"
},
{
"$ref": "#/recognitionResults/0/lines/31/words/2"
}
]
},
"MerchantAddress": null,
"MerchantName": {
"valueType": "stringValue",
"value": "Chick-fil-a",
"text": "Chick-fil-a",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/0/words/2"
}
]
},
"MerchantPhoneNumber": {
"valueType": "stringValue",
"value": "+13092689500",
"text": "309-268-9500",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/4/words/0"
}
]
},
"TransactionDate": {
"valueType": "stringValue",
"value": "2019-06-21",
"text": "6/21/2019",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/6/words/0"
}
]
},
"TransactionTime": {
"valueType": "stringValue",
"value": "13:00:57",
"text": "1:00:57 PM",
"elements": [
{
"$ref": "#/recognitionResults/0/lines/6/words/1"
},
{
"$ref": "#/recognitionResults/0/lines/6/words/2"
}
]
}
}
}
]
}
More details on Form Recognizer: https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/

Related

Terraform AWS Dashboard - Widgets from nested list

Terraform beginner here. I am trying to create some widgets from a nested list. Group will be a "label" widget indicating the group followed by the metric widgets for the canaries related to the group. So the dashboard should look as follows:
Group 1
widget1, widget2 etc.
Group 2
widget3, widget4 etc.
Variable value:
dashboard = [
{
name = "Group-1",
canaries = ["canary1", "canary2", "canary3"]
},
{
name = "Group-2",
canaries = ["canary4", "canary5"]
}
]
Attempt at building json:
locals {
body = [for group in var.dashboard :
#Create text widget for Group name
{
"height": 1,
"width": 24,
"y": 4,
"x": 0,
"type": "text",
"properties": {
"markdown": "\n# > [${group.name}]\n"
}
}
#Attempt to create underlying widgets for group
[for canary in group.canaries :
{
{
"height": 3,
"width": 6,
"y": 5,
"x": 0,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "${canary}", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
}
}
] #TF Doesn't like the inclusion of nested loop here or my syntax is incorrect.
]
}
Resource creation:
resource "aws_cloudwatch_dashboard" "canary_dashboard" {
dashboard_name = "Canary-Dashboard"
dashboard_body = jsonencode({
"widgets": concat(local.body)
})
}
In my creation of body, Terraform complains about Missing close bracket on index, but I have triple checked that I am not missing a bracket or curly brace. How do I dynamically create the dashboard widgets from nested lists?
Edit
Including desired json output below as suggested by Jordan. In the end, there will be n number of groups, each having n number of canaries belonging to said group.
{
"widgets": [
{
"height": 1,
"width": 24,
"y": 4,
"x": 0,
"type": "text",
"properties": {
"markdown": "\n# Group1\n"
}
},
{
"height": 3,
"width": 6,
"y": 5,
"x": 6,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "Group1-Canary", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
},
{
"height": 1,
"width": 24,
"y": 4,
"x": 0,
"type": "text",
"properties": {
"markdown": "\n# Group2\n"
}
},
{
"height": 3,
"width": 6,
"y": 5,
"x": 6,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "Group2-Canary", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
},
]
}
You're trying to do something with list comprehension that Terraform doesn't allow (see where I've marked "HERE"):
locals {
body = [for group in var.dashboard :
#Create text widget for Group name
{
"height": 1,
"width": 24,
"y": 4,
"x": 0,
"type": "text",
"properties": {
"markdown": "\n# > [${group.name}]\n"
}
} <===== HERE
#Attempt to create underlying widgets for group
[for canary in group.canaries :
{
{
"height": 3,
"width": 6,
"y": 5,
"x": 0,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "${canary}", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
}
}
] #TF Doesn't like the inclusion of nested loop here or my syntax is incorrect.
]
}
If TF allowed you to do what you're trying to do, you'd end up with something like:
body = [
{
"height": 1,
"width": 24,
"y": 4,
"x": 0,
"type": "text",
"properties": {
"markdown": "\n# > [${group.name}]\n"
}
},
[
{
{
"height": 3,
"width": 6,
"y": 5,
"x": 0,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "${canary}", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
}
},
{
{
"height": 3,
"width": 6,
"y": 5,
"x": 0,
"type": "metric",
"properties": {
"metrics": [
[ "CloudWatchSynthetics", "Failed", "CanaryName", "${canary}", { "label": "Canary failures count", "region": "us-west-2" } ]
],
"title": "Failed canary runs",
"period": 60,
"region": "us-west-2",
"stat": "Sum",
"view": "singleValue",
"setPeriodToTimeRange": true
}
}
}
]
]
And I doubt that's what you're trying to do. If you can provide a sample of what you'd like the JSON to look like, we can show you how to achieve it.

Azure Form Recognizer Name/Value Pairs

I am currently working with Azure Form Recognizer and had a question. I am using
https://<>.cognitiveservices.azure.com/formrecognizer/v2.0-preview/layout/analyzeResults/2e0a2322-65bb-4fd2-a3bf-98f70b36641e
The JSON returned seems to be using basic OCR. I was wondering if its possible (easily)
to take this
{
"boundingBox": [
4.4033,
1.5114,
6.5483,
1.5114,
6.5483,
1.6407,
4.4033,
1.6407
],
"text": "Invoice For: First Up Consultants",
"words": [
{
"boundingBox": [
4.4033,
1.5143,
4.8234,
1.5143,
4.8234,
1.6155,
4.4033,
1.6155
],
"text": "Invoice",
"confidence": 1
},
{
"boundingBox": [
4.8793,
1.5143,
5.1013,
1.5143,
5.1013,
1.6154,
4.8793,
1.6154
],
"text": "For:",
"confidence": 1
},
{
"boundingBox": [
5.2048,
1.5130,
5.4927,
1.5130,
5.4927,
1.6151,
5.2048,
1.6151
],
"text": "First",
"confidence": 1
},
{
"boundingBox": [
5.5427,
1.5130,
5.7120,
1.5130,
5.7120,
1.6407,
5.5427,
1.6407
],
"text": "Up",
"confidence": 1
},
{
"boundingBox": [
5.7621,
1.5114,
6.5483,
1.5114,
6.5483,
1.6151,
5.7621,
1.6151
],
"text": "Consultants",
"confidence": 1
}
]
}
but return it as
"boundingBox": [
4.4033,
1.5114,
6.5483,
1.5114,
6.5483,
1.6407,
4.4033,
1.6407
],
"text": "Invoice For:",
"value": "First Up Consultants"
}
If this is not something that I can do in azure form recognizer, then no worries. I just wanted to see.
Thank you in advance!
Michael
It sounds like you're looking to extract semantic meaning from your document. In that case, you might want to look at using a custom Form Recognizer model.
You can start by training a custom model to extract key value pairs:
https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/curl-train-extract
Sample key value pair:
{
"key": {
"text": "Address:",
"boundingBox": [ 0.7972, 1.5125, 1.3958, 1.5125, 1.3958, 1.6431, 0.7972, 1.6431 ]
},
"value": {
"text": "1 Redmond way Suite 6000 Redmond, WA 99243",
"boundingBox": [ 0.7972, 1.6764, 2.15, 1.6764, 2.15, 2.2181, 0.7972, 2.2181 ]
},
"confidence": 0.86
}
Or you can train a custom model using labels that you provide:
https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/label-tool
Sample field output:
{
"total": {
"type": "string",
"valueString": "$22,123.24",
"text": "$22,123.24",
"boundingBox": [ 5.29, 3.41, 5.975, 3.41, 5.975, 3.54, 5.29, 3.54 ],
"page": 1,
"confidence": 1
}
}

How to I return a json object along with totals in mongoose?

I have a database of exercises in a workout tracker, and when I do a find(), the result is this:
[
{
"_id": "5e9dacbb6512969974bd5b2d",
"day": "2020-04-10T14:07:55.905Z",
"exercises": [
{
"type": "resistance",
"name": "Bicep Curl",
"duration": 20,
"weight": 100,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b2e",
"day": "2020-04-11T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Lateral Pull",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b2f",
"day": "2020-04-12T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Push Press",
"duration": 25,
"weight": 185,
"reps": 8,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b30",
"day": "2020-04-13T14:07:55.916Z",
"exercises": [
{
"type": "cardio",
"name": "Running",
"duration": 25,
"distance": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b31",
"day": "2020-04-14T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 285,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b32",
"day": "2020-04-15T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b33",
"day": "2020-04-16T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Quad Press",
"duration": 30,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b34",
"day": "2020-04-17T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b35",
"day": "2020-04-18T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Military Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b36",
"day": "2020-04-19T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench",
"duration": 30,
"distance": 2
}
]
}
]
Then I need to get total sums of statistics from each exercise, so I used mongoose aggregate to give me this data:
[
{
"_id": null,
"totalDuration": 230,
"totalWeight": 2070,
"totalSets": 32,
"totalReps": 78,
"totalDistance": 6
}
]
I want to combine these two results in one GET request, ideally doing something similar to a push where I just push the totals at the end of the first JSON object. How do I achieve this?
Something like this:
function mergeResults(resultFromFindQuery, totalSums){
var allData = {};
allData['mongoFindresult'] = resultFromFindQuery;
allData['totalSums'] = totalSums;
return allData;
}
Then use the returned value to what you need to do. Now you have both of them in the same variable.

How to insert a json object inside embedded array object in MongoDB?

I have the following doc stored in MongoDB:
{
"_id": 2,
"template_name": "AllahabadBank",
"description": "Form For NewCustomer Application In Banking System",
"handwritten": "true",
"file_path": "./serverData/batchProcessedFiles/AllahabadBank/input",
"annotated_data": [{
"page_num": "page-01.jpg",
"entities": [{
"label": "CIFNo",
"type": "text",
"boundary_box": {
"x1": 325,
"x2": 861,
"y1": 324,
"y2": 360
}
}],
"num_of_pages": 12
}]
}
And want to insert below JSON data to annotated_data. Suggest me MongoDB query or Python Code to perform this.
{
"page_num": "page-02.jpg",
"entities": [
{
"label": "CustomerName",
"type": "text",
"boundary_box": {
"x1": 559,
"x2": 1615,
"y1": 382,
"y2": 440
}
}
]
}
This can be done using the update query.
Below is the python code to update it using pymongo. Replace the __ colName __ with your column name.
__colName__.update({'_id':2},
{'$push':{'annotated_data':{"page_num": "page-02.jpg",
"entities": [{
"label": "CustomerName",
"type": "text",
"boundary_box": {
"x1": 559,
"x2": 1615,
"y1": 382,
"y2": 440
}
}]
}}})
The Mongo command for the above is as follows:
db.__colName__.update({'_id':2},
{'$push':{'annotated_data':{"page_num": "page-02.jpg",
"entities": [{
"label": "CustomerName",
"type": "text",
"boundary_box": {
"x1": 559,
"x2": 1615,
"y1": 382,
"y2": 440
}
}]
}}})
You can do it in 2 ways:
1) MongoDB update method
db.collection.update({"_id":2},
{
$push: {
"annotated_data": {
"page_num": "page-02.jpg",
"entities": [
{
"label": "CustomerName",
"type": "text",
"boundary_box": {
"x1": 559,
"x2": 1615,
"y1": 382,
"y2": 440
}
}
]
}
}
})
2) Retrieve the document and append into array (Python code)
doc = db.collection.find_one({'_id':2})
doc["annotated_data"].append({
"page_num": "page-02.jpg",
"entities": [
{
"label": "CustomerName",
"type": "text",
"boundary_box": {
"x1": 559,
"x2": 1615,
"y1": 382,
"y2": 440
}
}
]
})
db.collection.save(doc)

Computer Vision REST API format

I am currently using the trial version of the Computer Vision API in java, So I acquired the code from the website and successfully got the JSON.
However, the format of the JSON I got was quite different than the one showed in the Demo page.
Example my Json response:
"regions": [
{
"boundingBox": "21,16,304,451",
"lines": [
{
"boundingBox": "28,16,288,41",
"words": [
{
"boundingBox": "28,16,288,41",
"text": "NOTHING"
}
]
},
Whereas the demo page is:
{
"lines": [
{
"boundingBox": [
122,
122,
401,
85,
404,
229,
143,
233
],
looking at the bounding box format, we can clearly see the difference
The response you get is the result of using the Computer Vision API's OCR as the example states:
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}
While the response from the demo page is the result of using the Computer Vision API's Recognize Text then Get Recognize Text Operation Result to get the result of the operation as the example states:
{
"status": "Succeeded",
"recognitionResult": {
"lines": [
{
"boundingBox": [
202,
618,
2047,
643,
2046,
840,
200,
813
],
"text": "Our greatest glory is not",
"words": [
{
"boundingBox": [
204,
627,
481,
628,
481,
830,
204,
829
],
"text": "Our"
},
{
"boundingBox": [
519,
628,
1057,
630,
1057,
832,
518,
830
],
"text": "greatest"
},
{
"boundingBox": [
1114,
630,
1549,
631,
1548,
833,
1114,
832
],
"text": "glory"
},
{
"boundingBox": [
1586,
631,
1785,
632,
1784,
834,
1586,
833
],
"text": "is"
},
{
"boundingBox": [
1822,
632,
2115,
633,
2115,
835,
1822,
834
],
"text": "not"
}
]
},
{
"boundingBox": [
420,
1273,
2954,
1250,
2958,
1488,
422,
1511
],
"text": "but in rising every time we fall",
"words": [
{
"boundingBox": [
423,
1269,
634,
1268,
635,
1507,
424,
1508
],
"text": "but"
},
{
"boundingBox": [
667,
1268,
808,
1268,
809,
1506,
668,
1507
],
"text": "in"
},
{
"boundingBox": [
874,
1267,
1289,
1265,
1290,
1504,
875,
1506
],
"text": "rising"
},
{
"boundingBox": [
1331,
1265,
1771,
1263,
1772,
1502,
1332,
1504
],
"text": "every"
},
{
"boundingBox": [
1812,
1263,
2178,
1261,
2179,
1500,
1813,
1502
],
"text": "time"
},
{
"boundingBox": [
2219,
1261,
2510,
1260,
2511,
1498,
2220,
1500
],
"text": "we"
},
{
"boundingBox": [
2551,
1260,
3016,
1258,
3017,
1496,
2552,
1498
],
"text": "fall"
}
]
},
{
"boundingBox": [
1612,
903,
2744,
935,
2738,
1139,
1607,
1107
],
"text": "in never failing ,",
"words": [
{
"boundingBox": [
1611,
934,
1707,
933,
1708,
1147,
1613,
1147
],
"text": "in"
},
{
"boundingBox": [
1753,
933,
2132,
930,
2133,
1144,
1754,
1146
],
"text": "never"
},
{
"boundingBox": [
2162,
930,
2673,
927,
2674,
1140,
2164,
1144
],
"text": "failing"
},
{
"boundingBox": [
2703,
926,
2788,
926,
2790,
1139,
2705,
1140
],
"text": ","
}
]
}
]
}
}

Resources