Azure Form Recognizer Not Finding Content with Python on Databricks - apache-spark

I am executing the following Python on Databricks with the relevant Cognitive Form recognizer libraries:
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential
from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import FormRecognizerClient
credential = AzureKeyCredential("aaa6123af5b843a38044538d95584c3d")
endpoint= "https://myformrecognizr.cognitiveservices.azure.com/"
form_recognizer_client = FormRecognizerClient(endpoint, credential)
with open("/dbfs/mnt/lake/RAW/export/Picturehouse.pdf", "rb") as fd:
form = fd.read()
poller = form_recognizer_client.begin_recognize_content(form)
form_pages = poller.result()
for content in form_pages:
for table in content.tables:
print("Table found on page {}:".format(table.page_number))
print("Table location {}:".format(table.bounding_box))
for cell in table.cells:
print("Cell text: {}".format(cell.text))
print("Location: {}".format(cell.bounding_box))
print("Confidence score: {}\n".format(cell.confidence))
if content.selection_marks:
print("Selection marks found on page {}:".format(content.page_number))
for selection_mark in content.selection_marks:
print("Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
selection_mark.state,
selection_mark.bounding_box,
selection_mark.confidence
))
The pdf form looks like the following:
The libraries recognizes
Cell text: Item
Cell text: Qty
Cell text: Seat Allocation
Cell text: Subtotal
Cell text: Adult
Cell text: 1
Cell text: D-11
Cell text: 14.50
But it doesn't recognize the following text from the pdf:
You can go straight to the screen by showing your e-ticket to an
usher. Alternatively, you can collect your tickets at Box Office at
least 15 minutes before the advertised start time of the film or
event. You need your Booking Reference and/or payment card to help us
find your booking. You can print this page by clicking the "Print This
Page" link above.
Is that by design? Or am I missing something in my code?

Unfortunately, the design is like that. The form recognizer is working on pre-trained models and that can recognize the key-value pairs, text, and tables from your documents and the table contents in the file uploaded as the input. Even though the file contains a large amount of text in paragraphs and table content in the middle or at any place, it will be recognized.
To know more details please Refer this link:
https://www.drware.com/extract-data-from-pdfs-using-form-recognizer-with-code-or-without/
https://www.youtube.com/watch?v=iBQO4QdUp6A&t=10s
https://github.com/tomweinandy/form_recognizer_demo

Related

Selenium/Python - continue scrolling down after clicking 'load more'

I want to scrape terms from SAP Glossary website with terms details.
I can only get 50 terms now. Because I couldn't figure out how to click on 'load more' then continue scrolling down to scrape more terms.
I noticed the 'load more' button has to change color to orange so it's clickable
page_url = "https://help.sap.com/glossary/?locale=en-US&search=CRM"
driver.get(page_url)
driver.maximize_window()
element = driver.find_elements(by=By.XPATH,value='//a[#role="menuitem"]')
load_more = driver.find_elements(by=By.CSS_SELECTOR,value='button.motion-button')
detail = []
c = driver.find_elements(by=By.TAG_NAME,value='p')
for i in range(51):
element[i].click()
detail.append(c[0].text)
print(i,c[0].text)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
I found this video talks exactly what I need. It's not about the 'Load more' button...you need to find the json file
https://www.youtube.com/watch?v=qqNufBruvUc
I wrote the following code can meet your requirement, first while the 'load more' button is existed, click it to load more data. after all data loaded. then use 'find_elements' to get the element collection.
from time import sleep
from clicknium import clicknium as cc
if not cc.chrome.extension.is_installed():
cc.chrome.extension.install_or_update()
tab = cc.chrome.open("https://help.sap.com/glossary/?locale=en-US&search=CRM")
load_more = tab.find_element_by_css_selector('button.motion-button')
while tab.is_existing_by_css_selector('button.motion-button'):
load_more.click()
sleep(1)
elements = tab.find_elements_by_xpath('//a[#role="menuitem"]')
for element in elements:
element.click()
print(element.get_text())

Changing table column width using Google-Slide-API and python

I have a Spreadsheet document and by using Google-Sheet-API I'm fetching the data from it. Then by using Google-Slide-API I'm creating a Slide document with 'n' slides. On each slide I'm creating a 2-column table with 'n' rows. Every first column on every slide always contains data from the Spreadsheet's A column. The second columns contain data from the other Spreadsheet's columns (from B to 'n'). Then I'm changing the text size. So far so good.
Now, when I try to adjust the columns width of each table on each slide it doesn't do anything which is an issue because some tables contain more information and therefore the tables don't fit the slide. This is the part of the code that doesn't work:
for i in range(number_of_slides):
regs = [
{'updateTableColumnProperties': {
'objectId': tableID[i],
'columnIndices': [j],
'tableColumnProperties': {
'columnWidth': {'magnitude': mag[j], 'unit': 'PT'}
},
'fields': 'columnWidth'
}
} for j in range(2) ]
SLIDES.presentations().batchUpdate(body={'requests': reqs},
presentationId=deckID).execute()
The tables always remain the same with or without this part so it doesn't have any affect whatsoever. The code doesn't return any errors or messages.
I believe your goal as follows.
You have a Google Slides.
The Google Slides has the several slides, each slide has a table which has 2 columns.
You want to change the column width of 2 columns.
You want to achieve this using googleapis for python.
You have already been able to use the batchUpdate method using Google Slides API.
Modification points:
I think that your request body is correct. But, I would like to propose one modification point. In your script, batchUpdate method is used in a loop. I think that when batchUpdate is used, updateTableColumnProperties for all slides in a Google Slides can be run by one API call.
Although I'm not sure about your values of mag and your whole script, as a sample script for achieving your above goal, how about the following modified script? If this helped you to understand Slides API, I'm glad.
Modified script:
About creds, please use your authorization script. Also, you can see it at Quickstart for python.
service = build('slides', 'v1', credentials=creds)
PRESENTATION_ID = '###' # Please set Google Slides ID.
magnitude = [100, 300] # Please set the widths for the columns A and B in each table. In this sample, 100 and 300 PT are set.
# 1. Retrieve an object for each slides.
presentation = service.presentations().get(presentationId=PRESENTATION_ID).execute()
# 2. Create a request body for the batchUpdate method.
slides = presentation.get('slides')
requests = []
for slide in slides:
pe = slide.get('pageElements')
if pe:
for pageElement in pe:
t = pageElement.get('table')
if t:
for i, m in enumerate(magnitude):
requests.append({
'updateTableColumnProperties': {
'objectId': pageElement['objectId'],
'columnIndices': [i],
'tableColumnProperties': {
'columnWidth': {'magnitude': m, 'unit': 'PT'}
},
'fields': 'columnWidth'
}
})
# 3. Request the request body.
service.presentations().batchUpdate(body={'requests': requests}, presentationId=PRESENTATION_ID).execute()
Note:
When above script is run, the widths of columns "A" and "B" of the table in each slide are modified. In this sample script, 100 and 300 PT are set for the columns "A" and "B", respectively. About this, please modify for your actual situation.
References:
Method: presentations.get
Method: presentations.batchUpdate
UpdateTableColumnPropertiesRequest

I need help in Python with displaying the contents of a 2D Set into a Tkinter Textbox

Disclaimer: I have only begun to learn about Python. I took a crash course just to learn the very basics about a month ago and the rest of my efforts to learn have all been research thru Google and looking at solutions here in Stack Overflow.
I am trying to create an application that will read all PDF files stored in a folder and extract their filenames, page numbers, and the contents of the first page, and store this information into a 2D set. Once this is done, the application will create a tkinter GUI with 2 listboxes and 1 text box.
The application should display the PDF filenames in the first listbox, and the corresponding page numbers of each file in the second listbox. Both listboxes are synched in scrolling.
The text box should display the text contents on the first page of the PDF.
What I want to happen is that each time I click a PDF filename in the first listbox with the mouse or with up or down arrow keys, the application should display the contents of the first page of the selected file in the text box.
This is how my GUI looks and how it should function
https://i.stack.imgur.com/xrkvo.jpg
I have been successful in all other requirements so far except the part where when I select a filename in the first listbox, the contents of the first page of the PDF should be displayed in the text box.
Here is my code for populating the listboxes and text box. The contents of my 2D set pdfFiles is [['PDF1 filename', 'PDF1 total pages', 'PDF1 text content of first page'], ['PDF2 filename', 'PDF2 total pages', 'PDF2 text content of first page'], ... etc.
===========Setting the Listboxes and Textbox=========
scrollbar = Scrollbar(list_2)
scrollbar.pack(side=RIGHT, fill=Y)
list_1.config(yscrollcommand=scrollbar.set)
list_1.bind("<MouseWheel>", scrolllistbox2)
list_2.config(yscrollcommand=scrollbar.set)
list_2.bind("<MouseWheel>", scrolllistbox1)
txt_3 = tk.Text(my_window, font='Arial 10', wrap=WORD)
txt_3.place(relx=0.5, rely=0.12, relwidth=0.472, relheight=0.86)
scrollbar = Scrollbar(txt_3)
scrollbar.pack(side=RIGHT, fill=Y)
list_1.bind("<<ListboxSelect>>", CurSelect)
============Populating the Listboxes with the content of the 2D Set===
i = 0
while i < count:
list_1.insert(tk.END, pdfFiles[i][0])
list_2.insert(tk.END, pdfFiles[i][1])
i = i + 1
============Here is my code for CurSelect function========
def CurSelect(evt):
values = [list_1.get(idx) for idx in list_1.curselection()]
print(", ".join(values)) ????
========================
The print command above is just my test command to show that I have successfully extracted the selected item in the listbox. What I need now is to somehow link that information to its corresponding page content in my 2D list and display it in the text box.
Something like:
1) select the filename in the listbox
2) link the selected filename to the filenames stored in the pdfFilename 2D set
3) once filename is found, identify the corresponding text of the first page
4) display the text of the first page of the selected file in the text box
I hope I am making sense. Please help.
You don't need much to finish it. You just need some small things:
1. Get the selected item of your listbox:
selected_indexes = list_1.curselection()
first_selected = selected_indexes[0] # it's possible to select multiple items
2. Get the corresponding PDF text:
pdf_text = pdfFiles[first_selected][2]
3. Change the text of your Text widget: (from https://stackoverflow.com/a/20908371/8733066)
txt_3.delete("1.0", tk.END)
txt_3.insert(tk.END, pdf_text)
so replace your CurSelect(evt) method with this:
def CurSelect(evt):
selected_indexes = list_1.curselection()
first_selected = selected_indexes[0]
pdf_text = pdfFiles[first_selected][2]
txt_3.delete("1.0", tk.END)
txt_3.insert(tk.END, pdf_text)

Interactive labeling of images in jupyter notebook

I have a list of pictures:
pictures = {im1,im2,im3,im4,im5,im6}
Where
im1:
im2:
im3:
im4:
im5:
im6:
I want to assign the pictures to labels (1,2,3,4 etc.)
For instance, here pictures 1 to 3 belong to label 1, picture 4 belongs to label 2, picture 5 to label 3, and picture 6 to label 4.
-> label = {1,1,1,2,3,4}
Since I need to see the images when I label them, I need a method to do that while labeling them. I was thinking of creating an array of images:
And then I define the ranges by clicking on the first and last picture belonging to the same labels, so for example:
What do you think ? Is this somehow possible ?
I would like to assign different labels to different ranges of pictures.
For instance: When one has finished selecting the first label one could indicate it by a Double-click and then do the selection of the second label range, then Double-click, then do the selection of the third label range, then Double-click, then do the selection of the fourth label range, etc.
It does not have to be Double-clicking to change the selection of the labels, it could also just be a buttom or any other idea that you might have.
In the end one should have the list of labels.
Essentially, most of the interaction you are looking for boils down to being able to display images, and detect clicks on them in real time. As that is the case, you can use the jupyter widgets (aka ipywidgets) module to achieve most (if not all) of what you are looking for.
Take a look at the button widget which is described here with explanation on how to register to its click event. The problem - we can't display an image on a button, and I didn't find any way to do this within the ipywidgets documentation. There is an image widget, but it does not provide an on_click event. So construct a custom layout, with a button underneath each image:
COLS = 4
ROWS = 2
IMAGES = ...
IMG_WIDTH = 200
IMG_HEIGHT = 200
def on_click(index):
print('Image %d clicked' % index)
import ipywidgets as widgets
import functools
rows = []
for row in range(ROWS):
cols = []
for col in range(COLS):
index = row * COLS + col
image = widgets.Image(
value=IMAGES[index], width=IMG_WIDTH, height=IMG_HEIGHT
)
button = widgets.Button(description='Image %d' % index)
# Bind the click event to the on_click function, with our index as argument
button.on_click(functools.partial(on_click, index))
# Create a vertical layout box, image above the button
box = widgets.VBox([image, button])
cols.append(box)
# Create a horizontal layout box, grouping all the columns together
rows.append(widgets.HBox(cols))
# Create a vertical layout box, grouping all the rows together
result = widgets.VBox(rows)
You can technically also write a custom widget to display an image and listen for a click, but I simply don't believe it's worth your time and effort.
Good luck!
The qsl package provides widgets that do this. For your case, the following code would allow you to label images in batches. Full disclosure, qsl is a project I started because I, like you, wanted to label images from inside Jupyter notebooks.
import qsl
from IPython.display import display
labeler = qsl.MediaLabeler(
items=[
{"target": "https://i.stack.imgur.com/cML6z.jpg"},
{"target": "https://i.stack.imgur.com/6EVAP.jpg"},
{"target": "https://i.stack.imgur.com/CAxUw.jpg"},
{"target": "https://i.stack.imgur.com/8fhan.jpg"},
{"target": "https://i.stack.imgur.com/eMXn5.jpg"},
{"target": "https://i.stack.imgur.com/YFBfM.jpg"}
],
# Optional, you can also configure the labeler from
# the UI.
config={
"image": [
{
"name": "Type",
"options": [
{"name": "Foo"},
{"name": "Bar"}
]
}
]
},
# Optional, set to 1 if you want to label
# one image at a time.
batch_size=4,
# Optionally, save labels to JSON. You
# can also get the labels using `labeler.items`.
jsonpath="labels.json"
)
display(labeler)
This generates a UI that looks like this.
Here is a Google Colab notebook that shows how to do this in Google Colab.

Adding a badge to an xpages applayout title bar node item

Is it possible to add a simple badge to a title bar node item? Each tab in my layout represents a separate application. My users would like to see a number next to the name on the tab to indicate whether there are documents in that application needing their attention. If no documents, then no badge, just the application name by itself. If there are documents, then display the application name and a badge for the number. I'm not finding a way to include the span tag in a PageLinkNode or BasicLinkNode.
PageLinkNode (xe:pageTreeNode) label can only be plain text.
You can use the image property though:
<xe:this.titleBarTabs>
<xe:pageTreeNode
page="..."
label="Application A"
image="#{javascript: var nr = 5;
nr > 0 ? ('badge' + Math.min(nr, 10) + '.gif') : ''}">
</xe:pageTreeNode>
Add to Resources/Images ten pictures badge1.gif, badge2.gif, ... badge10.gif with the numbers as pictures. badge10.gif would be a 9+ picture.
As an alternative, you could provide the number of documents as part of the label (e.g. "Your Application Name [23]") and convert it on client side onClientLoad event to HTML label + badge-span.

Resources