Extract data from tables stored in the form of an image and store in csv/excel file - conv-neural-network

I am trying to detect tables stored in the form of an image in a multipage pdf and extract the contents of the table and store it in a csv/excel file.
I am able to detect and tabulate the contents of the tables where there are horizontal and vertical lines for row and column separation.
Eg, table with horizontal and vertical lines
However, if the tables have no boundaries or any horizontal or vertical lines for row and column separation, I am unable to detect the table and extract the contents within the table.
Example of sample images:
table with no horizontal or vertical lines 1
table with no horizontal or vertical lines 2
There are two parts to this problem. Firstly, I need to detect tables from a multipage pdf and secondly, to extract the contents of the tables and store it in a csv/excel file.
Is there any existing solution to this problem?

You could a tool like this: https://extracttable.com/. I used it before and it seems to be extracting table data accurately.

Related

Hiding certain columns on an Excel table

I've been trying to hide table columns on my Excel spreadsheet. While I can hide entire columns if my data was not in table form, this is something I cannot do because of the information that is underneath the table. For the purposes of this spreadsheet, that information needs to be below. So I can't really convert the table and I can't hide the information that is irrelevant.
Does anyone have a solution for this (this seems like a basic problem but I'm relatively new to Excel)?
You don't mention if that table above moves in number of rows or not but another option is to Data ---> GROUP the rows of the table and then collapse them. Select ALL rows relevant to the table and then click GROUP. To left of row numbers you'll have a line to click (with a + or -) to expand or collapse the data. This will visually look like only the data below is present and you can set print ranges to only look at the data below.
Hope that helps
You can only hide full columns. If hiding the data in the table is important, then the data below needs to be moved to a different sheet. Or, if it only needs to be hidden when printed, then you can change the font color to match the background color.

Is there a way to shrink an Excel table to fit the data?

I have an Excel table that has several other places in the spreadsheet using for various reasons, and then I realized that the table had bad data. I collected new data, and there were fewer rows in the new data set than in the previous table.
Is there a way I can simply shrink the table to reflect the new count of data?
Not at all sure I understand your requirement but I'm guessing you want to reduce the size of a Tables/Table without deleting entire rows in your spreadsheet (because of content present elsewhere in your spreadsheet in the same rows as your Table data).
If so, merely select the area of your table to be deleted and press Delete. If you want to remove the formatting that remains select the angle icon shown at the extreme bottom right of your Table and drag it up to suit.
I am assuming the (Table) rows to be deleted are a contiguous block at the bottom of the Table.

dynamic linking of word and excel

This is my first question and it would have been nice to include an image but it seems that I can't. I have seen some answers to my problem but they always seen to be in one direction. Excel->Word. However I want Word->Excel->Word as described below.
I have an excel workbook that draws a graph. Some of the input data is pre-calculated and one or two parameters are entered manually and entered in a table in the spreadsheet.
I want to do the following.
The graph and the data entry table are to be displayed in word. The table in Word is also part of the spreadsheet but must be displayed separately. I want to be able to enter the variables in the table in the word document. These values are linked to the excel sheet, the graph will be changed and the actualised version displayed in the word document. I have attached two images, the data entry able. The entry fields are those in green and an image of the resulting graph. Any help would be appreciated.
I can't see your pictures, so the following is theoretical.
1) Open Word
2) Insert Chart - this opens a instance of Excel which is directly linked to the Chart, which I'll refer to as ExcelChartData to avoid confusion.
3) Edit the range in ExcelChartData to match what you have in your Excel data workbook.
4) Link the ranges in ExcelChartData to the required ranges in your Excel data workbook.
Changing the values in your Excel data workbook will then change the chart.

Create charts with interlaced data

Is it possible to create charts using interlaced data in Excel 2007?
Example: 3 parts are in a test, and the app that collects data from them creates a CSV file with one line per part.
Example:
Part#,Voltage,Freq,Mode
001,3.453,6546,1000
002,3.542,6543,1000
003,3.484,6654,1000
001,3.453,6543,1000
002,3.642,6764,1000
...etc.
I would like to create a chart that plots the data from these three parts in their own series. I know that autofilter can be used to hide data from parts I don't want, but that only allows me to see one part at a time.
Furthermore, I would like to automate this process to create charts for an arbitrary number of parts. Is there a filter function I can use somewhere?

Different results exporting to CSV or Excel

I have a simple report that I want to export to a CSV file. There is only the detail line that is grouped by one field, no group header, and a group footer for totals. The problem is when I export to CSV format, the total row for a group is listed in front of every record?
If I export to Excel and then save as a CSV file, the total row is where it belongs. However one field is spread across 3 columns then those columns are "merged and centered" which adds two commas in the middle of the line. And one column is added at the beginning of the record and two at the end of the record, for 3 more extra commas.
It would be easy enough to write a macro to "clean up" the spread sheet and export as a csv file for my end users. However corporate "insecurity" will not allow the end users to have macros.
Any help, suggestions, pointers to where else to look greatly appreciated.
cheers
bob
The CSV generated by any standard reporting tool does a flat data structure and hence would repeat all data set.
The XLS generated by the reproting tools are typically to be opened in the XLS and its XLS default behaviour to put additional commas for every merged cell.
The best way is to create a report with a layout that has equal data length columns even for the header, ie while formatting the report do not put the header in the center with larger lenght, bold and italics etc, put it as the first column and match the lenght with the data in the detail record.
This way you would be able to create a report that does not look presentable in XLS but would give you required data in the CSV

Resources