Import pdf into specific worksheet - excel

There are lots of topics about this item, but I wasn't able to find an answer to my question. I want to select a pdf file and import all the text from this file into a specific sheet, let's call it sheet2. Please note this is a new pdf file every day, so it cannot come from a specific location, but the file has to be selected every day.
Any ideas?

If you can waver the limitation of using only VBA, you could use iText (combined with Apache POI). iText is more than capable of extracting the text from a pdf file, and the apache POI library allows you to generate Office documents (such as excel workbooks).

Related

How can extract text from pdf to excel

I have Scanned Image, I converted it to pdf file, the content of image are rows and columns (table), I want Extract the text from table to excel file, any Idea? any good website or tool or program can I use it
I tried to use a lot of websites to extract text, but it does not work
Do you have Microsoft Excel? If you do, then first convert the PDF to a JPEG.
And with that, go to Microsoft Excel
Create a New Document
Go to the Data Tab
Choose "Data From Picture"
Choose Picture From File
You'll see a couple of instructions. Follow them to complete the process of converting the picture to table.
You'll also have the option of correcting any inaccuracies before adding them to your spreadsheet.
That's all!

Convert excel file to PDF using Java API

I have an Excel file that has multiple graphic content (a normal excel file).
I am reading this file using POI API in java. I am able to convert the file to PDF table using iText jar.
But, the whole format is not copied into the PDF. (e.g., merged cells come into one column, and other formatting or settings are all gone).
A simple pdf table is created like below.
However when I convert with MS Excel and save as PDF i get the below output which has all the formatting details and looks perfect.
How do i retain the same format as in excel? I have done some googling on this and got some methods like OpenOffice API but all of them convert files on local machine. What if I send my tomcat build to client machine where OpenOffice is not installed ? Need some solution for that. Any help would be useful.

Macro which search keywords in pdf and give page number

I want a excel macro which search words in PDF and give the page number where macro finds the words. I have 20 words that I want to search in PDF. I have put the keywords in coulmn A of the excel spreadsheet and I want to populate the page number in coulmn b. Please note that I am currently using Adobe reader XI, so please help me with the code which also work in Adobe reader XI.
This is more of a direction and not an answer.
Try searching for command line tools that will export ocr data into a text file. I've looked at them before and a few gave me the option of looking at the particular page of a pdf. All of these tools require a purchase (I was trying to OCR a barcode and I could not find a free tool for this) but there are some free ones out there.
But using excel will make this project harder. I would look at using powershell or some other scripting language and exporting the results into a csv file.
Hope this helps.

Reading custom data format into Excel VBA

I have a file of extension *.qub that I would like to read into Excel. The file contains spacecraft instrument data. The file has two parts, a SFDU-defined text header followed by binary data.
Ideally, we would like the user to be able to access the files using Excel's built-in File-->Open and File-->Save functionality to import/export this *.qub format.
What is the best way to create a custom file reader/writer in Excel?
Thank you!

save a cfdocument as an excel file

is there a workaround to use the cfdocument tag to save a page/file as an excel sheet instead of a PDF file?
I already have a process set up to make pdf files and email them out and would like to give my customers the option of getting an excel file instead. It would be nice if I could reuse the code I already have instead of having to rewrite it in POI or something like that.
The type of data witihn a PDF is not usually the same type of data that makes sense for Excel. That being said, there are multiple other ways to create Excel spreadhseets.
In ColdFusion 9, it's native. Just use the cfspreadsheet. In CF8... well, it looks like you have POI. So use that. ;) Ben Nadel also has a nice wrapper for POI so you can consider that too.
The thing is - you will not be able to go from CFDOCUMENT to a spreadsheet since it is really a different type of data.
In ColdFusion 9 use the cfspreadsheet tag and/or spreadsheet functions. That creates a real Excel file.
In ColdFusion 8 and below the easiest way is to use the html table > Excel hack/trick. Put your data in a standard html table, save them in a file with a .xls extension and email them to your users. When the user opens the file Excel will convert the html table to an Excel spreadsheet. You could also send the content to the browser by adding at the top of the page. With this method make sure that you are only sending an html table for best results.

Resources