Convert word document to excel - excel

I have a word document which needs to be converted to a table.
The catch however is, that the document contains a thousand pages and each page, needs to be an individual cell in the excel sheet. When I copy paste from Word, each line gets converted to one cell which i don't want. I need all the content between two page breaks to be a part of one cell.
To give some background on the issue, I need to basically create a csv from the the word file such that each page from the document is one value, hence I am trying to create a table.
Is there a way with which, this can be automated?

Found my solution here :
https://superuser.com/questions/747197/how-do-i-copy-word-tables-into-excel-without-splitting-cells-into-multiple-rows
It basically involved replacing 'pilcrow' characters into my file for line breaks and doing vice versa in excel.
One important thing though, the article says to type 'alt+0010' (the key combination for line break) something while replacing pilcrows in excel. However, that did not work for me. Ctrl+J does the trick though, it inserts line break character in excel replace box.
Cheers :)

Related

office 365 excel csv hyperlink not displaying correctly when imported to excel [duplicate]

Can Excel interpret the URLs in my CSV as hyperlinks? If so, how?
You can actually do this and have Excel show a clickable link. Use this format in the CSV file:
=HYPERLINK("URL")
So the CSV would look like:
1,23.4,=HYPERLINK("http://www.google.com")
However, I'm trying to get some links with commas in them to work properly and it doesn't look like there's a way to escape them and still have Excel make the link clickable.
Does anyone know how?
With embedding the hyperlink function you need to watch the quotes. Below is an example of a CSV file created that lists an error and a link to view the documentation on the method that failed. (Bit esoteric but that's what I am working on)
"Details","Failing Method (click to view)"
"Method failed","=HYPERLINK(""http://some_url_with_documentation"",""Method_name"")"
I read all of these answers and some others but it still took a while to work it out in Excel 2014.
The result in the csv should look like this
"=HYPERLINK(""http://www.Google.com"",""Google"")"
Note: If you are trying to set this from MSSQL server then
'"=HYPERLINK(""http://www.' + baseurl + '.com"",""' + baseurl + '"")"' AS url
you can URL Encode your commas inside the URL so the URL is not split across multiple cells.
Just replace commas with %2c
http://www.xyz.com/file,comma.pdf
becomes
=hyperlink("http://www.xyz.com/file%2ccomma.pdf")
Yes, but it's not possible to link them automatically. CSV files are just text files - whatever opens and reads them is responsible for allowing you to click the link.
As to how Excel seems to handle CSV files - everything between commas is interpreted as if it already had been typed into the cell. Therefore, the CSV file containing ="http://google.com",=A1 will display as http://google.com,http://google.com in Excel. It's important to note, however, that hyperlinks in Excel are metadata, and not the result of anything in the actual cell (ie, a hyperlinked cell to Google still contains http://google.com not <a>http://google.com</a> or anything of that sort.)
Since that's the case, and all metadata is lost when converting to a CSV, it's impossible to tell Excel you wish for something to be hyperlinked merely by changing the cell value. Normally, Excel interprets your input when you hit 'Enter' and links URLs then, but since CSV data is not being entered, but rather already exists, this does not happen.
Your best bet is to write some sort of addon or macro to run when you open up a CSV which parses every cell and hyperlinks them if they match a URL format.
Use this format:
=HYPERLINK(""<URL>"";""<LABEL>"")
e.g.:
=HYPERLINK(""http://stackoverflow.com"";""I love stackoverflow!"")
P.S. The same format works in LibreOffice Calc as well.
"=HYPERLINK(\"\" " + "http://www.mywebsite.com"+ "\"\")"
use this format before writing to CSV.
As described above, "=HYPERLINK(""http://www.google.com"", ""Google"")" is what worked for me.
However, In Excel Version 2204 Click to Run, I couldn't have leading white space.
For example;
FirstName, "=HYPERLINK(""http://www.google.com"", ""Google"")" fails
FirstName,"=HYPERLINK(""http://www.google.com"", ""Google"")" success
The issue here for me was that because a .CSV by it's nature is Comma separated, any commas in the text file are interpreted as separators. It worked for me by using tab characters as separators, saving it as a .TXT file so that when opened in EXCEL you choose the TAB character rather than ','.
In the text file …
## ensure that the file is TAB separated
Item 1 A file Name data.txt
Item 2 Col 2 =HYPERLINK("http:\www.ilexuk.com","ILEX")
"ILEX" then is shown in the cell and "http:\www.ilexuk.com" is the hyperlink for the cell.

Copy incorrect words in excel

I need to find and copy a word(s) in a string. The condition is that the word is an incorrect one. Essentially, it's something like copy all words that has wiggle red underline in browser,MS Words, etc.
I am doing this to extract the brand names in hundred of thousand of free text cells. Since the brand names are usually not words in dictionary (for searchability and identifiablity) , this approach would help find the majority of them.
It doesn't have to be an excel functionality, I am open to any tool that works.
moving them directly into excel is tedious, shown by the link in the previous answer. If you would like a generated list of the misspelled words, follow the instructions on this site:
http://www.techrepublic.com/blog/microsoft-office/a-word-macro-that-highlights-and-lists-misspelled-words/
The code copies the misspelled words into a new document for you, so they will be isolated from your original document. Then you can apply any formatting or data analyses if you need it.

excel breaks content in row to another row

i have an excel sheet that i have exported from a website, i have noticed that in some particular rows the content jumps to a new line. i have searched online, but no credible answer to my problem
what is the cause of this and how can it be solved.
i have even tried to copy them one by one to make them be on the same line, but i cant keep on doin that
here is a link to my file.
download
so that you can have a view of what i am talking about
The address field in your file contains newlines in certain records. I suggest you open the file in Notepad and join these lines together before importing the file (make sure you turn word wrap off to see the lines correctly).

How do I stop MS Word from auto-left-aligning new paragraphs generated from linked Excel objects?

I am created a form-letter using an Excel spreadsheet as a forming tool connected to a database and using paste-link to connect the results to an MS Word document.
Each section of the document is given a single cell to draw from which utilizes a formula to comprise itself of several other cells based on a logic determinate upon the data from the database queries.
All of this functions perfectly well.
The problem arises when the generated blocks of text from Excel include two carriage-returns in a row, creating what MS Word thinks is a new paragraph (and technically it is). The rest of the letter is justified, and I have attempted to set justified text as the default alignment. But no matter what I try, any newly formed paragraphs generated inside of linked text from Excel will be left-aligned.
For this form letter to function properly it must have justified text throughout. Inconsistent formatting won't be accepted by management.
To be clear, I have attempted to modify the settings of the "Normal" style of the document in Word, as well as creating a new style based on Normal called "Justified" and setting that as the default by selecting it and clicking "Change Styles" -> "Set as Default".
The first paragraph of any given block will always remain justified-aligned, it is only subsequent, newly-created (as far as MS Word knows) paragraphs that aren't. So I suspect I am just not setting the default properly or...I don't know, something.
I tried linking as unformatted text but that, for some maddening reason, includes QUOTATIONS MARKS bookending the text! I'm baffled and frustrated.
Please help. I don't like to look the fool at work.
While I still do not know how to make Word insert new paragraphs into linked blocks of text without left-aligning them, I have a working solution to my particular problem.
By forcing my spreadsheet to create blocks of text with the maximum number of paragraphs, then forcibly justifying the output in MS Word, I was able to ensure that, as long as I close the document between updates, that the text blocks will only shrink in size, rather than grow. This way, Word does not recognize the updated text as "new" paragraph, as there was already a paragraph in that block.
I saved the Word document with this overabundance of paragraphs, and put the Excel spreadsheet back the way it was.

Excel - Variable number of leading zeros in variable length numbers?

The format of our member numbers has changed several times over the years, such that 00008, 9538, 746, 0746, 00746, 100125, and various other permutations are valid, unique and need to be retained. Exporting from our database into the custom Excel template needed for a mass update strips the leading zeros, such that 00746 and 0746 are all truncated to 746.
Inserting the apostrophe trick, or formatting as text, does not work in our case, since the data seems to be already altered by the time we open it in Excel. Formatting as zip won't work since we have valid numbers less than five digits in length that cannot have zeros added to them. And I am not having any luck with "custom" formatting as that seems to require either adding the same number of leading zeros to a number, or adding enough zeros to every number to make them all the same length.
Any clues? I wish there was some way to set Excel to just take what it's given and leave it alone, but that does not seem to be the case! I would appreciate any suggestions or advice. Thank you all very much in advance!
UPDATE - thanks everybody for your help! Here are some more specifics. We are using a 3rd party membership management app -- we cannot access the database directly, we need to use their "query builder" tool to get the data we want to mass update. Then we export using their "template" format, which is called XLSX but there must be something going on behind the scenes, because if we try to import a regular old Excel, we get an error. Only their template works.
The data is formatted okay in the database, because all of the numbers show correctly in the web-based management tool. Also, if I export to CSV, save it as a .txt and import it into Excel, the numbers show fine.
What I have done is similar to ooo's explanation below -- I exported the template with the incorrect numbers, then exported as CSV/txt, and copied / pasted THOSE numbers into the template and re-imported. I did not get an error, which is something I guess, but I will not be able to find out if it was successful until after midnight! :-(
Assuming the data is not corrupt in the database, then try and export from the database to a csv or text file.
The following can then be done to ensure the import is formatted correctly
Text file with comma delimiter:
In Excel Data/From text and selected Delimited, then next
In step 3 of the import wizard. For each column/field you want as text, highlight the column and select Text
The data should then be placed as text and retain leading zeros.
Again, all of this assumes the database contains non-corrupt data and you are able to export a simple text or csv file. It also assumes you have Excel 2010 but it can be done with minor variation across all versions.
Hopefully, #ooo's answer works for you. I'm providing another answer mainly for informational purposes, and don't feel like dealing with the constraints on comments.
One thing to understand is that Excel is very aggressive about treating "numeric-looking" data as actual numbers. If you were to open the CSV by double-clicking and letting Excel do its thing (rather than using ooo's careful procedure), those numbers would still have come up as numbers (no leading zeros). As you've found, one way to counteract this is to append clearly nonnumeric characters onto your data (before Excel gets its grubby hands on it), to really convince Excel that what it's dealing with is text.
Now, if the thing that uploads to their software is a file ending in .xlsx, then most likely it is the current Excel format (a compressed XML document, used by Excel 2007 and later). I suppose by "regular old Excel" you mean .xls (which still works with the newer Excels in "compatibility mode").
So in case what you've tried so far doesn't work, there are still avenues to explore before resorting to appending characters to the end of your data. (I'll update this answer as needed.)
You're on the right track with the apostrophe.
You'll need to store your numbers in excel as text at the time they are added to the file.
What are you using to create the original excel file / export from database?
This will likely be where your focus needs to be regarding your export.
For example one approach is that you could potentially modify the database export to include the ' symbol prefix before the numbers so that excel will know to display them as text.
I use the formula =text(cell,"# of zeros of the field") to add preceding zeros.
Example, Cell C2 has 12345 and I need it to be 10 characters long. I would put =text(c2,"0000000000").
The result will be 0000012345.

Resources