I have a csv file which is filled automatically through a java programm. I have a line which have the following text when I open the text in Notepad++:
-LRB- from the PMI Practice Standard for Work Breakdown Structures , Oct 2000 -RRB- '',"no","f1_FRAG:1.0","f2_specialChar:1.0","f3:15.0","f4:7.0","f5:0.0","f6:2.0","f7:0.0","f8:3.7612001156935624","f9:7.0","f10:1.0","f11:1.0","f12:0.0","f13:0.0","f14:0.0,"f15_ROOT:1.0","f16_specialChar:1.0","f17_NOTHING:1.0","f18_IN:1.0""
But when I open it in excel sheet, there are two problems:
1) When I click on the cell, I see #Name error and any click on the page causes an error. I even can't close the excel window normally. I also sometimes see something like =A228 or =B223 when I click on the cell. It sounds to be read as a formula, but it actually isn't.
2) The row is not shown completely. I can't see this part when I open the file using office excel:
",f15_ROOT:1.0","f16_specialChar:1.0","f17_NOTHING:1.0","f18_IN:1.0"".
Any help is appreciated.
Since the row starts with a - (minus sign), Excel is expecting a formula.
Manually, you could either:
add an ' (apostrophe) at the beginning of the line (which tells Excel that the cell contains text), or
Format the cell as text : Right-click the cell โ Format Cells โ Number tab โ Text
Ideally, to prevent this issue in the future, the Java program which generates the .CSV file should be changed to enclose text fields with " double quotation marks.
Oddly, that is the only field in your example that isn't surrounded by double quotes.
"-LRB- from the PMI Practice Standard for Work Breakdown Structures , Oct 2000 -RRB- ''","no","f1_FRAG:1.0","f2_specialChar:1.0","f3:15.0","f4:7.0","f5:0.0","f6:2.0","f7:0.0","f8:3.7612001156935624","f9:7.0","f10:1.0","f11:1.0","f12:0.0","f13:0.0","f14:0.0,"f15_ROOT:1.0","f16_specialChar:1.0","f17_NOTHING:1.0","f18_IN:1.0""
At the minimum, double-quotes should the used around any fields that begin with a symbol or contain a comma (like above).
1997,Ford,E350,"Super, luxurious truck"
The double-quotes will be recognized and removed by most apps that open CSV's.
Any field may be quoted (that is, enclosed within double-quote characters). Some fields must be quoted, as specified in following
rules.
"1997","Ford","E350"
Fields with embedded commas or double-quote characters must be quoted.
1997,Ford,E350,"Super, luxurious truck"
Each of the embedded double-quote characters must be represented by a pair of double-quote characters.
1997,Ford,E350,"Super, ""luxurious"" truck"
.
More about Comma Separated Value files:
Wikipedia: CSV Files - Basic Rules
RFC 2046 Standard
RFC4180 Standard
.
Surprisingly, I can't find any reference document from Microsoft that mentions starting text cells with an apostrophe. (I guess it's a secret, so if anyone asks, you didn't hear it from me.) :-)
The reason you are getting the #NAME error specifically is because Excel figures you're trying to enter a formula (because of the minus sign) but it doesn't recognize the Name of the function ("LRB")
Related
I have a question, hopefully you may be able to help. I have been using MS Excel 2016 where i have number of data like in attached below file.
Where FILTERXML works for some cells and pulls the data but it also does not work for some cells which has similar data and returns empty.
I tried and clean the cell data while seeing to other cells data, which result is appearing with formula but it was not still working.
Your help will be greatly appreciated.
=IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE($A2,B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")
https://drive.google.com/file/d/1Xv4BUh8sObhVirqYK2fsLfwsydvrs4rv/view?usp=sharing
Your formula uses XML filtering using XPATH. But for XML there are some characters having special meaning and must be escaped. For example your text Pharmacy: 1ST AID PHARMACY & SURGICAL SUPPLIE would must be Pharmacy: 1ST AID PHARMACY & SURGICAL SUPPLIE in XML. This is because & marks the start of an entity in XML and needs to be escaped if it not has that meaning.
So you would must substtute all & with &.
=IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,"&","&"),B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")
But there are multiple other such characters. Character < for example would must be <. This is because < marks the start of a tag in XML and needs to be escaped if it not has that meaning.
=IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,"<","<"),"&","&"),B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")
Maybe there are others too, so the SUBSTITUTE chain gets longer and longer.
I am trying to parse a CSV file which has single quote as text qualifier. The problem here is that some values with single quote text qualifier itself contains single quote
e-g:
'Fri, 24 Feb 2017 17:44:57 +0700','th01ham000tthxs','/','','Writer's Tools Data','7.1.0.0',
I am struggling to parse the file as after this row, all of the remaining rows get displaced.
I tried working with OpenCSV, UnivocityParsers but didn't get any luck.
If I place the above row in excel (Excel Image) and provide text qualifier as single quote, it give correct result without any displacement of rows.
If using java, the JRecord library should handle the File.
How it works: if a field starts with a quote (e.g. ,') specifically look for ', or ''', or ''''', or ' etc (an odd number of quotes followed by either a comma or end-of-line marker). This approach breaks down if:
The embedded quote is the last character in a field i.e. 'Field with quote '',
White space between the quote and comma i.e. 'Field' , or , '
Here is the line in ReCsvEditor
Also in the ReCsvEditor when editing the file, if you select Generate >>> Java Code >>> ... it will generate Java/JRecord Code to read the file.
Disclaimer: I am the author of JRecord / ReCvEditor. Also the ReCsvEditor Generate function is new and needs more work
Try configuring univocity-parsers to handle the unescaped quote according to your scenario. 'Writer's Tools Data' has an unescaped quote. From your input, I can see you want to use STOP_AT_CLOSING_QUOTE as the strategy to work around these values.
Add this line to your code and it should work fine:
parserSettings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);
Hope this helps.
I am entering a string in a Userform in Excel-VBA from the user side of the form. I would want to know how to enter the Long Hyphen.
The Small Hyphen would be Shift + - the (the minus sign button next to 0).
How would you enter the Long Hyphen on the form as I am doing a string match in my VBA code on the back-end? It can be entered with Alt+0150, but is there another simpler way?
If the option of entering the Long Hyphen doesn't work then I will handle this value on the back-end through a find and replace method or something in VBA.
๐ There is only one "Hyphen"... โโ It's part of a แดแฉแฐแฅแชฦณ แดา แแผแฉแแฉแไธ
แดแแ called Dashes.
Examples:
Hyphen [-] (-)
Minus sign [โ] (โ)
En dash [โ] (โ)
Em dash [โ] (โ)
โฑ Your browser might render them differently, but the fonts above are supposed to be [Consolas or Courier 13px] and (Arial or Helvetica 15px). While they all kind of look the same in this font, those are four different characters.
The characters can be copy/pasted directly from here - โ โ โ into Excel (but not to the the VBA Editor), or can be produced along with 136,686 other Unicode characters, either:
with a worksheet formula, using the แดษดษชแดสแดส function:
=UNICHAR(9733) 'produces a [โ
] star character.
programmatically with VBA, using the ChrW function:
Range("A1") = ChrW(9743) 'puts a wee [โ] rotary phone in cell A1.
More about dashes โ
What they are
How to use them
"Stealing" Unicode characters from websites
There are all sorts of handy Unicode symbols โ so many that it can be hard to find "just" the right one.
However you can (and will!) find other Unicode symbols on web pages that you want to use programmatically. All you need is the symbol's code, which can be determined easily:
Copy/paste the symbol from your browser (being careful to copy only the single character; no spaces, etc.) into a cell in Excel
then, go to VBA's Immediate Window ( AltF11โCtrlG ), type:
MsgBox AscW([a1])
...(Where A1 is the cell with the character). Hit Enter, and the character's Unicode code will display.
You could also use Windows' built-in Character Map utility, or one of many third-party browser plug-ins. You can even paste a symbol directly into Google to learn more about it:
Finally, no discussion on the topic of Unicode would be complete without links to:
โโ๐ โThe Unicode Consortium (unicode.org) and,
โโ๐ โโศถษฆษ ๐๐ธ๐ธ๐ต ๐ฑ๐๐๐๐ โโโงโฃ ๐พ๐๐๐๐ฃ๐๐ฅ๐ ๐ฃโ โ(Yes, those are all plain text characters.)
...and if you're looking for a unique gift, or just want your name to go down in history for something that really matters, you can even adopt a Unicode character, starting at $100 แดsแด
!
Special thanks to Vinton Cerf for adopting the Unicode Leonard Nimoy's "Live Long and Prosper" symbol.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ๐
I just got a CSV input file to be processed, which has an equal-sign before the first delimiting quote, and wondered if this is valid and has any purpose. Example (simplified):
"2"
"3"
="4"
After reading some postings like this one I experimented with a CSV like this:
"2"
"3"
="A1+A2"
and:
"2"
"3"
"=A1+A2"
It seems that both Excel and LibreOffice silently ignore the equal-sign before the quote, and nicely treat the equal-sign after the quote as the flag for a formula. However, I could not find any documentation about this.
(For Excel, this CSV needs to be saved with the .txt extension, and opened with control-O)
I am inclined to call the CSV with equal-sign before the open quote as an error that is easy to deal with when reading this file, but still wondering if there is more to say about this.
This is used by Excel to avoid the loss of leading zero's.
For example, if you have a field in your csv file like this: 0123456, Excel will treat it as a number and lose the leading zero.
Saving it as ="0123456" solves this problem.
Using "0123456" won't help either, because quotes are not there to indicate a text field, but to escape possible delimiters inside fields.
Just like having sep=; on the first line to make Excel use the right seperator, the ="" is also 'non-standard', or better: Excel specific, because there is no real standard for csv files.
Excel isn't ignoring = in ="4" or ="A1 + A2", it is treating it as a constant formula.
If you open the csv file that looks like:
"2"
"3"
="4"
="A1+A2"
"=A1+A2"
in Excel the result looks like:
Note how A3 holds the formula ="4" rather than just the number 4.
There is no official standard for CSV. As it says at Comma-separated values,
An official standard for the CSV file format does not exist, but RFC 4180 provides a de facto standard for many aspects of it.
Looking at the RFC 4180, a field is either escaped or non-escaped. The escaped field has a BNF defined like this:
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
Since the equals sign is not a part of the escaped characters, it may be like the "Free Parking" in Monopoly: The rules say nothing regarding it, but the de facto standard is to place $500 under it.
If you have a .csv file with the following contents:
"2"
"3"
="4"
and open it in Excel, you will see:
As you see. Excel discards the double quotes on the first two items and converts the third item into a formula.
That is how Excel functions.
If you want to get the the exact text into Excel (retaining the double quotes) you could use a macro.
How do I get the last character of a string using an Excel function?
No need to apologize for asking a question! Try using the RIGHT function. It returns the last n characters of a string.
=RIGHT(A1, 1)
=RIGHT(A1)
is quite sufficient (where the string is contained in A1).
Similar in nature to LEFT, Excel's RIGHT function extracts a substring from a string starting from the right-most character:
SYNTAX
RIGHT( text, [number_of_characters] )
Parameters or Arguments
text
The string that you wish to extract from.
number_of_characters
Optional. It indicates the number of characters that you wish to extract starting from the right-most character. If this parameter is omitted, only 1 character is returned.
Applies To
Excel 2016, Excel 2013, Excel 2011 for Mac, Excel 2010, Excel 2007, Excel 2003, Excel XP, Excel 2000
Since number_of_characters is optional and defaults to 1 it is not required in this case.
However, there have been many issues with trailing spaces and if this is a risk for the last visible character (in general):
=RIGHT(TRIM(A1))
might be preferred.
Looks like the answer above was a little incomplete try the following:-
=RIGHT(A2,(LEN(A2)-(LEN(A2)-1)))
Obviously, this is for cell A2...
What this does is uses a combination of Right and Len - Len is the length of a string and in this case, we want to remove all but one from that... clearly, if you wanted the last two characters you'd change the -1 to -2 etc etc etc.
After the length has been determined and the portion of that which is required - then the Right command will display the information you need.
This works well combined with an IF statement - I use this to find out if the last character of a string of text is a specific character and remove it if it is. See, the example below for stripping out commas from the end of a text string...
=IF(RIGHT(A2,(LEN(A2)-(LEN(A2)-1)))=",",LEFT(A2,(LEN(A2)-1)),A2)
Just another way to do this:
=MID(A1, LEN(A1), 1)