I have a command button that inserts data in to a worksheet from a userform, it then saves it as a .csv.
We then load the csv data using another userform - However, the problem arises when a Carriage Returns is inserted into a text box and inserted. Obviously the first solution I can think of is to stop commas being entered - however, is there a better solution?
#Miguel has the right idea (his comment is on the main question thread).
This approach involves defining a list of integers relating to the ASCII codes for the characters you're having trouble with. Line feed (10) will definitely be in there and you can decide on carriage return (13) commas (44) or speech marks (34).
const ListOfSpecialChars = "10,13,34,44"
You'll need an 'encoding' proc that accepts a string and outputs a string.
It would transform text in this kind of fashion:
ab,cd"ef -> ab<<44>>cd<<34>>ef
This would be achieved by splitting the const and looping through each of the constituents, executing a replace:
For Each splitCharVal In Split(ListOfSpecialChars, ",")
stringToEncode = Replace(stringToEncode, Chr(splitCharVal), "<<" & splitCharVal & ">>"
Next
You'll also need a 'decoding' proc that does the opposite, which I'll let you work out.
So, when saving a file to CSV, you'll need to loop through the cells of each row in turn, encoding the text found within, then writing out a row to the file.
When reading in a row from the encoded CSV, you'll need to run the decode operation prior to writing out the text to the worksheet.
Related
I am importing a dataset from excel using the built in import data wizard. However, when viewing the data in SAS, cells with newlines have all line feeds (alt+Enter) replaced with a period (.)
For example, in excel:
"Example text
with new line"
will be read in by SAS as:
"Example text.with new line"
Usually line feeds or carriage returns are replaced by spaces, where the hex code (if you format the text as hex) is 0A. When I convert the text in excel to hex in excel using a formula, the new line feeds also show up as 0A.
However, the hex code for the period in my text (what used to be a line return in excel) is 2E, rather than the expected 0A. This prevents me from differentiating them from normal full stops, which means there's no possible workaround. Has anyone else come across this issue? Is there an option to change/set the default line feed replacement character in SAS?
My import code (variables replaced with 'text' for simplicity) for reference:
data work.table;
length
text $ 50;
label
text = "Text"
format
text $CHAR50;
informat
text $CHAR50;
infile 'path/to/file'
lrecl=1000
encoding='LATIN9'
termstr=CRLF
dlm='7F'x
missover
dsd;
input
text $CHAR50;
run;
SAS Viewer will not render so called non-printables (characters <= '1F'x) and does not display carriage return characters as a line break.
Example:
Excel cell with two line breaks in the data value
Imported with
proc import datafile='sample.xlsx' out=work.have;
run;
and viewed in standard SAS data set viewer (viewtable) appear to have lost the new lines.
Rest assured they are still there.
proc print data=have;
var text / style = [fontsize=14pt];
format text $hex48.;
run;
I would not recommend using the Import Wizard; there are far better tools nowadays. EG's import wizard is unique in SAS tools in how it works, and really was meant only to supply a way for data analysts who were not programmers to quickly bring in data; it's not robust enough for production work.
In this case, what's happening is that SAS's method for reading the data in is very rudimentary. What it does is convert it to a delimited file, and it doesn't handle LF characters very cleanly there. Instead of keeping them, which would be possible but is riskier (remember, this has to work for any incoming file), what it does is convert those to periods.
You'll see that in the notes in the program it generates:
Some characters embedded within the spreadsheet data were
translated to alternative characters so as to avoid transmission
errors.
It's referring to the LF character in that case.
The only way to get around this that I'm aware of is to either:
Convert the file to CSV from Excel yourself, and then read it in
Use ACCESS to PC FILES (via PROC IMPORT, or the checkbox in the import wizard)
Either of those will allow you to read in your line feed characters.
I am using Excel for Mac 2016 on macOS Sierra software. Although I have been successfully copying and pasting CSV files into excel for some time now, recently, they have begun to behave in an odd way. When I paste the data, the content of each row seems to split over many columns. Where as before one cell would have been able to contain many words, it seems now as though each cell is only able to contain one word, so it splits the content of what would normally be in one cell, over many cells, making some rows of data spread out over up to 100 columns!
I have tried Data tab>> From text>> which takes me through a Text Wizard. There I choose Delimited>> Choose Delimiters: Untick the 'Space' box ('Tab' box is still ticked)>> Column data as 'General'>> Finish. Following this process appears to import the data into its correct columns. It works. BUT, a lot of work to get there!
Question: Is there any way to change the default settings of Delimiters, so that the 'Space' delimiter does not automatically divide the data?
I found an answer! It has to do with the "Text to Columns" function:
The way fix this behavior is:
Select a non-empty cell
Do Data -> Text to Columns
Make sure to choose Delimited
Click Next >
Enable the Tab delimiter, disable all the others
Clear Treat consecutive delimiters as one
Click Cancel
Now try pasting your data again
I did the opposite regarding "consecutive delimiters"!
I put a tick in the box next to "Treat consecutive delimiters as one", and THEN it worked.
Choose delimiter directly in CSV file to open in Excel
For Excel to be able to read a CSV file with a field separator used in a given CSV file, you can specify the separator directly in that file. For this, open your file in any text editor, say Notepad, and type the below string before any other data:
To separate values with comma: sep=,
To separate values with semicolon: sep=;
To separate values with a pipe: sep=|
In a similar fashion, you can use any other character for the delimiter - just type the character after the equality sign.
For example, to correctly open a semicolon delimited CSV in Excel, we explicitly indicate that the field separator is a semicolon:
reference
I have a file that is extracted from a system in a format over which I have no control. It is CSV (UTF-8) and there is a column that contains carriage returns (I'm not sure exactly what they are, but they were originally uploaded from an Excel file with information in the same column but with CTRL+ENTER to change line in the column).
Excel interprets them and changes line, so it creates a new line that does not fit with the headers and is not handled properly when the file is saved as a .csv file again. The text is within text delimiters ", so I'm guessing there should be a way to specify the handling of the text inside this delimiter.
I know there's probably no easy way to do that, but is there some VBA code I could use to handle this?
Here is an exemple of what it looks like when it is saved as a csv :
HEADER, GoalLibraryEntry, GUID, PARENT_GUID, LOCALE name, etc.
ADD, GoalLibraryEntry, 710103, 710100, en_US, "Test:,,,,,,,
a) Test1,,,,,,,,,,,
b) Test2,,,,,,,,,,,
c) Blabla,,,,,,,,,,,
d) Blablabla,,,,,,,,,,
e) Test5",,,,,,,,,,,
My problem is very similar to this one : Generating CSV file for Excel, how to have a newline inside a value
However, I've tried saving in ANSI before opening it in Excel without success. I also tried to add UTF-8 BOM at the start, but Excel does not seem to interpret it in any way.
What you're most likely seeing is multiline cells made when the user does a alt-enter on the keyboard.
The function you're looking for is clean (eg.) which you could use in VBA.
=clean(a1)
This removes non-printable characters from a cell.
I am kind of new to PowerBuilder and I'd like to know if it was possible to keep the "visible" value of a column name when using the SaveAs() Method of my DataWindow. Currently, my report shows columns like "Numéro PB" or "Poste 1-3", but when I save, it shows the Database's names. ie: "no_pb" and "pos_1_3"...
As I am working on a deployed application, I have to make my changes and implementations As user-friendly as possible, and they won't understand anything of that.
I already use the dw2xls api to save an exact copy of the report, but they want to have an option saving only the raw data, and I don't think I can achieve it using their API.
Also, I was asked not to use the Excel OLE object to do it...
Anyone's got an idea?
Thanks,
Michael
dw.saveas(<string with filename and path>,CSV!,TRUE) saves the datawindow data as a comma separated value text file with the first row having the column headers (database names in the dw painter).
To set the column headings in a saveas you could first access the data with
any la_dwdata[] // declare array
la_dwdata = dw_1.Object.Data // get all data for all rows in dw_1 in the Primary! buffer
from here you would create an output file consisting initially of a series of strings along
with the column names you want and then the data from the array converted to a string (you loop through the array). If you insert commas between the values and name the file with the 'CSV' extension, it will load into Excel. Since this approach will also include any non visible data, you may have to use other logic to exclude them if the users don't want to see it.
So now you have a string consisting of lines of data separated by tabs along with a crlf at the end of each. You create your 'header string' with the user friendly column names in the format of 'blah,blah,blah~r~n' (this is three 'blah' strings separated by commas with a crlf at the end).
Now you parse the string obtained from dw_1.Object.Data to find the first line, strip it off, then replace it with the header string you created. You can use the replace method to replace the remaining tabs with a comma. Now you save the string to a file with a .CSV extension and you can load it into Excel
This assumes that your display columns match your raw columns. Create a DataStore ds_head . Set your report DW as the DataObject (no data). I'm calling the DataWindow with the report you want to save dw_report. You'll want to delete the two temporary files when you're done. You may need to specify EncodingUTF8! or some other encoding instead of ANSI depending on what the data in the DataWindow is. Note: Excel will open this CSV but some other programs may not like it because the header row has a trailing comma.
``
ds_head.saveAsFormattedText("file1.csv", EncodingANSI!, ",")
dw_report.saveAs("file2.csv", CSV!, FALSE, EncodingANSI!)
run("copy file1.csv file2.csv output.csv")
I have a table in Access I am exporting to Excel, and I am using VBA code for the export (because I actually create a separate Excel file every time the client_id changes which creates 150 files). Unfortunately I lose the leading zeroes when I do this using DoCmd.TransferSpreadsheet. I was able to resolve this by looping through the records and writing each cell one at a time (and formatting the cell before I write to it). Unfortunately that leads to 8 hours of run time. Using DoCmd.TransferSpreadsheet, it runs in an hour (but then I lose the leading zeroes). Is there any way at all to tell Excel to just treat every cell as text when using the TransferSpreadsheet command? Can anybody think of another way to do this that won't take 8 hours? Thanks!
prefix the Excel value with an apostrophe (') character. The apostrophe tells Excel to treat the data as "text".
As in;
0001 'Excel treats as number and strips leading zeros
becomes
'0001 'Excel treats as text
You will probably need to create an expression field to prefix the field with the apostrophe, as in;
SELECT "'" & [FIELD] FROM [TABLE]
As an alternative to my other suggestion, have you played with Excel's Import External Data command? Using Access VBA, you can loop through your clients, open a template Excel file, import the data (i.e. pull instead of push) with your client as a criteria, and save it with a unique name for each client.
What if you:
In your source table, change the column type to string.
Loop through your source table and add an "x" to the field.
If the Excel data is meant to be read by a human being, you can get creative, like hiding your data column, and adding a 'display' column that references the data column, but removes the "x".