I'm using the following code:
plugara=['CNPJ', 56631781000177, 21498104000148, 3914296000144, 28186370000184]
plugara=map(str,plugara)
with open(result.csv, 'w') as f:
wr = csv.writer(f,dialect='excel')
wr.writerows(plugara)
The result I'm getting is putting a comma between each character and is breaking into a different column:
I would like it to be without those commas like this:
Any ideas?
The writerows method that you're calling expects its argument to be an iterable containing rows. Each row should be iterable itself, with its values being the items in the row. In your case, the values are strings, which can be iterated upon to give their characters. Unfortunately, that's not what you intended!
Exactly how to fix this issue depends on what you want the output to be.
If you want your output to consist of a single row, then just change your call to writerow to instead of writerows (note the plural). The writerow method will only write a single row out, rather than trying to write several of them at once.
On the other hand, if you want many rows, with just one item in each one (forming a single column), then you'll need to transform your data a little bit. Rather than directly passing in your list of strings, you need to produce an iterable of rows with one item in them (perhaps 1-tuples). Try something like this:
wr.writerows((item,) for item in plugara)
This call uses a generator expression to transform each string from plugara into a 1-tuple containing the string. This should produce the output you want.
Related
Python 3.4
I've got an Excel file with some messy organizing, but one this is for sure:
I need EVERYTHING except the stuff that appears before the very first comma in every single line, the comma included.
Example:
Print command of the file gives me this:
Word1 Funky,Left Side,UDLRDURLUDRUDLUR
Nothing (because not) exists lol extraline,Right
Side,RBRGBRGBRGRBGRBGBR
What I want to get is this:
Left Side,UDLRDURLUDRUDLUR
Right Side,RBRGBRGBRGRBGRBGBR
I'd also like to make that into a dictionary:
dictionary = {"Left Side":"UDLRDURLUDRUDLUR", "Right Side":"RBRGBRGBRGRBGRBGBR",}
So basically I want to get rid of everything until the first comma (comma included), make the second part the key (ends at second comma), and third part the value (line ends with value).
What would be the easiest way to execute this?
Suppose s contains the string to be examined:
s = "word1,Left Side,UDLRDURLUDRUDLUR"
There are a number of ways to get rid of everything up to and including the first comma. You can use
Slicing coupled with find: s[s.find(',')+1:]
This expression will yield the desired result if the string s contain at least one comma, but it will yield the entire string if the string does not contain any commas.
Split coupled with indexing: s.split(',',1)[1]
This expression will yield the desired result if the string s contain at least one comma, but it will raise IndexError if the string does not contain any commas.
Regular expressions, but that's overkill here.
Other techniques, but those are also overkill here.
I have been using nlapiCreateFile() and nlapiSubmitFile() to create a CSV file from an array, and have run in to two problems I cant seem to figure out. When the CSV file is saved excel prints out each element of the array in to its own cell like it should but it prints it all on the same row (1a, 1b, 1c, 1d.. etc) I would rather have the array print downwards in the same column rather then row(1a, 2a, 3a, 4a... etc) if possible. But im not sure how to approach this.
var file1 = nlapiCreateFile('names.csv', 'CSV', names);
file1.setFolder(295767);
nlapiSubmitFile(file1);
The second thing I cant seem to figure out, if I wanted to print a second array in the same file, how would I approach that? For example the names array in the first columns and another array in the 2nd column.
Instead of using an Array for the third parameter of nlapiCreateFile(), try using a string that uses the , for column delimiters, and \n as a line separator.
I know this is rather old, but for those who are trying to figure this out, you want to use the join method:
var file1 = nlapiCreateFile('names.csv', 'CSV', names.join("\n"));
file1.setFolder(295767);
nlapiSubmitFile(file1);
This returns a string of the array, with the elements separated by the provided parameter. In this case, we want one column with all of the elements so we choose \n or new line.
For appending to a file, I believe you will have to load the record again, which will return and nlobjFile Object https://debugger.sandbox.netsuite.com/app/help/helpcenter.nl?fid=section_N3066995.html#bridgehead_N3067099
Then you can add to it, and then submit again.
I have several variables of the form:
1 gdppercap
2 19786,97
3 20713,737
4 20793,163
5 23070,398
6 5639,175
I have copy-pasted the data into Stata, and it thinks they are strings. So far I have tried:
destring gdppercap, generate(gdppercap_n)
but get
gdppercap contains nonnumeric characters; no generate
And:
encode gdppercap, gen(gdppercap_n)
but get a variable numbered from 1 to 1055 regardless of the previous value.
Also I've tried:
gen gdppercap_n = real(gdppercap)
But get:
(1052 missing values generated)
Can you help me? As far as I can tell, Stata do not like the fact that the variable contains fraction numbers.
If I understand you correctly, the interpretation as string arises from one and possibly two facts:
The variable name may be echoed in the first observation. If so, that's text and it's inconsistent with a numeric variable. The root problem there is likely to be a copy-and-paste operation that copied too much. Stata typically gives you a choice when importing by copy-and-paste of whether the first row of what you copied is to be treated as variable names or as data, and you need the first choice, so that column headers become variable names, not data. It may be best to go back and do the copy-and-paste correctly. However, Stata can struggle with multiple header lines in a spreadsheet. Alternatively, use import excel, not a copy-and-paste. Alternatively, drop in 1 to remove the first observation, provided that it consistently is superfluous.
Commas indicate decimal places. destring can easily cope with this: see the help for its dpcomma option. Stata has no objection to fractions; that would be absurd. The problem is that you need to flag your use of commas.
Note that
destring is a wrapper for real(), so real() is not a way round this.
encode is for mapping genuine categorical variables to integers, as you discovered, and as its help does explain. It is not for fixing data input errors.
You can write a for loop to convert a comma to a period. I don't quite know your variables but imagine you have a variable gdppercap with information like 1234,343 and you want that to be 1234.343 before you do the destring.
For example:
forvalues x = 1(1)10 {
replace gdppercap = substr(gdppercap, 1, `x'-1) + "." + substr(gdppercap, `x'+1, .)
if substr(gdppercap, `x', 1) == ","
}
I have a file composed of many strings. For each string, I want to create substrings of length 4 and then compare each substring with a dictionary of words from another SPSS file. For example, if I have the string "transport" I want to create a list of 4-letter strings (e.g., 'tran', 'rans', 'ansp', etc.). For each of these 4-letter strings, I want to know if it exists in another file with a long list of words. Here is my syntax in SPSS:
*rawNonword is the name of the string in my first file.
compute chars = char.length(rawNonword).
string holder (A50).
loop #i = 1 to chars-4.
compute holder = char.substr(rawNonword, #i, 4).
*here I would like to compare holder with the strings in another file.
end loop.
execute.
I realize that the merge and match functions are normally used in SPSS, but it seems as if I can't use them inside a loop. I believe this problem is fairly easy in python, but I need to do this task in SPSS. Is there an easy function in SPSS that will return a value of 1 or true if the 4-letter string exists in another file?
Certainly easier to do using the Python plugin with the extendedTransforms.vlookup function, but in traditional syntax, you could create a variable holding all the four-letter fragments, sort both files, and use a TABLE match with MATCH FILES using that variable as the key.
I'm trying to load the following dataset:
Afghanistan,5,1,648,16,10,2,0,3,5,1,1,0,1,1,1,0,green,0,0,0,0,1,0,0,1,0,0,black,green
Albania,3,1,29,3,6,6,0,0,3,1,0,0,1,0,1,0,red,0,0,0,0,1,0,0,0,1,0,red,red
Algeria,4,1,2388,20,8,2,2,0,3,1,1,0,0,1,0,0,green,0,0,0,0,1,1,0,0,0,0,green,white
...
Problem is it contains both integers and strings.
I found some information on how to get out the integers only.
But haven't been able to see if there's any way to get all the data.
My question is that possible ??
If that is not possible, is there then any way to find the numbers on each line and throw everything else away without having to choose the columns?
I need specifically since it seems I cannot use str2num on a whole line at a time.
Almost anything is possible, you just have to define your goal accurately.
Assuming that your database is stored as a text file, you can parse it line by line using textread, and then apply regexp to filter only the numerical fields (this does not require having prior knowledge about the columns):
C = textread('database.txt', '%s', 'delimiter', '\n');
C = cellfun(#(x)regexp(x, '\d+', 'match'), C, 'Uniform', false);
The result here is a cell array of cell array of strings, where each string corresponds to a numerical field in a specific line.
Since the numbers are still stored as strings, you'd probably need to convert them to actual numerical values. There's a multitude of ways to do that, but you can use str2num in a tricky way: it can convert delimited strings into an array of numbers. This means that if you concatenate all strings in a specific line back into one string, and put spaces in between, you can apply str2num on all of them at once, like so:
C = cellfun(#(x)str2num(sprintf('%s ', x{:})), C, 'Uniform', false);
The resulting C is a cell array of vectors, each vector containing the values of all numerical fields in the corresponding line. To access a specific vector, you can use curly braces ({}). For instance, to access the numbers of the second line, you would use C{2}.
All the non-numerical fields are discarded in the process of parsing, of course. If you want to keep them as well, you should use a different regular expression with regexp.
Good luck!