I want to output a multidimensional list to a CSV file.
Currently, I am creating a new DataFrame object and converting that to CSV. I am aware of the csv module, but I can't seem to figure out how to do that without manual input. The populate method allows the user to choose how many rows and columns they want. Basically, the data variable will usually be of form [[x1, y1, z1], [x2, y2, z2], ...]. Any help is appreciated.
FROM populator IMPORT populate
FROM pandas IMPORT DataFrame
data = populate()
df = DataFrame(data)
df.to_csv('output.csv')
CSVs are nothing but comma separated strings for each column and new-line separated for each row, which you can do like so:
data = [[1, 2, 4], ['A', 'AB', 2], ['P', 23, 4]]
data_string = '\n'.join([', '.join(map(str, row)) for row in data])
with open('data.csv', 'wb') as f:
f.write(data_string.encode())
I have 10 txt files. Each of them with strings.
A.txt: "This is a cat"
B.txt: "This is a dog"
.
.
J.txt: "This is an ant"
I want to read these multiple files and put it in 2D array.
[['This', 'is', 'a', 'cat'],['This', 'is', 'a', 'dog']....['This', 'is', 'an', 'ant']]
from glob import glob
import numpy as np
for filename in glob('*.txt'):
with open(filename) as f:
data = np.genfromtxt(filename, dtype=str)
It's not working the way I want. Any help will be greatly appreciated.
You are just generating different numpy arrays for each text file and not saving any of them. How about add each file to a list like so and convert to numpy later?
data = []
for filename in glob('*.txt'):
with open(filename) as f:
data.append(f.read().split())
data = np.array(data)
Is there any way to generate random alphabets in python. I've come across a code where it is possible to generate random alphabets from a-z.
For instance, the below code generates the following output.
import pandas as pd
import numpy as np
import string
ran1 = np.random.random(5)
print(random)
[0.79842166 0.9632492 0.78434385 0.29819737 0.98211011]
ran2 = string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
However, I want to generate random letters with input as the number of random letters (example 3) and the desired output as [a, f, c]. Thanks in advance.
Convert the string of letters to a list and then use numpy.random.choice. You'll get an array back, but you can make that a list if you need.
import numpy as np
import string
np.random.seed(123)
list(np.random.choice(list(string.ascii_lowercase), 10))
#['n', 'c', 'c', 'g', 'r', 't', 'k', 'z', 'w', 'b']
As you can see, the default behavior is to sample with replacement. You can change that behavior if needed by adding the parameter replace=False.
Here is an idea modified from https://pythontips.com/2013/07/28/generating-a-random-string/
import string
import random
def random_generator(size=6, chars=string.ascii_lowercase):
return ''.join(random.choice(chars) for x in range(size))
I have a large csv file and don't want to load it fully into my memory, I need to get only column names from this csv file. How to load it clearly?
try this:
pd.read_csv(file_name, nrows=1).columns.tolist()
If you pass nrows=0 to read_csv then it will only load the column row:
In[8]:
import pandas as pd
import io
t="""a,b,c,d
0,1,2,3"""
pd.read_csv(io.StringIO(t), nrows=0)
Out[8]:
Empty DataFrame
Columns: [a, b, c, d]
Index: []
After which accessing attribute .columns will give you the columns:
In[10]:
pd.read_csv(io.StringIO(t), nrows=0).columns
Out[10]: Index(['a', 'b', 'c', 'd'], dtype='object')
I was trying to load a .csv file which is present on desktop in iPython notebook but it is showing an error as invalid syntax
here is my code and file I used:
data = np.loadtxt("C:/Users/rj/Desktop/data.csv",dtype={ 'formats':('S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8')},delimiter=',')
data.csv file contains :
24-Dec-15,378.45,380.9,384.75,377.6,382.35,382.4,382.39,4568751
28-Dec-15,382.4,384.9,395,383.75,394.85,394,391.54,7166351
29-Dec-15,394,392.9,397.5,388.75,390.7,391.85,392.95,7359611
30-Dec-15,391.85,392,395,390.5,394,393.45,393.11,4866177
31-Dec-15,393.45,394,395.75,389.15,391.6,391.3,391.85,6410622
01-Jan-16,391.3,392.5,403,373,401.8,401.9,398.24,4377363
04-Jan-16,401.9,400,400.1,375.05,376.15,377.05,383.74,7822660
05-Jan-16,377.05,381.05,382.45,372.1,373,374.45,377.36,6901068
06-Jan-16,374.45,374.25,375.5,364.6,365,365.9,370.04,7211230
07-Jan-16,365.9,356.25,358,338.1,344.8,343.55,347.83,11782307
08-Jan-16,343.55,345.6,355.85,345.6,353.9,353.35,351.97,8770370
error is:
File "<ipython-input-13-177939f245ba>", line 21
... 'formats':('S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8')},delimiter=',')
^
SyntaxError: invalid syntax
how to correct syntax?
How about:
import numpy as np
names = ['date', 'a', 'b', 'c', 'e', 'f', 'g', 'h', 'i']
formats = ['S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8']
data = np.loadtxt('C:/Users/rj/Desktop/data.csv', \
dtype=list(zip(names, formats)), delimiter=',')
Of course you'd probably prefer more meaningful names. Python 2 doesn't need list(...) on the zip object.