Reading CSV column values and append to List in Python - python-3.x

I'd like to read a column from a CSV file and store those values in a list
The CSV file is currently as below
Names
Tom
Ryan
John
The result that I'm looking for is
['Tom', 'Ryan', 'John']
Below is the code that I've written.
import csv
import pandas as pd
import time
# Declarations
UserNames = []
# Open a csv file using pandas
data_frame = pd.read_csv("analysts.csv", header=1, index_col=False)
names = data_frame.to_string(index=False)
# print(names)
# Iteration
for name in names:
UserNames.append(name)
print(UserNames)
So far the result is as follows
['T', 'o', 'm', ' ', '\n', 'R', 'y', 'a', 'n', '\n', 'J', 'o', 'h', 'n']
Any help would be appreciated.
Thanks in advance

Hi instead of using converting your Dataframe to a String you could just convert it to a list like this:
import pandas as pd
import csv
import time
df = pd.read_csv("analyst.csv", header=0)
names = df["Name"].to_list()
print(names)
Output: ['tom', 'tim', 'bob']
Csv File:
Name,
tom,
tim,
bob,
I was not sure how your csv really looked like so you could have to adjust the arguments of the read_csv function.

Related

Most efficient way to convert Python multidimensional list to CSV file?

I want to output a multidimensional list to a CSV file.
Currently, I am creating a new DataFrame object and converting that to CSV. I am aware of the csv module, but I can't seem to figure out how to do that without manual input. The populate method allows the user to choose how many rows and columns they want. Basically, the data variable will usually be of form [[x1, y1, z1], [x2, y2, z2], ...]. Any help is appreciated.
FROM populator IMPORT populate
FROM pandas IMPORT DataFrame
data = populate()
df = DataFrame(data)
df.to_csv('output.csv')
CSVs are nothing but comma separated strings for each column and new-line separated for each row, which you can do like so:
data = [[1, 2, 4], ['A', 'AB', 2], ['P', 23, 4]]
data_string = '\n'.join([', '.join(map(str, row)) for row in data])
with open('data.csv', 'wb') as f:
f.write(data_string.encode())

Read multiple text files to 2D numpy array in Python

I have 10 txt files. Each of them with strings.
A.txt: "This is a cat"
B.txt: "This is a dog"
.
.
J.txt: "This is an ant"
I want to read these multiple files and put it in 2D array.
[['This', 'is', 'a', 'cat'],['This', 'is', 'a', 'dog']....['This', 'is', 'an', 'ant']]
from glob import glob
import numpy as np
for filename in glob('*.txt'):
with open(filename) as f:
data = np.genfromtxt(filename, dtype=str)
It's not working the way I want. Any help will be greatly appreciated.
You are just generating different numpy arrays for each text file and not saving any of them. How about add each file to a list like so and convert to numpy later?
data = []
for filename in glob('*.txt'):
with open(filename) as f:
data.append(f.read().split())
data = np.array(data)

How to generate random letters in python?

Is there any way to generate random alphabets in python. I've come across a code where it is possible to generate random alphabets from a-z.
For instance, the below code generates the following output.
import pandas as pd
import numpy as np
import string
ran1 = np.random.random(5)
print(random)
[0.79842166 0.9632492 0.78434385 0.29819737 0.98211011]
ran2 = string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
However, I want to generate random letters with input as the number of random letters (example 3) and the desired output as [a, f, c]. Thanks in advance.
Convert the string of letters to a list and then use numpy.random.choice. You'll get an array back, but you can make that a list if you need.
import numpy as np
import string
np.random.seed(123)
list(np.random.choice(list(string.ascii_lowercase), 10))
#['n', 'c', 'c', 'g', 'r', 't', 'k', 'z', 'w', 'b']
As you can see, the default behavior is to sample with replacement. You can change that behavior if needed by adding the parameter replace=False.
Here is an idea modified from https://pythontips.com/2013/07/28/generating-a-random-string/
import string
import random
def random_generator(size=6, chars=string.ascii_lowercase):
return ''.join(random.choice(chars) for x in range(size))

How to load Only column names from csv file (Pandas)?

I have a large csv file and don't want to load it fully into my memory, I need to get only column names from this csv file. How to load it clearly?
try this:
pd.read_csv(file_name, nrows=1).columns.tolist()
If you pass nrows=0 to read_csv then it will only load the column row:
In[8]:
import pandas as pd
import io
t="""a,b,c,d
0,1,2,3"""
pd.read_csv(io.StringIO(t), nrows=0)
Out[8]:
Empty DataFrame
Columns: [a, b, c, d]
Index: []
After which accessing attribute .columns will give you the columns:
In[10]:
pd.read_csv(io.StringIO(t), nrows=0).columns
Out[10]: Index(['a', 'b', 'c', 'd'], dtype='object')

error in importing .csv file in ipython notebook

I was trying to load a .csv file which is present on desktop in iPython notebook but it is showing an error as invalid syntax
here is my code and file I used:
data = np.loadtxt("C:/Users/rj/Desktop/data.csv",dtype={ 'formats':('S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8')},delimiter=',')
data.csv file contains :
24-Dec-15,378.45,380.9,384.75,377.6,382.35,382.4,382.39,4568751
28-Dec-15,382.4,384.9,395,383.75,394.85,394,391.54,7166351
29-Dec-15,394,392.9,397.5,388.75,390.7,391.85,392.95,7359611
30-Dec-15,391.85,392,395,390.5,394,393.45,393.11,4866177
31-Dec-15,393.45,394,395.75,389.15,391.6,391.3,391.85,6410622
01-Jan-16,391.3,392.5,403,373,401.8,401.9,398.24,4377363
04-Jan-16,401.9,400,400.1,375.05,376.15,377.05,383.74,7822660
05-Jan-16,377.05,381.05,382.45,372.1,373,374.45,377.36,6901068
06-Jan-16,374.45,374.25,375.5,364.6,365,365.9,370.04,7211230
07-Jan-16,365.9,356.25,358,338.1,344.8,343.55,347.83,11782307
08-Jan-16,343.55,345.6,355.85,345.6,353.9,353.35,351.97,8770370
error is:
File "<ipython-input-13-177939f245ba>", line 21
... 'formats':('S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8')},delimiter=',')
^
SyntaxError: invalid syntax
how to correct syntax?
How about:
import numpy as np
names = ['date', 'a', 'b', 'c', 'e', 'f', 'g', 'h', 'i']
formats = ['S10', 'f8','f8','f8','f8', 'f8','f8','f8','f8']
data = np.loadtxt('C:/Users/rj/Desktop/data.csv', \
dtype=list(zip(names, formats)), delimiter=',')
Of course you'd probably prefer more meaningful names. Python 2 doesn't need list(...) on the zip object.

Resources