How to extract the contents of the mth column of the nth row from a csv file using python - python-3.x

I created a CSV file and was able to add headers for it. I tried using loc to extract the contents but to no avail.
I want to get e as an output or use it in code for something.
The code I've used is as follows:
import pandas as pd
import csv
with open("boo.csv", "w") as f:
writer = csv.writer(f)
writer.writerow(('a','b', 'c'))
df = pd.read_csv("boo.csv", header=None)
df.to_csv("boo.csv", header=["alpha", "beta", "gamma"], index=False)
with open('boo.csv','a') as f:
writer=csv.writer(f)
writer.writerow(('c','d','e'))
writer.writerow(('f','g','h'))
print(df.loc[(df["alpha"]=='c')]["gamma"])
Upon running this code, I'm getting a KeyError for alpha. Please help with this. I'm pretty new to handling CSV files and pandas.
Thank you. :)

Related

How to get specific column value from .csv Python3?

I have a .csv file with Bitcoin price and market data, and I want to get the 5th and 7th columns from the last row in the file. I have worked out how to get the last row, but I'm not sure how to extract columns (values) 5 and 7 from it. Code:
with open('BTCAUD_data.csv', mode='r') as BTCAUD_data:
writer = csv.reader(BTCAUD_data, delimiter=',')
data = list(BTCAUD_data)[-1]
print(data)
Edit: How would I also add column names, and would adding them help me? (I have already manually put the names into individual columns in the first line of the file itself)
Edit #2: Forget about the column names, they are unimportant. I still don't have a working solution. I have a vague idea that I'm not actually reading the file as a list, but rather as a string. (This means when I subscript the data variable, I get a single digit, rather than an item in a list) Any hints to how I read the line as a list?
Edit #3: I have got everything working to expectations now, thanks for everyone's help :)
Your code never uses the csv-reader. You can do so like this:
import csv
# This creates a file with demo data
with open('BTCAUD_data.csv', 'w') as f:
f.write(','.join( f"header{u}" for u in range(10))+"\n")
for l in range(20):
f.write(','.join( f"line{l}_{c}" for c in range(10))+"\n")
# this reads and processes the demo data
with open('BTCAUD_data.csv', 'r', newline="") as BTCAUD_data:
reader = csv.reader(BTCAUD_data, delimiter=',')
# 1st line is header
header = next(reader)
# skip through the file, row will be the last line read
for row in reader:
pass
print(header)
print(row)
# each row is a list and you can index into it
print(header[4], header[7])
print(row[4], row[7])
Output:
['header0', 'header1', 'header2', 'header3', 'header4', 'header5', 'header6', 'header7', 'header8', 'header9']
['line19', 'line19', 'line19', 'line19', 'line19', 'line19', 'line19', 'line19', 'line19', 'line19']
header4 header7
line19_4 line19_7
Better use pandas for handling CSV file.
import pandas as pd
df=pd.read_csv('filename')
df.column_name will give the corresponding column
If you read this csv file into df and try df.Year will give you the Year column.

Python3 Extracting only emails from csv file

I have written a working script that extracts information from a .csv file. However, when extracted, it prints out all information instead of the emails when I wrote the code to specifically look for # symbols.
#!/bin/python3
import re
def print_csv():
in_file = open('sample._data.csv', 'rt')
for line in in_file:
if re.findall(r'(.*)#(.*).(.*)', line):
print(line)
print_csv()
Here's a sample of the output:
"Carlee","Boulter","Tippett, Troy M Ii","8284 Hart St","Abilene","Dickinson","KS",67410,"785-347-1805","785-253-7049","carlee.boulter#hotmail.com","http://www.tippetttroymii.com"
"Thaddeus","Ankeny","Atc Contracting","5 Washington St #1","Roseville","Placer","CA",95678,"916-920-3571","916-459-2433","tankeny#ankeny.org","http://www.atccontracting.com"
"Jovita","Oles","Pagano, Philip G Esq","8 S Haven St","Daytona Beach","Volusia","FL",32114,"386-248-4118","386-208-6976","joles#gmail.com","http://www.paganophilipgesq.com"
"Alesia","Hixenbaugh","Kwikprint","9 Front St","Washington","District of Columbia","DC",20001,"202-646-7516","202-276-6826","alesia_hixenbaugh#hixenbaugh.org","http://www.kwikprint.com"
"Lai","Harabedian","Buergi & Madden Scale","1933 Packer Ave #2","Novato","Marin","CA",94945,"415-423-3294","415-926-6089","lai#gmail.com","http://www.buergimaddenscale.com"
"Brittni","Gillaspie","Inner Label","67 Rv Cent","Boise","Ada","ID",83709,"208-709-1235","208-206-9848","bgillaspie#gillaspie.com","http://www.innerlabel.com"
"Raylene","Kampa","Hermar Inc","2 Sw Nyberg Rd","Elkhart","Elkhart","IN",46514,"574-499-1454","574-330-1884","rkampa#kampa.org","http://www.hermarinc.com"
"Flo","Bookamer","Simonton Howe & Schneider Pc","89992 E 15th St","Alliance","Box Butte","NE",69301,"308-726-2182","308-250-6987","flo.bookamer#cox.net","http://www.simontonhoweschneiderpc.com"
"Jani","Biddy","Warehouse Office & Paper Prod","61556 W 20th Ave","Seattle","King","WA",98104,"206-711-6498","206-395-6284","jbiddy#yahoo.com","http://www.warehouseofficepaperprod.com"
"Chauncey","Motley","Affiliated With Travelodge","63 E Aurora Dr","Orlando","Orange","FL",32804,"407-413-4842","407-557-8857","chauncey_motley#aol.com","http://www.affiliatedwithtravelodge.com"
What I'm trying to do is get the output to look like a list of emails. I have trouble with filtering out the other content from the csv file.
As mentioned aboce, you should be able to use the built in csv library. If the file is csv then it should have a structured format and even if it doesn't have column names, you should be able to pull it by column position. Per your sample data, you can get the correct column by position. Please check out the official Python docs
>>> import os
>>> import csv
>>> with open('sample._data.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=',',quotechar='"')
for row in reader:
print(row[10])
# output:
carlee.boulter#hotmail.com
tankeny#ankeny.org
joles#gmail.com
alesia_hixenbaugh#hixenbaugh.org
lai#gmail.com
bgillaspie#gillaspie.com
rkampa#kampa.org
flo.bookamer#cox.net
jbiddy#yahoo.com
chauncey_motley#aol.com

How can I convert a pipe delimited text file to a csv with each data in individual columns using python

Good Day
I am currently trying to create program in python 2.7 that takes an input text file that looks something like this:
and then delimits and converts the file into an Excel csv that looks like this:
I have tried the following:
import pandas as pd
import easygui
def convert():
file = easygui.fileopenbox()
data = pd.read_csv(file, header=None, skiprows=0)
export = data.to_csv(r'C:\Users\micha\Desktop\Python Tools\output.csv') #index=True, header=True)
print(convert())
This program yields this:
If anyone can help it would be much appreciated.

How to check if a CSV file is empty and to add data to it in Python?

I have created a CSV file and it is currently empty. My code checks whether if the CSV file contains data or not. If it doesn't, it adds data to it. If it does, it doesn't do anything. This is what I tried so far:
import pandas as pd
df = pd.read_csv("file.csv")
if df.empty:
#code for adding in data
else:
pass #do nothing
But when implemented, I got the error:
pandas.errors.EmptyDataError: No columns to parse from file
Is there a better way to check if the CSV file is empty or not?
import pandas as pd
try:
#file.csv is an empty csv file
df=pd.read_csv('file.csv')
except pd.errors.EmptyDataError:
#Code to adding data
else:
pass #Do something

CSV to Pythonic List

I'm trying to convert a CSV file into Python list I have strings organize in columns. I need an Automation to turn them into a list.
my code works with Pandas, but I only see them again as simple text.
import pandas as pd
data = pd.read_csv("Random.csv", low_memory=False)
dicts = data.to_dict().values()
print(data)
so the final results should be something like that : ('Dan', 'Zac', 'David')
You can simply do this by using csv module in python
import csv
with open('random.csv', 'r') as f:
reader = csv.reader(f)
your_list = map(list, reader)
print your_list
You can also refer here
If you really want a list, try this:
import pandas as pd
data = pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist()
This produces
['Dan', 'Zac', 'David']
If you want a tuple instead, just cast the list:
data = tuple(pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist())
And this produces
('Dan', 'Zac', 'David')
I assumed that you use commas as separators in your csv and your file has no header. If this is not the case, just change the params of read_csv accordingly.

Resources