python 3 - how to split a key in a dictionary in 2 - python-3.x

This is my first post, so if I miss something, let me know.
I'm doing a CS50 beginner python course, and I'm stuck with a problem.
Long story short, the problem is to open a csv file, and it looks like this:
name,house
"Abbott, Hannah",Hufflepuff
"Bell, Katie",Gryffindor
.....
So I would love to put into a dictionary (which I did), but the problem now is that I supposed to split the "key" name in 2.
Here is my code, but it doesn't work:
before = []
....
with open(sys.argv[1]) as file:
reader = csv.reader(file)
for name, house in reader:
before.append({"name": name, "house": house})
# here i would love to split the key "name" in "last", "first"
for row in before[1:]:
last, first = name.split(", ")
Any advice?
Thank you in advance.

After you have the dictionary with complete name, you can split the name as below:
before = [{"name": "Abbott, Hannah", "house": "Hufflepuff"}]
# Before split
print(before)
for item in before:
# Go through each item in before dict and split the name
last, first = item["name"].split(', ')
# Add new keys for last and first name
item["last"] = last
item["first"] = first
# Remove the full name entry
item.pop("name")
# After split
print(before)
You can also do the split from the first pass, e.g. store directly the last and first instead of full name.

Related

Python: Trouble indexing a list from .split()

I'm currently working on a folder rename program that will crawl a directory, and rename specific words to their abbreviated version. These abbreviations are kept in a dictionary. When I try to replace mylist[mylist.index(w)] with the abbreviation, it replaces the entire list. The list shows 2 values, but it is treating them like a single index. Any help would be appreciated, as I am very new to Python.
My current test environment has the following:
c:\test\Accounting 2018
My expected result when this is completed, is c:\test\Acct 2018
import os
keyword_dict = {
'accounting': 'Acct',
'documents': 'Docs',
'document': 'Doc',
'invoice': 'Invc',
'invoices': 'Invcs',
'operations': 'Ops',
'administration': 'Admin',
'estimate': 'Est',
'regulations': 'Regs',
'work order': 'WO'
}
path = 'c:\\test'
def format_path():
for kw in os.walk(path, topdown=False):
#split the output to separate the '\'
usable_path = kw[0].split('\\')
#pull out the last folder name
string1 = str(usable_path[-1])
#Split this output based on ' '
mylist = [string1.lower().split(" ")]
#Iterate through the folders to find any values in dictionary
for i in mylist:
for w in i:
if w in keyword_dict.keys():
mylist[i.index(w)] = keyword_dict.get(w)
print(mylist)
format_path()
When I use print(mylist) prior to the index replacement, I get ['accounting', '2018'], and print(mylist[0]) returns the same result.
After the index replacement, the print(mylist) returns ['acct] the ['2018'] is now gone as well.
Why is it treating the list values as a single index?
I didn't test the following but it should point to the right direction. But first, not sure if it is a good idea spacing is the way to go (Accounting 2018) could come up as accounting2018 or accounting_2018. Better to use regular expression. Anyway, here is a slightly modified version of your code:
import os
keyword_dict = {
'accounting': 'Acct',
'documents': 'Docs',
'document': 'Doc',
'invoice': 'Invc',
'invoices': 'Invcs',
'operations': 'Ops',
'administration': 'Admin',
'estimate': 'Est',
'regulations': 'Regs',
'work order': 'WO'
}
path = 'c:\\test'
def format_path():
for kw in os.walk(path, topdown=False):
#split the output to separate the '\'
usable_path = kw[0].split('\\')
#pull out the last folder name
string1 = str(usable_path[-1])
#Split this output based on ' '
mylist = string1.lower().split(" ") #Remove [] since you are creating a list within a list for no reason
#Iterate through the folders to find any values in dictionary
for i in range(0,len(mylist)):
abbreviation=keyword_dict.get(mylist[i],'')
if abbreviation!='': #abbrevaition exists so overwrite it
mylist[i]=abbreviation
new_path=" ".join(mylist) #create new path (i.e. ['Acct', '2018']==>Acct 2018
usable_path[len(usable_path)-1]=new_path #replace the last item in the original path then rejoin the path
print("\\".join(usable_path))
What you need is:
import re, os
regex = "|".join(keyword_dict.keys())
repl = lambda x : keyword_dict.get(x.group().lower())
path = 'c:\\test'
[re.sub(regex,repl, i[0],re.I) for i in os.walk(path)]
You need to ensure the above is working.(So far it is working as expected) before you can rename

How do i get my code to append to the end of a specific csv row

Here is my code:
import csv
with open("Grades.txt", "r") as file:
reader = csv.reader(file)
for row in reader:
if name == row[0]:
with open("Grades.txt", "a") as file:
writer = csv.writer(file)
writer.writerow(grade)
The variable name and grade have already been defined in an earlier function. I have a text file with a list of names so the code checks if the name(John) is in the text file and then is supposed to write the grade(A) next the name with a comma separating it. The problem is that my code will write the grade a space or 2 spaces below the entire list of names. If I can get it to write to the end of the name it would just be shown like (JohnA) with no separation. Im clueless about how to go about fixing this. I would appreciate if you could correct my code to do what I need it to. The variable name is an input from a login in a different function so the input is different every time. Also new names may be added through my sign up function so the similar question doesn't help.
for example say my text file looked like this:
John
Sam
Bob
And the grade Sam got was an A. How would I append the A grade to the end of Bobs name with a comma separating the name and the grade?
I don't see how this code example should do the job you describe. Sorry.
import csv
students = [["Anne", "A"], ["Emily", "B"]]
with open("grades.csv", "w", newline="") as f:
writer = csv.writer(f)
for row in students:
writer.writerow(row)
You must give a tupel or a list as a row to csv.writer. What you describe sounds that you write two times to that file, but I don't see that this is been done by your code as described.
I hope to help you a little bit. Sorry at the moment I can't comment...
New:
What I want to say is, that you should put the names and grades in your main program together and then write it to the file. This is how I would solve your task.
names = ["John", "Sam", "Bob"]
grades = ["A", "B", "C"]
names_grades = zip(names, grades)
for row in names_grades:
print(row)
The new row can be written easily to your file.

Python iterate over specific column in csv , and replacing values

First sorry for my english ;)
I have a problem regarding a csv file. The file contains a lot of col. with a lot of different features. I want to iterate over the col. host_location to get the entries of each row. For each String which contains ("London" or "london") i want to change the string into an binary. So if the string contains "London" or "london" the entry should be 1 , if not 0.
Im familiar with Java, but Python is new for me.
What i know so far with reference to this problem:
i cant change the csv file directly, i have to read it, change the value and write it back to a new file.
My method so far:
listings = io.read_csv('../data/playground/neu.csv')
def Change_into_Binaryy():
listings.loc[listings["host_location"] == ( "London" or
"london"),"host_location"] = 1
listings.to_csv("../data/playground/neu1.csv",index =False)
The code is from another question of stackoverflow, and im really not familiar with Python so far. The problem is that i can only use the equal operator and not something like contains in java.
As a result only the entries with the string "London" or "london" are changed to 1. But there are also entries like "London, Uk" that i want to change
In addition i don't know how i can change the remaining entries to 0 , because i don't know how i can combine the .loc with sth. like a if/else construct
I also tried another solution:
def Change_into_Binary():
for x in listings['host_location']:
if "London" or "london" in x:
x = 1
else:
x = 0
listings.to_csv("../data/playground/neu1.csv",index =False)
But also do not work. In this case the entries are not changed.
Thanks for you answers
from csv import DictReader, DictWriter
with open('infile.csv', 'r') as infile, open('outfile.csv', 'w') as outfile:
reader = DictReader(infile)
writer = DictWriter(outfile, fieldnames=reader.fieldnames)
writer.writeheader()
for row in reader:
if row['host_location'].capitalize() == 'London':
row['host_location'] = 1
else:
row['host_location'] = 0
writer.writerow(row)

convert data string to list

I'm having some troubles processing some input.
I am reading data from a log file and store the different values according to the name.
So my input string consists of ip, name, time and a data value.
A log line looks like this and it has \t spacing:
134.51.239.54 Steven 2015-01-01 06:09:01 5423
I'm reading in the values using this code:
loglines = file.splitlines()
data_fields = loglines[0] # IP NAME DATE DATA
for loglines in loglines[1:]:
items = loglines.split("\t")
ip = items[0]
name = items[1]
date = items[2]
data = items[3]
This works quite well but I need to extract all names to a list but I haven't found a functioning solution.
When i use print name i get:
Steven
Max
Paul
I do need a list of the names like this:
['Steven', 'Max', 'Paul',...]
There is probably a simple solution and i haven't figured it out yet, but can anybody help?
Thanks
Just create an empty list and add the names as you loop through the file.
Also note that if that file is very large, file.splitlines() is probably not the best idea, as it reads the entire file into memory -- and then you basically copy all of that by doing loglines[1:]. Better use the file object itself as an iterator. And don't use file as a variable name, as it shadows the type.
with open("some_file.log") as the_file:
data_fields = next(the_file) # consumes first line
all_the_names = [] # this will hold the names
for line in the_file: # loops over the rest
items = line.split("\t")
ip, name, date, data = items # you can put all this in one line
all_the_names.append(name) # add the name to the list of names
Alternatively, you could use zip and map to put it all into one expression (using that loglines data), but you rather shouldn't do that... zip(*map(lambda s: s.split('\t'), loglines[1:]))[1]

reading data from a file and storing them in a list of lists Python

I have a file data.txt containing following lines :
I would like to extract the lines of this file into a list of lists, each line is a list that will be contained within ListOfLines wich is a list of lists.
When there is no data on some cell I just want it to be -1.
I have tried this so far :
from random import randint
ListOfLines=[]
with open("C:\data.txt",'r') as file:
data = file.readlines()
for line in data :
y = line.split()
ListOfLines.append(y)
with open("C:\output.txt",'a') as output:
for x in range(0, 120):
# 'item' represente une ligne
for item in ListOfLines :
item[2] = randint(1, 1000)
for elem in item :
output.write(str(elem))
output.write(' ')
output.write('\n')
output.write('------------------------------------- \n')
How can I improve my program to contain less code and be faster ?
Thank you in advance :)
Well, sharing your sample data in an image don't make easy to working with it. Like this I don't even bother and I assume others do the same.
However, data = file.readlines() forces the content of the file into a list first, and then you iterate through that list. You could do that instantly with 'for line in file:'. That improves it a little.
You haven't mentioned what you want with the otput part which seems quite messy.

Resources