Opening a text file and adding to a dictionary - python-3.x

1) I'm looking to open a text file with values separated by colons like this:
Name : Daniel
Age : 12
Gender : Male
...
How do I open this text file in Python and add everything to a dictionary so that it ends up like this:
dictionary={"Name":"Daniel","Age":"12","Gender","male"...}
2) I then want the user to be able to search for a key, let's say "Name" and then the program outputs "Daniel". How can I do this?

A suggestion:
file = open("file.txt","r")
output_dict={}
lines=file.readlines()
file.close()
line=lines[0].replace(" : "," ")
words=line.split(' ')
for i in range(0,len(words)-1,2):
output_dict[words[i]]=words[i+1]
print(output_dict)

And what is the separator between lines?
If it is line break, you can do a file.readlines():
yourFile = open("file.txt", "r")
lines = yourFile.readlines()
output = {}
for line in lines:
#BE CAREFUL! [-2] if window line break (\r\n)
#You can also do l = line.replace('\r', "").replace("\n", "")
#Which is better, because it is cross-platform and cross-format
l = line[-1]
output[l.split(':')[0]] = l.split(':')[1]
Explanation:
yourFile.readlines() reads file, and return it like ["line1\n", "line2\n"]
We do a for, loop, we cut the line break character(s) (\n, \r\n or \r, depends of OS!! it should be only \n, but try...
for each lines:
We split string by column: "Name:Daniel".split(":") returns ['Name', 'Daniel']
We append it into the dictionary with dictionnary['key'] = 'value' syntax
It should work, but be careful: Spaces between column stay !!
To remove it out, you have to use string.replace ("Name : Daniel".replace(" ", "") will returns "Name:Daniel" ).
And for return the name, once you have dictionary, nothing simpler: dictionnary["Name"] outputs "Daniel".

Related

extract words from a text file and print netxt line

sample input
in parsing a text file .txt = ["'blah.txt'", "'blah1.txt'", "'blah2.txt'" ]
the expected output in another text file out_path.txt
blah.txt
blah1.txt
blah2.txt
Code that I tried, this just appends "[]" to the input file. While I also tried perl one liner replacing double and single quotes.
read_out_fh = open('out_path.txt',"r")
for line in read_out_fh:
for word in line.split():
curr_line = re.findall(r'"(\[^"]*)"', '\n')
print(curr_line)
this happens because while you reading a file it will be taken as string and not as a list even if u kept the formatting of a list. thats why you getting [] while doing re.for line in read_in_fh: here you are taking each letters in the string thats why you are not getting the desired output. so iwrote something first to transform the string into a list. while doing that i also eliminated "" and '' as you mensioned. then wrote it in to a new file example.txt.
Note: change the file name according to your files
read_out_fh = open('file.txt',"r")
for line in read_out_fh:
line=line.strip("[]").replace('"','').replace("'",'').split(", ")
with open("example.txt", "w") as output:
for word in line:
#print(word)
output.write(word+'\n')
example.txt(outputfile)
blah.txt
blah1.txt
blah2.txt
The code below works out for your example you gave in the question:
# Content of textfile.txt:
asdasdasd=["'blah.txt'", "'blah1.txt'", "'blah2.txt'"]asdasdasd
# Code:
import re
read_in_fh = open('textfile.txt',"r")
write_out_fh = open('out_path.txt', "w")
for line in read_in_fh:
find_list = re.findall(r'\[(".*?"*)\]', line)
for element in find_list[0].split(","):
element_formatted = element.replace('"','').replace("'","").strip()
write_out_fh.write(element_formatted + "\n")
write_out_fh.close()

Splitting text file in Python - delimeter issue

I try to split a file by delimeter: "}., but the delimeter is not found and as a result I get only one new file with the same content as the original one. The code is:
with open('okladki_200_01') as fp:
contents = fp.read()
i = 1
for entry in contents.split('"}.'):
f= open("okladka_%s" % i,"w+")
f.write(entry)
f.close()
i += 1
Can you help, please?
EDIT:
The content of the file is like:
{"base64Image":"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB\nAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEBAQEBAQEB\nAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/wAARCAusFMADASIA\nAhEBAxEB/8QAHwAAAgIBBQEBAAAAAAAAAAAAAgQAAwUBBgcICQoL/8QAaRAAAQEFBAcDBwgHBQYD\nAwEZAwIBBBESEwAhIiMFFDEyM0FDUVNhBiRCY3GBkQcIFTRSc6GxRGKDk8HR8FRyo+HxCRYlZLPD\ndILTFzWEkp [...] 3aIiVoL1pmNQxjWr27\nPBnhatT94NfdwDzDBz9aSP/Z\n","elementHashcode":-1794239528,"imageOrientation":6,"type":"BOOK"}
{"base64Image":"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB\nAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
And I think I just found the problem... HxD viewer displays 0x0A ASCII character as a dot, but it is New Line. So I should look for '"}\n'
Move contents.split('"}.') into its own variable.
lines = contents.split('.}"')
for entry in lines:
...
Code :
with open('textfile') as fp:
contents = fp.read()
i = 1
lines = contents.split('.}"')
for entry in lines:
f= open("textfile_%s" % i,"w+")
f.write(entry)
f.close()
i += 1
fp.close()
Do you actually need to check for brackets? In your case it seems like your input file is already formatted with 1 content = 1 line, so our delimiter could be \n instead and we can use readlines().
Here is a possible solution:
with open('okladki_200_01') as fp:
lines = fp.readlines() # this is a list of strings.
i = 1
for line in lines:
entry = line.lstrip("{").rstrip("}\n") # some clean-up.
f = open("okladka_%s" %i ,"w+")
f.write(entry)
f.close()
i += 1

Find items in a text file that is a incantinated string of capitalized words that begin with a certain capital letter in python

I am trying to pull a string of input names that get saved to a text file. I need to pull them by capital letter which is input. I.E. the saved text file contains names DanielDanClark, and I need to pull the names that begin with D. I am stuck at this part
for i in range(num):
print("Name",i+1," >> Enter the name:")
n=input("")
names+=n
file=open("names.txt","w")
file.write(names)
lookUp=input("Did you want to look up any names?(Y/N)")
x= ord(lookUp)
if x == 110 or x == 78:
quit()
else:
letter=input("Enter the first letter of the names you want to look up in uppercase:")
file=open("names.txt","r")
fileNames=[]
file.list()
for letter in file:
fileNames.index(letter)
fileNames.close()
I know that the last 4 lines are probably way wrong. It is what I tried in my last failed attempt
Lets break down your code block by block
num = 5
names = ""
for i in range(num)
print("Name",i+1," >> Enter the name:")
n=input("")
names+=n
I took the liberty of giving num a value of 5, and names a value of "", just so the code will run. This block has no problems. And will create a string called names with all the input taken. You might consider putting a delimiter in, which makes it more easier to read back your data. A suggestion would be to use \n which is a line break, so when you get to writing the file, you actually have one name on each line, example:
num = 5
names = ""
for i in range(num)
print("Name",i+1," >> Enter the name:")
n = input()
names += n + "\n"
Now you are going to write the file:
file=open("names.txt","w")
file.write(names)
In this block you forget to close the file, and a better way is to fully specify the pathname of the file, example:
file = open(r"c:\somedir\somesubdir\names.txt","w")
file.write(names)
file.close()
or even better using with:
with open(r"c:\somedir\somesubdir\names.txt","w") as openfile:
openfile.write(names)
The following block you are asking if the user want to lookup a name, and then exit:
lookUp=input("Did you want to look up any names?(Y/N)")
x= ord(lookUp)
if x == 110 or x == 78:
quit()
First thing is that you are using quit() which should not be used in production code, see answers here you really should use sys.exit() which means you need to import the sys module. You then proceed to get the numeric value of the answer being either N or n and you check this in a if statement. You do not have to do ord() you can use a string comparisson directly in your if statement. Example:
lookup = input("Did you want to look up any names?(Y/N)")
if lookup.lower() == "n":
sys.exit()
Then you proceed to lookup the requested data, in the else: block of previous if statement:
else:
letter=input("Enter the first letter of the names you want to look up in uppercase:")
file=open("names.txt","r")
fileNames=[]
file.list()
for letter in file:
fileNames.index(letter)
fileNames.close()
This is not really working properly either, so this is where the delimiter \n is coming in handy. When a text file is opened, you can use a for line in file block to enumerate through the file line by line, and with \n delimiter added in your first block, each line is a name. You also go wrong in the for letter in file block, it does not do what you think it should be doing. It actually returns each letter in the file, regardless of whay you type in the input earlier. Here is a working example:
letter = input("Enter the first letter of the names you want to look up in uppercase:")
result = []
with open(r"c:\somedir\somesubdir\names.txt", "r") as openfile:
for line in openfile: ## loop thru the file line by line
line = line.strip('\n') ## get rid of the delimiter
if line[0].lower() == letter.lower(): ## compare the first (zero) character of the line
result.append(line) ## append to result
print(result) ## do something with the result
Putting it all together:
import sys
num = 5
names = ""
for i in range(num)
print("Name",i+1," >> Enter the name:")
n = input("")
names += n + "\n"
with open(r"c:\somedir\somesubdir\names.txt","w") as openfile:
openfile.write(names)
lookup = input("Did you want to look up any names?(Y/N)")
if lookup.lower() == "n":
sys.exit()
letter = input("Enter the first letter of the names you want to look up in uppercase:")
result = []
with open(r"c:\somedir\somesubdir\names.txt", "r") as openfile:
for line in openfile:
line = line.strip('\n')
if line[0].lower() == letter.lower():
result.append(line)
print(result)
One caveat I like to point out, when you create the file, you open the file in w mode, which will create a new file every time, therefore overwriting the a previous file. If you like to append to a file, you need to open it in a mode, which will append to an existing file, or create a new file when the file does not exist.

file reading in python, need help for homework

Write a function func(infilepath) that reads the file whose file path is infilepath, and prints the number of times each character(excluding newline characters) appeared in the file, in sorted order of the characters.
Any help would be greatly appreciated !
This won't be the exact answer, but enough to get you started!
First, open a file:
f = open("file.txt", "r")
Then read lines
lines = f.readlines()
Define a dictionary. Split the line by spaces, increment the dictionary by one if they character is already present in the dictionary, else initialize it to 0.
chars = {}
lines = [line.strip() for line in lines]
for line in lines:
line = line.split(" ")
for i in line:
if i not in chars.keys():
chars[i] = 0
else:
chars[i]+=1
More about file handling: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/06-file-handling.md
More about sets/lits/dictionaries: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/04-list-set-dict.md
Some practical examples to get you thinking: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/13-examples.md

Use Python to parse comma separated string with text delimiter coming from stdin

I have a csv file that is being fed to my Python script via stdin.
This is a comma separated file with quotations as text delimiter.
Here is an example line:
457,"Last,First",NYC
My script so far, splits each line by looking for commas, but how do I make it aware of the text delimiter quotes?
My current script:
for line in sys.stdin:
line = line.strip()
line.split(',')
print line
The code splits the name into two since it does not recognize the quotations enclosing that text field. I need the name to remain as a single element.
If it matters, the data is being fed through stdin within a hadoop-streaming program.
Thanks!
Well, you could do it more manually, with something like this:
row = []
enclosed = False
word = ''
for character in sys.stdin:
if character == '"':
enclosed = not enclosed
elif character = ',' and not enclosed:
row.append(word)
word = ''
else:
word += character
Haven't tested nor thought about it for too long but seems to me it could work. Probably someone more into Pythonist sintax could fine something better for doing the trick although ;)
Attempting to answer my own question. If I read right, it may be possible to send a streaming input into csv reader like so:
for line in csv.reader(sys.stdin):
print line

Resources