Nested for loop in python doesn't iterate fully - python-3.x

I have the following code to replace just one set of value from 26th line of 150 lines data. The problem is with the nested for loop. From the second iteration onwards, the inner loop isn't executing. Instead the loop skips to the last line
n= int(input("Enter the Number of values: "))
arr = []
print ('Enter the Fz values (in Newton): ')
for _ in range(n):
x=(input())
arr.append(x)
print ('Array: ', arr)
os.chdir ("D:\Work\python")
f = open("datanew.INP", "r+")
f1 = open("data.INP", "w+")
for i in range(len(arr)):
str2 = arr[i]
for j,line in enumerate(f):
if j == 25 :
new = line.split()
linenew = line.replace(new[1],str2)
print (linenew)
f1.write(linenew)
else:
f1.write(line)
print(arr[i], 'is replaced')
f.close()
f1.close()

The issue is that your code is looping over a file. On the first pass through, the end of file is reached. After that, there is no data left in the file to read, so the next loop has nothing to iterate over.
Instead, you might try reading over the whole file and storing the data in a list, then looping over that list. (Alternatively, you could eliminate the loops and access the 26th item directly.)
Here is some simple code to read from one file, replace the 26th line, and write to another file:
f = open("data.INP", "r") # Note that for simple reading you don't need the '+' added to the 'r'
the_data = f.readlines()
f.close()
the_data[25] = 'new data\n' # Remember that Python uses 0 indexing, thus the 25
f1 = open("updated_data.out", "w") # Assuming you want to write a new file, leave off the '+' here, as that indicates that you want to append to an existing file
for l in the_data:
f1.write(l)
f1.close()

Related

How do I count all occurrences of a phrase in a text file using regular expressions?

I am reading in multiple files from a directory and attempting to find how many times a specific phrase (in this instance "at least") occurs in each file (not just that it occurs, but how many times in each text file it occurs) My code is as follows
import glob
import os
path = 'D:/Test'
k = 0
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
data = f.read()
data.split()
data.lower()
S = re.findall(r' at least ', data, re.MULTILINE)
count = []
if S == True:
for S in data:
count.append(data.count(S))
k= k + 1
print("'{}' match".format(filename), count)
else:
print("'{}' no match".format(filename))
print("Total number of matches", k)
At this moment I get no matches at all. I can count whether or not there is an occurrence of the phrase but am not sure why I can't get a count of all occurrences in each text file.
Any help would be appreciated.
regards
You can get rid of the regex entirely, the count-method of string objects is enough, much of the other code can be simplified as well.
You're also not changing data to lower case, just printing the string as lower case, note how I use data = data.lower() to actually change the variable.
Try this code:
import glob
import os
path = 'c:\script\lab\Tests'
k = 0
substring = ' at least '
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
data = f.read()
data = data.lower()
S= data.count(substring)
if S:
k= k + 1
print("'{}' match".format(filename), S)
else:
print("'{}' no match".format(filename))
print("Total number of matches", k)
If anything is unclear feel free to ask!
You make multiple mistakes in your code. data.split() and data.lower() have no effect at all, since the both do not modifiy data but return a modified version. However, you don't assign the return value to anything, so it is lost.
Also, you should always close a resource (e.g. a file) when you don't need it anymore.
Also, you append every string you find using re.search to a list S, which you dont use for anything anymore. It would also be pointless, because it would just contain the string you are looking for x amount of time. You can just take the list that is returned by re.search and comupute its length. This gives you the number of times it occurs in the text. Then you just increase your counter variable k by that amount and move on to the next file. You can still have your print statements by simply printing the temporary num_found variable.
import re
import glob
import os
path = 'D:/Test'
k = 0
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
text = f.read()
f.close()
num_found = len(re.findall(r' at least ', data, re.MULTILINE))
k += num_found

How can I print the line index of a specific word in a text file?

I was trying to find a way to print the biggest word from a txt file, it's size and it's line index. I managed to get the first two done but can't quite figure it out how to print the line index. Can anyone help me?
def BiggestWord():
list_words = []
with open('song.txt', 'r') as infile:
lines = infile.read().split()
for i in lines:
words = i.split()
list_words.append(max(words, key=len))
biggest_word = str(max(list_words, key=len))
print biggest_word
print len(biggest_words)
FindWord(biggest_word)
def FindWord(biggest_word):
You don't need to do another loop through your list of largest words from each line. Every for-loop increases function time and complexity, and it's better to avoid unnecessary ones when possible.
As one of the options, you can use Python's built-in function enumerate to get an index for each line from the list of lines, and instead of adding each line maximum to the list, you can compare it to the current max word.
def get_largest_word():
# Setting initial variable values
current_max_word = ''
current_max_word_length = 0
current_max_word_line = None
with open('song.txt', 'r') as infile:
lines = infile.read().splitlines()
for line_index, line in enumerate(lines):
words = line.split()
max_word_in_line = max(words, key=len)
max_word_in_line_length = len(max_word_in_line)
if max_word_in_line_length > current_max_word_length:
# updating the largest word value with a new maximum word
current_max_word = max_word_in_line
current_max_word_length = max_word_in_line_length
current_max_word_line = line_index + 1 # line number starting from 1
print(current_max_word)
print(current_max_word_length)
print(current_max_word_line)
return current_max_word, current_max_word_length, current_max_word_line
P.S.: This function doesn't suggest what to do with the line maximum words of the same length, and which of them should be chosen as absolute max. You would need to adjust the code accordingly.
P.P.S.: This example is in Python 3, so change the snippet to work in Python 2.7 if needed.
With a limited amount of info I'm working with, this is the best solution I could think of. Assuming that each line is separated by a new line, such as '\n', you could do:
def FindWord(largest_word):
with open('song.txt', 'r') as infile:
lines = infile.read().splitlines()
linecounter = 1
for i in lines:
if largest_word in lines:
return linecounter
linecounter += 1
You can use enumerate in your for to get the current line and sorted with a lambda to get the longest word:
def longest_word_from_file(filename):
list_words = []
with open(filename, 'r') as input_file:
for index, line in enumerate(input_file):
words = line.split()
list_words.append((max(words, key=len), index))
sorted_words = sorted(list_words, key=lambda x: -len(x[0]))
longest_word, line_index = sorted_words[0]
return longest_word, line_index
Are you aware that there can be:
many 'largest' words with the same length
several lines contain word(s) with the biggest length
Here is the code that finds ONE largest word and returns a LIST of numbers of lines that contain the word:
# built a dictionary:
# line_num: largest_word_in_this_line
# line_num: largest_word_in_this_line
# etc...
# !!! actually, a line can contain several largest words
list_words = {}
with open('song.txt', 'r') as infile:
for i, line in enumerate(infile.read().splitlines()):
list_words[i] = max(line.split(), key=len)
# get the largest word from values of the dictionary
# !!! there can be several different 'largest' words with the same length
largest_word = max(list_words.values(), key=len)
# get a list of numbers of lines (keys of the dictionary) that contain the largest word
lines = list(filter(lambda key: list_words[key] == largest_word, list_words))
print(lines)
If you want to get all lines that have words with the same biggest length you need to modify the last two lines in my code this way:
lines = list(filter(lambda key: len(list_words[key]) == len(largest_word), list_words))
print(lines)

How to print 1st string of file's line during second iteration in python

Actually My file contents are.
ttsighser66
dagadfgadgadgfadg
dafgad
fgadfgad
ttsighser63
sadfsadf
asfdas
My code
file=open("C:\\file.txt","r")
cont = []
for i in file:
dd = i.strip("\n")
cont.append(dd)
cc = ",".join(cont)
if "tt" in i:
cc = ",".join(cont[:-1])
print(cont[-1], cc)
cont = []
My code generate below Output:
ttsighser66
ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad
But I want output like below format
ttsighser66,dagadfgadgadgfadg,dafgad,fgadfgad
ttsighser63,sadfsadf,asfdas
file=open("file.txt","r")
cont = []
for i in file:
dd = i.strip("\n")
cont.append(dd)
#print('cc',cont)
if "ttsighser" in i and len(cont) != 1:
cc = ",".join(cont[:-1])
print(cc)
cont = []
cont.append(dd)
print(",".join(cont))
If you don't need to store any strings to a list and just need to print strings, you could try this instead.
with open("file.txt", "r") as f:
line_counter = 0
file_lines = f.readlines()
for i in file_lines:
dd = i.strip()
if "tt" in dd:
print("{0}{1}".format("\n" if line_counter > 0 else "", dd), end="")
else:
print(",{0}".format(dd), end="")
line_counter += 1
print("")
The reason why your code displays
ttsighser66
ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad
instead of
ttsighser66,dagadfgadgadgfadg,dafgad,fgadfgad
ttsighser63,sadfsadf,asfdas
is because when you first encounter 'ttsighser66', it is appended to cont. Then since 'ttsighser66' contains 'tt', we would proceed to the conditional branch.
In the conditional branch, we would be joining the first and second to the last string in cont in cc = ",".join(cont[:-1]). However, since we only have 'ttsighser66' in cont, cont[:-1] will give us [] (an empty list). Since cont[:-1] is empty, ",".join(cont[:-1]) will be empty as well. Thus, cc will be empty. Since cc is empty, print(cont[-1], cc) will give us ttsighser66.
In the second line, ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad gets displayed because cont contains more than one value already so it will also display the values before 'ttsighser63'.
The remaining strings are not displayed because, based from your code, it would need another string containing 'tt' before the strings in cc could be displayed.
Essentially, you require a pair of strings containing 'tt' to display the strings between the pairs.
Additonal remark: The line cc = ",".join(cont) in your code seems pretty useless since its scope is limited to the for loop only and its value is being replaced inside the conditional branch.
version 1 (all data in list of strings && 1 time print)
fp=open("file.txt", "r")
data = []
for line in fp:
if "tt" in line:
data.append(line.strip())
else:
data.append(data.pop() + "," + line.strip())
fp.close()
[print (data) for line in data]
Version 2 (all data in a single string && 1 time print)
fp=open("file.txt","r")
data = ""
for line in fp:
if "tt" in line:
data += "\n" + line.strip()
else:
data += ","+line.strip()
fp.close()
data = data[1:]
print (data)

Python program for json files

i want to search a particular keyword in a .json file and print 10 lines above and below the line in which the searched keyword is present.
Note - the keyword might be present more than once in the file.
So far i have made this -
with open('loggy.json', 'r') as f:
last_lines = deque(maxlen=5)
for ln, line in enumerate(f):
if "out_of_memory" in line:
print(ln)
sys.stdout.writelines(chain(last_lines, [line], islice(f, 5)))
last_lines.append(line)
print("Next Error")
print("No More Errors")
Problem with this is - the number of times it prints the keyword containing line is equal to that number of times the keyword has been found.
it is only printing 5 lines below it, whereas i want it to print five lines above it as well.
If the json file was misused to store really a lot of information, then
processing on-the-fly may be better. In the case, keep the history lines
say in the list that is shortened if it grows above a given limit.
Then use a counter that indicates how many lines must be displayed after
observing a problem:
#!python3
def print_around_pattern(pattern, fname, numlines=10):
"""Prints the lines with the pattern from the fname text file.
The pattern is a string, numline is the number of lines printed before
and after the line with the pattern (with default value 10).
"""
history = []
cnt = 0
with open(fname, encoding='utf8') as fin:
for n, line in enumerate(fin):
history.append(line) # append the line
history = history[-numlines-1:] # keep only the tail, including last line
if pattern in line:
# Print the separator and the history lines including the pattern line.
print('\n{!r} at the line {} ----------------------------'.format(
pattern, n+1))
for h in history:
print('{:03d}: {}'.format(n-numlines, h), end='')
cnt = numlines # set the counter for the next lines
elif cnt > 0:
# The counter indicates we want to see this line.
print('{:03d}: {}'.format(n+1, line), end='')
cnt -= 1 # decrement the counter
if __name__ == '__main__':
print_around_pattern('out_of_memory', 'loggy.json')
##print_around_pattern('out_of_memory', 'loggy.json', 3) # three lines before and after

Get filename from user and convert the number into list

So far, I have this:
def main():
bad_filename = True
l =[]
while bad_filename == True:
try:
filename = input("Enter the filename: ")
fp = open(filename, "r")
for f_line in fp:
a=(f_line)
b=(f_line.strip('\n'))
l.append(b)
print (l)
bad_filename = False
except IOError:
print("Error: The file was not found: ", filename)
main()
this is my program and when i print this what i get
['1,2,3,4,5']
['1,2,3,4,5', '6,7,8,9,0']
['1,2,3,4,5', '6,7,8,9,0', '1.10,2.20,3.30,0.10,0.30']
but instead i need to get
[1,2,3,4,5]
[6,7,8,9,0.00]
[1.10,2.20,3.3.0,0.10,0.30]
Each line of the file is a series on numbers separated by commas, but to python they are just characters. You need one more conversion step to get your string into a list. First split on commas to create a list of strings each of which is a number. Then use what is called "list comprehension" (or a for loop) to convert each string into a number:
b = f_line.strip('\n').split(',')
c = [float(v) for v in b]
l.append(c)
If you really want to reset the list each time through the loop (your desired output shows only the last line) then instead of appending, just assign the numerical list to l:
b = f_line.strip('\n').split(',')
l = [float(v) for v in b]
List comprehension is a shorthand way of saying:
l = []
for v in b:
l.append(float(v))
You don't need a or the extra parentheses around the assignment of a and b.

Resources