compair two files and display missing result - python-3.x

A list of IP addresses are downloaded to a file and rename to Old_file. As days goes by device get update with more IPs(or deleted). Therefore, I download a new list of IP addresses to another file named as New_file
I then want to run a compare these two files and see what is not matching
Old_file = [1.1.1.1,
1.1.1.2,
1.1.1.3,
1.1.1.4,
1.1.1.6,]
new_file = [1.1.1.1,
1.1.1.2,
1.1.1.3,
1.1.1.5,
1.1.1.6]
return needs to 1.1.1.4, and stops there. But never from Old_file e.g: 1.1.1.5 (we need the results only from the New_file only)
I really hope this would explain.
Thanks in advance
Tony

For a simple element-wise comparison, you could do
def get_first_unequal(s0, s1):
for e0, e1 in zip(s0, s1): # assumes sequences are of equal length!
if e0 != e1:
print(f"unequal elements: '{e0}' vs. '{e1}'!")
return (e0, e1)
return None # all equal
a = ['a', 'b', 'c']
b = ['a', 'b', 'd']
get_first_unequal(a, b)
# unequal elements: 'c' vs. 'd'!
# ('c', 'd')
# --> to get a list of all unequal pairs, you could also use
# [(e0, e1) for (e0, e1) in zip(s0, s1) if e0 != e1]
If you want to go more sophisticated, as mentioned in the comments, difflib might be your way to go. to run e.g. a comparison of two sequences (which are the list of strings you read from the two txt files you want to compare):
import difflib
a = ['a', 'b', 'c']
b = ['s', 'b', 'c', 'd']
delta = difflib.context_diff(a, b)
for d in delta:
print(d)
gives
*** 1,3 ****
! a
b
c
--- 1,4 ----
! s
b
c
+ d
to check the difference between two strings, you could do something like (borrowing from here):
a = 'string1'
b = 'string 2'
delta = difflib.ndiff(a, b)
print(f"a -> b: {a} -> {b}")
for i, d in enumerate(delta):
if d[0] == ' ': # no difference
continue
elif d[0] == '-':
print(f"Deleted '{d[-1]}' from position {i}")
elif d[0] == '+':
print(f"Added '{d[-1]}' to position {i-1}")
gives
a -> b: string1 -> string 2
Deleted '1' from position 6
Added ' ' to position 6
Added '2' to position 7

If you're assuming that both files should be exactly identical, you can just iterate over the characters of the first and compare them to the second. I.e.
# check that they're the same length first
if len(Old_file) != len(New_file):
print('not the same!')
else:
for indx, char in enumerate(Old_file):
try:
# actually compare the characters
old_char = char
new_char = New_file[indx]
assert(old_char == new_char)
except IndexError:
# the new file is shorter than the old file
print('not the same!')
break # kill the loop
except AssertionError:
# the characters do not match
print('not the same!')
break # kill the loop
It's worth noting that there are faster ways to do this. You could look into performing a checksum, though it wouldn't tell you which parts are different only that they are different. If the files are large, the performance of doing the check one character at a time will be quite bad -- in that case you can try instead to compare blocks of data at a time.
Edit: re-reading your original question, you could definitely do this with a while loop. If you did, I would suggest basically the same strategy of checking each individual character. In that case you would manually need to increment the indx of course.

Related

Looking through a list of lists for specific values

I need reada .txt file and find a specific pattern of T's, namely T's arranged in a cross-pattern.
Here's what I've done so far, and its output when I print is below:
def find_treasure(mapfile):
lst = []
with open(mapfile, 'r') as rf:
for line in rf:
lst.append(line.split())
print(lst)
Output
My initial idea was to do something like using 2 for loops to go through each item in the list and then look at each letter/ character in the item itself, but I kept getting list index range errors or its not working at all.
for i in range(len(lst)):
for j in range(len(lst[i])):
if lst[i][j] == 'T':
print('WHy')
else:
print('why am i here why')
Do you guys have any advice?
EDIT: Sample input:
WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
W: WATER
T: TREE
.: GRASS
^: MOUNTAIN
And the expected output is: (21,25)
This is not a total response, but I see two problems in your code.
First, as Tadhg mention it, your find_treasure does not return any value, that could be causing the range errors.
Once you connect that, your other block remains. And the reason that you are reaching your why am i here why statement it's cause the split() method without a separator parameter just split the blank spaces. If you want to separate each value from the line, you should use lst.append(list(line)) this would create a matrix with all the elements of your input to be accessed with mat[][]
I hope this helps you =).
I'm assuming by "T's arranged in a cross pattern", you mean this:
*T*
TTT
*T*
Where * is anything but a T.
So to identify a cross pattern centered at the location lst[i][j], all the indices surrounding it must be equal to T.
def isCrossAt(lst, i, j):
return lst[i - 1][j] == 'T' and \
lst[i + 1][j] == 'T' and \
lst[i][j - 1] == 'T' and \
lst[i][j + 1] == 'T' and \
lst[i][j] == 'T'
This means that you only need to check for crosses centered at the second through the second-last row, and the second through the second-last column.
def findCrosses(lst):
for i in range(1, len(lst) - 1):
row = lst[i]
for j in range(1, len(row) - 1):
# Copy the isCrossAt logic here to save a function call
foundCross = lst[i - 1][j] == 'T' and \
lst[i + 1][j] == 'T' and \
lst[i][j - 1] == 'T' and \
lst[i][j + 1] == 'T' and \
lst[i][j] == 'T'
if foundCross:
return (i, j)
Let's test this using your string.
lst = """WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW""".split('\n')
# Now lst is a list of strings, but that doesn't matter
# because we can obtain characters in a string just like elements in a list
# duck typing FTW!
findCrosses(lst)
# Out: (21, 25)
I'm afraid the error you are getting is not caused by any of the code you have shared, your loop works perfectly well (other than line.split() splitting by whitespace which there is none in your file, you probably want list(line) or just line to split on every character)
This script runs without error demonstrating your issue is in some other part of your code:
import io
mock_file = io.StringIO("""WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
W: WATER
T: TREE
.: GRASS
^: MOUNTAIN""")
def find_treasure(mapfile):
lst = []
with mapfile as rf:
for line in rf:
lst.append(list(line))
print(lst)
for i in range(len(lst)):
for j in range(len(lst[i])):
if lst[i][j] == 'T':
print('WHy')
else:
print('why am i here why')
find_treasure(mock_file)
Because of this I would verify that the lst variable is the same in the 2 mentioned sections of code, because range errors would only happen in something different from what you have shown.

How to Sort Alphabets

Input : abcdABCD
Output : AaBbCcDd
ms=[]
n = input()
for i in n:
ms.append(i)
ms.sort()
print(ms)
It gives me ABCDabcd.
How to sort this in python?
Without having to import anything, you could probably do something like this:
arr = "abcdeABCDE"
temp = sorted(arr, key = lambda i: (i.lower(), i))
result = "".join(temp)
print(result) # AaBbCcDdEe
The key will take in each element of arr and sort it first by lower-casing it, then if it ties, it will sort it based on its original value. It will group all similar letters together (A with a, B with b) and then put the capital first.
Use a sorting key:
ms = "abcdABCD"
sorted_ms = sorted(ms, key=lambda letter:(letter.upper(), letter.islower()))
# sorted_ms = ['A', 'a', 'B', 'b', 'C', 'c', 'D', 'd']
sorted_str = ''.join(sorted_ms)
# sorted_str = 'AaBbCcDd'
Why this works:
You can specify the criteria by which to sort by using the key argument in the sorted function, or the list.sort() method - this expects a function or lambda that takes the element in question, and outputs a new criteria by which to sort it. If that "new criteria" is a tuple, then the first element takes precedence - if it's equal, then the second argument, and so on.
So, the lambda I provided here returns a 2-tuple:
(letter.upper(), letter.islower())
letter.upper() as the first element here means that the strings are going to be sorted lexigraphically, but case-insensitively (as it will sort them as if they were all uppercase). Then, I use letter.islower() as the second argument, which is True if the letter was lowercase and False otherwise. When sorting, False comes before True - which means that if you give a capital letter and a lowercase letter, the capital letter will come first.
Try this:
>>>s='abcdABCD'
>>>''.join(sorted(s,key=lambda x:x.lower()))
'aAbBcCdD'

why it show me this in result (list index out of range)?

Write a function called stop_at_z that iterates through a list of strings. Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
while True :
if str1[d] != 'Z' :
x.append(str1[d])
d+=1
if str1[d] == 'Z' :
break
return x
Using a while loop, append each string to a new list until the string that appears is “z”. The function should return the new list.
You're getting this error because d keeps increasing infinitely if there is no uppercase 'Z' in the string. Instead, you should only stay in the while loop while the full length of the input string has not been reached:
def stop_at_z(inputstr):
d = 0
x=[]
str1 = list(inputstr)
while d<len(inputstr) :
if str1[d] == 'z' :
break
else:
x.append(str1[d])
d+=1
return x
Note that you can achieve the same thing using takewhile() from the itertools module:
from itertools import takewhile
def stop_at_z(inputstr):
return list(takewhile(lambda i: i != 'z', inputstr))
print(stop_at_z("hello wzrld"))
Output:
['h', 'e', 'l', 'l', 'o', ' ', 'w']
Is the the way you are doing it, searching for “z” is case-sensitive, try something like:
If str1[d].strip().lower() == “z”
It strips off leading and trailing white space and then converts the str1 element to lower case (both of these simply return the modified string, so the original is unchanged) and compares it to a lower case z
What if the string 'z' is never in the list?
Then it keeps on increasing the index and eventually runs into an error.
Just restricting the loop to the length of the list should help.
def stop_at_z(str):
d = 0
x=[]
str1 = list(str)
for d in range(0,len(str1)) :
print(d)
if str1[d] != 'Z' :
x.append(str1[d])
else:
break
return x
Basically, we needed to have a list that could have all the characters until we get "z". One way we could do that is we first convert the string into a list and iterate that list and add every character to a new list ls until we get "z". But the problem is we may get a string that doesn't have "z" so we need to iterate till the length of that list. I hope it is clear.
def stop_at_z(s):
ls = []
idx = 0
x = list(s)
while idx<len(x):
if x[idx]=="z":
break
ls.append(x[idx])
idx+=1
return ls
It's my first time posting here, but I use this while loop:
def stop_at_z(input_list):
print (input_list)
output_list=[]
index=0
while index< len(input_list):
if input_list[index] != "z":
output_list.append(input_list[index])
index+=1
else:
break
return output_list

How to remove a character nested in a list?

I am given a sample string AABCAAADA. I then split it into 3 parts: AAB, CAA, ADA.
I have nested these 3 elements into a list. In each part, I should check whether a duplicate character is present and delete the duplicate character. I know strings are immutable, but is there any trick to do that?
Below is the sample approach I tried but I am unable to use del and pop method to delete that duplicate character.
s='AABCAAADA'
x = int(input())
l=[]
#for i in range(0,len(s),x):
for j in range(0,len(s),3):
l.append(s[j:j+3])
j=0
for i in range(0,len(s)//x):
for j in range(0,len(l[j])-1):
if(l[i][j] == l[i][j+1]):
pass
#need to remove the (j+1)th term if it is duplicate
The output should be AB, CA, AD.
delete duplicate character in nested list
from functools import reduce
l = ['AAB','CAA','ADA']
print([''.join(reduce(lambda a, b: a if b in a else a + b, s, '')) for s in l])
Or, for Python 3.6+:
print([''.join({a: 1 for a in s}) for s in l])
Both output:
['AB', 'CA', 'AD']

Python 3.xx - Deleting consecutive numbers/letters from a string

I actually need help evaluating what is going on with the code which I wrote.
It is meant to function like this:
input: remove_duple('WubbaLubbaDubDub')
output: 'WubaLubaDubDub'
another example:
input: remove_duple('aabbccdd')
output: 'abcd'
I am still a beginner and I would like to know both what is wrong with my code and an easier way to do it. (There are some lines in the code which were part of my efforts to visualize what was happening and debug it)
def remove_duple(string):
to_test = list(string)
print (to_test)
icount = 0
dcount = icount + 1
for char in to_test:
if to_test[icount] == to_test[dcount]:
del to_test[dcount]
print ('duplicate deleted')
print (to_test)
icount += 1
elif to_test[icount] != to_test[dcount]:
print ('no duplicated deleted')
print (to_test)
icount += 1
print ("".join(to_test))
Don't modify a list (e.g. del to_test[dcount]) that you are iterating over. Your iterator will get screwed up. The appropriate way to deal with this would be to create a new list with only the values you want.
A fix for your code could look like:
In []:
def remove_duple(s):
new_list = []
for i in range(len(s)-1): # one less than length to avoid IndexError
if s[i] != s[i+1]:
new_list.append(s[i])
if s: # handle passing in an empty string
new_list.append(s[-1]) # need to add the last character
return "".join(new_list) # return it (print it outside the function)
remove_duple('WubbaLubbaDubDub')
Out[]:
WubaLubaDubDub
As you are looking to step through the string, sliding 2 characters at a time, you can do that simply by ziping the string with itself shifted one, and adding the first character if the 2 characters are not equal, e.g.:
In []:
import itertools as it
def remove_duple(s):
return ''.join(x for x, y in it.zip_longest(s, s[1:]) if x != y)
remove_duple('WubbaLubbaDubDub')
Out[]:
'WubaLubaDubDub'
In []:
remove_duple('aabbccdd')
Out[]:
'abcd'
Note: you need itertools.zip_longest() or you will drop the last character. The default fillvalue of None is fine for a string.

Resources