I am confuse about this code can someone explain - python-3.x

Changing Duplicate characters in a string to ) and non duplicate to (.
I have tried 2 for loops but it doesn't work. I am beginner in coding therefore I cant understand this complex code can someone explain.
def duplicate_encode(word):
return (lambda w: ''.join(('(', ')')[c in w[:i] + w[i+1:]] for i, c in enumerate(w)))(word.lower())
print(duplicate_encode("rEcede"))
Input: "Mercedes Bench"
Output: ()())()((()()(

As said in a comment, I think this is bad coding practice and should be avoided. But it can serve as an example of code reading. So I'll give it a try here. (First you should read about lambda if you're not familiar with it.)
First, look at the matching parentheses and try to find the "deepest" parts:
The top one is: lambda w: ''.join(('(', ')')[c in w[:i] + w[i+1:]] for i, c in enumerate(w))) applied to word.lower().
Then we have ('(', ')')[c in w[:i] + w[i+1:]] for i, c in enumerate(w)) in place of three dots inside ''.join(...).
enumerate(w), where w is a string, will produce an enumerate object that can be iterated to get tuples of form (i,c), where i is the index of the letter c. Try running for x in enumerate(w): print(x) for different strings w in order to get a feel for it.
The ('(', ')')[c in w[:i] + w[i+1:]] for i, c in enumerate(w)) will then produce a generator object by iterating through the tuples of letters of w and the respective indices that will consist of only ')' and '(' that will be then concatenated by ''.join(...) into the final output string. Let's break it down further.
[c in w[:i] + w[i+1:]] will always evaluate to either [True] or [False] (see 6 as to why). Now, ('(', ')')[False] will return '(' and ('(', ')')[True] will return ')' (something I learned right now by typing it out to see what happens).
For any letter in w there will be a tuple in the generator object (see point 4), (i, c). The [c in w[:i] + w[i+1:]] will first take two substrings of w. The first one will include all the letters up to the position i (where the current letter is located) and the second will include all the letters after the current letter. These two substrings are then concatenated. Then c in part will just check if the current letter is in the resulting string, effectively checking if the letter c appears at some other part of the string as well. For example for a w = 'aba' and second tuple from enumerate('aba'), that is (1, 'b'), w[:i] will be equal to 'aba'[:1] which is 'a' and w[i+1:] will be equal to 'aba'[:1] which is equal to 'a', concatenated we get a string 'aa' and thus [c in w[:i] + w[i+1:]] which in this case is equal to ['b' in 'aa'] will evaluate to [False], hence resulting in '('.
Effectively the lambda part is just a function that for each letter at a given position, checks if the same letter is present in a modified string with the letter removed from that position. It is then applied to an argument word.lower() which just insures that the caps are ignored (e.g., 'A' and 'a' are counted as the same letter).

This code replicates exactly what the lambda function does. By separating the logic into distinct statements it is easier to follow the logic. Remove the comments from the print statements to see the whole process in detail.
def simple_duplicate_encode(word):
output = ""
for i, c in enumerate(word):
# print(i,c)
i1 = word[:i]
i2 = word[i+1:]
# print(":{} = {}".format(i, word[:i]))
# print("{}: = {}".format(i+1, word[i+1:]))
is_duplicated = c in i1 + i2 # Check to see if the character c is in the rest of the string
# print("Is duplicated:{}".format(is_duplicated))
character = ('(',')')[is_duplicated] # If is_duplicated = True the value is 1, else 0
# print(character)
output += character
return output

Related

Looking through a list of lists for specific values

I need reada .txt file and find a specific pattern of T's, namely T's arranged in a cross-pattern.
Here's what I've done so far, and its output when I print is below:
def find_treasure(mapfile):
lst = []
with open(mapfile, 'r') as rf:
for line in rf:
lst.append(line.split())
print(lst)
Output
My initial idea was to do something like using 2 for loops to go through each item in the list and then look at each letter/ character in the item itself, but I kept getting list index range errors or its not working at all.
for i in range(len(lst)):
for j in range(len(lst[i])):
if lst[i][j] == 'T':
print('WHy')
else:
print('why am i here why')
Do you guys have any advice?
EDIT: Sample input:
WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
W: WATER
T: TREE
.: GRASS
^: MOUNTAIN
And the expected output is: (21,25)
This is not a total response, but I see two problems in your code.
First, as Tadhg mention it, your find_treasure does not return any value, that could be causing the range errors.
Once you connect that, your other block remains. And the reason that you are reaching your why am i here why statement it's cause the split() method without a separator parameter just split the blank spaces. If you want to separate each value from the line, you should use lst.append(list(line)) this would create a matrix with all the elements of your input to be accessed with mat[][]
I hope this helps you =).
I'm assuming by "T's arranged in a cross pattern", you mean this:
*T*
TTT
*T*
Where * is anything but a T.
So to identify a cross pattern centered at the location lst[i][j], all the indices surrounding it must be equal to T.
def isCrossAt(lst, i, j):
return lst[i - 1][j] == 'T' and \
lst[i + 1][j] == 'T' and \
lst[i][j - 1] == 'T' and \
lst[i][j + 1] == 'T' and \
lst[i][j] == 'T'
This means that you only need to check for crosses centered at the second through the second-last row, and the second through the second-last column.
def findCrosses(lst):
for i in range(1, len(lst) - 1):
row = lst[i]
for j in range(1, len(row) - 1):
# Copy the isCrossAt logic here to save a function call
foundCross = lst[i - 1][j] == 'T' and \
lst[i + 1][j] == 'T' and \
lst[i][j - 1] == 'T' and \
lst[i][j + 1] == 'T' and \
lst[i][j] == 'T'
if foundCross:
return (i, j)
Let's test this using your string.
lst = """WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW""".split('\n')
# Now lst is a list of strings, but that doesn't matter
# because we can obtain characters in a string just like elements in a list
# duck typing FTW!
findCrosses(lst)
# Out: (21, 25)
I'm afraid the error you are getting is not caused by any of the code you have shared, your loop works perfectly well (other than line.split() splitting by whitespace which there is none in your file, you probably want list(line) or just line to split on every character)
This script runs without error demonstrating your issue is in some other part of your code:
import io
mock_file = io.StringIO("""WWWWWWWWWWWWWWWWWWWWWWW.TTT..^^^^...WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...T..^^^^....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWW..T.....^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWWW........^^^^..T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^....T.WWWWWWW
WWWWWWWWWWWWWWWWWWWW........^^^......WWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^^.....WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW.....^^^......WWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW....^^......WWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW......^.....WWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW............WWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWW....T......WWWWWWWWWWWWW
WWWWW...WWWWWWWWWWWWW..T.T.....WWWWWWWWWWWWW
WWWW..TTT.WWWWWWWWWWW...T.....WWWWWWWWWWWWWW
WWWWW.......WWWWWWWWWWW......WWWWWWWWWWWWWWW
WWWWWWWW...T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWW....WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWW.T.WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWW.WWWWWWWWWW.....WWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWW....T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW.TTT..WWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWW..T..WWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWW...WWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
W: WATER
T: TREE
.: GRASS
^: MOUNTAIN""")
def find_treasure(mapfile):
lst = []
with mapfile as rf:
for line in rf:
lst.append(list(line))
print(lst)
for i in range(len(lst)):
for j in range(len(lst[i])):
if lst[i][j] == 'T':
print('WHy')
else:
print('why am i here why')
find_treasure(mock_file)
Because of this I would verify that the lst variable is the same in the 2 mentioned sections of code, because range errors would only happen in something different from what you have shown.

Unable to Reverse the text using 'for' Loop Function

I want to reverse the string using the Loop & Function. But when I use the following code, it is output the exact same string again. But it suppose to reverse the string. I can't figure out why.
def reversed_word(word):
x=''
for i in range(len(word)):
x+=word[i-len(word)]
print(i-len(word))
return x
a=reversed_word('APPLE')
print(a)
If you look at the output of your debug statement (the print in the function), you'll see you're using the indexes -5 through -1.
Since negative indexes specify the distance from the end of the string, -5 is the A, -4 is the first P, and so on. And, since you're appending these in turn to an originally empty string, you're just adding the letters in the same order they appear in the original.
To add them in the other order, you can simply use len(word) - i - 1 as the index, giving the sequence (len-1) .. 0 (rather than -len .. -1, which equates to 0 .. (len-1)):
def reversed_word(word):
result = ""
for i in range(len(word)):
result += word[len(word) - i - 1]
return result
Another alternative is to realise you don't need to use an index at all since iterating over a string gives it to you one character at a time. However, since it gives you those characters in order, you need to adjust how you build the reversed string, by prefixing each character rather than appending:
def reverse_string(word):
result = ""
for char in word:
result = char + result
return result
This builds up the reversed string (from APPLE) as A, PA, PPA, LPPA and ELPPA.
Of course, you could also go fully Pythonic:
def reverse_string(word):
return "".join([word[i] for i in range(len(word), -1, -1)])
This uses list comprehension to create a list of characters in the original string (in reverse order) then just joins that list into a single string (with an empty separator).
Probably not something I'd hand in for classwork (unless I wanted to annoy the marker) but you should be aware that that's how professional Pythonistas usually tackle the problem.
Let's say your word is python.
You loop will then iterate over the values 0 through 5, since len(word) == 6.
When i is 0, i-len(word) is -6 (note carefully that this value is negative). You'll note that word[-6] is the character six places to the left from the end of the string, which is p.
Similarly, when i is 1, i-len(word) is -5, and word[i-len(word)] is y.
This pattern continues for each iteration of your loop.
It looks like you intend to use positive indices to step backward through the string with each iteration. To obtain this behavior, try using the expression len(word)-i-1 to index your string.
def reversed_word(word):
reversed = ''
for i in range(len(word)-1, -1, -1):
reversed += word[i]
return reversed
print(reversed_word("apple"))

How to remove a character nested in a list?

I am given a sample string AABCAAADA. I then split it into 3 parts: AAB, CAA, ADA.
I have nested these 3 elements into a list. In each part, I should check whether a duplicate character is present and delete the duplicate character. I know strings are immutable, but is there any trick to do that?
Below is the sample approach I tried but I am unable to use del and pop method to delete that duplicate character.
s='AABCAAADA'
x = int(input())
l=[]
#for i in range(0,len(s),x):
for j in range(0,len(s),3):
l.append(s[j:j+3])
j=0
for i in range(0,len(s)//x):
for j in range(0,len(l[j])-1):
if(l[i][j] == l[i][j+1]):
pass
#need to remove the (j+1)th term if it is duplicate
The output should be AB, CA, AD.
delete duplicate character in nested list
from functools import reduce
l = ['AAB','CAA','ADA']
print([''.join(reduce(lambda a, b: a if b in a else a + b, s, '')) for s in l])
Or, for Python 3.6+:
print([''.join({a: 1 for a in s}) for s in l])
Both output:
['AB', 'CA', 'AD']

How to determine if two elements from a list appear consecutively in a string? Python

I am trying to solve a problem that can be modelled most simply as follows.
I have a large collection of letter sequences. The letters come from two lists: (1) member list (2) non-member list. The sequences are of different compositions and lengths (e.g. AQFG, CCPFAKXZ, HBODCSL, etc.). My goal is to insert the number '1' into these sequences when any 'member' is followed by any two 'non-members':
Rule 1: Insert '1' after the first member letter that is followed
by 2 or more non-members letters.
Rule 2: Insert not more than one '1' per sequence.
The 'Members': A, B, C, D
'Non-members': E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
In other words, once a member letter is followed by 2 non-member letters, insert a '1'. In total, only one '1' is inserted per sequence. Examples of what I am trying to achieve are this:
AQFG ---> A1QFG
CCPFAKXZ ---> CC1PFAKXZ
BDDCCA ---> BDDCCA1
HBODCSL ---> HBODC1SL
ABFCC ---> ABFCC
ACKTBB ---> AC1KTBB # there is no '1' to be inserted after BB
I assume the code will be something like this:
members = ['A','B','C','D']
non_members = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N',
'O','P','Q','R','S','T','U','V','W','X','Y','Z']
strings = ['AQFG', 'CCPFAKXZ', 'BDDCCA', 'HBODCSL', 'ABFCC']
for i in members:
if i in strings:
if member is followed by 2 non-members: # Struggling here
i.insert(index_member, '1')
return i
return ''
EDIT
I have found that one solution could be to generate a list of all permutations of two 'non-member' items using itertools.permutations(non_members, 2), and then test for their presence in the string.
But is there a more elegant solution for this problem?
Generating all permutations is going to explode the number of things you are checking. you need to change how you are iterating something like:
members = ...
non_members = ...
s = 'AQFG'
out = ""
look = 2
for i in range(len(s)-look):
out += s[i]
if (s[i] in members) & \
(s[i+1] in non_members) & \
(s[i+2] in non_members):
out += '1' + s[i+1:]
break
This way you only need to go through the target string once, and you don't need to generate permutations, this method could be extended to look ahead many more than your method.
I believe can be done via regex also.
s = 'AQFG'
x = re.sub(r'([ABCD])([EFGHIJKLMNOPQRSTUVWXYZ])',r'\g<1>1\2',s)
print(x)
This will print A1QFG
Sorry. I missed that. re.sub can take an optional count parameter that can stop after the given number of replacements are made.
s = 'HBODCSL'
x = re.sub(r'([ABCD]+)([EFGHIJKLMNOPQRSTUVWXYZ])',r'\g<1>1\2',s,count=1)
print(x)
This will print HB1ODCSL

Algorithm for generating all string combinations

Say I have a list of strings, like so:
strings = ["abc", "def", "ghij"]
Note that the length of a string in the list can vary.
The way you generate a new string is to take one letter from each element of the list, in order. Examples: "adg" and "bfi", but not "dch" because the letters are not in the same order in which they appear in the list. So in this case where I know that there are only three elements in the list, I could fairly easily generate all possible combinations with a nested for loop structure, something like this:
for i in strings[0].length:
for ii in strings[1].length:
for iii in strings[2].length:
print(i+ii+iii)
The issue arises for me when I don't know how long the list of strings is going to be beforehand. If the list is n elements long, then my solution requires n for loops to succeed.
Can any one point me towards a relatively simple solution? I was thinking of a DFS based solution where I turn each letter into a node and creating a connection between all letters in adjacent strings, but this seems like too much effort.
In python, you would use itertools.product
eg.:
>>> for comb in itertools.product("abc", "def", "ghij"):
>>> print(''.join(comb))
adg
adh
adi
adj
aeg
aeh
...
Or, using an unpack:
>>> words = ["abc", "def", "ghij"]
>>> print('\n'.join(''.join(comb) for comb in itertools.product(*words)))
(same output)
The algorithm used by product is quite simple, as can be seen in its source code (Look particularly at function product_next). It basically enumerates all possible numbers in a mixed base system (where the multiplier for each digit position is the length of the corresponding word). A simple implementation which only works with strings and which does not implement the repeat keyword argument might be:
def product(words):
if words and all(len(w) for w in words):
indices = [0] * len(words)
while True:
# Change ''.join to tuple for a more accurate implementation
yield ''.join(w[indices[i]] for i, w in enumerate(words))
for i in range(len(indices), 0, -1):
if indices[i - 1] == len(words[i - 1]) - 1:
indices[i - 1] = 0
else:
indices[i - 1] += 1
break
else:
break
From your solution it seems that you need to have as many for loops as there are strings. For each character you generate in the final string, you need a for loop go through the list of possible characters. To do that you can make recursive solution. Every time you go one level deep in the recursion, you just run one for loop. You have as many level of recursion as there are strings.
Here is an example in python:
strings = ["abc", "def", "ghij"]
def rec(generated, k):
if k==len(strings):
print(generated)
return
for c in strings[k]:
rec(generated + c, k+1)
rec("", 0)
Here's how I would do it in Javascript (I assume that every string contains no duplicate characters):
function getPermutations(arr)
{
return getPermutationsHelper(arr, 0, "");
}
function getPermutationsHelper(arr, idx, prefix)
{
var foundInCurrent = [];
for(var i = 0; i < arr[idx].length; i++)
{
var str = prefix + arr[idx].charAt(i);
if(idx < arr.length - 1)
{
foundInCurrent = foundInCurrent.concat(getPermutationsHelper(arr, idx + 1, str));
}
else
{
foundInCurrent.push(str);
}
}
return foundInCurrent;
}
Basically, I'm using a recursive approach. My base case is when I have no more words left in my array, in which case I simply add prefix + c to my array for every c (character) in my last word.
Otherwise, I try each letter in the current word, and pass the prefix I've constructed on to the next word recursively.
For your example array, I got:
adg adh adi adj aeg aeh aei aej afg afh afi afj bdg bdh bdi
bdj beg beh bei bej bfg bfh bfi bfj cdg cdh cdi cdj ceg ceh
cei cej cfg cfh cfi cfj

Resources