Strange behavior on lambda and Dict.items() - python-3.x

I'm trying to replace duplicate words from string with the code bellow:
from functools import reduce
from collections import Counter
import re
if __name__ == '__main__':
sentence = 'User Key Account Department Account Start Date'
result = reduce(
lambda sentence, word: re.sub(rf'{word}\s*', '', sentence, count=1),
filter(lambda x: x[0] if x[1] > 1 else '',
Counter(sentence.split()).items()),
sentence
)
import pdb
pdb.set_trace()
print(result)
# User Key Department Account Start Date
But it does not print the expected. The strange part is in filter. If I list only results filtered:
[el for el in filter(lambda x: x[0] if x[1] > 1 else '', Counter(sentence.split()).items())]
# [('Account', 2)]
Despite what is specified in lambda, x[0].
If I pass a not false value to else clause:
[el for el in filter(lambda x: x[0] if x[1] > 1 else ['foo'], Counter(sentence.split()).items())]
# [('User', 1), ('Key', 1), ('Account', 2), ('Department', 1), ('Start', 1), ('Date', 1)]
What I'm missing here?
I'd like to do the following:
[el for el in filter(lambda key,value: key if value > 1 else '', Counter(sentence.split()).items())]
And get Account. But it raises *** TypeError: <lambda>() missing 1 required positional argument: 'value'
It works fine using list comprehension:
[key for key, value in Counter(sentence.split()).items() if value > 1]
# ['Account']

The issue here is that I'm not sure what you're trying to do here. But I will explain what's actually happening.
Consider the expression filter(lambda x: x[0] if x[1] > 1 else '', Counter(sentence.split()).items()).
The first argument to filter is a predicate. This is a function which takes one input (x) and returns a value which is interpreted as a Boolean.
In this case, let's consider the predicate lambda x : x[0] if x[1] > 1 else '' - we will write this as P for shorthand. We will assume we call this function on an ordered pair (a, b) such that a is a string and b is a number.
Then we see that P((a, b)) = a if b > 1 else ''.
So if b > 1, then P((a, b)) evaluates to a. This value is then interpreted as a Boolean (even though it's a string) because P serves as a predicate.
When we interpret some "container" data-type like a String as a Boolean, we interpret the container to be "true-like" if it is non-empty and "false-like" if it is empty. So in this case, a will be interpreted as True when a is non-empty and False when a is empty.
On the other hand, when b <= 1, P((a, b)) will evaluate to '' which is then interpreted as False (because it's the empty string).
So P((a, b)) is a string which, when interpreted as a Boolean, is equal to b > 1 and (a is non-empty).
So when we call filter(P, seq), where seq is a sequence of pairs (a, b), a a string and b a number, we see that we will keep exactly those pairs (a, b) where b > 1 and a is non-empty.
This is indeed what happens.
However, it seems that what you want to happen is to only keep the items which occur more than once while ignoring their count. To do this, you need a combination of map and filter. You would want
map(lambda x: x[0], filter(lambda x: x[1] > 1, Counter(sentence.split()).items()))
This first keeps only the pairs (a, b) where b > 1. It then takes each remaining pair (a, b) and keeps only the a.

Related

creating a list of tuples based on successive items of initial list [duplicate]

I sometimes need to iterate a list in Python looking at the "current" element and the "next" element. I have, till now, done so with code like:
for current, next in zip(the_list, the_list[1:]):
# Do something
This works and does what I expect, but is there's a more idiomatic or efficient way to do the same thing?
Some answers to this problem can simplify by addressing the specific case of taking only two elements at a time. For the general case of N elements at a time, see Rolling or sliding window iterator?.
The documentation for 3.8 provides this recipe:
import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)
For Python 2, use itertools.izip instead of zip to get the same kind of lazy iterator (zip will instead create a list):
import itertools
def pairwise(iterable):
"s -> (s0, s1), (s1, s2), (s2, s3), ..."
a, b = itertools.tee(iterable)
next(b, None)
return itertools.izip(a, b)
How this works:
First, two parallel iterators, a and b are created (the tee() call), both pointing to the first element of the original iterable. The second iterator, b is moved 1 step forward (the next(b, None)) call). At this point a points to s0 and b points to s1. Both a and b can traverse the original iterator independently - the izip function takes the two iterators and makes pairs of the returned elements, advancing both iterators at the same pace.
Since tee() can take an n parameter (the number of iterators to produce), the same technique can be adapted to produce a larger "window". For example:
def threes(iterator):
"s -> (s0, s1, s2), (s1, s2, s3), (s2, s3, 4), ..."
a, b, c = itertools.tee(iterator, 3)
next(b, None)
next(c, None)
next(c, None)
return zip(a, b, c)
Caveat: If one of the iterators produced by tee advances further than the others, then the implementation needs to keep the consumed elements in memory until every iterator has consumed them (it cannot 'rewind' the original iterator). Here it doesn't matter because one iterator is only 1 step ahead of the other, but in general it's easy to use a lot of memory this way.
Roll your own!
def pairwise(iterable):
it = iter(iterable)
a = next(it, None)
for b in it:
yield (a, b)
a = b
Starting in Python 3.10, this is the exact role of the pairwise function:
from itertools import pairwise
list(pairwise([1, 2, 3, 4, 5]))
# [(1, 2), (2, 3), (3, 4), (4, 5)]
or simply pairwise([1, 2, 3, 4, 5]) if you don't need the result as a list.
I’m just putting this out, I’m very surprised no one has thought of enumerate().
for (index, thing) in enumerate(the_list):
if index < len(the_list):
current, next_ = thing, the_list[index + 1]
#do something
Since the_list[1:] actually creates a copy of the whole list (excluding its first element), and zip() creates a list of tuples immediately when called, in total three copies of your list are created. If your list is very large, you might prefer
from itertools import izip, islice
for current_item, next_item in izip(the_list, islice(the_list, 1, None)):
print(current_item, next_item)
which does not copy the list at all.
Iterating by index can do the same thing:
#!/usr/bin/python
the_list = [1, 2, 3, 4]
for i in xrange(len(the_list) - 1):
current_item, next_item = the_list[i], the_list[i + 1]
print(current_item, next_item)
Output:
(1, 2)
(2, 3)
(3, 4)
I am really surprised nobody has mentioned the shorter, simpler and most importantly general solution:
Python 3:
from itertools import islice
def n_wise(iterable, n):
return zip(*(islice(iterable, i, None) for i in range(n)))
Python 2:
from itertools import izip, islice
def n_wise(iterable, n):
return izip(*(islice(iterable, i, None) for i in xrange(n)))
It works for pairwise iteration by passing n=2, but can handle any higher number:
>>> for a, b in n_wise('Hello!', 2):
>>> print(a, b)
H e
e l
l l
l o
o !
>>> for a, b, c, d in n_wise('Hello World!', 4):
>>> print(a, b, c, d)
H e l l
e l l o
l l o
l o W
o W o
W o r
W o r l
o r l d
r l d !
This is now a simple Import As of 16th May 2020
from more_itertools import pairwise
for current, next in pairwise(your_iterable):
print(f'Current = {current}, next = {nxt}')
Docs for more-itertools
Under the hood this code is the same as that in the other answers, but I much prefer imports when available.
If you don't already have it installed then:
pip install more-itertools
Example
For instance if you had the fibbonnacci sequence, you could calculate the ratios of subsequent pairs as:
from more_itertools import pairwise
fib= [1,1,2,3,5,8,13]
for current, nxt in pairwise(fib):
ratio=current/nxt
print(f'Curent = {current}, next = {nxt}, ratio = {ratio} ')
As others have pointed out, itertools.pairwise() is the way to go on recent versions of Python. However, for 3.8+, a fun and somewhat more concise (compared to the other solutions that have been posted) option that does not require an extra import comes via the walrus operator:
def pairwise(iterable):
a = next(iterable)
yield from ((a, a := b) for b in iterable)
A basic solution:
def neighbors( list ):
i = 0
while i + 1 < len( list ):
yield ( list[ i ], list[ i + 1 ] )
i += 1
for ( x, y ) in neighbors( list ):
print( x, y )
Pairs from a list using a list comprehension
the_list = [1, 2, 3, 4]
pairs = [[the_list[i], the_list[i + 1]] for i in range(len(the_list) - 1)]
for [current_item, next_item] in pairs:
print(current_item, next_item)
Output:
(1, 2)
(2, 3)
(3, 4)
code = '0016364ee0942aa7cc04a8189ef3'
# Getting the current and next item
print [code[idx]+code[idx+1] for idx in range(len(code)-1)]
# Getting the pair
print [code[idx*2]+code[idx*2+1] for idx in range(len(code)/2)]

How to multiply 2 input lists in python

Please help me understand how to code the following task in Python using input
Programming challenge description:
Write a short Python program that takes two arrays a and b of length n
storing int values, and returns the dot product of a and b. That is, it returns
an array c of length n such that c[i] = a[i] · b[i], for i = 0,...,n−1.
Test Input:
List1's input ==> 1 2 3
List2's input ==> 2 3 4
Expected Output: 2 6 12
Note that the dot product is defined in mathematics to be the sum of the elements of the vector c you want to build.
That said, here is a possibiliy using zip:
c = [x * y for x, y in zip(a, b)]
And the mathematical dot product would be:
sum(x * y for x, y in zip(a, b))
If the lists are read from the keyboard, they will be read as string, you have to convert them before applying the code above.
For instance:
a = [int(s) for s in input().split(",")]
b = [int(s) for s in input().split(",")]
c = [x * y for x, y in zip(a, b)]
Using for loops and appending
list_c = []
for a, b in zip(list_a, list_b):
list_c.append(a*b)
And now the same, but in the more compact list comprehension syntax
list_c = [a*b for a, b in zip(list_a, list_b)]
From iPython
>>> list_a = [1, 2, 3]
>>> list_b = [2, 3, 4]
>>> list_c = [a*b for a, b in zip(list_a, list_b)]
>>> list_c
[2, 6, 12]
The zip function packs the lists together, element-by-element:
>>> list(zip(list_a, list_b))
[(1, 2), (2, 3), (3, 4)]
And we use tuple unpacking to access the elements of each tuple.
From fetching the input and using map & lambda functions to provide the result. If you may want to print the result with spaces between (not as list), use the last line
list1, list2 = [], []
list1 = list(map(int, input().rstrip().split()))
list2 = list(map(int, input().rstrip().split()))
result_list = list(map(lambda x,y : x*y, list1, list2))
print(*result_list)
I came out with two solutions. Both or them are the ones that are expected in a Python introductory course:
#OPTION 1: We use the concatenation operator between lists.
def dot_product_noappend(list_a, list_b):
list_c = []
for i in range(len(list_a)):
list_c = list_c + [list_a[i]*list_b[i]]
return list_c
print(dot_product_noappend([1,2,3],[4,5,6])) #FUNCTION CALL TO SEE RESULT ON SCREEN
#OPTION 2: we use the append method
def dot_product_append(list_a, list_b):
list_c = []
for i in range(len(list_a)):
list_c.append(list_a[i]*list_b[i])
return list_c
print(dot_product_append([1,2,3],[4,5,6])) #FUNCTION CALL TO SEE RESULT ON SCREEN
Just note that the first method requires that you cast the product of integers to be a list before you can concatenate it to list_c. You do that by using braces ([[list_a[i]*list_b[i]] instead of list_a[i]*list_b[i]). Also note that braces are not necessary in the last method, because the append method does not require to pass a list as parameter.
I have added the two function calls with the values you provided, for you to see that it returns the correct result. Choose whatever function you like the most.

Compare lists with multiple elements

I have a tuple as follows s=[(1,300),(250,800),(900,1000),(1200,1300),(1500,2100)]
I need to compare the upper limit of the list with the lower limit of the next list. If the lower limit of the next list is less than the upper limit of the previous list than it should throw error else it should pass.
Example:
s=[(1,300),(250,800),(900,1000),(1200,1300),(1500,2100)] - This should throw error as 250<300.If it fails for any one, it should throw error immediately.
s=[(1,300),(350,800),(900,1000)] - This should not throw error as 350>300.
I have tried something like this:
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
print(s)
def f(mytuple, currentelement):
return mytuple[mytuple.index(currentelement) + 1]
for i in s:
j = f(s,i)
if i[0]<j[1]:
print("fail")
else:
print("pass")
But it's not working. Help me out here.
zip() combines lists (or any iterables) to a new iterable. It stops when the shortest list is exhausted. Imagine:
a = [1, 2, 3, 4]
b = ['a', 'b', 'c']
zipped = zip(a, b) # Gives: [(1, 'a'), (2, 'b'), (3, 'c')]
# 4 is skipped, because there is no element remaining in b
We can used this to get all pairs in s in an elegant, easy to read form:
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
pairs = zip(s, s[1:]) # zip s from index 0 with s from index 1
Now that we have pairs in the form of ((a0, a1), (b0, b1)) you can easily compare if a1 > b0 in a loop:
for a,b in pairs:
if a[1] > b[0]:
print("fail")
else:
print("pass")
Two problems I see:
1) You're running into an out of bounds error, as the last element (900,1000) is trying to check the follow element which does not exist.
You can skip the last element by adding [:-1] to your loop.
2) In addition, your "if" condition seems to be backwards. You seem to be wanting to compare i[1] with j[0] instead of i[0] with j[1].
s=[(1,300),(250,800),(900,1000)]
s= (sorted(s))
print(s)
def f(mytuple, currentelement):
return mytuple[mytuple.index(currentelement) + 1]
for i in s[:-1]:
j = f(s,i)
if i[1]>j[0]:
print("fail")
else:
print("pass")
See How to loop through all but the last item of a list? for more details.

Everytime I run this code it says that numpy.ndarray has not attribute 'index'

When I run this code it returns that the numpy.ndarray object has no attributes. I'm trying to write a function that in case the number given is in the array will return with the position of that number in the array.
a = np.c_[np.array([1, 2, 3, 4, 5])]
x = int(input('Type a number'))
def findelement(x, a):
if x in a:
print (a.index(x))
else:
print (-1)
print(findelement(x, a))
Please use np.where instead of list.index.
import numpy as np
a = np.c_[np.array([1, 2, 3, 4, 5])]
x = int(input('Type a number: '))
def findelement(x, a):
if x in a:
print(np.where(a == x)[0][0])
else:
print(-1)
print(findelement(x, a))
Result:
Type a number: 3
2
None
Note np.where returns the indices of elements in an input array where
the given condition is satisfied.
You should check out np.where and np.argwhere.

Using the function Map, count the number of words that start with ‘S’ in list in Python3

I'd like to get the total count of elements in a list starting with 'S' by only using Map function and Lambda expression. What I've tried is using list function encapsulated which is not I want.
Below is my code in which I've tried which is not desired.
input_list = ['San Jose', 'San Francisco', 'Santa Fe', 'Houston']
desireList = list(map(lambda x: x if x[0] == 'S' else '', input_list))
desireList.remove('')
print(len(desireList))
It's more Pythonic to use sum with a generator expression for your purpose:
sum(w.startswith('S') for w in input_list)
or:
sum(f == 'S' for f, *_ in input_list)
or if you still would prefer to use map and lambda:
sum(map(lambda x: x[0] == 'S', input_list))
With your sample input, all of the above would return: 3
You can try this:
count = list(map(lambda x:x[0]=='S',input_list)).count(True)
Here's an alternate approach
list( map( lambda x : x[0].lower() , input_list ) ).count('s')
Generate a list of 1st characters per item in the list, and count the number of 's' characters in that list.

Resources