Lets say I had a simple python list that contained the type of expense and I wanted to iterate over these expenses with a for loop. At each iteration, if the indice produces the correct expense type a counter will be advanced by 1. I can easily write this with the code below, but it is not using a fast running for loop.
array = ['Groceries', 'Restaurant', 'Groceries', 'Misc', 'Bills']
sum = 0
for i in range(len(array)):
if array[i] == 'Groceries':
sum += 1
Is there a more pythonic way to write this loop that accelerates the execution? I have seen examples that would look something like the code snippet below. NOTE: The code snippet below does not work, it is just an example of an accelerated format that I have seen before, but do not fully understand.
sum = [sum + 1 for i in array if array[i] == 'Groceries']
If it's just about counts, try collections.Counter:
from collections import Counter
array = ['Groceries', 'Restaurant', 'Groceries', 'Misc', 'Bills']
counts = Counter(array)
print(counts)
# Counter({'Groceries': 2, 'Bills': 1, 'Restaurant': 1, 'Misc': 1})
print(counts['Groceries'])
# 2
for i in range(len(array)):
is definitely NOT the Python-ic way of iterating over an array. It is VisualBasic thinking, from which you should free yourself.
If you want to iterate over an array, just iterate over it as follows:
array = ['Groceries', 'Restaurant', 'Groceries', 'Misc', 'Bills']
for eachItem in array:
...
What you do in the loop is up to you. If you want to count howmany groceries in the list, then you can do this:
array = ['Groceries', 'Restaurant', 'Groceries', 'Misc', 'Bills']
groceriesTotal = 0
for eachItem in array:
if eachItem == 'Groceries':
groceriesTotal = groceriesTotal + 1
This is simple, clear and pythonic enough to be readable by others.
You seem to think you need a list comprehension for this. But list comprehensions produce lists, and you want a scalar. Try array. count("Groceries").
Related
I am implementing an algorithm which might affect the size of some array, and I need to iterate through the entire array. Basically a 'for x in arrayname' would not work because it does not update if the contents of arrayname are changed in the loop. I came up with an ugly solution which is shown in the following example:
test = np.array([1,2,3])
N = len(test)
ii=0
while ii < N:
N = len(test)
print(test[ii])
if test[ii] ==2:
test = np.append(test,4)
ii+=1
I am wondering whether a cleaner solution exists.
Thanks in advance!
Assuming all the elements are going to be added at the end and no elements are being deleted you could store the new elements in a separate list:
master_list = [1,2,3]
curr_elems = master_list
while len(curr_elems) > 0: # keep looping over new elements added
new_elems = []
for item in curr_elems: # loop over the current list of elements, initially the list but then all the added elements on second run etc
if should_add_element(item):
new_elems.append(generate_new_element(item))
master_list.extend(new_elems) # add all the new elements to our master list
curr_elems = new_elems # and prep to iterate over the new elements for next iteration of the while loop
The while loop seems the best solution. As the condition is re-evaluated at each iteration, you don’t need to reset the length of the list in the loop, you can do it inside the condition:
import random
l = [1, 2, 3, 4, 5]
i = 0
while i < len(l):
if random.choice([True, False]):
del l[i]
else:
i += 1
print(f'{l=}')
This example gives a blueprint for a more complex algorithm. Of course, in this simple case, it could be coded more simply with a filter, or like this:
l = [1, 2, 3, 4, 5]
[x for x in l if random.choice([True, False])]
You might want to check this related post for more creative solutions: How to remove items from a list while iterating?
Is it possible to convert this into a list comprehension? For example, I have a list v. On the source code below, v = dictionary.keys()
v = ["naive", "bayes", "classifier"]
I have the following nested list t.
t = [["naive", "bayes"], ["lol"]]
The expected output O should be:
O = [[1 1 0], [0 0 0]]
1 if the dictionary contains the word and 0 if not. I'm creating a spam/ham feature matrix. Due to the large dataset, I'd like to convert the code below into a list comprehension for a faster iteration.
ham_feature_matrix = []
for each_file in train_ham:
feature_vector = [0] * len(dictionary)
for each_word in each_file:
for d,dicword in enumerate(dictionary.keys()):
if each_word == dicword:
feature_vector[d] = 1
ham_feature_matrix.append(feature_vector)
I couldn't test this, but this translates as:
ham_feature_matrix = [[[int(each_word == dicword) for dicword in dictionary] for each_word in each_file] for each_file in train_ham]
[int(each_word == dicword) for dicword in dictionary] is the part which changes the most compared to your original code.
Basically, since you're iterating on the words of the dictionary, you don't need enumerate to set the matching slots to 1. The comprehension builds the list with the result of the comparison which is 0 or 1 when converted to integers. You don't need to get the keys since iterating on a dictionary iterates on the keys by default.
The rest of the loops is trivial.
The issue I'm seeing here is that you're iterating on a dictionary to create a list of booleans, but the order of the dictionary isn't fixed, so you'll have different results each time (like in your original code) unless you sort the items somehow.
I am using python 3.2.0 and numpy. I would like to check if one of the arrays is in between two other specified arrays. I would like it if you suggest a function or few of them together. Any help is appreciated , as it is a school project and I need to submit it quickly.
If you mean how the last item of arr1 is less than all items of input_arr, and the first item of arr2 is greater than all those in input_arr, you can do this, with "biggest" of arr1 and "smallest" of arr2:
biggest = arr1[len(arr1)-1]
smallest = arr2[0]
between=True
for item in input_arr:
if not (biggest<item and smallest>item):
between=False
break
Alternatively, you could change the < and/or the > to <= or >= if you allow equals to (so [1,3,4],[4,6,8],[8,17,18] is True)
This assumes that the lists are consecutive. If they're not, you'll have to loop through arr1 to find the largest number and arr2 to find the smallest first.
biggest=0
for item in arr1:
if item>biggest:
biggest=item
smallest=arr2[0]
for item in arr2:
if item<smallest:
smallest = item
Use this as a skeleton guide, don't just copy and paste. If you don't understand it and therefore can't construct your own version, you probably need to do some sort of online course (eg. Codecademy). In the mean time, copy the 2nd bit first if you need it, then the first.
If you had 3 arrays,
#lower bound
In[1]: small = np.zeros((3,3))
#Array we are testing
In[2]: test = np.ones((3,3))
#Upper bound
In[3]: large = np.ones((3,3))*2
then you could do a logical_and and test the entire array with the Boolean sum against the size
In[4]: np.logical_and(small<=testm,testm<=large).sum() == l.size
out[4]: True
I'm trying to write a code that will return common values from a dictionary based on a list of words.
Example:
inp = ['here','now']
dict = {'here':{1,2,3}, 'now':{2,3}, 'stop':{1, 3}}
for val in inp.intersection(D):
lst = D[val]
print(sorted(lst))
output: [2, 3]
The input inp may contain any one or all of the above words, and I want to know what values they have in common. I just cannot seem to figure out how to do that. Please, any help would be appreciated.
The easiest way to do this is to just count them all, and then make a dict of the values that are equal to the number of sets you intersected.
To accomplish the first part, we do something like this:
answer = {}
for word in inp:
for itm in word:
if itm in answer:
answer[itm] += 1
else:
answer[itm] = 1
To accomplish the second part, we just have to iterate over answer and build an array like so:
answerArr = []
for i in answer:
if (answer[i] == len(inp)):
answerArr.append(i)
i'm not certain that i understood your question perfectly but i think this is what you meant albeit in a very simple way:
inp = ['here','now']
dict = {'here':{1,2,3}, 'now':{2,3}, 'stop':{1, 3}}
output = []
for item in inp:
output.append(dict[item])
for item in output:
occurances = output.count(item)
if occurances <= 1:
output.remove(item)
print(output)
This should output the items from the dict which occurs in more than one input. If you want it to be common for all of the inputs just change the <= 1 to be the number of inputs given.
Say I have a list of strings, like so:
strings = ["abc", "def", "ghij"]
Note that the length of a string in the list can vary.
The way you generate a new string is to take one letter from each element of the list, in order. Examples: "adg" and "bfi", but not "dch" because the letters are not in the same order in which they appear in the list. So in this case where I know that there are only three elements in the list, I could fairly easily generate all possible combinations with a nested for loop structure, something like this:
for i in strings[0].length:
for ii in strings[1].length:
for iii in strings[2].length:
print(i+ii+iii)
The issue arises for me when I don't know how long the list of strings is going to be beforehand. If the list is n elements long, then my solution requires n for loops to succeed.
Can any one point me towards a relatively simple solution? I was thinking of a DFS based solution where I turn each letter into a node and creating a connection between all letters in adjacent strings, but this seems like too much effort.
In python, you would use itertools.product
eg.:
>>> for comb in itertools.product("abc", "def", "ghij"):
>>> print(''.join(comb))
adg
adh
adi
adj
aeg
aeh
...
Or, using an unpack:
>>> words = ["abc", "def", "ghij"]
>>> print('\n'.join(''.join(comb) for comb in itertools.product(*words)))
(same output)
The algorithm used by product is quite simple, as can be seen in its source code (Look particularly at function product_next). It basically enumerates all possible numbers in a mixed base system (where the multiplier for each digit position is the length of the corresponding word). A simple implementation which only works with strings and which does not implement the repeat keyword argument might be:
def product(words):
if words and all(len(w) for w in words):
indices = [0] * len(words)
while True:
# Change ''.join to tuple for a more accurate implementation
yield ''.join(w[indices[i]] for i, w in enumerate(words))
for i in range(len(indices), 0, -1):
if indices[i - 1] == len(words[i - 1]) - 1:
indices[i - 1] = 0
else:
indices[i - 1] += 1
break
else:
break
From your solution it seems that you need to have as many for loops as there are strings. For each character you generate in the final string, you need a for loop go through the list of possible characters. To do that you can make recursive solution. Every time you go one level deep in the recursion, you just run one for loop. You have as many level of recursion as there are strings.
Here is an example in python:
strings = ["abc", "def", "ghij"]
def rec(generated, k):
if k==len(strings):
print(generated)
return
for c in strings[k]:
rec(generated + c, k+1)
rec("", 0)
Here's how I would do it in Javascript (I assume that every string contains no duplicate characters):
function getPermutations(arr)
{
return getPermutationsHelper(arr, 0, "");
}
function getPermutationsHelper(arr, idx, prefix)
{
var foundInCurrent = [];
for(var i = 0; i < arr[idx].length; i++)
{
var str = prefix + arr[idx].charAt(i);
if(idx < arr.length - 1)
{
foundInCurrent = foundInCurrent.concat(getPermutationsHelper(arr, idx + 1, str));
}
else
{
foundInCurrent.push(str);
}
}
return foundInCurrent;
}
Basically, I'm using a recursive approach. My base case is when I have no more words left in my array, in which case I simply add prefix + c to my array for every c (character) in my last word.
Otherwise, I try each letter in the current word, and pass the prefix I've constructed on to the next word recursively.
For your example array, I got:
adg adh adi adj aeg aeh aei aej afg afh afi afj bdg bdh bdi
bdj beg beh bei bej bfg bfh bfi bfj cdg cdh cdi cdj ceg ceh
cei cej cfg cfh cfi cfj