Need help working with lists within lists - python-3.x

I'm taking a programming class and have our first assignment. I understand how it's supposed to work, but apparently I haven't hit upon the correct terms to search to get help (and the book is less than useless).
The assignment is to take a provided data set (names and numbers) and perform some manipulation and computation with it.
I'm able to get the names into a list, and know the general format of what commands I'm giving, but the specifics are evading me. I know that you refer to the numbers as names[0][1], names[1][1], etc, but not how to refer to just that record that is being changed. For example, we have to have the program check if a name begins with a letter that is Q or later; if it does, we double the number associated with that name.
This is what I have so far, with ??? indicating where I know something goes, but not sure what it's called to search for it.
It's homework, so I'm not really looking for answers, but guidance to figure out the right terms to search for my answers. I already found some stuff on the site (like the statistics functions), but just can't find everything the book doesn't even mention.
names = [("Jack",456),("Kayden",355),("Randy",765),("Lisa",635),("Devin",358),("LaWanda",452),("William",308),("Patrcia",256)]
length = len(names)
count = 0
while True
count < length:
if ??? > "Q" # checks if first letter of name is greater than Q
??? # doubles number associated with name
count += 1
print(names) # self-check
numberNames = names # creates new list
import statistics
mean = statistics.mean(???)
median = statistics.median(???)
print("Mean value: {0:.2f}".format(mean))
alphaNames = sorted(numberNames) # sorts names list by name and creates new list
print(alphaNames)

first of all you need to iter over your names list. To do so use for loop:
for person in names:
print(person)
But names are a list of tuples so you will need to get the person name by accessing the first item of the tuple. You do this just like you do with lists
name = person[0]
score = person[1]
Finally to get the ASCII code of a character, you use ord() function. That is going to be helpful to know if name starts with a Q or above.
print(ord('A'))
print(ord('Q'))
print(ord('R'))
This should be enough informations to get you started with.

I see a few parts to your question, so I'll try to separate them out in my response.
check if first letter of name is greater than Q
Hopefully this will help you with the syntax here. Like list, str also supports element access by index with the [] syntax.
$ names = [("Jack",456),("Kayden",355)]
$ names[0]
('Jack', 456)
$ names[0][0]
'Jack'
$ names[0][0][0]
'J'
$ names[0][0][0] < 'Q'
True
$ names[0][0][0] > 'Q'
False
double number associated with name
$ names[0][1]
456
$ names[0][1] * 2
912
"how to refer to just that record that is being changed"
We are trying to update the value associated with the name.
In theme with my previous code examples - that is, we want to update the value at index 1 of the tuple stored at index 0 in the list called names
However, tuples are immutable so we have to be a little tricky if we want to use the data structure you're using.
$ names = [("Jack",456), ("Kayden", 355)]
$ names[0]
('Jack', 456)
$ tpl = names[0]
$ tpl = (tpl[0], tpl[1] * 2)
$ tpl
('Jack', 912)
$ names[0] = tpl
$ names
[('Jack', 912), ('Kayden', 355)]
Do this for all tuples in the list
We need to do this for the whole list, it looks like you were onto that with your while loop. Your counter variable for indexing the list is named count so just use that to index a specific tuple, like: names[count][0] for the countth name or names[count][1] for the countth number.
using statistics for calculating mean and median
I recommend looking at the documentation for a module when you want to know how to use it. Here is an example for mean:
mean(data)
Return the sample arithmetic mean of data.
$ mean([1, 2, 3, 4, 4])
2.8
Hopefully these examples help you with the syntax for continuing your assignment, although this could turn into a long discussion.
The title of your post is "Need help working with lists within lists" ... well, your code example uses a list of tuples
$ names = [("Jack",456),("Kayden",355)]
$ type(names)
<class 'list'>
$ type(names[0])
<class 'tuple'>
$ names = [["Jack",456], ["Kayden", 355]]
$ type(names)
<class 'list'>
$ type(names[0])
<class 'list'>
notice the difference in the [] and ()
If you are free to structure the data however you like, then I would recommend using a dict (read: dictionary).

I know that you refer to the numbers as names[0][1], names[1][1], etc, but
not how to refer to just that record that is being changed. For
example, we have to have the program check if a name begins with a
letter that is Q or later; if it does, we double the number associated
with that name.
It's not entirely clear what else you have to do in this assignment, but regarding your concerns above, to reference the ith"record that is being changed" in your names list, simply use names[i]. So, if you want to access the first record in names, simply use names[0], since indexing in Python begins at zero.
Since each element in your list is a tuple (which can also be indexed), using constructs like names[0][0] and names[0][1] are ways to index the values within the tuple, as you pointed out.
I'm unsure why you're using while True if you're trying to iterate through each name and check whether it begins with "Q". It seems like a for loop would be better, unless your class hasn't gotten there yet.
As for checking whether the first letter is 'Q', str (string) objects are indexed similarly to lists and tuples. To access the first letter in a string, for example, see the following:
>>> my_string = 'Hello'
>>> my_string[0]
'H'
If you give more information, we can help guide you with the statistics piece, as well. But I would first suggest you get some background around mean and median (if you're unfamiliar).

Related

Generate a list of strings from another list using python random and eliminate duplicates

I have the following list:
original_list = [('Anger', 'Envy'), ('Anger', 'Exasperation'), ('Joy', 'Zest'), ('Sadness', 'Suffering'), ('Joy', 'Optimism'), ('Surprise', 'Surprise'), ('Love', 'Affection')]
I am trying to create a random list comprising of the 2nd element of the tuples (of the above list) using the random method in such a way that duplicate values appearing as the first element are only considered once.
That is, the final list I am looking at, will be:
random_list = [Exasperation, Suffering, Optimism, Surprise, Affection]
So, in the new list random_list, strings Envy and Zest are eliminated (as they are appearin the the original list twice). And the process has to randomize the result, i.e. with each iteration would produce a different list of Five elements.
May I ask somebody to show me the way how may I do it?
You can use dictionary to filter the duplicates from original_list (shuffled before with random.sample):
import random
original_list = [
("Anger", "Envy"),
("Anger", "Exasperation"),
("Joy", "Zest"),
("Sadness", "Suffering"),
("Joy", "Optimism"),
("Surprise", "Surprise"),
("Love", "Affection"),
]
out = list(dict(random.sample(original_list, len(original_list))).values())
print(out)
Prints (for example):
['Optimism', 'Envy', 'Surprise', 'Suffering', 'Affection']

Defining a function to find the unique palindromes in a given string

I'm kinda new to python.I'm trying to define a function when asked would give an output of only unique words which are palindromes in a string.
I used casefold() to make it case-insensitive and set() to print only uniques.
Here's my code:
def uniquePalindromes(string):
x=string.split()
for i in x:
k=[]
rev= ''.join(reversed(i))
if i.casefold() == rev.casefold():
k.append(i.casefold())
print(set(k))
else:
return
I've tried to run this line
print( uniquePalindromes('Hanah asked Sarah but Sarah refused') )
The expected output should be ['hanah','sarah'] but its returning only {'hanah'} as the output. Please help.
Your logic is sound, and your function is mostly doing what you want it to. Part of the issue is how you're returning things - all you're doing is printing the set of each individual word. For example, when I take your existing code and do this:
>>> print(uniquePalindromes('Hannah Hannah Alomomola Girafarig Yes Nah, Chansey Goldeen Need log'))
{'hannah'}
{'alomomola'}
{'girafarig'}
None
hannah, alomomola, and girafarig are the palindromes I would expect to see, but they're not given in the format I expect. For one, they're being printed, instead of returned, and for two, that's happening one-by-one.
And the function is returning None, and you're trying to print that. This is not what we want.
Here's a fixed version of your function:
def uniquePalindromes(string):
x=string.split()
k = [] # note how we put it *outside* the loop, so it persists across each iteration without being reset
for i in x:
rev= ''.join(reversed(i))
if i.casefold() == rev.casefold():
k.append(i.casefold())
# the print statement isn't what we want
# no need for an else statement - the loop will continue anyway
# now, once all elements have been visited, return the set of unique elements from k
return set(k)
now it returns roughly what you'd expect - a single set with multiple words, instead of printing multiple sets with one word each. Then, we can print that set.
>>> print(uniquePalindromes("Hannah asked Sarah but Sarah refused"))
{'hannah'}
>>> print(uniquePalindromes("Hannah and her friend Anna caught a Girafarig and named it hannaH"))
{'anna', 'hannah', 'girafarig', 'a'}
they are not gonna like me on here if I give you some tips. But try to divide the amount of characters (that aren't whitespace) into 2. If the amount on each side is not equivalent then you must be dealing with an odd amount of letters. That means that you should be able to traverse the palindrome going downwards from the middle and upwards from the middle, comparing those letters together and using the middle point as a "jump off" point. Hope this helps

How can I transform the string of characters back into words?

I've been trying to learn Python for the past two months or so, but I'm really only now getting my hands dirty with it, so I thank you in advance for your patience and insight.
I was working on a project where I was cleaning the names in a dataset. That means filtering out the names of the apps who have foreign characters (that is to say, ord(character) > 127.
However, it turns out that this approach removed too many legitimate apps since the emojis in those were coming back as out of that range.
The workaround is to allow up to one foreign character. So it's pretty straightforward for that part; I can simply scan the characters of the names in each list. The part I'm having trouble with is telling Python where in the loop to add a name to the "cleaned" list (the final version of app names having <=1 one error. (The requirements are actually different in my project, but I'm trying to keep it as simple as possible in this example.)
To simplify the problem a bit, I was working on a dummy list. I have included that for you.
Where do I add the code so that after that final iteration of each name, the name is added to the list entitled cleanedNameList to only append names with <=1 foreign character?
When I've tried appending a 'clean' name to the list before (a name that had <=1 foreign characters in it), it also sometimes adds the ones with more than three foreign characters. I think this is due in part to me not knowing where to put the exception counter.
nameList = ['うErick', 'とうきhine', 'Charliと']
cleanedNameList = []
exceptions = 0
for name in nameList:
print('New name', name, 'being evaluated!')
exceptions = 0
for char in name:
print(char, 'being evaluated')
ascii_value = ord(char)
if ascii_value < 127:
continue
elif ascii_value > 127:
exceptions+=1
print(exceptions, 'exception(s) added for', name)
#where would I add append.cleanedNamesList(name) ?
So, TL;DR: how do I scan a list of names, and once done scanning the list, add those names to a new list only IF they have <=1 foreign character.
def canAllow(s):
return sum((1 for char in s if ord(char)>127), 0) <= 1
cleanList = [name for name in nameList if canAllow(name)]

Find a specific item from a list using python

I have a list of 20000 Products with their Description
This shows the variety of the products
I want to be able to write a code that searches a particular word say 'TAPA'
and give a output of all the TAPAs
I found this Find a specific word from a list in python , but it uses startswith which finds only the first item for example:
new = [x for x in df1['A'] if x.startswith('00320')]
## output ['00320671-01 Guide rail 25N/1660', '00320165S02 - Miniature rolling table']
How shall i find for the second letter, third or any other item
P.S- the list consists of strings, integers, floats
You can use string.find(substring) for this purpose. So in your case this should work:
new = [x for x in df1['A'] if x.find('00320') != -1]
The find() method returns the lowest index of the substring found else returns -1.
To know more about usage of find() refer to Geeksforgeeks.com - Python String | find()
Edit 1:
As suggested by #Thierry in comments, a cleaner way to do this is:
new = [x for x in df1['A'] if '00320' in x]
You can use the built-in functions of Pandas to find partial string matches and generate lists:
new = df1['A'][df1['A'].astype(str).str.contains('00320')]['A'].tolist()
An advantage of pandas str.contains() is that the use of regex is possible.

Python: iterate through list and check for matching sub-string in specific parts of string

for all the strings in a list of strings, if either of the first two characters of the string match (in any order) then check if either of last two strings match in specific order. If so, I will ad an edge between two vertex in graph G.
Example:
d = ['BEBC', 'ABRC']
since the 'B' in the first two characters and the 'C' in the second two characters match, I will add an edge. I'm fairly new to Python and what I have come up with through previous searches seems overly verbose:
for i in range(0,len(d)-1):
for j in range(0,len(d)-1):
if (d[i][0] in d[j+1][:2] or d[i][1] in d[j+1][:2]) and \
(d[i][2] in d[j+1][2] or d[i][3] in d[j+1][3]):
G.add_edge(d[i],d[j+1])
The next step on this is to come up with a faster way to iterate through since there will probably only be 1 to 3 edges connecting each node, so 90% of the iteration test will come back false. Suggestions would be welcome!
Since you know that the last character of each list item needs to absolutely match in the same place it's less expensive to check for that first. The code is otherwise doing unnecessary work even though it really doesn't need to. Using timeit you can determine the difference in calculation time by making a few changes, such as checking for the last characters first:
import timeit
d = ['BEBC', 'ABRC']
def test1():
if (d[0][len(d[0])-1] is d[1][len(d[1])-1]):
for i in range(0,2):
if(d[0][i] in d[1][:2]):
return(d[0],d[1])
print(test1())
print(timeit.timeit(stmt=test1, number=1000000))
Result:
('BEBC', 'ABRC')
2.3587113980001959
Original Code:
d = ['BEBC', 'ABRC']
def test2():
for i in range(0,len(d)-1):
for j in range(0,len(d)-1):
if (d[i][0] in d[j+1][:2] or d[i][1] in d[j+1][:2]) and \
(d[i][2] in d[j+1][2] or d[i][3] in d[j+1][3]):
return(d[i],d[j+1])
print(test2())
print(timeit.timeit(stmt=test2, number=1000000))
Result:
('BEBC', 'ABRC')
3.1525327970002763
Now let's take the last list value and change it so that the last character C does not match:
d = ['BEBC', 'ABRX']
New Code:
None
0.766526217000318
Original:
None
2.963771982000253
This is where it's obviously going to pay off in regard to the order of iterating items — especially considering if 90% of the iteration checks could come back false.

Resources