Python whois like function - python-3.x

Okay so I have a file called 'whois.txt' which contains
["96363612", "#a2743, coil, charge"]
["12101258", "#a0272, climate, vault"]
["83157521", "sith"]
["33907120", "#a1321, missile, wired"]
["55553768", "#a2722, legal, illegal"]
["22686400", "#a5619, mindless, #a5637, bank"]
["97436430", "jedi, #a5770, charge, lantern, #a9491, legal"]
["91645905", "sith"]
["89514799", "lantern, #a2563, #a2693"]
["19658307", "Umbrechu"]
["56112504", "#a0473, lantern, kryptonian"]
["12195491", "riyoken"]
["53281943", "#a5135, gateway, jedi"]
["76515035", "#a4023, gateway, wired"]
["79444876", "#a2716, loyalty"]
What I'm doing here is using json and using the first numbers as an ID and the accounts that are associated with the ID are linked by ', '. So using python I am using this code to try to get all the accounts that are associated
def getWhois(self):
x = []
f = open('whois.txt','r')
for line in f.readlines():
rid,names = json.loads(line.strip())
x.append([rid,names])
return x
def recvWhois(self,user):
returned = self.getWhois()
x = []
for data in returned:
rid,names = data[0],data[1]
if user in names:
x.append(names)
matches = list(set(', '.join(x).split(', ')))
return matches
So what that is doing is getting the matches of a user you are searching but I want to search the users in those matches also, I have done this but It feels Like I would have to do this an infinite amount of times of researching matches that are pulled so if I were to do self.recvWhois('missile') It would pull "['missile', 'wired', '#a1321']" I would then try to search all of those accounts to link more, and by now you probably see my problem because I would have to do that x amount of times depending on how many matches there are linked to the previous matched accounts If any of you have a solution to my problem it would be very appreciated.

First i would suggest to maintain an index for searching. You could use a search engine but a python map can also serve as a poor man's search engine. So idea is to have an inverted index where the usernames points to records to which they belong. For searching all linked accounts you can write a memoized recursive function which will cut down the infinite recursive paths. Also in case you have large no. of records you can limit recursion to a predefined maximum level.

It is really hard to tell what you are trying to do, but I think you are making it too complicated. Your data structure lends itself to a dictionary. Why not load it using rid as the key and names as the values?

Related

How can I transform the string of characters back into words?

I've been trying to learn Python for the past two months or so, but I'm really only now getting my hands dirty with it, so I thank you in advance for your patience and insight.
I was working on a project where I was cleaning the names in a dataset. That means filtering out the names of the apps who have foreign characters (that is to say, ord(character) > 127.
However, it turns out that this approach removed too many legitimate apps since the emojis in those were coming back as out of that range.
The workaround is to allow up to one foreign character. So it's pretty straightforward for that part; I can simply scan the characters of the names in each list. The part I'm having trouble with is telling Python where in the loop to add a name to the "cleaned" list (the final version of app names having <=1 one error. (The requirements are actually different in my project, but I'm trying to keep it as simple as possible in this example.)
To simplify the problem a bit, I was working on a dummy list. I have included that for you.
Where do I add the code so that after that final iteration of each name, the name is added to the list entitled cleanedNameList to only append names with <=1 foreign character?
When I've tried appending a 'clean' name to the list before (a name that had <=1 foreign characters in it), it also sometimes adds the ones with more than three foreign characters. I think this is due in part to me not knowing where to put the exception counter.
nameList = ['うErick', 'とうきhine', 'Charliと']
cleanedNameList = []
exceptions = 0
for name in nameList:
print('New name', name, 'being evaluated!')
exceptions = 0
for char in name:
print(char, 'being evaluated')
ascii_value = ord(char)
if ascii_value < 127:
continue
elif ascii_value > 127:
exceptions+=1
print(exceptions, 'exception(s) added for', name)
#where would I add append.cleanedNamesList(name) ?
So, TL;DR: how do I scan a list of names, and once done scanning the list, add those names to a new list only IF they have <=1 foreign character.
def canAllow(s):
return sum((1 for char in s if ord(char)>127), 0) <= 1
cleanList = [name for name in nameList if canAllow(name)]

Finding Close String Matches - valuing sub string word matches higher

I'm trying to find close string matches (context - searching for a discord user from user input).
Atm, I'm trying out the difflib. It works ok, but seems to return some funny results sometimes. Eg. if someone's name contains a word, searching that word may result in something that seems far off instead of that name.
I think that's just because of how get_close_matches works. Could I be suggested some other libraries to try out? (Not sure how to quantify what I'm after, but maybe I want a searcher that gives a higher score to names containing a similar word to the search term)
user_names = []
for member in server.members:
if member.name is not None: user_names.append(member.name)
if member.nick is not None: user_names.append(member.nick)
user_name = difflib.get_close_matches(user_msg, user_names, n = 1, cutoff = 0.2)
I've used https://github.com/seatgeek/fuzzywuzzy for this in the past which provides a few options out of the box from single words to tokenizing and sorting larger strings.

Concatenating FOR loop output

I am very new to Python (first week of active use). I have some bash scripting experience but have decided to learn Python.
I have a variable of multiple strings which I am using to build a URL in FOR loop. The output of URL is JSON and I would like to concatenate complete output into one file.
I will put random URL for privacy reasons.
The code looks like this:
==================
numbers = ['24246', '83367', '37643', '24245', '24241', '77968', '63157', '76004', '71665']
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
result = restAPI.json
==================
the problem is that if I do print(result) I will get only output of last iteration, i.e. www.google.com/test/71665&test2
Creating a list by adding text = [] worked (content was concatenated) but I would like to keep the original format.
text = []
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
Does anyone have idea how to do this
When the for loop ends, the variable assigned inside the for loop only keeps the last value. I.e. Every time your code for loops through, the restAPI variable gets reset each time.
If you wanted to keep each URL, you could append to a list outside the scope of the for loop every time, i.e.
restAPI = s.get(urljoin(baseurl, ...
url_list.append(restApi.json)
Or if you just wanted to print...
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
print(restAPI.json)
If you added them to a list, you could perform seperate functions with the new list of URLs.
If you think there might be duplicates, feel free to use a set() instead (which automatically removes the dupes inside the iterable as new values are added). You can use set_name.add(restAPI.json)
To be better, you could implement a dict and assign the id as the key and the json object as the value. So you could:
dict_obj = dict()
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
dict_obj[id] = restAPI.json
That way you can query the dictionary later in the script.
Note that if you're querying many URLs, storing the JSON's in memory might be intensive depending on your hardware.

Need help working with lists within lists

I'm taking a programming class and have our first assignment. I understand how it's supposed to work, but apparently I haven't hit upon the correct terms to search to get help (and the book is less than useless).
The assignment is to take a provided data set (names and numbers) and perform some manipulation and computation with it.
I'm able to get the names into a list, and know the general format of what commands I'm giving, but the specifics are evading me. I know that you refer to the numbers as names[0][1], names[1][1], etc, but not how to refer to just that record that is being changed. For example, we have to have the program check if a name begins with a letter that is Q or later; if it does, we double the number associated with that name.
This is what I have so far, with ??? indicating where I know something goes, but not sure what it's called to search for it.
It's homework, so I'm not really looking for answers, but guidance to figure out the right terms to search for my answers. I already found some stuff on the site (like the statistics functions), but just can't find everything the book doesn't even mention.
names = [("Jack",456),("Kayden",355),("Randy",765),("Lisa",635),("Devin",358),("LaWanda",452),("William",308),("Patrcia",256)]
length = len(names)
count = 0
while True
count < length:
if ??? > "Q" # checks if first letter of name is greater than Q
??? # doubles number associated with name
count += 1
print(names) # self-check
numberNames = names # creates new list
import statistics
mean = statistics.mean(???)
median = statistics.median(???)
print("Mean value: {0:.2f}".format(mean))
alphaNames = sorted(numberNames) # sorts names list by name and creates new list
print(alphaNames)
first of all you need to iter over your names list. To do so use for loop:
for person in names:
print(person)
But names are a list of tuples so you will need to get the person name by accessing the first item of the tuple. You do this just like you do with lists
name = person[0]
score = person[1]
Finally to get the ASCII code of a character, you use ord() function. That is going to be helpful to know if name starts with a Q or above.
print(ord('A'))
print(ord('Q'))
print(ord('R'))
This should be enough informations to get you started with.
I see a few parts to your question, so I'll try to separate them out in my response.
check if first letter of name is greater than Q
Hopefully this will help you with the syntax here. Like list, str also supports element access by index with the [] syntax.
$ names = [("Jack",456),("Kayden",355)]
$ names[0]
('Jack', 456)
$ names[0][0]
'Jack'
$ names[0][0][0]
'J'
$ names[0][0][0] < 'Q'
True
$ names[0][0][0] > 'Q'
False
double number associated with name
$ names[0][1]
456
$ names[0][1] * 2
912
"how to refer to just that record that is being changed"
We are trying to update the value associated with the name.
In theme with my previous code examples - that is, we want to update the value at index 1 of the tuple stored at index 0 in the list called names
However, tuples are immutable so we have to be a little tricky if we want to use the data structure you're using.
$ names = [("Jack",456), ("Kayden", 355)]
$ names[0]
('Jack', 456)
$ tpl = names[0]
$ tpl = (tpl[0], tpl[1] * 2)
$ tpl
('Jack', 912)
$ names[0] = tpl
$ names
[('Jack', 912), ('Kayden', 355)]
Do this for all tuples in the list
We need to do this for the whole list, it looks like you were onto that with your while loop. Your counter variable for indexing the list is named count so just use that to index a specific tuple, like: names[count][0] for the countth name or names[count][1] for the countth number.
using statistics for calculating mean and median
I recommend looking at the documentation for a module when you want to know how to use it. Here is an example for mean:
mean(data)
Return the sample arithmetic mean of data.
$ mean([1, 2, 3, 4, 4])
2.8
Hopefully these examples help you with the syntax for continuing your assignment, although this could turn into a long discussion.
The title of your post is "Need help working with lists within lists" ... well, your code example uses a list of tuples
$ names = [("Jack",456),("Kayden",355)]
$ type(names)
<class 'list'>
$ type(names[0])
<class 'tuple'>
$ names = [["Jack",456], ["Kayden", 355]]
$ type(names)
<class 'list'>
$ type(names[0])
<class 'list'>
notice the difference in the [] and ()
If you are free to structure the data however you like, then I would recommend using a dict (read: dictionary).
I know that you refer to the numbers as names[0][1], names[1][1], etc, but
not how to refer to just that record that is being changed. For
example, we have to have the program check if a name begins with a
letter that is Q or later; if it does, we double the number associated
with that name.
It's not entirely clear what else you have to do in this assignment, but regarding your concerns above, to reference the ith"record that is being changed" in your names list, simply use names[i]. So, if you want to access the first record in names, simply use names[0], since indexing in Python begins at zero.
Since each element in your list is a tuple (which can also be indexed), using constructs like names[0][0] and names[0][1] are ways to index the values within the tuple, as you pointed out.
I'm unsure why you're using while True if you're trying to iterate through each name and check whether it begins with "Q". It seems like a for loop would be better, unless your class hasn't gotten there yet.
As for checking whether the first letter is 'Q', str (string) objects are indexed similarly to lists and tuples. To access the first letter in a string, for example, see the following:
>>> my_string = 'Hello'
>>> my_string[0]
'H'
If you give more information, we can help guide you with the statistics piece, as well. But I would first suggest you get some background around mean and median (if you're unfamiliar).

Python3 access previously created object

Im new to programming in general and I need some help for accessing a previously created instance of Class. I did some search on SO but I could not find anything... Maybe it's just because I should not try to do that.
for s in servers:
c = rconprotocol.Rcon(s[0], s[2],s[1])
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)
Now, what can I do if I want to call a function on "c" ?
Thanks, Hugo
You're creating several different objects that you briefly name c as you go through the loop. If you want to be able to access more than the last of them, you'll need to save them somewhere that won't be overwritten. Probably the best approach is to use a list to hold the successive values, but depending on your specific needs another data structure might make sense too (for instance, using a dictionary you could look up each value by a specific key).
Here's a trivial adjustment to your current code that will save the c values in a list:
c_list = []
for s in servers:
c = rconprotocol.Rcon(s[0], s[2],s[1])
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)
c_list.append(c)
Later you can access any of the c values with c_list[index], or by iterating with for c in c_list.
A slightly more Pythonic version might use a list comprehension rather than append to create the list (this also shows what a loop over c_list later one might look like):
c_list = [rconprotocol.Rcon(s[0], s[2],s[1]) for s in servers]
for c in c_list:
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)

Resources