Concatenating FOR loop output - python-3.x

I am very new to Python (first week of active use). I have some bash scripting experience but have decided to learn Python.
I have a variable of multiple strings which I am using to build a URL in FOR loop. The output of URL is JSON and I would like to concatenate complete output into one file.
I will put random URL for privacy reasons.
The code looks like this:
==================
numbers = ['24246', '83367', '37643', '24245', '24241', '77968', '63157', '76004', '71665']
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
result = restAPI.json
==================
the problem is that if I do print(result) I will get only output of last iteration, i.e. www.google.com/test/71665&test2
Creating a list by adding text = [] worked (content was concatenated) but I would like to keep the original format.
text = []
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
Does anyone have idea how to do this

When the for loop ends, the variable assigned inside the for loop only keeps the last value. I.e. Every time your code for loops through, the restAPI variable gets reset each time.
If you wanted to keep each URL, you could append to a list outside the scope of the for loop every time, i.e.
restAPI = s.get(urljoin(baseurl, ...
url_list.append(restApi.json)
Or if you just wanted to print...
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
print(restAPI.json)
If you added them to a list, you could perform seperate functions with the new list of URLs.
If you think there might be duplicates, feel free to use a set() instead (which automatically removes the dupes inside the iterable as new values are added). You can use set_name.add(restAPI.json)
To be better, you could implement a dict and assign the id as the key and the json object as the value. So you could:
dict_obj = dict()
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
dict_obj[id] = restAPI.json
That way you can query the dictionary later in the script.
Note that if you're querying many URLs, storing the JSON's in memory might be intensive depending on your hardware.

Related

List, tuples or dictionary, differences and usage, How can I store info in python

I'm very new in python (I usually write in php). I want to understand how to store information in an associative array, and if you can explain me whats the difference of "tuples", "arrays", "dictionary" and "list" will be wonderful (I tried to read different source but I still not caching it).
So This is my code:
#!/usr/bin/python3.4
import csv
import string
nidless_keys = dict()
nidless_keys = ['test_string1','test_string2'] #this contain the string to
# be searched in linesreader
data = {'type':[],'id':[]} #here I want to store my information
with open('path/to/csv/file.csv',newline="") as csvfile:
linesreader = csv.reader(csvfile,delimiter=',',quotechar="|")
for row in linesreader: #every line in this csv have a url like
#www.test.com/?test_string1&id=123456
current_row_string = str(row)
for needle in nidless_keys:
current_needle = str(needle)
if current_needle in current_row_string:
data[current_needle[current_row_string[-8:]]) += 1 # also I
#need to count per every id how much rows there are.
In conclusion:
my_data_stored = [current_needle][current_row_string[-8]]
current_row_string[-8] is a url which the last 8 digit of the url is an ID.
So the array should looks like this at the end of the script:
test_string1 = 123456 = 20
= 256468 = 15
test_string2 = 123155 = 10
Edit 1:
Which type I need here to store the information?
Can you tell me how to resolve this script?
It seems you want to count how many times an ID in combination with a test string occurs.
There can be multiple ID/count combinations associated with every test string.
This suggests that you should use a dictionary indexed by the test strings to store the results. In that dictionary I would suggest to store collections.Counter objects.
This way, you would have to add a special case when a key in the results dictionary isn't found to add an empty Counter. This is a common problem, so there is a specialized form of dictionary in the collections module called defaultdict.
import collections
import csv
# Using a tuple for the keys so it cannot be accidentally modified
keys = ('test_string1', 'test_string2')
result = collections.defaultdict(collections.Counter)
with open('path/to/csv/file.csv',newline="") as csvfile:
linesreader = csv.reader(csvfile,delimiter=',',quotechar="|")
for row in linesreader:
for key in keys:
if key in row:
id = row[-6:] # ID's are six digits in your example.
# The first index is into the dict, the second into the Counter.
result[key][id] += 1
There is an even easier way, by using regular expressions.
Since you seem to treat every row in a CSV file as a string, there is little need to use the CSV reader, so I'll just read the whole file as text.
import re
with open('path/to/csv/file.csv') as datafile:
text = datafile.read()
pattern = r'\?(.*)&id=(\d+)'
The pattern is a regular expression. This is a large topic in and of itself, so I'll only cover briefly what it does. (You might also want to check out the relevant HOWTO) At first glance it looks like complete gibberish, but it is actually a complete language.
In looks for two things in a line. Anything between ? and &id=, and a sequence of digits after &id=.
I'll be using IPython to give an example.
(If you don't know it, check out IPython. It is great for trying things and see if they work.)
In [1]: import re
In [2]: pattern = r'\?(.*)&id=(\d+)'
In [3]: text = """www.test.com/?test_string1&id=123456
....: www.test.com/?test_string1&id=123456
....: www.test.com/?test_string1&id=234567
....: www.test.com/?foo&id=234567
....: www.test.com/?foo&id=123456
....: www.test.com/?foo&id=1234
....: www.test.com/?foo&id=1234
....: www.test.com/?foo&id=1234"""
The text variable points to the string which is a mock-up for the contents of your CSV file.
I am assuming that:
every URL is on its own line
ID's are a sequence of digits.
If these assumptions are wrong, this won't work.
Using findall to extract every match of the pattern from the text.
In [4]: re.findall(pattern, test)
Out[4]:
[('test_string1', '123456'),
('test_string1', '123456'),
('test_string1', '234567'),
('foo', '234567'),
('foo', '123456'),
('foo', '1234'),
('foo', '1234'),
('foo', '1234')]
The findall function returns a list of 2-tuples (that is key, ID pairs). Now we just need to count those.
In [5]: import collections
In [6]: result = collections.defaultdict(collections.Counter)
In [7]: intermediate = re.findall(pattern, test)
Now we fill the result dict from the list of matches that is the intermediate result.
In [8]: for key, id in intermediate:
....: result[key][id] += 1
....:
In [9]: print(result)
defaultdict(<class 'collections.Counter'>, {'foo': Counter({'1234': 3, '123456': 1, '234567': 1}), 'test_string1': Counter({'123456': 2, '234567': 1})})
So the complete code would be:
import collections
import re
with open('path/to/csv/file.csv') as datafile:
text = datafile.read()
result = collections.defaultdict(collections.Counter)
pattern = r'\?(.*)&id=(\d+)'
intermediate = re.findall(pattern, test)
for key, id in intermediate:
result[key][id] += 1
This approach has two advantages.
You don't have to know the keys in advance.
ID's are not limited to six digits.
A brief summary of the python data types you mentioned:
A dictionary is an associative array, aka hashtable.
A list is a sequence of values.
An array is essentially the same as a list, but limited to basic datatypes. My impression is that they only exists for performance reasons, don't think I've ever used one. If performance is that critical to you, you probably don't want to use python in the first place.
A tuple is a fixed-length sequence of values (whereas lists and arrays can grow).
Lets take them one by one.
Lists:
List is a very naive kind of data structure similar to arrays in other languages in terms of the way we write them like:
['a','b','c']
This is a list in python , but seems very similar to array structure.
However there is a very large difference in the way lists are used in python and the usual arrays.
Lists are heterogenous in nature. This means that we can store any kind of data simultaneously inside it like:
ls = [1,2,'a','g',True]
As you can see, we have various kinds of data within a list and is a valid list.
However, one important thing about them is that we can access the list items using zero based indices. So we can write:
print ls[0],ls[3]
output: 1 g
Dictionary:
This datastructure is similar to a hash map data structure. It contains a (key,Value) pair. An empty dictionary looks like:
dc = {}
Now, to store a key,value pair, e.g., ('potato',3),(tomato,5), we can do as:
dc['potato'] = 3
dc['tomato'] = 5
and we saved the data in the dictionary dc.
The important thing is that we can even store another data structure element like a list within a dictionary like:
dc['list1'] = ls , where ls is the list defined above.
This shows the power of using dictionary.
In your case, you have difined a dictionary like this:
data = {'type':[],'id':[]}
This means that your dictionary will consist of only two keys and each key corresponds to a list, which are empty for now.
Talking a bit about your script, the expression :
current_row_string[-8:]
doesn't make a sense. The index should have been -6 instead of -8 that would give you the id part of the current row.
This part is the id and should have been stored in a variable say :
id = current_row_string[-6:]
Further action can be performed as seen the answer given by Roland.

Proper Syntax for List Comprehension Involving an Integer and a Float?

I have a List of Lists that looks like this (Python3):
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
I'm trying to use a list comprehension to convert the first item of each inner list to an int and the second item to a float so that I end up with this:
newLOL = [[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
I'm learning list comprehensions, can somebody please help me with this syntax?
Thank you!
[edit - to explain why I asked this question]
This question is a means to an end - the syntax requested is needed for testing. I'm collecting sensor data on a ZigBee network, and I'm using an Arduino to format the sensor messages in JSON. These messages are published to an MQTT broker (Mosquitto) running on a Raspberry Pi. A Redis server (also running on the Pi) serves as an in-memory message store. I'm writing a service (python-MQTT client) to parse the JSON and send a LoL (a sample of the data you see in my question) to Redis. Finally, I have a dashboard running on Apache on the Pi. The dashboard utilizes Highcharts to plot the sensor data dynamically (via a web socket connection between the MQTT broker and the browser). Upon loading the page, I pull historical chart data from my Redis LoL to "very quickly" populate the charts on my dashboard (before any realtime data is added dynamically). I realize I can probably format the sensor data the way I want in the Redis store, but that is a problem I haven't worked out yet. Right now, I'm trying to get my historical data to plot correctly in Highcharts. With the data properly formatted, I can get this piece working.
Well, you could use ast.literal_eval:
from ast import literal_eval
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
items = [[int(literal_eval(i)[0]), float(literal_eval(i)[1])] for i in myLOL]
Try:
import json
newLOL = [[int(a[0]), float(a[1])] for a in (json.loads(s.replace("'", '"')) for s in myLOL)]
Here I'm considering each element of the list as a JSON, but since it's using ' instead of " for the strings, I have to replace it first (it only works because you said there will be only numbers).
This may work? I wish I was more clever.
newLOL = []
for listObj in myLOL:
listObj = listObj.replace('[', '').replace(']', '').replace("'", '').split(',')
newListObj = [int(listObj[0]), float(listObj[1])]
newLOL.append(newListObj)
Iterates through your current list, peels the string apart into a list by replace un-wanted string chracters and utilizing a split on the comma. Then we take the modified list object and create another new list object with the values being the respective ints and floats. We then append the prepared newListObj to the newLOL list. Considering you want an actual set of lists within your list. Your previously documented input list actually contains strings, which look like lists.
This is a very strange format and the best solution is likely to change the code which generates that.
That being said, you can use ast.literal_eval to safely evaluate the elements of the list as Python tokens:
>>> lit = ast.literal_eval
>>> [[lit(str_val) for str_val in lit(str_list)] for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
We need to do it twice - once to turn the string into a list containing two strings, and then once per resulting string to convert it into a number.
Note that this will succeed even if the strings contain other valid tokens. If you want to validate the format too, you'd want to do something like:
>>> def process_str_list(str_list):
... l = ast.literal_eval(str_list)
... if not isinstance(l, list):
... raise TypeError("Expected list")
... str_int, str_float = l
... return [int(str_int), float(str_float)]
...
>>> [process_str_list(str_list) for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
Your input consists of a list of strings, where each string is the string representation of a list. The first task is to convert the strings back into lists:
import ast
lol2 = map(ast.literal_eval, mylol) # [['1466279297', '703.0'], ...]
Now, you can simply get int and float values from lol2:
newlol = [[int(a[0]), float(a[1])] for a in lol2]

Updating dictionary - Python

total=0
line=input()
line = line.upper()
names = {}
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
if tag =='<SPEAKER>':
if text not in names:
names.update({text})
I seem to get this far and then draw a blank.. This is what I'm trying to figure out. When I run it, I get:
ValueError: dictionary update sequence element #0 has length 8; 2 is required
Make an empty dictionary
Which I did.
(its keys will be the names of speakers and its values will be how many times s/he spoke)
Within the if statement that checks whether a tag is <SPEAKER>
If the speaker is not in the dictionary, add him to the dictionary with a value of 1
I'm pretty sure I did this right.
If he already is in the dictionary, increment his value
I'm not sure.
You are close, the big issue is on this line:
names.update({text})
You are trying to make a dictionary entry from a string using {text}, python is trying to be helpful and convert the iterable inside the curly brackets into a dictionary entry. Except the string is too long, 8 characters instead of two.
To add a new entry do this instead:
names.update({text:1})
This will set the initial value.
Now, it seems like this is homework, but you've put in a bit of effort already, so while I won't answer the question I'll give you some broad pointers.
Next step is checking if a value already exists in the dictionary. Python dictionaries have a get method that will retrieve a value from the dictionary based on the key. For example:
> names = {'romeo',1}
> print names.get('romeo')
1
But will return None if the key doesn't exist:
> names = {'romeo',1}
> print names.get('juliet')
None
But this takes an optional argument, that returns a different default value
> names = {'romeo',2}
> print names.get('juliet',1)
1
Also note that your loop as it stands will never end, as you only set tag once:
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
# you need to set tag in here
# and have an escape clause if you run out of file
The rest is left as an exercise for the reader...

Python3 access previously created object

Im new to programming in general and I need some help for accessing a previously created instance of Class. I did some search on SO but I could not find anything... Maybe it's just because I should not try to do that.
for s in servers:
c = rconprotocol.Rcon(s[0], s[2],s[1])
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)
Now, what can I do if I want to call a function on "c" ?
Thanks, Hugo
You're creating several different objects that you briefly name c as you go through the loop. If you want to be able to access more than the last of them, you'll need to save them somewhere that won't be overwritten. Probably the best approach is to use a list to hold the successive values, but depending on your specific needs another data structure might make sense too (for instance, using a dictionary you could look up each value by a specific key).
Here's a trivial adjustment to your current code that will save the c values in a list:
c_list = []
for s in servers:
c = rconprotocol.Rcon(s[0], s[2],s[1])
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)
c_list.append(c)
Later you can access any of the c values with c_list[index], or by iterating with for c in c_list.
A slightly more Pythonic version might use a list comprehension rather than append to create the list (this also shows what a loop over c_list later one might look like):
c_list = [rconprotocol.Rcon(s[0], s[2],s[1]) for s in servers]
for c in c_list:
t = threading.Thread(target=c.connect)
t.start()
c.messengers(allmessages, 10)

Python whois like function

Okay so I have a file called 'whois.txt' which contains
["96363612", "#a2743, coil, charge"]
["12101258", "#a0272, climate, vault"]
["83157521", "sith"]
["33907120", "#a1321, missile, wired"]
["55553768", "#a2722, legal, illegal"]
["22686400", "#a5619, mindless, #a5637, bank"]
["97436430", "jedi, #a5770, charge, lantern, #a9491, legal"]
["91645905", "sith"]
["89514799", "lantern, #a2563, #a2693"]
["19658307", "Umbrechu"]
["56112504", "#a0473, lantern, kryptonian"]
["12195491", "riyoken"]
["53281943", "#a5135, gateway, jedi"]
["76515035", "#a4023, gateway, wired"]
["79444876", "#a2716, loyalty"]
What I'm doing here is using json and using the first numbers as an ID and the accounts that are associated with the ID are linked by ', '. So using python I am using this code to try to get all the accounts that are associated
def getWhois(self):
x = []
f = open('whois.txt','r')
for line in f.readlines():
rid,names = json.loads(line.strip())
x.append([rid,names])
return x
def recvWhois(self,user):
returned = self.getWhois()
x = []
for data in returned:
rid,names = data[0],data[1]
if user in names:
x.append(names)
matches = list(set(', '.join(x).split(', ')))
return matches
So what that is doing is getting the matches of a user you are searching but I want to search the users in those matches also, I have done this but It feels Like I would have to do this an infinite amount of times of researching matches that are pulled so if I were to do self.recvWhois('missile') It would pull "['missile', 'wired', '#a1321']" I would then try to search all of those accounts to link more, and by now you probably see my problem because I would have to do that x amount of times depending on how many matches there are linked to the previous matched accounts If any of you have a solution to my problem it would be very appreciated.
First i would suggest to maintain an index for searching. You could use a search engine but a python map can also serve as a poor man's search engine. So idea is to have an inverted index where the usernames points to records to which they belong. For searching all linked accounts you can write a memoized recursive function which will cut down the infinite recursive paths. Also in case you have large no. of records you can limit recursion to a predefined maximum level.
It is really hard to tell what you are trying to do, but I think you are making it too complicated. Your data structure lends itself to a dictionary. Why not load it using rid as the key and names as the values?

Resources