I have used "\n" for string concatenation in the python code but it doesn't working. "\n" is appending along with the context to the list data
data = []
context_data = [('What is the available storage capability', '124578'), ('what is the available budget set', '12587'), ('what is the available budget set', '12587')]
for part in context_data:
s = "User : "+part[0]+" \nUlta : "+part[1]
data.append(s)
print(data)
['User : What is the available storage capability \nUlta : 124578', 'User : what is the available budget set \nUlta : 12587', 'User : what is the available budget set \nUlta : 12587']
It is not working as you are appending it to data... try it without data.
Code:
context_data = [('What is the available storage capability', '124578'), ('what is the available budget set', '12587'), ('what is the available budget set', '12587')]
for part in context_data:
s = "User : "+part[0]+" \nUlta : "+part[1]
print(s)
Output:
User : What is the available storage capability
Ulta : 124578
User : what is the available budget set
Ulta : 12587
User : what is the available budget set
Ulta : 12587
Related
We have a code to reads some electricity meter data ,which we want to push to bigquery so that it can be visualized in data studio. We tried usign Cloud function, but it seems the code generates streaming data and cloud function timesout. So this may not be a correct use case for cloud function
def test():
def print_recursive(usage_dict, info, depth=0):
for gid, device in usage_dict.items():
for channelnum, channel in device.channels.items():
name = channel.name
if name == 'Main':
name = info[gid].device_name
d = datetime.now()
t = d.strftime("%x")+' '+d.strftime("%X")
print(d.strftime("%x"),d.strftime("%X"))
res={'Gid' : gid,
'ChannelNumber' : channelnum[0],
'Name' : channel.name,
'Usage' : channel.usage,
'unit':'kwh',
'Timestamp':t
}
global resp
resp = res
print(resp)
return resp
devices = vue.get_devices()
deviceGids = []
info ={}
for device in devices:
if not device.device_gid in deviceGids:
deviceGids.append(device.device_gid)
info[device.device_gid] = device
else:
info[device.device_gid].channels += device.channels
device_usage_dict = vue.get_device_list_usage(deviceGids=deviceGids,
instant=datetime.utcnow(), scale=Scale.SECOND.value, unit=Unit.KWH.value)
print_recursive(device_usage_dict, info)
This generates a electricity consumption data in real time
Can anyone suggest which GCP service would be ideal here? based on my research it seems pub/sub => bigquery . But I my question is can we programmatically ingest data into pubsub ? if yes then what are the prerequisites ?
I wrote a script to import the characterization factors of the LCIA method EF 3.0 (adapated) on Brightway. I think it works fine as I see the right characterization factors on the Activity Browser (ex for the Climate Change method : but when I run calculations with the method, the results are not the same as on Simapro (where I got the CSV Import File from) : And for instance the result is 0 for the Climate Change method. Do you know what can be the issue ?
It seems that the units are different but it is the same for the other methods that are available on Brightway.
Besides, I saw on another question that there would be a method implemented to import the EF 3.0 method, is it available yet ?
Thank you very much for your help.
Code of the importation script :
import brightway2 as bw
import csv
import uuid
from bw2data import mapping
from bw2data.utils import recursive_str_to_unicode
class import_method_EF:
'''Class for importing the EF method from Simapro export CSV file to Brightway. '''
def __init__(
self,
project_name,
name_file,
):
self.project_name = project_name
self.name_file = name_file
self.db_biosphere = bw.Database('biosphere3')
#Definition of the dictionnary for the correspondance between the Simapro and the ecoinvent categories
self.dict_categories = {'high. pop.' : 'urban air close to ground',
'low. pop.' : 'low population density, long-term',
'river' : 'surface water',
'in water' : 'in water',
'(unspecified)' : '',
'ocean' : 'ocean',
'indoor' : 'indoor',
'stratosphere + troposphere' : 'lower stratosphere + upper troposphere',
'low. pop., long-term' : 'low population density, long-term',
'groundwater, long-term' : 'ground-, long-term',
'agricultural' : 'agricultural',
'industrial' : 'industrial',
}
#Definition of the dictionnary of the ecoinvent units abreviations
self.dict_units = {'kg' : 'kilogram',
'kWh' : 'kilowatt hour',
'MJ' : 'megajoule',
'p':'p',
'unit':'unit',
'km':'kilometer',
'my' : 'meter-year',
'tkm' : 'ton kilometer',
'm3' : 'cubic meter',
'm2' :'square meter',
'kBq' : 'kilo Becquerel',
'm2a' : 'm2a', #à modifier
}
def importation(self) :
"""
Makes the importation from the Simapro CSV file to Brightway.
"""
#Set the current project
bw.projects.set_current(self.project_name)
self.data = self.open_CSV(self.name_file, [])
list_methods = []
new_flows = []
for i in range(len(self.data)) :
#print(self.data[i])
if self.data[i] == ['Name'] :
name_method = self.data[i+1][0]
if self.data[i] == ['Impact category'] :
list_flows = []
j = 4
while len(self.data[i+j])>1 :
biosphere_code = self.get_biosphere_code(self.data[i+j][2],self.data[i+j][1],self.data[i+j][0].lower())
if biosphere_code == 0 :
if self.find_if_already_new_flow(i+j, new_flows)[0] :
code = self.find_if_already_new_flow(i+j, new_flows)[1]
list_flows.append((('biosphere3', code),float(self.data[i+j][4].replace(',','.'))))
else :
code = str(uuid.uuid4())
while (self.db_biosphere.name, code) in mapping:
code = str(uuid.uuid4())
new_flows.append({'amount' : float(self.data[i+j][4].replace(',','.')),
'CAS number' : self.data[i+j][3],
'categories' : (self.data[i+j][0].lower(), self.dict_categories[self.data[i+j][1]]),
'name' : self.data[i+j][2],
'unit' : self.dict_units[self.data[i+j][5]],
'type' : 'biosphere',
'code' : code})
list_flows.append((('biosphere3', code),float(self.data[i+j][4].replace(',','.'))))
else :
list_flows.append((('biosphere3', biosphere_code),float(self.data[i+j][4].replace(',','.'))))
j+=1
list_methods.append({'name' : self.data[i+1][0],
'unit' : self.data[i+1][1],
'flows' : list_flows})
new_flows = recursive_str_to_unicode(dict([self._format_flow(flow) for flow in new_flows]))
if new_flows :
print('new flows :',len(new_flows))
self.new_flows = new_flows
biosphere = bw.Database(self.db_biosphere.name)
biosphere_data = biosphere.load()
biosphere_data.update(new_flows)
biosphere.write(biosphere_data)
print('biosphere_data :',len(biosphere_data))
for i in range(len(list_methods)) :
method = bw.Method((name_method,list_methods[i]['name']))
method.register(**{'unit':list_methods[i]['unit'],
'description':''})
method.write(list_methods[i]['flows'])
print(method.metadata)
method.load()
def open_CSV(self, CSV_file_name, list_rows):
'''
Opens a CSV file and gets a list of the rows.
: param : CSV_file_name = str, name of the CSV file (must be in the working directory)
: param : list_rows = list, list to get the rows
: return : list_rows = list, list of the rows
'''
#Open the CSV file and read it
with open(CSV_file_name, 'rt') as csvfile:
data = csv.reader(csvfile, delimiter = ';')
#Write every row in the list
for row in data:
list_rows.append(row)
return list_rows
def get_biosphere_code(self, simapro_name, simapro_cat, type_biosphere):
"""
Gets the Brightway code of a biosphere process given in a Simapro format.
: param : simapro_name = str, name of the biosphere process in a Simapro format.
: param : simapro_cat = str, category of the biosphere process (ex : high. pop., river, etc)
: param : type_biosphere = str, type of the biosphere process (ex : Emissions to water, etc)
: return : 0 if the process is not found in biosphere, the code otherwise
"""
if 'GLO' in simapro_name or 'RER' in simapro_name :
simapro_name = simapro_name[:-5]
if '/m3' in simapro_name :
simapro_name = simapro_name[:-3]
#Search in the biosphere database, depending on the category
if simapro_cat == '' :
act_biosphere = self.db_biosphere.search(simapro_name, filter={'categories' : (type_biosphere,)})
else :
act_biosphere = self.db_biosphere.search(simapro_name, filter={'categories' : (type_biosphere, self.dict_categories[simapro_cat])})
#Pourquoi j'ai fait ça ? ...
for act in act_biosphere :
if simapro_cat == '' :
if act['categories'] == (type_biosphere, ):
return act['code']
else :
if act['categories'] == (type_biosphere, self.dict_categories[simapro_cat]):
return act['code']
return 0
def _format_flow(self, cf):
# TODO
return (self.db_biosphere.name, cf['code']), {
'exchanges': [],
'categories': cf['categories'],
'name': cf['name'],
'type': ("resource" if cf["categories"][0] == "resource"
else "emission"),
'unit': cf['unit'],
}
def find_if_already_new_flow(self, n, new_flows) :
"""
"""
for k in range(len(new_flows)) :
if new_flows[k]['name'] == self.data[n][2] :
return True, new_flows[k]['code']
return False, 0
Edit : I made a modification in the get_biosphere_code method and it works better (it was not finding some biosphere flows) but I still have important differences between the results I get on Brightway and the results I get on Simapro. My investigations led me to the following observations :
there are some differences in ecoinvent activities and especially in the lists of biosphere flows (should be a sink of differences in result), some are missing in Brightway and also in the ecoSpold data that was used for the importation compared to the data in Simapro
it seems that the LCA calculation doesn't work the same way as regards the subcategories : for example, the biosphere flow Carbon dioxide, fossil (air,) is in the list of caracterization factors for the Climate Change method and when looking at the inventory in the Simapro LCA results, it appears that all the Carbon dioxide, fossil flows to air participate in the Climate Change impact, no matter what their subcategory is. But Brightway does not work this way and only takes into account the flows that are exactly the same, so it leads to important differences in the results.
In LCA there's no agreement on elementary flows and archetypical emission scenarios / context (https://doi.org/10.1007/s11367-017-1354-3), and implementations of the impact assessment methods differ (https://www.lifecycleinitiative.org/portfolio_category/lcia/).
It is not unusual that the same activity and same impact assessment method returns different results in different software. There are some attempts to improve the current practices (see e.g , https://github.com/USEPA/LCIAformatter).
Trail 1 :
result, data = mail.uid("STORE", str(message_id), "+X-GM-LABELS", '"\\Trash"')
o/p :
BAD [b'Command Argument Error. 11']
Trail 2 :
result, data = mail.uid('STORE', str(message_id) , '+FLAGS', '(\\Deleted)')
print("Deleted the mail : " , result ,"-", details_log[4])
result, data = mail.uid('EXPUNGE', str(message_id))
print("result",result)
print("data",data)
o/p :
Deleted the mail : OK
result NO
data [b'EXPUNGE failed.']
Issue : After Expunge , I even tried to close and logout the connection , but still it doesnt get deleted.
I know this post is old, but for anyone who reads it later on:
When using imaplib's select function to choose a mailbox to view (in my case, the "Inbox" mailbox), I had the readonly argument set to True, to be safe, but this blocked me from deleting emails in Microsoft Outlook. I set it to False and was able to delete emails with the store and expunge methods:
conn.select("Inbox", readonly=False)
# modify search query as you see fit
typ, data = conn.search(None, "FROM", "scammer#whatever.com")
for email in data[0].split():
conn.store(email, "+FLAGS", "\\Deleted")
conn.expunge()
conn.close()
conn.logout()
This is my first post, I've been lurking for a while.
Some context to my question;
I'm working with the Stripe API to pull transaction data and match these with booking numbers from another API source. (property reservations --> funds received for reconciliation)
I started by just making calls to the API and sorting the data in place using python 3, however it started to get very complicated and I thought I should persist the data in a mongodb stored on localhost. I began to do this, however I decided that storing the sorted data was still just as complicated and the request times were getting quite long, I thought, maybe I should pull all the stripe data and store it locally and then query whatever I needed.
So here I am, with a bunch of code I've written for both and still not alot of progress. I'm a bit lost with the next move. I feel like I should probably pick a path and stick with it. I'm a little unsure what is the "best practise" when working with API's, usually I would turn to YouTube, but I haven't been able to find a video which covers this specific scenario. The amount of data being pulled from the API would be around 100kb per request.
Here is the original code which would grab each query. Recently I've learnt I can use the expand method (I think this is what it's called) so I don't need to dig down so many levels in my for loop.
The goal was to get just the metadata which contains the booking reference numbers that can then be match against a response from my property management systems API. My code is a bit embarrassing, I've kinda just learnt it over the last little while in my downtime from work.
import csv
import datetime
import os
import pymongo
import stripe
"""
We need to find a Valid reservation_ref or reservation_id in the booking.com Metadata. Then we need to match this to a property ID from our list of properties in the book file.
"""
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
stripe_payouts = mydb["stripe_payouts"]
stripe.api_key = "sk_live_thisismyprivatekey"
r = stripe.Payout.list(limit=4)
payouts = []
for data in r['data']:
if data['status'] == 'paid':
p_id = data['id']
amount = data['amount']
meta = []
txn = stripe.BalanceTransaction.list(payout=p_id)
amount_str = str(amount)
amount_dollar = str(amount / 100)
txn_len = len(txn['data'])
for x in range(txn_len):
if x != 0:
charge = (txn['data'][x]['source'])
if charge.startswith("ch_"):
meta_req = stripe.Charge.retrieve(charge)
meta = list(meta_req['metadata'])
elif charge.startswith("re_"):
meta_req = stripe.Refund.retrieve(charge)
meta = list(meta_req['metadata'])
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": txn['data'],
"metadata": {
charge: [meta]
}
}
)
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
"_id": str(p_id),
"payout": str(p_id),
"transactions": txn['data'],
"metadata": {
charge: [meta]
This last section doesn't work properly, this is kinda where I stopped and starting calling all the data and storing it in mongodb locally.
I appreciate if you've read this wall of text this far.
Thanks
EDIT:
I'm unsure what the best practise is for adding additional information, but I've messed with the code below per the answer given. I'm now getting a "Key error" when trying to insert the entries into the database. I feel like It's duplicating keys somehow.
payouts = []
def add_metadata(payout_id, transaction_type):
transactions = stripe.BalanceTransaction.list(payout=payout_id, type=transaction_type, expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = [transaction.source.metadata]
if stripe_payouts.Collection.count_documents({"_id": payout_id}) == 0:
payouts.append(
{
transaction.id: transaction
}
)
for data in r['data']:
p_id = data['id']
add_metadata(p_id, 'charge')
add_metadata(p_id, 'refund')
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
#print(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")```
To answer your high level question. If you're frequently accessing the same data and that data isn't changing much then it can make sense to try to keep your local copy of the data in sync and make your frequent queries against your local data.
No need to be embarrassed by your code :) we've all been new at something at some point.
Looking at your code I noticed a few things:
Rather than fetch all payouts, then use an if statement to skip all except paid, instead you can pass another filter to only query those paid payouts.
r = stripe.Payout.list(limit=4, status='paid')
You mentioned the expand [B] feature of the API, but didn't use it so I wanted to share how you can do that here with an example. In this case, you're making 1 API call to get the list of payouts, then 1 API call per payout to get the transactions, then 1 API call per charge or refund to get the metadata for charges or metadata for refunds. This results in 1 * (n payouts) * (m charges or refunds) which is a pretty big number. To cut this down, let's pass expand=['data.source'] when fetching transactions which will include all of the metadata about the charge or refund along with the transaction.
transactions = stripe.BalanceTransaction.list(payout=p_id, expand=['data.source'])
Fetching the BalanceTransaction list like this will only work as long as your results fit on one "page" of results. The API returns paginated [A] results, so if you have more than 10 transactions per payout, this will miss some. Instead, you can use an auto-pagination feature of the stripe-python library to iterate over all results from the BalanceTransaction list.
for transaction in transactions.auto_paging_iter():
I'm not quite sure why we're skipping over index 0 with if x != 0: so that may need to be addressed elsewhere :D
I didn't see how or where amount_str or amount_dollar was actually used.
Rather than determining the type of the object by checking the ID prefix like ch_ or re_ you'll want to use the type attribute. Again in this case, it's better to filter by type so that you only get exactly the data you need from the API:
transactions = stripe.BalanceTransaction.list(payout=p_id, type='charge', expand=['data.source'])
I'm unable to test because I lack the same database that you have, but wanted to share a refactoring of your code that you may consider.
r = stripe.Payout.list(limit=4, status='paid')
payouts = []
for data in r['data']:
p_id = data['id']
amount = data['amount']
meta = []
amount_str = str(amount)
amount_dollar = str(amount / 100)
transactions = stripe.BalanceTransaction.list(payout=p_id, type='charge', expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
transactions = stripe.BalanceTransaction.list(payout=p_id, type='refund', expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
Here's a further refactoring using functions defined to encapsulate just the bit adding to the database:
r = stripe.Payout.list(limit=4, status='paid')
payouts = []
def add_metadata(payout_id, transaction_type):
transactions = stripe.BalanceTransaction.list(payout=payout_id, type=transaction_tyep, expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": payout_id}).count() == 0:
payouts.append(
{
"_id": str(payout_id),
"payout": str(payout_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
for data in r['data']:
p_id = data['id']
add_metadata('charge')
add_metadata('refund')
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
[A] https://stripe.com/docs/api/pagination
[B] https://stripe.com/docs/api/expanding_objects
I have a csv file and I'm trying to convert it into Corpus to use the tm_map later and the apply some clustering.
I read the file
data <- read.csv("data.csv", header = TRUE, sep = ",",stringsAsFactors = FALSE)
Turn what I need into corpus
corp <- Corpus(VectorSource(data$text))
This is the outcome for the metadata
> meta(corp[[1]])
author : character(0)
datetimestamp: 2019-09-20 20:48:45
description : character(0)
heading : character(0)
id : 1
language : en
origin : character(0)
Then I try to add the author info, so I can add the date and title afterwards, like this
> for(i in 1:length(corp)) {
+ corp[[i]]$meta$author == data$author[i]
+ }
but I keep on getting this
> abstract[[1]]$meta$author
character(0)
> meta(abstract[[1]], tag = 'author')
character(0)
when
> data$author[1]
[1] "Juan Vásquez Córdoba"
How can I add the right metadata info to my Corpus?
I found the answear, object corpus must be this way:
corp <- VCorpus(VectorSource(data$text))
With the V everything works out