Firebase... Add/Update Firebase Using node.js Script - node.js

I have arbitrary JSON that is sensibly laid out like this:
[
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{
"street":"416 S Watson RD",
"city":"Buckeye"
...
}
}
]
I've written a node.js script like this for proof of concept (why I'm using node is that the JS API seems better supported than REST or Ruby for this. I could be wrong):
http = require('http')
Firebase = require('firebase')
all_sites_url = "http://supercharge.info/service/supercharge/allSites"
firebase_url = "https://tesla-supercharger.firebaseio.com/"
http.get(all_sites_url, (res) ->
body = ""
res.on "data", (chunk) ->
body += chunk
return
res.on "end", ->
response = JSON.parse(body)
all_sites = response
send_to_firebase(response)
return
return
).on "error", (e) ->
console.log "Got error: ", e
return
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
console.log charger
new_child = firebase_ref.push()
new_child.set {id: charger.id, data: charger}, (error) ->
if error
console.log "Data cound not be saved #{error}"
else
console.log "Data saved successfully"
The result is a unique id generated by Firebase, which has as a child a data and an id child. The data child has the expected information like name, status, etc.
What I'd prefer is to generate a key-value pair. E.g., for an id of 100:
- 100
- name
- address
street
city
etc. So my first question is how to accomplish this or if it is even sensible.
After the first time around, this data (call it the data from an external server) will be there and a mobile app will have added some fields. These are not present in the data already there. Next time I fetch data from the external server, I want to update things that have changed that the server would know about, like status. I don't want to tamper with things that only the mobile devices would know about like remote_observations.
I know I'm seeming a bit dense here, but I'm trying to put together a sensible data model that will be updatable from that server using a CRON job and incrementally updatable from a bunch of mobile devices.
Any help is much appreciated.
UPDATE: I have found that this works for getting the structure I want:
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
responses_pending += 1
console.log "Data saved successfully : #{responses_pending} pending"
firebase_ref.on 'value', ->
console.log "value received rp=#{responses_pending}"
process.exit() if (responses_pending -= 1) < 1

So the code I settled on is this:
http = require('http')
Firebase = require('firebase')
firebase_url = '/path/to/your/firebase'
# code to get JSON of the form:
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{"street":"416 S Watson RD",
"city":"Buckeye",
"state":"AZ",
"zip":"85326",
"country":"USA"},
... etc.
}
# Asynchronous get of JSON hash from some server or other.
get_my_fine_JSON().on 'complete', (response) ->
send_to_firebase(response)
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
length = response.length
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
console.log "Data saved successfully"
process.exit() if length -= 1 is 0
Discussion:
The idea was to have a Firebase structure like this:
- 100
- address
street: "123 Main Street"
etc.
That's reason 1 why id is pulled up to be the primary key. Reason 2 is so that I can uniquely identify an object pulled off the external server as the "same" one in my Firebase and apply any updates necessary.
Epiphany 1: Update is more like upsert. If the key is there, whatever hash you supply replaces matching values. If it's not there, then Firebase happily adds it. Which is way cool because it covers both the push and patch cases.
Epiphany 2: This process will hang waiting for events if nothing tells it to stop. That's why the countdown index, length is decremented until the code has upserted (for lack of a better term) each item.
Observation 1: Doing this in node.js is super fast compared with REST using Python or Ruby. And this upsert stuff is wicked cool if I'm understanding it right.
Observation 2: There isn't a ton of wisdom out there as of this writing regarding writing node shell scripts to do this kind of stuff. Maybe it's a good idea, maybe a bad one. I don't know.
Observation 3: Because of the asynchronous nature of node and the Firebase Javascript API (both GOOD THINGs), terminating a process before the last bit is done can be tricky because your process has to hang on just long enough to complete its last request/response with Firebase. This is, as mentioned before, done in the completion handler of the update. Otherwise we wouldn't necessarily be complete when the process exited.
Caveat 1: Related to observation 2, this could be a bad idea, but I haven't been able to find resources that speak to the problem.
Caveat 2: This could be a horrid abuse or misunderstanding of the Firebase update API. I am reporting observed behavior in the limited case of my specific data. YMMV.
Caveat 3: I'm hoping the process lifetime is as I suggest it is in observation 3.
A note to the decaffeinated: The Javascript for this is so trivially different that it shouldn't be too tough to translate. Or go to js2coffee and paste the Coffeescript into the right pane to get real Javascript in the left pane that you can tune.

Related

Elastic Search via python gives wrong count

I’m new to python and I need to get connected to “Kibana” via python. we’re using Kibana 7.4.1. The requirement is to get them just the count (hits).
Due to some restrictions, I need to use Python 3.6 only. I’ve added the “ElasticSearch” & “ElasticSearch-dsl” library.
I’m able to get connected to the Kibana via the client, but I’m getting the wrong hits count.
Code:
from elasticsearch import Elasticsearch
from elasticsearch_dsl import MultiSearch, Search
from elasticsearch_dsl.query import QueryString, Range, SimpleQueryString
es = Elasticsearch(['host2', 'host2'], http_auth=('usr', 'pass'), port=9200)
s = Search(using=es, index='c*')
s.filter(SimpleQueryString(query="tags:prod AND severity:INFO AND service: finder AND msg:* is processed"))
s.filter(Range(** {'#timestamp': {'gte': 'now-5m', 'lt': 'now'}}))
response = s.execute()
print("Got %d Hits:" % response['hits']['total']['value']) # Always coming as 1000 so this is wrong
Can I get some help with this, please?
First of all a little clarification. You are connecting to Elasticsearch and not Kibana (Kibana is a client, like the program you are writing).
You are receiving always 10000 as result, because your index has more than 10000 hits. It is a documented feature. Indeed, since the count computation is expensive in the general case it is performed only when needed. In order to obtain the right number of results you have two possibilities
to set the query parameter track_total_hits to true
use the count API.
track_total_hits
You can add this extra parameter to the search object as reported here as follows:
s = Search(using=es, index='c*')
s = s.extra(track_total_hits=True)
<the-rest of your code>
Count API approach
Instead of invoking the execute() function, you can simply use the count() function:
s = Search(using=es, index='c*')
s.filter(SimpleQueryString(query="tags:prod AND severity:INFO AND service: finder AND msg:* is processed"))
s.filter(Range(** {'#timestamp': {'gte': 'now-5m', 'lt': 'now'}}))
response = s.cpunt()
print("Got %d Hits:" % response)
Kind regards

When working with the Stripe API, is it better to sort each request or store locally and perform queries?

This is my first post, I've been lurking for a while.
Some context to my question;
I'm working with the Stripe API to pull transaction data and match these with booking numbers from another API source. (property reservations --> funds received for reconciliation)
I started by just making calls to the API and sorting the data in place using python 3, however it started to get very complicated and I thought I should persist the data in a mongodb stored on localhost. I began to do this, however I decided that storing the sorted data was still just as complicated and the request times were getting quite long, I thought, maybe I should pull all the stripe data and store it locally and then query whatever I needed.
So here I am, with a bunch of code I've written for both and still not alot of progress. I'm a bit lost with the next move. I feel like I should probably pick a path and stick with it. I'm a little unsure what is the "best practise" when working with API's, usually I would turn to YouTube, but I haven't been able to find a video which covers this specific scenario. The amount of data being pulled from the API would be around 100kb per request.
Here is the original code which would grab each query. Recently I've learnt I can use the expand method (I think this is what it's called) so I don't need to dig down so many levels in my for loop.
The goal was to get just the metadata which contains the booking reference numbers that can then be match against a response from my property management systems API. My code is a bit embarrassing, I've kinda just learnt it over the last little while in my downtime from work.
import csv
import datetime
import os
import pymongo
import stripe
"""
We need to find a Valid reservation_ref or reservation_id in the booking.com Metadata. Then we need to match this to a property ID from our list of properties in the book file.
"""
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
stripe_payouts = mydb["stripe_payouts"]
stripe.api_key = "sk_live_thisismyprivatekey"
r = stripe.Payout.list(limit=4)
payouts = []
for data in r['data']:
if data['status'] == 'paid':
p_id = data['id']
amount = data['amount']
meta = []
txn = stripe.BalanceTransaction.list(payout=p_id)
amount_str = str(amount)
amount_dollar = str(amount / 100)
txn_len = len(txn['data'])
for x in range(txn_len):
if x != 0:
charge = (txn['data'][x]['source'])
if charge.startswith("ch_"):
meta_req = stripe.Charge.retrieve(charge)
meta = list(meta_req['metadata'])
elif charge.startswith("re_"):
meta_req = stripe.Refund.retrieve(charge)
meta = list(meta_req['metadata'])
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": txn['data'],
"metadata": {
charge: [meta]
}
}
)
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
"_id": str(p_id),
"payout": str(p_id),
"transactions": txn['data'],
"metadata": {
charge: [meta]
This last section doesn't work properly, this is kinda where I stopped and starting calling all the data and storing it in mongodb locally.
I appreciate if you've read this wall of text this far.
Thanks
EDIT:
I'm unsure what the best practise is for adding additional information, but I've messed with the code below per the answer given. I'm now getting a "Key error" when trying to insert the entries into the database. I feel like It's duplicating keys somehow.
payouts = []
def add_metadata(payout_id, transaction_type):
transactions = stripe.BalanceTransaction.list(payout=payout_id, type=transaction_type, expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = [transaction.source.metadata]
if stripe_payouts.Collection.count_documents({"_id": payout_id}) == 0:
payouts.append(
{
transaction.id: transaction
}
)
for data in r['data']:
p_id = data['id']
add_metadata(p_id, 'charge')
add_metadata(p_id, 'refund')
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
#print(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")```
To answer your high level question. If you're frequently accessing the same data and that data isn't changing much then it can make sense to try to keep your local copy of the data in sync and make your frequent queries against your local data.
No need to be embarrassed by your code :) we've all been new at something at some point.
Looking at your code I noticed a few things:
Rather than fetch all payouts, then use an if statement to skip all except paid, instead you can pass another filter to only query those paid payouts.
r = stripe.Payout.list(limit=4, status='paid')
You mentioned the expand [B] feature of the API, but didn't use it so I wanted to share how you can do that here with an example. In this case, you're making 1 API call to get the list of payouts, then 1 API call per payout to get the transactions, then 1 API call per charge or refund to get the metadata for charges or metadata for refunds. This results in 1 * (n payouts) * (m charges or refunds) which is a pretty big number. To cut this down, let's pass expand=['data.source'] when fetching transactions which will include all of the metadata about the charge or refund along with the transaction.
transactions = stripe.BalanceTransaction.list(payout=p_id, expand=['data.source'])
Fetching the BalanceTransaction list like this will only work as long as your results fit on one "page" of results. The API returns paginated [A] results, so if you have more than 10 transactions per payout, this will miss some. Instead, you can use an auto-pagination feature of the stripe-python library to iterate over all results from the BalanceTransaction list.
for transaction in transactions.auto_paging_iter():
I'm not quite sure why we're skipping over index 0 with if x != 0: so that may need to be addressed elsewhere :D
I didn't see how or where amount_str or amount_dollar was actually used.
Rather than determining the type of the object by checking the ID prefix like ch_ or re_ you'll want to use the type attribute. Again in this case, it's better to filter by type so that you only get exactly the data you need from the API:
transactions = stripe.BalanceTransaction.list(payout=p_id, type='charge', expand=['data.source'])
I'm unable to test because I lack the same database that you have, but wanted to share a refactoring of your code that you may consider.
r = stripe.Payout.list(limit=4, status='paid')
payouts = []
for data in r['data']:
p_id = data['id']
amount = data['amount']
meta = []
amount_str = str(amount)
amount_dollar = str(amount / 100)
transactions = stripe.BalanceTransaction.list(payout=p_id, type='charge', expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
transactions = stripe.BalanceTransaction.list(payout=p_id, type='refund', expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": p_id}).count() == 0:
payouts.append(
{
"_id": str(p_id),
"payout": str(p_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
Here's a further refactoring using functions defined to encapsulate just the bit adding to the database:
r = stripe.Payout.list(limit=4, status='paid')
payouts = []
def add_metadata(payout_id, transaction_type):
transactions = stripe.BalanceTransaction.list(payout=payout_id, type=transaction_tyep, expand=['data.source'])
for transaction in transactions.auto_paging_iter():
meta = list(transaction.source.metadata)
if stripe_payouts.find({"_id": payout_id}).count() == 0:
payouts.append(
{
"_id": str(payout_id),
"payout": str(payout_id),
"transactions": transactions,
"metadata": {
charge: [meta]
}
}
)
for data in r['data']:
p_id = data['id']
add_metadata('charge')
add_metadata('refund')
# TODO: Add error exception to check for po id already in the database.
if len(payouts) != 0:
x = stripe_payouts.insert_many(payouts)
print("Inserted into Database ", len(x.inserted_ids), x.inserted_ids)
else:
print("No entries made")
[A] https://stripe.com/docs/api/pagination
[B] https://stripe.com/docs/api/expanding_objects

Avoid memory leaks with promises and loop in coffee-script (no await)

I am currently trying to perform some operations using promises in a loop but I ended up with huge memory leaks.
My problem is exactly the one pointed out in this article but as opposite to author, I am writing in coffee-script (yes, with hyphen. Which means coffeescript 1.12 and not the latest version). Thus, I am not able to use "await" key word (this is a casual guess since each time I want to use it, I got "await is not defined" error).
This is my original code (with memory leaks) :
recursiveFunction: (next = _.noop) ->
_data = #getSomeData()
functionWithPromise(_data).then (_enrichedData) =>
#doStuffWithEnrichedData(_enrichedData)
#recursiveFunction()
.catch (_err) =>
#log.error _err.message
#recursiveFunction()
So according to the article I linked, I would have to do something like that :
recursiveFunction: (next = _.noop) ->
_data = #getSomeData()
_enrichedData = await functionWithPromise(_data)
#recursiveFunction()
But then again, I am stuck because I can't use "await" key word. What would be the best approach then ?
EDIT:
Here is my real original code. What I am trying to achieve is a face-detection application. This function is located in a lib and I am using "Service" variable to expose variables between libs. In order to get frame from webcam, I am using opencv4nodejs.
faceapi = require('face-api.js')
tfjs = require('#tensorflow/tfjs-node')
(...)
# Analyse the new frame
analyseFrame: (next = _.noop) ->
# Skip if not capturing
return unless Service.isCapturing
# get frame
_frame = Service.videoCapture.getFrame()
# get frame date, and
#currentFrameTime = Date.now()
# clear old faces in history
#refreshFaceHistory(#currentFrameTime)
#convert frame to a tensor
try
_data = new Uint8Array(_frame.cvtColor(cv.COLOR_BGR2RGB).getData().buffer)
_tensorFrame = tfjs.tensor3d(_data, [_frame.rows, _frame.cols, 3])
catch _err
#log.error "Error instantiating tensor !!!"
#log.error _err.message
# find faces on frames
faceapi.detectAllFaces(_tensorFrame, #faceDetectionOptions).then (_detectedFaces) =>
#log.debug _detectedFaces
# fill face history with detceted faces
_detectedFaces = #fillFacesHistory(_detectedFaces)
# draw boxes on image
Service.videoCapture.drawFaceBoxes(_frame, _detectedFaces)
# Get partial time
Service.frameDuration = Date.now() - #currentFrameTime
# write latency on image
Service.videoCapture.writeLatency(_frame, Service.frameDuration)
# show image
Service.faceRecoUtils.showImage(_frame)
# Call next
_delayNextFrame = Math.max(0, 1000/#options.fps - Service.frameDuration)
setTimeout =>
# console.log "Next frame : #{_delayNextFrame}ms - TOTAL : #{_frameDuration}ms"
#analyseFrame()
, (_delayNextFrame)
The solution was to dispose the tensor copy sent to detectFaces.
faceapi.detectAllFaces(_tensorFrame, #faceDetectionOptions).then (_detectedFaces) =>
(...)
_tensorFrame.dispose()
(...)

Spark stream data from IBM MQ

I want to stream data from IBM MQ. I have tried out this code I found on Github.
I am able to stream data from the Queue but each time it streams, it takes all the data from it. I just want to take the current data that is pushed into the queue. I looked up on many sites but didn't find the correct solution.
In Kafka we had something like KafkaStreamUtils for streaming the near-real-time data. Is there anything similar to that in IBM MQ so that it streams only the latest data?
The sample in the link you provided shows that it calls the following method to recieve from the the IBM MQ:
CustomMQReciever(String host , int port, String qm, String channel, String qn)
If you review CustomMQReciever here you can see that it is only Browsing the messages from the queue. This means the message will still be on the queue and the next time you connect you will receive the same messages:
MQQueueBrowser browser = (MQQueueBrowser) qSession.createBrowser(queue);
If you wanted to remove the messages from the queue you would need to call a method that does consume them from the queue instead of browsing them from the queue. Below is an example of changes to CustomMQReciever.java that should accomplish what you want:
Under the initConnection() change the above code to the following to cause it to remove the messages from the queue:
MQMessageConsumer consumer = (MQMessageConsumer) qSession.createConsumer(queue);
Get rid of:
enumeration= browser.getEnumeration();
Under receive() change the following:
while (!isStopped() && enumeration.hasMoreElements() )
{
receivedMessage= (JMSMessage) enumeration.nextElement();
String userInput = convertStreamToString(receivedMessage);
//System.out.println("Received data :'" + userInput + "'");
store(userInput);
}
To something like this:
while (!isStopped() && (receivedMessage = consumer.receiveNoWait()) != null))
{
String userInput = convertStreamToString(receivedMessage);
//System.out.println("Received data :'" + userInput + "'");
store(userInput);
}

GML room_goto() Error, Expecting Number

I'm trying to make a game that chooses a room from a pool of rooms using GML, but I get the following error:
FATAL ERROR in action number 3 of Create Event for object obj_control:
room_goto argument 1 incorrect type (5) expecting a Number (YYGI32)
at gml_Object_obj_control_CreateEvent_3 (line 20) - room_goto(returnRoom)
pool = ds_list_create()
ds_list_insert(pool, 0, rm_roomOne)
ds_list_insert(pool, 1, rm_roomTwo)
ds_list_insert(pool, 2, rm_roomThree)
ds_list_insert(pool, 3, rm_roomFour)
var returnIndex;
var returnRoom;
returnIndex = irandom(ds_list_size(pool))
returnRoom = ds_list_find_value(pool, returnIndex)
if (ds_list_size(pool) == 0){
room_goto(rm_menu_screen)
}else{
room_goto(returnRoom)
}
I don't get the error message saying it's expecting a number.
This is weird indeed... I think this should actually work.. But I have no GM around to test :(
For now you can also solve this using "choose". This saves a list (and saves memory, because you're not cleaning up the list by deleting it - thus it resides in memory)
room_goto(choose(rm_roomOne, rm_roomTwo, rm_roomThree, rm_roomFour));
choose basically does exactly what you're looking for. Might not be the best way to go if you're re-using the group of items though.

Resources