Firebase Bucket Download File taking a lot of time - node.js

I'm developing an app using Firebase as BaaS.
I'm having time problems when I'm uploading an image (with 90KB or less) and triggering a Cloud Function.
My trigger starts when upload ends:
exports.uploadControl = functions.storage.object().onFinalize((req, res) => {
uploadControl.handler(req, res);
return 0;
});
And, inside uploadControl, i have:
return mkdirp(tempLocalDir).then(() => {
console.log('1. mkDirp - OK!');
console.log('2. Download starts...');
return bucket.file(filePath).download();
}).then((content) => {
console.log('3. Download ends');
return 0;
});
This code works fine, but the problem is the time spent between step 2 and 3...
It takes 24 sec or more.
How to solve this? Is there any code problem? Or is there a Firebase setting to solve it?
Tks.

Two things wrong here:
The onFinalize() callback doesn't receive res and req objects like HTTP triggers do. It receives object metadata as the first argument. Read the documentation for details.
background triggers like this one must return a promise when all the work is complete. Otherwise Cloud Functions will shut down the work prematurely, as it has no idea when it's finished. If you want to kick off all that work from another function, it should be returning that promise.
-
exports.uploadControl = functions.storage.object().onFinalize(object => {
return uploadControl.handler(object);
});

Related

Firebase Firestore transactions incredibly slow (3-4 minutes)

Edit: Removing irrelevant code to improve readability
Edit 2: Reducing example to only uploadGameRound function and adding log output with times.
I'm working on a mobile multiplayer word game and was previously using the Firebase Realtime Database with fairly snappy performance apart from the cold starts. Saving an updated game and setting stats would take at most a few seconds. Recently I made the decision to switch to using Firestore for my game data and player stats / top lists, primarily because of the more advanced queries and the automatic scaling with no need for manual sharding. Now I've got things working on Firestore, but the time it takes to save an updated game and update a number of stats is just ridiculous. I'm clocking average between 3-4 minutes before the game is updated, stats added and everything is available in the database for other clients and viewable in the web interface. I'm guessing and hoping that this is because of something I've messed up in my implementation, but the transactions all go through and there are no warnings or anything else to go on really. Looking at the cloud functions log, the total time from function call to completion log statement appears to be a bit more than a minute, but that log doesn't appear until after same the 3-4 minute wait for the data.
Here's the code as it is. If someone has time to have a look and maybe spot what's wrong I'd be hugely grateful!
This function is called from Unity client:
exports.uploadGameRound = functions.https.onCall((roundUploadData, response) => {
console.log("UPLOADING GAME ROUND. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
// CODE REMOVED FOR READABILITY. JUST PREPARING SOME VARIABLES TO USE BELOW. NOTHING HEAVY, NO DATABASE TRANSACTIONS. //
// Get a new write batch
const batch = firestoreDatabase.batch();
// Save game info to activeGamesInfo
var gameInfoRef = firestoreDatabase.collection('activeGamesInfo').doc(gameId);
batch.set(gameInfoRef, gameInfo);
// Save game data to activeGamesData
const gameDataRef = firestoreDatabase.collection('activeGamesData').doc(gameId);
batch.set(gameDataRef, { gameDataCompressed: updatedGameDataGzippedString });
if (foundWord !== undefined && foundWord !== null) {
const wordId = foundWord.timeStamp + "_" + foundWord.word;
// Save word to allFoundWords
const wordRef = firestoreDatabase.collection('allFoundWords').doc(wordId);
batch.set(wordRef, foundWord);
exports.incrementNumberOfTimesWordFound(gameInfo.language, foundWord.word);
}
console.log("COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
// Commit the batch
batch.commit().then(result => {
return gameInfoRef.update({ roundUploaded: true }).then(function (result2) {
console.log("DONE COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
return;
});
});
});
Again, any help with understanding this weird behaviour massively appreciated!
Ok, so I found the problem now and thought I should share it:
Simply adding a return statement before the batch commit fixed the function and reduced the time from 4 minutes to less than a second:
RETURN batch.commit().then(result => {
return gameInfoRef.update({ roundUploaded: true }).then(function (result2) {
console.log("DONE COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
return;
});
});
Your function isn't returning a promise that resolves with the data to send to the client app. In the absence of a returned promise, it will return immediately, with no guarantee that any pending asynchronous work will terminate correctly.
Calling then on a single promise isn't enough to handle promises. You likely have lots of async work going on here, between commit() and other functions like incrementNumberOfTimesWordFound. You will need to handle all of the promises correctly, and make sure your overall function returns only a single promise that resolves when all that work is complete.
I strongly suggest taking some time to learn how promises work in JavaScript - this is crucial to writing effective functions. Without a full understanding, things will appear to go wrong, or not at all, in strange ways.

Should I save data to DB asynchronously

I'm using node express and postgress.
I'm not sure if what I'm trying to do is a good practice or a very big mistake.
Save data to database asynchronously after I already return a result to the client.
I tried to demonstrate it with console.log to check if my server will be blocked during the saving.
Here you can see status route and statusB route.
app.get("/statusB", async (req, res) => {
return res.status(200).send("testAAA");
});
app.get("/status", async (req, res) => {
const userStats = await UserController.getData("id")
const x = test();
return res.status(200).send(userStats);
});
async function test() {
return new Promise(() => {
for (let x = 0; x < 10000; x++) {
setTimeout( () => {
console.log(x)
}, 5000);
}
})
}
What should I want to happen is if I send /status and right after send statusB.
I expect the output to be:
/status will return userStats data
/StatusB return 'testAAA'
and the counter will run asynchronously.
But actual the output is:
- /status return userStats data
- The counter run
- /StatusB return 'testAAA' only after the counter finished
The console log is only test to know if I can fetching and saving data to the database asynchronously instead of the console log.
Depends on your business case.
If it's alright for your customer to get a 200 OK status code even if the saving might actually have failed, then sure, you can do it asynchronously after you've responded.
In other cases, you'll want to do the saving within the request and only respond after you're sure everything is safe and sound.
It's depending on your logic if you want for example to return the saved resource to the client you should wait (async/await or callback) until the data is saved to the database but for example, if you want just log an action without any returns to the frontend you can save it asynchronously
Yes, you should save data to db asynchronously, because of the way nodejs works. If you wait for an answer from db (synchronously), nodejs block event loop and doesn't handle new requests from clients. BUT if your business logic rely on the fact that you should return the answer from db to client, you should do it synchronously and maybe think about workarounds or choose another runtime, if that will become a problem.

Firebase Function returning connection error

I am using firebase functions Database Triggers.
My Function :
exports.function_name = functions.database
.ref("/transact/{entry_type}/{id1}/{id2}/trigger_cross_entry_function")
.onCreate((data, context) => {
console.log("trigger_cross_entry_function value", data.val());
if (data.val() == true) {
console.log("Function Calling parent data");
return data.ref.parent.once('value').then(function(snapshot){
}, function(error) {
// The Promise was rejected.
console.error("THIS IS THE ERROR:"+error);
});
} else {
console.log("Function Returned");
return 0;
}
});
Whenever I want to trigger this function I put trigger_cross_entry_function into that partcular path from the Mobile App. Everything works fine and function is called as expected.
After sometime If I again try to do the same thing it gives me
Function execution took 16 ms, finished with status: 'connection error'
If I restart the function than it again works perfectly.
and If I trigger it continously than it has no issue.
But once I keep it idle and try to trigger it
again after sometime, it gives me this error.
And I am using Firebase Pay as you go Plan.
Also, as you can see I am returning all the promises correctly and function is being triggered everytime but It doesnt go into the function. it just exit with connection error.
What might be the cause for this ? I have a spent a day finding this.
Experiencing the same issue. Looks like Google changed the status of the service to down
https://status.firebase.google.com/incident/Functions/18046
https://status.firebase.google.com/
So most likely the Google side problem.

Using redis as cache as REST Api user (in order to save Api requests)

I am a API user and I have only a limited number of requests availble for a high traffic website (~1k concurrent visitors). In order to save API requests I would like to cache the responses for specific requests which are unlikely to change.
However I want to refresh this redis key (the API response) at least every 15 seconds. I wonder what the best approach for this would be?
My ideas:
I thought the TTL field would be handy for this scenario. Just set a TTL of 15s for this key. When I query this key and it's not present I would just request it again using the API. The problem: Since this is a high traffic website I would expect around 20-30 requests until I've got a response from the API and this would lead to 20-30 requests to the API within a few ms. So I would need to "pause" all incoming requests until there is a API response
My second idea was to refresh the key every 15s. I could set a background task which runs every 15s or upon page request I could check in my controller if the key needs a refresh. I would prefer the last idea but therefore I would need to maintain the redis key age and this seems to be very expensive and it is not a built in feature?
What would you suggest for this use case?
My controller code:
function players(req, res, next) {
redisClient.getAsync('leaderboard:players').then((playersLeaderboard) => {
if(!playersLeaderboard) {
// We need to get a fresh copy of the playersLeaderboard
}
res.set('Cache-Control', 's-maxage=10, max-age=10')
res.render('leaderboards/players', {playersLeaderboard: playersLeaderboard})
}).catch((err) => {
logger.error(err)
})
}
Simply fetch and cache the data when the node.js server starts and then set an interval for 15 seconds to fetch fresh data and update cache. Avoid using the TTL for this usecase.
function fetchResultsFromApi(cb) {
apiFunc((err, result) => {
// do some error handling
// cache result in redis without ttl
cb();
});
}
fetchResultsFromApi(() => {
app.listen(port);
setInterval(() => {
fetchResultsFromApi(() => {});
}, 15000);
}
Pros:
Very simple to implement
No queuing of client request required
Super fast response times
Cons:
The cache update might not execute/complete exactly after every 15th second. It might be a few milliseconds here and there. I assume that it won't make a lot of difference for what you are doing and you can always reduce the interval time to update cache before 15 seconds.
I guess this is more of an architecture question than those typical "help my code don't work" kind.
Let me paraphrase your requirements.
Q: I would like to cache the responses of some HTTP requests which are unlikely to change and I would like these cached responses to be refreshed every 15 seconds. Is it possible?
A: Yes it is and you're so going to thank the fact that Javascript is single threaded so it is going to be quite straight forward.
Some fundamental knowledge here. NodeJS is an event driven framework which means that at 1 point in time it is going to execute only one piece of code, all the way until it is done.
If any aysnc call is encountered along the way, it will call them and add an event to the event-loop to say "callback when a response is received". When the code routine is finished then it will pops the next event from the queue to run them.
Based on this knowledge, we know we can achieve this by building a function to only fire-off 1 async call to update the cached-responses everytime it expires. If an async call is already in action, then just put their callback functions into a queue. This is so that you don't do multiple async calls to fetch the new result.
I'm not familiar with the async module so I have provided an pseudo code example using promises instead.
Pseudo code:
var fetch_queue = [];
var cached_result = {
"cached_result_1": {
"result" : "test",
"expiry" : 1501477638 // epoch time 15s in future
}
}
var get_cached_result = function(lookup_key) {
if (cached_result.hasOwnProperty(lookup_key)) {
if (result_expired(cached_result[lookup_key].expiry)) {
// Look up cached
return new Promise(function (resolve) {
resolve(cached_result[lookup_key].result);
});
}
else {
// Not expired, safe to use cached result
return update_result();
}
}
}
var update_result = function() {
if (fetch_queue.length === 0) {
// No other request is retrieving an updated result.
return new Promise(function (resolve, reject) {
// call your API to get the result.
// When done call.
resolve("Your result");
// Inform other requests that an updated response is ready.
fetch_queue.forEach(function(promise) {
promise.resolve("Your result");
})
// Compute the new expiry epoch time and update the cached_result
})
}
else {
// Create a promise and park it into the queue
return new Promise(function(resolve, reject) {
fetch_queue.push({
resolve: resolve,
reject: reject
})
});
}
}
get_cached_result("cached_result_1").then(function(result) {
// reply the result
})
Note: As the name suggested the code is not actual working solution but the concept is there.
Something worth noting is, setInterval is 1 way to go but it doesn't guarantee that the function will get called exactly at the 15 second mark. The API only make sure that something will happen after the expected time.
Whereas the proposed solution will ensure that as long as the cached result has expired, the very next person looking it up will do a request and the following requests will wait for the initial request to return.

Getting data pushed to an array outside of a Promise

I'm using https://github.com/Haidy777/node-youtubeAPI-simplifier to grab some information from a playlist of Bounty Killers. The way, this library is setup seems to use Promise via Bluebird (https://github.com/petkaantonov/bluebird) which I don't know much about. Looking up the Beginner's Guide for BlueBird gives http://bluebirdjs.com/docs/beginners-guide.html which literally just shows
This article is partially or completely unfinished. You are welcome to create pull requests to help completing this article.
I am able to set up the library
var ytapi = require('node-youtubeapi-simplifier');
ytapi.setup('My Server Key');
As well as list some information about Bounty Killers
ytdata = [];
ytapi.playlistFunctions.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (data) {
for (var i = 0, len = data.length; i < len; i++) {
ytapi.videoFunctions.getDetailsForVideoIds([data[i].videoId])
.then(function (video) {
console.log(video);
// ytdata.push(video); <- Push a Bounty Killer Video
});
}
});
// console.log(ytdata); This gives []
Basically the above pulls the full playlist (normally there will be some pagination here depending on the length) then it takes the data from getVideosForPlaylist iterates the list and calls getDetailsForVideoIds for each YouTube video. All good here.
The issues arises with getting data out of this. I would like to push the video object to ytdata array and I'm unsure whether the empty array at the end is due to scoping or some out of sync such that console.log(ytdata) gets called before the API calls are finished.
How will I be able to get each Bounty Killer video into the ytdata array to be available globally?
console.log(ytdata) gets called before the API calls are finished
Spot on, that's exactly what's happening here, the API calls are async. Once you're using async functions, you must go the async way if you want to deal with the returned data. Your code could be written like this:
var ytapi = require('node-youtubeapi-simplifier');
ytapi.setup('My Server Key');
// this function return a promise you can "wait"
function getVideos() {
return ytapi.playlistFunctions
.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (videos) {
// extract all videoIds
var videoIds = videos.map(video => video.videoId);
// getDetailsForVideoIds is called with an array of videoIds
// and return a promise, one API call is enough
return ytapi.videoFunctions.getDetailsForVideoIds(videoIds);
});
}
getVideos().then(function (ydata) {
// this is the only place ydata is full of data
console.log(ydata);
});
I made use of ES6's arrow function in videos.map(video => video.videoId);, that should work if your nodejs is v4+.
console.log(ytdata) should be immediately AFTER your FOR loop. This data is NOT available until the promise is resolved and the FOR loop execution is complete and attempting to access it beforehand will give you an empty array.
(your current console.log is not working because that code is being executed immediately before the promise is resolved). Only code inside the THEN block is executed AFTER the promise is resolved.
If you NEED the data available NOW or ASAP and the requests for the videos is taking a long time then can you request 1 video at a time or on demand or on a separate thread (using a webworker maybe)? Can you implement caching?
Can you make the requests up front behind the scenes before the user even visits this page? (not sure this is a good idea but it is an idea)
Can you use video thumbnails (like youtube does) so that when the thumbnail is clicked then you start streaming and playing the video?
Some ideas ... Hope this helps
ytdata = [];
ytapi.playlistFunctions.getVideosForPlaylist('PLCCB0BFBF2BB4AB1D')
.then(function (data) {
// THE CODE INSIDE THIS THEN BLOCK IS EXECUTED WHEN ALL THE VIDEO IDS HAVE BEEN RETRIEVED AND ARE AVAILABLE
// YOU COULD SAVE THESE TO A DATASTORE IF YOU WANT
for (var i = 0, len = data.length; i < len; i++) {
var videoIds = [data[i].videoId];
ytapi.videoFunctions.getDetailsForVideoIds(videoIds)
.then(function (video) {
// THE CODE INSIDE THIS THEN BLOCK IS EXECUTED WHEN ALL THE DETAILS HAVE BEEN DOWNLOADED FOR ALL videoIds provided
// AGAIN YOU CAN DO WHATEVER YOU WANT WITH THESE DETAILS
// ALSO NOW THAT THE DATA IS AVAILABLE YOU MIGHT WANT TO HIDE THE LOADING ICON AND RENDER THE PAGE! AGAIN JUST AN IDEA, A DATA STORE WOULD PROVIDE FASTER ACCESS BUT YOU WOULD NEED TO UPDATE THE CACHE EVERY SO OFTEN
// ytdata.push(video); <- Push a Bounty Killer Video
});
// THE DETAILS FOR ANOTHER VIDEO BECOMES AVAILABLE AFTER EACH ITERATION OF THE FOR LOOP
}
// ALL THE DATA IS AVAILABLE WHEN THE FOR LOOP HAS COMPLETED
});
// This is executed immediately before YTAPI has responded.
// console.log(ytdata); This gives []

Resources