Don't get tweet with stream API in Node - node.js

I'm working with Twit to get a nice wrapper around the Twitter API. I have a cron to get all the tweet on a particular hashtag. It increment the counter everytime there's a new tweet, and at the end of the period, save it in to the database (MongoDB). Only problem is, it always return me 0.
Here is the code
new cronJob('00 */5 * * * *', function(){ // start parsing 5mn after call, and every 5mn then
var stream = T.stream('statuses/filter', { track: 'hashtag' })
var counter = 0;
var date = new Date();
var collection = client.collection("TweetsNumber");
stream.on('tweet', function (tweet) {
console.log(tweet);
counter += 1;
})
collection.insert({Date: date, CrawledTweets: counter, Channel: "someChannel"});
console.log(counter + " tweets saved in DB");
}, null, true, "Europe/Paris");
According to the doc, the "stream.on" method is called everytime there's a new tweets. I use some trending topic to be sure to have data, but it's like it is never called, and I really don't know why.
Hope you can help. Have a great day !
EDIT: T is already create in another part of the program, and with other functionnalities, it is working. Same for client, which is my db.
EDIT: Thanks to Shodan, it works now, see the github issue. Thanks a lot !

Are the tweets logged to your console?
If yes, then it is not a twit problem, since it exactly does what you told it to.
As I read your code correctly, you create a cronjob, which fires once every 5 minutes.
It connects a new local stream, which should output to the console and increase the counter for the next 5 minutes.
It insert into the global client.collection("TweetNumbers"), with the local variable counter having a value of 0
it console.log(counter + " tweets saved in DB");, with the local variable counter having a value of 0
The function then exits, and is fresh started in 5 minutes.
stream.on should continue to fire when ever a tweet comes along for the next 5 minutes and increase the counter, BUT the counter is never used again by collection.insert and the second console.log.
This is because you restart the function creating new local variables for all the stuff and logging the initial values again.

You set var counter = 0 and then immediately console.log() it, which means the 'tweet' event never gets a chance to fire and increase counter. You might want to do this:
new cronJob('00 */5 * * * *', function(){ // start parsing 5mn after call, and every 5mn then
var stream = T.stream('statuses/filter', { track: 'hashtag' })
var counter = 0;
var date = new Date();
var collection = client.collection("TweetsNumber");
stream.on('tweet', function (tweet) {
console.log(tweet);
collection.insert({Date: date, CrawledTweets: counter, Channel: "someChannel"});
counter += 1;
console.log(counter + " tweets saved in DB");
})
}, null, true, "Europe/Paris");

Related

Discord JS Scheduled events

I am trying to make a discord bot that will scrape a group google calendar and remind people of upcoming events. I can get the calendar data no problem. Thing thing I don't understand is how to send a scheduled message on a discord server via discord js. This won't be a set time because it will change based on the start time of the calendar event. I'm trying to read the documentation for the GuildScheduledEvent here. But, I can't seem to figure it out/how to implement it.
I've already tried doing it from a cron task but that won't work because the event time is subject to change.
What I have so far is just a bot that will send messages when I run the script. I would really like to have it be automatic via a scheduled event.
let upcomingEvents = []; //array of calendar events
const gcpClient = authorize().then(listEvents); //getting the calendar data
const client = new Client({ intents: [GatewayIntentBits.Guilds]});
client.once(Events.ClientReady, c => {
console.log('Ready! Logged in as ', c.user.tag);
const channel = client.channels.cache.get('1049384497017266228');
upcomingEvents.forEach(element => {
channel.send(`${element.title} on ${element.readabledate}`);
});
})
client.login(TOKEN);
Again, I don't really know how to implement the Scheduled event logic.
Any help would be greatly appreciated.
As far as I understand the class ScheduledEvent doesn't represent what you need, it's for guild events like the ones explained here: https://support.discord.com/hc/en-us/articles/4409494125719-Scheduled-Events
What you need is the 'cron'-package from npm (https://www.npmjs.com/package/cron).
I modified your code to schedule a message for each upcomingEvents entry.
var CronJob = require('cron').CronJob;
let upcomingEvents = []; //array of calendar events
const gcpClient = authorize().then(listEvents);
const channel = client.channels.cache.get('1049384497017266228');
// method to convert date values to cron expressions
const dateToCron = (date) => {
const minutes = date.getMinutes();
const hours = date.getHours();
const days = date.getDate();
const months = date.getMonth() + 1;
const dayOfWeek = date.getDay();
return `${minutes} ${hours} ${days} ${months} ${dayOfWeek}`;
};
function scheduleMessage(cronExpression, msgToSend) {
var job = new CronJob(
cronExpression, // cron expression that describes when the function below is executed
function() {
channel.send(msgToSend); //insert here what you want to do at the given time
},
null,
true,
'America/Los_Angeles' //insert your server time zone here
);
}
client.once(Events.ClientReady, c => {
console.log('Ready! Logged in as ', c.user.tag);
upcomingEvents.forEach(element => { scheduleMessage(element.title, element.readabledate) });
});
client.login(TOKEN);
To get the correct cron expressions for each date value you need to convert it first, as answered in this post: https://stackoverflow.com/a/67759706/11884183
You might need to adjust some things like time zone and cronjob behavior.
If you want to keep the created cronjobs up to date you can delete them and recreate them in intervals

Firebase Firestore transactions incredibly slow (3-4 minutes)

Edit: Removing irrelevant code to improve readability
Edit 2: Reducing example to only uploadGameRound function and adding log output with times.
I'm working on a mobile multiplayer word game and was previously using the Firebase Realtime Database with fairly snappy performance apart from the cold starts. Saving an updated game and setting stats would take at most a few seconds. Recently I made the decision to switch to using Firestore for my game data and player stats / top lists, primarily because of the more advanced queries and the automatic scaling with no need for manual sharding. Now I've got things working on Firestore, but the time it takes to save an updated game and update a number of stats is just ridiculous. I'm clocking average between 3-4 minutes before the game is updated, stats added and everything is available in the database for other clients and viewable in the web interface. I'm guessing and hoping that this is because of something I've messed up in my implementation, but the transactions all go through and there are no warnings or anything else to go on really. Looking at the cloud functions log, the total time from function call to completion log statement appears to be a bit more than a minute, but that log doesn't appear until after same the 3-4 minute wait for the data.
Here's the code as it is. If someone has time to have a look and maybe spot what's wrong I'd be hugely grateful!
This function is called from Unity client:
exports.uploadGameRound = functions.https.onCall((roundUploadData, response) => {
console.log("UPLOADING GAME ROUND. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
// CODE REMOVED FOR READABILITY. JUST PREPARING SOME VARIABLES TO USE BELOW. NOTHING HEAVY, NO DATABASE TRANSACTIONS. //
// Get a new write batch
const batch = firestoreDatabase.batch();
// Save game info to activeGamesInfo
var gameInfoRef = firestoreDatabase.collection('activeGamesInfo').doc(gameId);
batch.set(gameInfoRef, gameInfo);
// Save game data to activeGamesData
const gameDataRef = firestoreDatabase.collection('activeGamesData').doc(gameId);
batch.set(gameDataRef, { gameDataCompressed: updatedGameDataGzippedString });
if (foundWord !== undefined && foundWord !== null) {
const wordId = foundWord.timeStamp + "_" + foundWord.word;
// Save word to allFoundWords
const wordRef = firestoreDatabase.collection('allFoundWords').doc(wordId);
batch.set(wordRef, foundWord);
exports.incrementNumberOfTimesWordFound(gameInfo.language, foundWord.word);
}
console.log("COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
// Commit the batch
batch.commit().then(result => {
return gameInfoRef.update({ roundUploaded: true }).then(function (result2) {
console.log("DONE COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
return;
});
});
});
Again, any help with understanding this weird behaviour massively appreciated!
Ok, so I found the problem now and thought I should share it:
Simply adding a return statement before the batch commit fixed the function and reduced the time from 4 minutes to less than a second:
RETURN batch.commit().then(result => {
return gameInfoRef.update({ roundUploaded: true }).then(function (result2) {
console.log("DONE COMMITTING BATCH. TIME: ");
var d = new Date();
var n = d.toLocaleTimeString();
console.log(n);
return;
});
});
Your function isn't returning a promise that resolves with the data to send to the client app. In the absence of a returned promise, it will return immediately, with no guarantee that any pending asynchronous work will terminate correctly.
Calling then on a single promise isn't enough to handle promises. You likely have lots of async work going on here, between commit() and other functions like incrementNumberOfTimesWordFound. You will need to handle all of the promises correctly, and make sure your overall function returns only a single promise that resolves when all that work is complete.
I strongly suggest taking some time to learn how promises work in JavaScript - this is crucial to writing effective functions. Without a full understanding, things will appear to go wrong, or not at all, in strange ways.

Inserting data into mongodb using time comparison

Suppose I have this code running on 3 different servers and every server is using a single database.
setInterval(function(){
if(userArray) {
var report = mongoose.connection.db.collection('report');
report.insert({datenow:new Date(),userlist:userArray},function(err,doc) {
if(err) throw err;
});
}
},600000);
So, this piece of code is running every 10 minutes on every server but I want only one of them to insert the data into the database. Since the data is same it is getting inserted 3 times.
How do I check if the data is already inserted into the database by any one of the servers.
I tried making an incrementing count variable and insert it into the database and use it as a unique ID to check if it exists in the database. If it exists then I won't insert the data. But what if I have to restart the server for some reason, then the count will be reset to its initial value and this doesn't seem a viable solution.
So, how do I approach this problem? I am guessing I have to compare time somehow?
IMO, you should use a Cron expression instead of interval and use the execution time as primary key of your report when you perform the insertion in the database.
Explanation
Cron expression can garantee that the execution of your script will occur at an accurate time. If you use this Cron expression : 00 */10 * * * * (every 10 minutes), your script will execute at exactly 11:00:00, 11:10:00, 11:20:00, so on.. for every server you have.
So you can use this execution time as key for your reports and it will prevent multiple insertion of the same report.
Libs
You can use this lib to use Cron with Node.js : node-cron
Example
var CronJob = require('cron').CronJob;
new CronJob('* * * * * *', function() {
console.log('You will see this message every second');
}, null, true, 'America/Los_Angeles');
I hope this will help you.
So I had to end up comparing timestamps and checking it anything is inserted in the last 10 minutes. This is the solution I came up with.
setInterval(function(){
var currDate = new Date().getTime();
if(keyPairNameArray) {
var overallreport = mongoose.connection.db.collection('overallreport');
overallreport.find({}).sort({_id:-1}).limit(1).toArray(function(err, res){
if(res.length > 0){
var dbDate = new Date(res[0].datenow).getTime();
var diffDate = currDate - dbDate;
if(diffDate < 600000){
} else {
overallreport.insert({datenow:new Date(),userlist:keyPairNameArray},function(err,doc) {
if(err) throw err;
});
}
} else {
overallreport.insert({datenow:new Date(),userlist:keyPairNameArray},function(err,doc) {
if(err) throw err;
});
}
});
}
},610000);

Creating scheduled jobs (polling) with NodeJS

in my NodeJS app i need to send requests every 2-3 seconds to the third-party service. I have database with objects that contains URL to request and when response is coming i link this response with my object.
Now it's like:
// Getting objects from DB and calling ask function
objectsFromDB.find(function(err, data){
if(!err){
for (var i = 0; i < data.length; i++) {
var object = data[i];
// Calling ask function
startAsking(object);
}
}
});
// Start asking objects ...
function startAsking(object){
var intervalId = setInterval(function(){
console.log("Asking " + object.name);
// ...
// Processing and linking response with object
}, config.INTERVAL);
arrayOfIntervals.push(intervalId);
};
So, now i need to stop the job for one of the objects. How can i do this ?
I see that i can save intervalId but what happens if this intervalId is not mutch with the object ?
Also i see many libraries like:
Agenda
nschedule
node-cron
But i think that all of this libraries is oriented to schedule jobs whit large interval, and i don't know how to stop one of the jobs.

Node.js/ Make redis call async

I use redis client in node.js
var db = require("redis");
var dbclient = db.createClient();
I load the DB in the next way:
dbclient.zrange("cache", -1000000000000000, +1000000000000000, function(err, replies){
logger.info("Go to cache");
for (var i=0; i < replies.length; i++){
(function(i){
// Do some commands with the result
})(i)
}
})
I notice that where my application is started, it takes 30~ sec. for the DB query to execute. In this time, no other request from Express module are served.
How can I slove this issue? Why is no a asynchronous?
If you're worried about the uptime then you should use ZSCAN
Use the COUNT option to get more data on every call but remember:
While SCAN does not provide guarantees about the number of elements returned at every iteration, it is possible to empirically adjust the behavior of SCAN using the COUNT option
And at every result use the setImmediate function to iterate over to the next SCAN.
var cursor = '0';
function doscan(){
dbclient.zscan("cache", cursor, "COUNT", "100", function(err, replies){
logger.info("Go to cache");
// Update the cursor position for the next scan
cursor = res[0];
if (cursor === '0') {
return console.log('Iteration complete');
} else {
for (var i=0; i < replies.length; i++){
(function(i){
// Do some commands with the result
})(i)
}
setImmediate(function() { doscan() });
}
})
}
As stdob said, the only part of your code that is not asynchronous is the for loop part. Personally, I would create a child process and run the DB query as well as whatever you decide to do with the output. If that doesn't work for your application, you may just have to delay starting your program for those 30 seconds or handle the cache differently.
You can use Iced-CoffeeScript for that , like
https://github.com/Terebinth/britvic/blob/master/server/models/hive_iced.iced#L101

Resources