I need to delete documents in collection after 7 days from creation, unless "confirmed" value is equal true. I am creating index like on screenshot but it does not work. I am using Node.js for server if it matter.
You can use rule npm package https://www.npmjs.com/package/node-rules to do this.
you can define a rule and if the rule is executed then the consequeces will be deleting the document. if the rule is to calculate the days count from the creation date of the file
var RuleEngine = require("node-rules");
/* Creating Rule Engine instance */
var R = new RuleEngine();
/* Add a rule */
var rule = {
"condition": (R) => {
console.log(this);
// check if a document creation date and current date , dates >= 7
// and check for other condition
},
"consequence": (R) => {
// delete the document
// if above condition met
}
};
you need cron job to achieve this
try something like this (I'm using node-cron but you can use any cron job lib you want)
import cron from 'node-cron'
import Collection from 'models/YourCollection'
cron.schedule('0 12 * * * *', () => { // execute everyday at 12:00
const lastWeek = new Date();
lastWeek.setDate(lastWeek.getDate() -7);
Collection.deleteMany({'created_at': {'$lte': lastWeek}}) // created more than 7 days
});
Related
I have a DynamoDB table with the following items
{
"jobId":<job1>,
"cron" : "* 5 * * *"
},
{
"jobId":<job2>,
"cron" : "* 8 * * *"
}
I need to scan items who next execution time based on cron string is within the next 5 minutes, based on current time.
Is there a way I can convert the cron to a valid next execution time while scanning?
I am using node.js in AWS Lambda and cron-parser npm library to extract next_execution time from cron string.
Note that scanning the full table will slow down over time. You may want to consider some other data store or structure to store this data.
That said something like this could work:
const results = await client.scan({ TableName: 'tableName' }).promise();
const cronItems = results.Items;
const intervals = cronItems.map((item) => {
return cronParser.parseExpression(item.cron);
});
const now = new Date();
const fiveMinMillis = 300 * 1000;
const within5Mins = intervals.filter((interval) => {
const timeUntil = interval.next().valueOf() - now.valueOf();
return timeUntil < fiveMinMillis;
});
Note you will actually need to call scan(...) iteratively until the response does not include a LastEvaluatedKey attribute. See here for details.
After reading the docs on ServerValue.TIMESTAMP, I was under the impression that once the object hits the database, the timestamp placeholder evaluates once and remains the same, but this was not the case for me:
// Example on Node:
> const db = f.FIREBASE_APP.database();
> const timestamp = f.FIREBASE_APP.database.ServerValue.TIMESTAMP;
> const ref = db.ref('/test');
> ref.on(
... 'child_added',
... function(snapshot) {
..... console.log(`Timestamp from listener: ${snapshot.val().timestamp}`);
..... }
... )
> var child_key = "";
> ref.push({timestamp: timestamp}).then(
... function(thenable_ref) {
..... child_key = thenable_ref.key;
..... }
... );
Timestamp from listener: 1534373384299
> ref.child(child_key).once('value').then(
... function(snapshot) {
..... console.log(`Timestamp after querying: ${snapshot.val().timestamp}`);
..... }
... );
> Timestamp after querying: 1534373384381
> 1534373384299 < 1534373384381
true
The timestamp is different when queried from the on listener and it is different during a later query.
Is this like this by design and I just missed some parts of the documentation? If this is the case, when does the ServerValue.TIMESTAMP stabilize?
I am building a CQRS/ES library on the Realtime Database, and just wanted to avoid the expected_version (or sequence numbers) of events.
UPDATE
The proof for Frank's explanation below:
/* `db`, `ref` and `timestamp` are defined above,
and the test path ("/test") has been deleted
from DB beforehand to avoid noise.
*/
> ref.on(
... 'child_added',
... function(snapshot) {
..... console.log(`Timestamp from listener: ${snapshot.val().timestamp}`);
..... }
... )
> ref.on(
... 'value',
... function(snapshot) {
..... console.log(snapshot.val());
..... }
... )
> ref.push({timestamp: timestamp}); null;
Timestamp from listener: 1534434409034
{ '-LK2Pjd8FS_L8hKqIpiE': { timestamp: 1534434409034 } }
{ '-LK2Pjd8FS_L8hKqIpiE': { timestamp: 1534434409114 } }
Bottom line is, if one needs to rely on immutable server side timestamps, keep this in mind, or work around it.
When you perform the ref.push({timestamp: timestamp}) the Firebase client immediately makes an estimate of the timestamp on the client and fires an event for that locally. It then send the command off to the server.
Once the Firebase client receives the response from the server, it checks if the actual timestamp is different from its estimate. If it is indeed different, the client fires reconciliatory events.
You can most easily see this by attaching your value listener before setting the value. You'll see it fire with both the initial estimates value, and the final value from the server.
Also see:
How to use the Firebase server timestamp to generate date created?
Trying to convert Firebase timestamp to NSDate in Swift
firebase.database.ServerValue.TIMESTAMP return an Object
CAVEAT: After wasting another day, the ultimate solution is not to use Firebase server timestamps at all, if you have to compare them in a use case that is similar to the one below. When the events come in fast enough, the second 'value' update may not trigger at all.
One solution, to the double-update condition Frank describes in his answer, is to get the final server timestamp value is (1) to embed an on('event', ...) listener inside an on('child_added', ...) and (2) remove the on('event', ...) listener as soon as the specific use case permits.
> const db = f.FIREBASE_APP.database();
> const ref = db.ref('/test');
> const timestamp = f.FIREBASE_APP.database.ServerValue.TIMESTAMP;
> ref.on(
'child_added',
function(child_snapshot) {
console.log(`Timestamp in 'child_added': ${child_snapshot.val().timestamp}`);
ref.child(child_snapshot.key).on(
'value',
function(child_value_snapshot) {
// Do a timestamp comparison here and remove `on('value',...)`
// listener here, but keep in mind:
// + it will fire TWICE when new child is added
// + but only ONCE for previously added children!
console.log(`Timestamp in embedded 'event': ${child_value_snapshot.val().timestamp}`);
}
)
}
)
// One child was already in the bank, when above code was invoked:
Timestamp in 'child_added': 1534530688785
Timestamp in embedded 'event': 1534530688785
// Adding a new event:
> ref.push({timestamp: timestamp});null;
Timestamp in 'child_added': 1534530867511
Timestamp in embedded 'event': 1534530867511
Timestamp in embedded 'event': 1534530867606
In my CQRS/ES case, events get written into the "/event_store" path, and the 'child_added' listener updates the cumulated state whenever new events come in, where each event has a ServerValue.TIMESTAMP. The listener compares the new event's and the state's timestamp whether the new event should be applied or it already has been (this mostly matters when the server has been restarted to build the internal in-memory state). Link to the full implementation, but here's a shortened outline on how single/double firing has been handled:
event_store.on(
'child_added',
function(event_snapshot) {
const event_ref = event_store.child(event_id)
event_ref.on(
'value',
function(event_value_snapshot){
const event_timestamp = event_value_snapshot.val().timestamp;
if ( event_timestamp <= state_timestamp ) {
// === 1 =======
event_ref.off();
// =============
} else {
var next_state = {};
if ( event_id === state.latest_event_id ) {
next_state["timestamp"] = event_timestamp;
Object.assign(state, next_state);
db.ref("/state").child(stream_id).update(state);
// === 2 =======
event_ref.off();
// =============
} else {
next_state = event_handler(event_snapshot, state);
next_state["latest_event_id"] = event_id;
Object.assign(state, next_state);
}
}
}
);
}
);
When the server is restarted, on('child_added', ...) goes through all events already in the "/event_store", attaching on('value',...) dynamically on all children and compares the events` timestamps to the current state's.
If the event is older than the age of the current state (event_timestamp < state_timestamp is true), the only action is detaching the 'value' listener . This callback will be fired once as the ServerValue.TIMESTAMP placeholder has already been resolved once in the past.
Otherwise the event is newer, which means that it hasn't been applied yet to the current state and ServerValue.TIMESTAMP also hasn't been evaluated yet, causing the callback to fire twice. To handle the double update, this block saves the actual child's key (i.e., event_id here) to the state (to latest_event_id) and compares it to the incoming event's key (i.e., event_id):
What's the query or some other quick way to delete all the documents matching the where condition in a collection?
I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA' but, apparently, it doesn't work.
Note: I'm not looking for any C# implementation for this.
This is a bit old but just had the same requirement and found a concrete example of what #Gaurav Mantri wrote about.
The stored procedure script is here:
https://social.msdn.microsoft.com/Forums/azure/en-US/ec9aa862-0516-47af-badd-dad8a4789dd8/delete-multiple-docdb-documents-within-the-azure-portal?forum=AzureDocumentDB
Go to the Azure portal, grab the script from above and make a new stored procedure in the database->collection you need to delete from.
Then right at the bottom of the stored procedure pane, underneath the script textarea is a place to put in the parameter. In my case I just want to delete all so I used:
SELECT c._self FROM c
I guess yours would be:
SELECT c._self FROM c WHERE c.DocumentType = 'EULA'
Then hit 'Save and Execute'. Viola, some documents get deleted. After I got it working in the Azure Portal I switched over the Azure DocumentDB Studio and got a better view of what was happening. I.e. I could see I was throttled to deleting 18 a time (returned in the results). For some reason I couldn't see this in the Azure Portal.
Anyway, pretty handy even if limited to a certain amount of deletes per execution. Executing the sp is also throttled so you can't just mash the keyboard. I think I would just delete and recreate the Collection unless I had a manageable number of documents to delete (thinking <500).
Props to Mimi Gentz #Microsoft for sharing the script in the link above.
HTH
I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA'
but, apparently, it doesn't work.
Deleting documents this way is not supported. You would need to first select the documents using a SELECT query and then delete them separately. If you want, you can write the code for fetching & deleting in a stored procedure and then execute that stored procedure.
I wrote a script to list all the documents and delete all the documents, it can be modified to delete the selected documents as well.
var docdb = require("documentdb");
var async = require("async");
var config = {
host: "https://xxxx.documents.azure.com:443/",
auth: {
masterKey: "xxxx"
}
};
var client = new docdb.DocumentClient(config.host, config.auth);
var messagesLink = docdb.UriFactory.createDocumentCollectionUri("xxxx", "xxxx");
var listAll = function(callback) {
var spec = {
query: "SELECT * FROM c",
parameters: []
};
client.queryDocuments(messagesLink, spec).toArray((err, results) => {
callback(err, results);
});
};
var deleteAll = function() {
listAll((err, results) => {
if (err) {
console.log(err);
} else {
async.forEach(results, (message, next) => {
client.deleteDocument(message._self, err => {
if (err) {
console.log(err);
next(err);
} else {
next();
}
});
});
}
});
};
var task = process.argv[2];
switch (task) {
case "listAll":
listAll((err, results) => {
if (err) {
console.error(err);
} else {
console.log(results);
}
});
break;
case "deleteAll":
deleteAll();
break;
default:
console.log("Commands:");
console.log("listAll deleteAll");
break;
}
And if you want to do it in C#/Dotnet Core, this project may help: https://github.com/lokijota/CosmosDbDeleteDocumentsByQuery. It's a simple Visual Studio project where you specify a SELECT query, and all the matches will be a) backed up to file; b) deleted, based on a set of flags.
create stored procedure in collection and execute it by passing select query with condition to delete. The major reason to use this stored proc is because of continuation token which will reduce RUs to huge extent and will cost less.
##### Here is the python script which can be used to delete data from Partitioned Cosmos Collection #### This will delete documents Id by Id based on the result set data.
Identify the data that needs to be deleted before below step
res_list = "select id from id_del"
res_id = [{id:x["id"]}
for x in sqlContext.sql(res_list).rdd.collect()]
config = {
"Endpoint" : "Use EndPoint"
"Masterkey" : "UseKey",
"WritingBatchSize" : "5000",
'DOCUMENTDB_DATABASE': 'Database',
'DOCUMENTDB_COLLECTION': 'collection-core'
};
for row in res_id:
# Initialize the Python DocumentDB client
client = document_client.DocumentClient(config['Endpoint'], {'masterKey': config['Masterkey']})
# use a SQL based query to get documents
## Looping thru partition to delete
query = { 'query': "SELECT c.id FROM c where c.id = "+ "'" +row[id]+"'" }
print(query)
options = {}
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 1000
result_iterable = client.QueryDocuments('dbs/Database/colls/collection-core', query, options)
results = list(result_iterable)
print('DOCS TO BE DELETED : ' + str(len(results)))
if len(results) > 0 :
for i in range(0,len(results)):
# print(results[i]['id'])
docID = results[i]['id']
print("docID :" + docID)
options = {}
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 1000
options['partitionKey'] = docID
client.DeleteDocument('dbs/Database/colls/collection-core/docs/'+docID,options=options)
print ('deleted Partition:' + docID)
i want to include node scheduling code in my sails , but i don't know where i put the code in my sails .
But i tried to put the code in my config/bootstrap.js .but it doesn't run . code is
sails.on('lifted', function() {
var schedule = require('node-schedule');
var j = schedule.scheduleJob({hour: 0, minute: 1, dayOfWeek: 0}, function(){
console.log('Time for tea!');
});
});
I want to know , where i put this code. Main conditions is , that file execute every time when my sails server lift.
Sorry if it doesn't format the code since I am on my tablet. This was surprisingly easy. I used node-cron and put the following code in my services folder as a file called Cron.js
I do not believe I had to place any code anywhere else to start the job.
/*
https://github.com/ncb000gt/node-cron
Read up on cron patterns here. http://crontab.org/
When specifying your cron values you'll need to make sure that your values fall within the ranges. For instance, some cron's use a 0-7 range for the day of week where both 0 and 7 represent Sunday. We do not.
Seconds: 0-59
Minutes: 0-59
Hours: 0-23
Day of Month: 1-31
Months: 0-11
Day of Week: 0-6
How to check if a cron pattern is valid:
try {
new CronJob('invalid cron pattern', function() {
console.log('this should not be printed');
})
} catch(ex) {
console.log("cron pattern not valid");
}
constructor(cronTime, onTick, onComplete, start, timezone, context) - Of note, the first parameter here can be a JSON object that has the below names and associated types (see examples above).
cronTime - [REQUIRED] - The time to fire off your job. This can be in the form of cron syntax or a JS Date object.
onTick - [REQUIRED] - The function to fire at the specified time.
onComplete - [OPTIONAL] - A function that will fire when the job is complete, when it is stopped.
start - [OPTIONAL] - Specifies whether to start the job just before exiting the constructor. By default this is set to false. If left at default you will need to call job.start() in order to start the job (assuming job is the variable you set the cronjob to).
timeZone - [OPTIONAL] - Specify the timezone for the execution. This will modify the actual time relative to your timezone.
context - [OPTIONAL] - The context within which to execute the onTick method. This defaults to the cronjob itself allowing you to call this.stop(). However, if you change this you'll have access to the functions and values within your context object.
start - Runs your job.
stop - Stops your job.
*/
var CronJob = require('cron').CronJob;
var job = new CronJob({
cronTime: '00 30 11 * * 1-5',
onTick: function() {
// Runs every weekday (Monday through Friday)
// at 11:30:00 AM. It does not run on Saturday
// or Sunday.
},
start: false,
timeZone: "America/Los_Angeles"
});
job.start()
;
As you rightly said config/bootstrap.js is one place where you can have your Job schedulers,
The below cron will execute at 00 hrs 00 mins 00 secs daily
From the left
The first 00 is seconds can hold values (00-59)
The second 00 is minutes can hold values (00-59)
The third 00 is hours can hold values (00-23)
The fourth position * is Day can hold values (00-30)
The fifth position * is Month can hold values (00-11)
The sixth position * is the Day of Week can hold values (0-6)
module.exports.bootstrap = function (cb) {
try {
var CronJob = require('cron').CronJob;
new CronJob('00 00 00 * * *', function() {
sails.controllers.controllerName.functionName(sails.request, sails.response, sails.next, function(err,data){
console.log(err,"err");
});
}, null, true);
}
catch(ex) {
console.log("cron pattern not valid");
}
cb();
};
Note: The function which you are using should not have any res.json or res.something
I've been trying to create a code that takes info from a Google Spreadsheet, and creates Google Calendar events. I'm new to this, so bear with my lack of in-depth coding knowledge!
I initially used this post to create a code:
Create Google Calendar Events from Spreadsheet but prevent duplicates
I then worked out that it was timing out due to the number of rows on the spreadsheet, and wasn't creating eventIDs to avoid the duplicates. I got an answer here to work that out!
Google Script that creates Google Calendar events from a Google Spreadsheet - "Exceeded maximum execution time"
And now I've realised that it's over-writing the formulas, I have in the spreadsheet, auto-completing into each row, as follows:
Row 12 - =if(E4="","",E4+1) // Row 13 - =if(C4="","",C4+1) // Row 18 - =if(B4="","","WHC - "&B4) // Row 19 - =if(B4="","","Docs - "&B4)
Does anyone have any idea how I can stop it doing this?
/**
* Adds a custom menu to the active spreadsheet, containing a single menu item
* for invoking the exportEvents() function.
* The onOpen() function, when defined, is automatically invoked whenever the
* spreadsheet is opened.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function onOpen() {
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var entries = [{
name : "Export WHCs",
functionName : "exportWHCs"
},
{
name : "Export Docs",
functionName : "exportDocs"
}];
sheet.addMenu("Calendar Actions", entries);
};
/**
* Export events from spreadsheet to calendar
*/
function exportWHCs() {
// check if the script runs for the first time or not,
// if so, create the trigger and PropertiesService.getScriptProperties() the script will use
// a start index and a total counter for processed items
// else continue the task
if(PropertiesService.getScriptProperties().getKeys().length==0){
PropertiesService.getScriptProperties().setProperties({'itemsprocessed':0});
ScriptApp.newTrigger('exportWHCs').timeBased().everyMinutes(5).create();
}
// initialize all variables when we start a new task, "notFinished" is the main loop condition
var itemsProcessed = Number(PropertiesService.getScriptProperties().getProperty('itemsprocessed'));
var startTime = new Date().getTime();
var sheet = SpreadsheetApp.getActiveSheet();
var headerRows = 4; // Number of rows of header info (to skip)
var range = sheet.getDataRange();
var data = range.getValues();
var calId = "flightcentre.com.au_pma5g2rd5cft4lird345j7pke8#group.calendar.google.com";
var cal = CalendarApp.getCalendarById(calId);
for (i in data) {
if (i < headerRows) continue; // Skip header row(s)
var row = data[i];
var date = new Date(row[12]); // First column
var title = row[18]; // Second column
var tstart = new Date(row[15]);
tstart.setDate(date.getDate());
tstart.setMonth(date.getMonth());
tstart.setYear(date.getYear());
var tstop = new Date(row[16]);
tstop.setDate(date.getDate());
tstop.setMonth(date.getMonth());
tstop.setYear(date.getYear());
var id = row[17]; // Sixth column == eventId
// Check if event already exists, update it if it does
try {
var event = cal.getEventSeriesById(id);
}
catch (e) {
// do nothing - we just want to avoid the exception when event doesn't exist
}
if (!event) {
//cal.createEvent(title, new Date("March 3, 2010 08:00:00"), new Date("March 3, 2010 09:00:00"));
var newEvent = cal.createEvent(title, tstart, tstop).addEmailReminder(5).getId();
row[17] = newEvent; // Update the data array with event ID
}
else {
event.setTitle(title);
}
if(new Date().getTime()-startTime > 240000){ // if > 4 minutes
var processed = i+1;// save usefull variable
PropertiesService.getScriptProperties().setProperties({'itemsprocessed':processed});
range.setValues(data);
MailApp.sendEmail(Session.getEffectiveUser().getEmail(),'progress sheet to cal','item processed : '+processed);
return;
}
debugger;
}
// Record all event IDs to spreadsheet
range.setValues(data);
}
/**
* Export events from spreadsheet to calendar
*/
function exportDocs() {
// check if the script runs for the first time or not,
// if so, create the trigger and PropertiesService.getScriptProperties() the script will use
// a start index and a total counter for processed items
// else continue the task
if(PropertiesService.getScriptProperties().getKeys().length==0){
PropertiesService.getScriptProperties().setProperties({'itemsprocessed':0});
ScriptApp.newTrigger('exportDocs').timeBased().everyMinutes(5).create();
}
// initialize all variables when we start a new task, "notFinished" is the main loop condition
var itemsProcessed = Number(PropertiesService.getScriptProperties().getProperty('itemsprocessed'));
var startTime = new Date().getTime();
var sheet = SpreadsheetApp.getActiveSheet();
var headerRows = 4; // Number of rows of header info (to skip)
var range = sheet.getDataRange();
var data = range.getValues();
var calId = "flightcentre.com.au_pma5g2rd5cft4lird345j7pke8#group.calendar.google.com";
var cal = CalendarApp.getCalendarById(calId);
for (i in data) {
if (i < headerRows) continue; // Skip header row(s)
var row = data[i];
var date = new Date(row[13]); // First column
var title = row[19]; // Second column
var tstart = new Date(row[15]);
tstart.setDate(date.getDate());
tstart.setMonth(date.getMonth());
tstart.setYear(date.getYear());
var tstop = new Date(row[16]);
tstop.setDate(date.getDate());
tstop.setMonth(date.getMonth());
tstop.setYear(date.getYear());
var id = row[20]; // Sixth column == eventId
// Check if event already exists, update it if it does
try {
var event = cal.getEventSeriesById(id);
}
catch (e) {
// do nothing - we just want to avoid the exception when event doesn't exist
}
if (!event) {
//cal.createEvent(title, new Date("March 3, 2010 08:00:00"), new Date("March 3, 2010 09:00:00"));
var newEvent = cal.createEvent(title, tstart, tstop).addEmailReminder(5).getId();
row[20] = newEvent; // Update the data array with event ID
}
else {
event.setTitle(title);
}
if(new Date().getTime()-startTime > 240000){ // if > 4 minutes
var processed = i+1;// save usefull variable
PropertiesService.getScriptProperties().setProperties({'itemsprocessed':processed});
range.setValues(data);
MailApp.sendEmail(Session.getEffectiveUser().getEmail(),'progress sheet to cal','item processed : '+processed);
return;
}
debugger;
}
// Record all event IDs to spreadsheet
range.setValues(data);
}
You have to ways to solve that problem.
First possibility : update your sheet with array data only on columns that have no formulas, proceeding as in this other post but in your case (with multiple columns to skip) it will rapidly become tricky
Second possibility : (the one I would personally choose because I 'm not a "formula fan") is to do what your formulas do in the script itself, ie translate the formulas into array level operations.
following your example =if(E4="","",E4+1) would become something like data[n][4]=data[n][4]==''?'':data[n+1][4]; if I understood the logic (but I'm not so sure...).
EDIT
There is actually a third solution that is even simpler (go figure why I didn't think about it in the first place...) You could save the ranges that have formulas, for example if col M has formulas you want to keep use :
var formulM = sheet.getRange('G1:G').getFormulas();
and then, at the end of the function (after the global setValues()) rewrite the formulas using :
sheet.getRange('G1:G').setFormulas(formulM);
to restore all the previous formulas... as simple as that, repeat for every column where you need to keep the formulas.