How to run a job through Queue in arangodb - arangodb

I am moving from ArangoDb 2.5.7 to ArangoDb 3.1.7. I have managed to make everything work except the Jobs. I look at the documentation and I don't understand If I have to create a separate service just for that ?
So, I have a foxx application myApp
manifest.json
{
"name": "myApp",
"version": "0.0.1",
"author": "Deepak",
"files":
{
"/static": "static"
},
"engines":
{
"arangodb": "^3.1.7"
},
"scripts":
{
"setup": "./scripts/setup.js",
"myJob": "./scripts/myJob.js"
},
"main": "index.js"
}
index.js
'use strict';
module.context.use('/one', require('./app'));
app.js
const createRouter = require('org/arangodb/foxx/router');
const controller = createRouter();
module.exports = controller;
const queues = require('#arangodb/foxx/queues');
queue = queues.create('myQueue', 2);
queue.push({mount:"/myJob", name:"myJob"}, {"a":4}, {"allowUnknown": true});
myJob.js
const argv = module.context.argv;
var obj = argv[0];
console.log('obj:'+obj);
I get following error:
Job failed:
ArangoError: service not found
Mount path: "/myJob".
I am not sure if I have to expand myJob as an external service. Can you help me. I don't see a complete example of how to do it.

To answer your question:
You do not have to extract the job script into a new service. You can specify the mount point of the current service by using module.context.mount.
You can find more information about the context object in the documentation: https://docs.arangodb.com/3.1/Manual/Foxx/Context.html
By the way, it's probably not a good idea to arbitrarily create jobs at mount-time. The common use case for the queue is to create jobs in route handlers as a side-effect of incoming requests (e.g. to dispatch a welcome e-mail on signup).
If you create a job at mount-time (e.g. in your main file or a file required by it) the job will be created whenever the file as executed, which will be at least once for each Foxx thread (by default ArangoDB uses multiple Foxx threads to handle parallel requests) or when development mode is enabled once per request(!).
Likewise if you create a job in your setup script it will be created whenever the setup script is executed, although this will only happen in one thread each time (but still once per request when development mode is active).
If you need e.g. a periodic job that lives alongside your service, you should put it in a unique queue and only create it in your setup script after checking whether it already exists.
On the changes in the queue API:
The queue API changed in 2.6 due to a serious issue with the old API that would frequently result in pending jobs not being properly rescheduled when the ArangoDB daemon was restarted after a job had been pushed to the queue.
Specifically ArangoDB 2.6 introduced so-called script-based (rather than function-based) job types: https://docs.arangodb.com/3.1/Manual/ReleaseNotes/UpgradingChanges26.html#foxx-queues
Support for the old function-based job types was dropped in ArangoDB 2.7 and the cookbook recipe was updated to reflect script-based job types: https://docs.arangodb.com/2.8/cookbook/FoxxQueues.html
A more detailed description of the new queue can be found in the documentation: https://docs.arangodb.com/3.1/Manual/Foxx/Scripts.html

Related

How to implement a pull-queue using Cloud Tasks in Node.js

I am trying to implement a pull-queue using Cloud Tasks + App Engine standard environment in Node.js. So basically I am trying to lease tasks from a queue. The problem is that I can only find examples in other languages, and I can find no mention of creating or leasing tasks for pull queues in the GCP Node.js documentation.
Please tell me this is possible and I do not need to start using a different language in my project only to implement a pull-queue mechanism.
Here is a link to the equivalent Python documentation
--- edit ---
I managed to find a reference in the types that allowed me to do this:
import { v2beta2 } from "#google-cloud/tasks";
const client = new v2beta2.CloudTasksClient();
const [{ tasks }] = await client.leaseTasks({
parent: client.queuePath(project, location, "my-pull-queue"),
maxTasks: 100,
});
...but it is giving me some odd quota error:
Error: Failed to lease tasks from the my-pull-queue queue: 8
RESOURCE_EXHAUSTED: Quota exceeded for quota metric 'Alpha API
requests' and limit 'Alpha API requests per minute (should be 0 unless
whitelisted)' of service 'cloudtasks.googleapis.com' for consumer
'project_number:xxx'.
I can hardly find sources referencing this type of quota error, but it seems to stem from APIs that are not made public yet and can only be used when access is granted explicitly (which would explain the whitelisting).
Another thing I find very odd is that there seem to be two beta clients v2beta2 and v2beta3, but only beta2 types define methods for leasing a task. Both beta APIs are defining types for creating a pull-queue task.
I just found this statement that pull-queues are not supported in Node.js.
https://github.com/googleapis/nodejs-tasks/issues/123#issuecomment-445090253

Blob-triggered Azure Function doesn't process only one blob at a time anymore

I have written a blob-triggered function that uploads data on a CosmosDB database using the Gremlin API, using Azure Functions version 2.0. Whenever the function is triggered, it is going to read the blob, extract relevant information, and then queries the database to upload the data on it.
However, when all files are uploaded on the blob storage at the same time, the Function is going to process all files at the same time, which results in too many requests for the database to handle. To avoid this, I ensured that the Azure Function would only process one file at a time, by setting the batchSize to 1 in the host.json file :
{
"extensions": {
"queues": {
"batchSize": 1,
"maxDequeueCount": 1,
"newBatchThreshold": 0
}
},
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"version": "2.0"
}
This worked perfectly fine for 20 files at a time.
Now, we are trying to process 300 files at a time, and this feature doesn't seem to work anymore, the Function processes all the files at the same time again, which results in the database not being able to handle all the requests.
What am I missing here ? Is there some scaling issue I'm not aware of ?
From here:
If you want to avoid parallel execution for messages received on one queue, you can set batchSize to 1. However, this setting eliminates concurrency as long as your function app runs only on a single virtual machine (VM). If the function app scales out to multiple VMs, each VM could run one instance of each queue-triggered function.
You need to combine this with the app setting WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT when you run in Consumption plan.
Or, according to the docs, the better way would be through the Function property functionAppScaleLimit: https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling#limit-scale-out
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT would work of course.
You can also scale to multiple Function App instances within one Host then you can have less hosts and more FUNCTIONS_WORKER_PROCESS_COUNT per host. Cost implications would depend on your plan.
Note that all workers within a Host would share resources, so this is recommended for more IO bound workload.

Service Fabric Application PackageDeployment Operation Time Out exception

i have service fabric cluster and 3 nodes are created in 3 systems and it is inter-connected. i am able to connect each of nodes. These nodes are created in windows server. These Windows Server(VMs) are on-premises.
Manually i am trying to deploy my package into my cluster/one of nodes, i am getting Operation Timeout exception. i have used below commands to execute for deployment.
Service Fabric Power shell Commands:
Copy-ServiceFabricApplicationPackage -ApplicationPackagePath 'c:\sample\etc' -ApplicationPackagePathInImageStore 'abc.app.portaltype'
after execute above command it runs for 2 -3 mins and throws Operation Timeout exception. My package size is almost 250 MB and approximately 15000 file exist in my package. after that i have passed an extra parameter -TimeOutSec to 600(10mins) explicitly in above command, then it successfully executed and it copied to service fabric imagestore.
Register-ServiceFabricApplicationType -ApplicationPathInImageStore 'abc.app.portaltype'
after executed Copy-ServiceFabricApplicationPackage command , i have executed above Register-ServiceFabricApplicationType command to register my in cluster.but it also throws Operation timeout exception then i have passed an extra parameter -TimeOutSec to 600(10mins) explicitly in above command, but no luck it throws same operation timeout exception.
Just to make sure these operation Timeout issue because of no files in package or not. i have created simple empty service fabric asp.net core app and created package and try to deploy in same server with using above command, it deployed with in fraction of second and it works as smoothly.
Anybody has any idea how to over come service fabric operation timeout issue ?
How to handle the operation timeout issue if the package contains large set of files ?
Any help/suggestion would be very appreciated.
Thanks,
If this is taking longer than the 10 Minute default max it's probably one of the following issues:
Large application packages (>100s of MB)
Slow network connections
A large number of files within the application package (>1000s).
The following workarounds should help you.
Add the following settings to your cluster config:
"fabricSettings": [
{
"name": "NamingService",
"parameters": [
{
"name": "MaxOperationTimeout",
"value": "3600"
},
]
}
]
Also add:
"fabricSettings": [
{
"name": "EseStore",
"parameters": [
{
"name": "MaxCursors",
"value": "32768"
},
]
}
]
There’s a couple additional features which are currently rolling out. For these to be present and functional, you need to be sure that the client is at least 2.4.28 and the runtime of your cluster is at least 5.4.157. If you’re staying up to date these should already be present in your environment.
For register you can specify the -Async flag which will handle the upload asynchronously, reducing the need for the timeout to just the time necessary to send the command, not the application package. You can also query the status of the registration with Get-ServiceFabricApplicationType. 5.5 fixes some issues with these commands, so if they aren't working for you you'll have to wait for that release to hit your environment.

Make scheduled tasks with node-schedule (using forever) persist after a restart

I want to develop a node.js program that will be executed at a specific time using a job scheduler (node-schedule).
This program is running in the background using forever (node.js module).
Here's the content of my app.js :
var schedule = require('node-schedule');
~
~
var id = request.body.id;
var scheduled = schedule.scheduledJobs;
if(scheduled[id]!=null){
//Tasks
}
else{
scheduled[id].cancel;
delete scheduled[id];
}
But if app.js is killed by any reason,
the schedule object is removed.
and sometimes app.js is restarted by forever.
how can I handle node-schedule objects ?
I have faced similar problem recently and there are two solutions:
1. Use actual cron
2. Use database
I solved my problem by using database. Each time you are creating some event save it to database. In your app.js file when the application is starting make function reading the database and creating scheduled events accordingly.
The first option is better if you do not have dynamic tasks, you do not create new tasks or if they are always the same.

Starting a scheduling service in sails.js with forever from within sails with access to all waterline models

I have a standalone scheduling service set to execute some logic every 1 hour, I want to start this service with forever right after sails start and I am not sure what's the best way to do that.
// services/Scheduler.js
sails.load(function() {
setInterval( logicFn , config.schedulingInterval);
});
Sails can execute bootstrap logic in the config.bootstrap module and I'll be using the forever-monitor node module \
var forever = require('forever-monitor'),
scheduler = new (forever.Monitor)( schedulerPath, {
max: 20,
silent: true,
args: []
});
module.exports.bootstrap = function(cb) {
scheduler.start();
cb();
};
What if the service failed and restarted for whatever reason would it have access to all waterline models again, how to ensure it works as intended every time?
as brittonjb said in comments, a simple solution is to use the cron module for scheduling.
You can specify a function for it to call at whatever interval you wish; this function could be defined within /config/bootstrap.js or it could be defined somewhere else (e.g. mail.dailyReminders() if you have a mail service with a dailyReminders method);
Please please please, always share your sails.js version number! This is really important for people googling questions/answers!
There are many ways to go about doing this. However, for those that want the "sails.js" way, there are hooks for newer sails.js versions.
See this issue thread in github, specifically, after the issue gets closed some very helpful solutions get provided by some users. The latest is shared by "scott-wyatt", commented on Dec 28, 2014:
https://github.com/balderdashy/sails/issues/2092

Resources