In my Lotus Notes workflow application, I have a scheduled server agent (every five minutes). When user's act on a document, a server-side agent is also triggered (this agent modifies the said document, server-side). In the production, we are receiving many complaints that the processing are incomplete or sometimes not being processed at all. I checked the server configuration and found out that only 4 agents can run concurrently. Being a global application with over 50,000 users, the only thing that I can blame with these issues are the volume of agent run, but I'm not sure if I'm correct (I'm a developer and lacks knowledge about these stuffs). Can someone help me find if my reasoning is correct (on simulteneous agents) and help me understand how I can solve this? Can you provide me references please. Thank you in advance!
Important thing to remember.
Scheduled agents on the server will only run one agent from the same database at any given time!
So if you have Database A with agent X (5 mins) and Y (10 mins). It will first run X. Once X completes which ever is scheduled next (X or Y) will run next. It will never let you run X + Y at the same time if they are in the same database.
This is intended behaviour to stop possible deadlocks within the database agents.
Also you have an schedule queue which will have a limit to the number of agents that can be scheduled up. For example if you have Agent X every 5 minutes, but it takes 10 minutes to complete, your schedule queue will slowly fill up and then run out of space.
So how to work around this? There is a couple of ways.
Option 1: Use Program Documents on the server.
Set the agent to scheduled "Never" and have a program document execute the agent with the command.
tell amgr run "dir/database.nsf" 'agentName'
You will be able to run agents in <5 minute schedule.
You can run multiple agents in the same database.
You have to be aware of what the agent is interacting with, and code for it to handle other agents or itself running at the same time.
There can be serious performance implications in doing this. You need to be aware of what is going on in the server and how it would impact it.
If you have lots of databases, you have a messy program document list and hard to maintain.
Agents via "Tell AMGR" will not be terminated if they exceed the agent execution time allowed on the server. They have to be manually killed.
There is easy way to determine what agents are running/ran.
Option 2: Create an agent which calls out to web agents.
You will be able to run agents in <5 minute schedule.
You can run multiple agents in the same database.
You have slightly better control of what runs via another agent.
You need HTTP running on the server.
There are performance implications in doing this and again you need to be aware of how it will interact with the system if multiple instances run or other agents.
Agents will not be terminated if they exceed the agent execution time allowed on the server.
You will need to allow concurrent web agents/web services on the server or you can potentially hang the server.
Option 3: Change from scheduled to another trigger.
For example "When new mail arrives". Overall this is the better option of the three.
In closing I would say that you should rarely use the "Execute every 5 mins" if you can, unless it is a critical agent that isn't going to be executed by multiple users across different databases.
I have been looking for a time based persistent scheduler. I looked into some applications (Agenda, node-cron, node-schedule). But I couldn't find anything that satisfies my criteria.
So my applications sends out reminders to our customers based on their event timings. I am hesitating to run a regular cronjob because I have to run every 15 mins or so in this case. And for each cronjob, I have to make a database call. I am trying not to use resources unnecessarily.
In addition to that, I am already running a lot of cronjobs. But in my case, when the job is completed, I want the cron to get cancelled/finished; not live on memory until the server restart happens.
I tried using the above specified applications by setting exact timestamps (agenda, node-cron, node-schedule). But the cron lives on forever even after the job is completed, and if i restart the server, all the scheduled jobs are cron. So persistence is also an issue I am facing.
My server uses node js. If there are any other languages/tools to make this work, I am all ears.
Looking forward to your help.
I tried following this solution. But this solution is for one predefined event. In my case, the number of reminders to be sent out are dynamic and jobs are to be scheduled on the fly.
I have to run one utility periodically for instance say, every minute.
So, I have two option #Scheduled spring boot vs crontab of linux box that we are using to deploy the artifact.
SO, my question is which way should I use?
what are the pros and cons for each solution , any other solution if you can suggest.
Just for comparing between these two, I don't have much points, but only based on this situation which I faced now. I just built a new end point and am doing performance testing and stress testing for the same on production. I am yet to decide the cron schedule times, and those may need a slight tweaking over some more time of observation. Setting via #Scheduled needs me to deploy/restart application every time I make a change.
Application restart generally takes more time than crontab edit.
Other than this, a few points considering the aspects of availability and scalability -
Setting only via crontab on a single server would mean a single point of failure, if the server goes down.
Setting via #Scheduled also could mean the same.
If you have multiple instances of the server, this could mean endpoint getting triggered twice and you may not want to have the same. Worst case, is if the scaling up happens after a long time, and you wrote the #Scheduled endpoint long back, while it was only deployed on a single server and then you forgot. As soon as scaling up happens, the process will start getting hit twice.
So, none of these seem to be the best in terms of points of availability and scalability.
In such situations, ideally a distributed cron management system (I have heard about Rundeck) is needed, which manages which, out of the available servers is to be called to hit the desired end point and if needed to call the next server in case the first one is down.
In case of any need for investigation. logs of rundeck could be checked to find the server which was actually called.
Where can I find a great online reference on Lotus Notes Agent. I currently having problems with having simultaneous agents and understanding agents, how it works, best practices, etc? Thanks in advance!
I currently having problems with having simultaneous agents
Based on this comment I take it you are running a scheduled agent?
The way that scheduled agents work is that only one agent from a particular database can be run at one time, even if you have multiple Agent manager (AMGR) threads. Also agents cannot run less then every 5 minutes. The UI will let you put in a lower number, but it will change it.
The other factors to take into account is how long your agent will run for. If it runs for longer then the interval time you setup you will end up backlogging the running time. Also the server can be configured to kill agents that run over a certain time. So you need to make sure the agent runs within that timeframe.
Now to bypass all this you can execute an agent from the Domino console like as follows.
tell amgr run "database.nsf" 'agentName'
This will run in it's own thread outside of the scheduler. Because of this you can create a program document to execute an agent in less then 5 minute intervals and multiple agents within the same database.
This is dangerous in doing this however, as you have to be aware of a number of issues.
As the agent is outside the control of the scheduler you can't kill it as you would in the scheduler.
Running multiple threads can tie up more processes. So while the scheduler will backlog everything if the agent runs longer then the schedule, doing a program document in this instance will crash the server.
You need to be aware of what the agent is doing in the database so that it won't interfere with any other agents in the same database, and can cope if it is run twice in parallel.
For more reading material on this:
Improving Agent Manager Performance.
Agent Manager trouble shooting.
Troubleshooting Agents (Old material but still relevant)
... and related tech notes:
Title: How to run two agents concurrently in the same database using a wrapper agent
Title: How to run multiple agents in the same database using a Program document
I want to create a Web Crawler, that takes the content of some website and saves it in a blob storage. What is the right way to do that on Azure? Should I start a Worker role, and use the Thread.Sleep method to make it run once a day?
I also wonder, if I use this Worker Role, how would it work if I create two instances of it? I noticed using "Compute Emulator UI" that the command "Trace.WriteLine" works on both instances at the same time, can someone clarify this point.
I created the same crawler using php and set the cron job to start the script once a day, but it took 6 hours to grab the whole content, thats why I want to use Azure.
This is the right way to do it, as of Jan 2014 Microsoft introduced Azure WebJobs, where you can create a project (console for example), and run it as a scheduled task (occurrence once, recurrence)
Considering that a worker role is basically Windows 2008 Server, you can run the same code you'd run on-premises.
Consider, though, that there are several reasons why a role instance might reboot: OS updates, crash, etc. In these cases, it's possible you'd lose the work being done. So... you can handle this in a few ways:
Queue. Place a message on a command queue. If it's a once-a-day task, you can just push the message on the queue when done processing the previous message. Note that you can put an invisibility timeout on the message, so it doesn't appear for a day. In the event of failure during processing, the message will re-appear on the queue and a different instance can pick it up. You can also modify the message as you go, to keep track of your status.
Scheduler. Just make sure there's only one instance running (by way of a mutex). An easy way to do this is to attempt to obtain a write-lock on a blob (there can only be one).
One thing to consider is breaking up your web-crawl into separate tasks (url's?) and place those individually on the queue? With this, you'd be able to scale, running either multiple instances or, potentially, multiple threads in the same instance (since web-crawling is likely to be a blocking operation, rather than a cpu- and bandwidth-intensive one).
A single worker role running once a day is probably the best approach. I would not use thread sleep though, since you may want to restart the instance and then it may, depening on your programming, start before one day or later than one day. What about putting the task command as a message on the Azure Queue and dequeuing it once it has been picked up by a worker role, then adding a new task command on the Azure Queue once.
We are having a web application build using 3.5 & SQL server as database which is quite big and used by around 300 super users for managing around 5000 staffs.
Now we are implementing SMS functionality into the application which means the users will be able to send and receive SMS. Every two minute the SMS server of the third party is pinged to check whether there are any new messages. Also SMS are hold in queue and send every time interval of 15 to 30 minutes.
I want this checking and sending process to run in the background of the application all the time, even if the user closes the browser window.
I need some advice on how do I do this?
Will using thread will achieve this or do I need to create a windows service for it or are there any other options?
More information:
I want to execute a task in a timer, what will happen if I close the browser window, the task wont be completed isn't it so.
For example I am saving 10 records to the database in a time interval of 5 minutes, which means every 5 minutes when the timer tick event fires, a record is inserted into the database.
How do I run this task if I close the browser window?
I tried looking at windows service but how do I pass a generic collection of data to it for processing.
There really is no thread or service choice, a service can (and usually is!) multi threaded, a thread can start a service.
There are three basic choices you can:-
Somehow start another thread running when a user logs in -- this is probably a very poor choice for what you want, as you cannot really keep it running once the user session is lost.
Write a fully fledged windows service which is starts on OS startup and continues running unitl the server is shutdown. You can make this dependant on the SQLserver service, so it starts after the DB is available. This is the "best" solution but may be overkill for your purposes. Aslo you need to know the services API to write it properly as you need to respond correctly to shutdown and status requests.
You can schedule your task periodically using either the Windows schedular, or, preferably the schedular which is built in to SQLServer, I think this would be the most suitable option for your needs.
Distinguish between what the browser is doing and what's happening server-side.
Your Web App is sitting server-side waiting for requests from whatever browsers may be running, and servicing those requests, in servicing those requests I guess it may well put messages on a queue and have a look in a database for any new messages.
You want the daemon processor, which talks to the third-party SMS, to be triggered by time rather than by browser function. Either of your suggestions would work:
A competely independent service could run and work against the queues and database.
Your web app, which I assume is already a service, could spawn a thread
In either case we have a few technical questions of avoiding any race conditions between the browser-request processing and the daemon - but databases and queueing systems can deal with that.
So I would decide between stand-alone daemon and background thread like this:
Which is easier to implement? I'm a Java EE developer, I know in my app server I have an API for specifying code to be run according to a timer, the API deals with the threading issues. So for me that's very easy. I don't know what you have available. Timers are not quite as trivial as they may appear - so having a reliable API is beneficial. If this was a more complex requirement, where the daemon code were gnarly and might possibly interfere with the WebApp code then I might prefer to keep it conspicuously separate.
Which is easier to deploy and administer? Deploy separate Web App and daemon, or deploy one thing. In the Java EE world we could have a single Enterprise Application with all the code, so that's a single thing to deploy, start and control.
One other thing to consider: Scaling and Resilience. You might choose to have more than one copy of your web app running, either to provide fail-over capabilities or just because you need the extra power. In which case how many daemons would you have? Would it be a problem to have two daemons running? You might need some extra code to mediate between two daemons, for example log in the database the time of last work, each daemon can say "Oh, my buddy balready did the 10:30 job, I'll go back to sleep"