Is there any workaround in Hybris for hotfolders to be paused manually?
Here is the story and the problem: In production we have 2 data import servers for our application. Both servers use the same shared hotfolder for multiple different hotfolder configurations, and thus importing several different incoming files day by day. Our production deployment for these servers are differ from the other (non data import) servers of the application, because we have to wait for these servers to finish the actual in process file import, so we have to manually check the /processing folder over and over again 'till there isn't any file. Our goal is to skip this manual process, instead just "tell" the Hybris to stop the processing after the currently in progress imports.
Is there any OOTB implementation to do this?
AFAIK no there is not. You can probably fix your issue by creating a temporary "work in progress" file while the import starts then you delete the file when it ends. You can then automize the manual check with a method that check if there is a "work in progress" file.
Related
I have a Google Apps Script that builds a new csv file any time someone makes an edit to one of our shared google sheets (via a trigger). The file gets saved off to a dedicated Shared Drive folder (which is first cleaned out by trashing all existing files before writing the updated one). This part works splendidly.
I need to take that CSV and consume it in SSIS so I can datestamp it and load it into a MSSQL table for historical tracking purposes, but aside from paying for some third-party apps (i.e. CDATA, COZYROC), I can't find a way to do this. Eventually this package will be deployed on our SQL Agent to run on a daily schedule, so it will be attached to a service account that wouldn't have any sort of access to the Google Shared drive. If I can get that CSV over to one of the shared folders on our SQL server, I will be golden...but that is what I am struggling with.
If something via Apps Script isn't possible, is there someone that can direct me as to how I might then be able to programmatically get an Excel spreadsheet to open, refresh its dataset, then save itself and close? I can get the data I need into Excel out of the Google Sheet directly using a Power Query, but need it to refresh itself in an unattended process on a daily schedule.
I found that CData actually has a tool called Sync which got us what we needed out of this. There is a limited-options free version of the tool (that they claim it's "free forever") that runs as a local service. On a set schedule, it can query all sorts of sources, including Google Sheets, and will write out to various destinations.
The free version has limited availability in terms of the sources and destinations you can use (though there are quite a few), but it only allows 2 connection definitions. That said, you can define multiple source files, but only 1 source type (i.e. I can define 20 different Google Sheets to use in 20 different jobs, but can only use Google Sheets as my source).
I have it setup to read my shared google sheet and output the CSV to our server's share. A SSIS project reads the local CSV, processes it as needed, and then writes to our SQL server. Seems to work pretty well if you don't mind having an additional service running, and don't need a series of different sources and destinations.
Link to their Sync landing page: https://www.cdata.com/sync/
Use the Buy Now button and load up the free version in your cart, then proceed to check out. They will email you a key and a link to get the download from.
I have 3 agents in Lotus, these agents just update different CSV files on a shared drive. Based on their logs, they are running but only took a second. Checking the CSV files, they are not updating.
I've tried to adjust the schedule time
Tried other servers
Changed the Target
Disable/re-enable the agent
Made a copy of the agent
I haven't edit the code.
Workaround is to run these agents manually. It actually updates the CSV files and its taking at least 5 minutes for the agents to finish running which is expected. These agents just suddenly stop running as scheduled.
As Torsten mentioned, your Domino server does not have enough permissions. Per default it runs as local system, which does not have access to any shares.
See this technote before it disappears https://www.ibm.com/support/pages/domino-server-unable-locate-mapped-drives-when-started-windows-service
I have a workflow pipeline which will generate data files on a Linux server periodically, and also a cleanup service which will remove data files which are older than a week.
However, sometimes I found that a new generated data file will be missing, which is definitely no older than a week. I'm not sure whether it is a logic bug of the cleanup service, or another program did it. Currently I don't have any idea on how to investigate this issue. Is there any method to log all the file deletion activities and related process id as well as process name?
Thanks in advance.
How can we use Solution Package (WSP) MOSS 2007 to synchronize lists from one server to another?
Have a look at this tool: Content Migration Wizard
It allows you to copy lists from farm to farm using the Migration API. You can also script it to run automatically.
Copying data / schema from one server to another is not supported and requires custom code.
Is it really necessary that the items 'exist' on both servers? It sounds error prone to me. Maybe it's possible to simply 'aggregate' the items on one server by using a webservice or an RSS feed.
If copying is required then I would create a SharePoint job that runs every x minutes/hours to do the syncronization. Let the custom job communicate with the web services on the other server.
Note : Since your job only runs every x minutes it means that your syncronization is not realtime!
Be carefull with large workloads. Make sure you don't stress your server by trying to synchronize 10.000 every minute.
The situation is as follows:
A series of remote workstations collect field data and ftp the collected field data to a server through ftp. The data is sent as a CSV file which is stored in a unique directory for each workstation in the FTP server.
Each workstation sends a new update every 10 minutes, causing the previous data to be overwritten. We would like to somehow concatenate or store this data automatically. The workstation's processing is limited and cannot be extended as it's an embedded system.
One suggestion offered was to run a cronjob in the FTP server, however there is a Terms of service restriction to only allow cronjobs in 30 minute intervals as it's shared-hosting. Given the number of workstations uploading and the 10 minute interval between uploads it looks like the cronjob's 30 minute limit between calls might be a problem.
Is there any other approach that might be suggested? The available server-side scripting languages are perl, php and python.
Upgrading to a dedicated server might be necessary, but I'd still like to get input on how to solve this problem in the most elegant manner.
Most modern Linux's will support inotify to let your process know when the contents of a diretory has changed, so you don't even need to poll.
Edit: With regard to the comment below from Mark Baker :
"Be careful though, as you'll be notified as soon as the file is created, not when it's closed. So you'll need some way to make sure you don't pick up partial files."
That will happen with the inotify watch you set on the directory level - the way to make sure you then don't pick up the partial file is to set a further inotify watch on the new file and look for the IN_CLOSE event so that you know the file has been written to completely.
Once your process has seen this, you can delete the inotify watch on this new file, and process it at your leisure.
You might consider a persistent daemon that keeps polling the target directories:
grab_lockfile() or exit();
while (1) {
if (new_files()) {
process_new_files();
}
sleep(60);
}
Then your cron job can just try to start the daemon every 30 minutes. If the daemon can't grab the lockfile, it just dies, so there's no worry about multiple daemons running.
Another approach to consider would be to submit the files via HTTP POST and then process them via a CGI. This way, you guarantee that they've been dealt with properly at the time of submission.
The 30 minute limitation is pretty silly really. Starting processes in linux is not an expensive operation, so if all you're doing is checking for new files there's no good reason not to do it more often than that. We have cron jobs that run every minute and they don't have any noticeable effect on performance. However, I realise it's not your rule and if you're going to stick with that hosting provider you don't have a choice.
You'll need a long running daemon of some kind. The easy way is to just poll regularly, and probably that's what I'd do. Inotify, so you get notified as soon as a file is created, is a better option.
You can use inotify from perl with Linux::Inotify, or from python with pyinotify.
Be careful though, as you'll be notified as soon as the file is created, not when it's closed. So you'll need some way to make sure you don't pick up partial files.
With polling it's less likely you'll see partial files, but it will happen eventually and will be a nasty hard-to-reproduce bug when it does happen, so better to deal with the problem now.
If you're looking to stay with your existing FTP server setup then I'd advise using something like inotify or daemonized process to watch the upload directories. If you're OK with moving to a different FTP server, you might take a look at pyftpdlib which is a Python FTP server lib.
I've been a part of the dev team for pyftpdlib a while and one of more common requests was for a way to "process" files once they've finished uploading. Because of that we created an on_file_received() callback method that's triggered on completion of an upload (See issue #79 on our issue tracker for details).
If you're comfortable in Python then it might work out well for you to run pyftpdlib as your FTP server and run your processing code from the callback method. Note that pyftpdlib is asynchronous and not multi-threaded, so your callback method can't be blocking. If you need to run long-running tasks I would recommend a separate Python process or thread be used for the actual processing work.