CruiseControl.Net Nant copy/move failure error - cruisecontrol.net

I've recently taken over responsibility of a Cruise Control continuous integration server, although I know very little about either Cruise Control or Nant. But, such is life.
One of the regular build jobs this is supposed to do is to execute a Nant script that backs up files and data from one of live servers to a backup server. I've discovered that this has been failing pretty much as far back as user interface will let me see.
The error message is always the same:
Build Error: NAnt.Core.BuildException
Cannot copy '[filename].bak' to '[server]'.
But it doesn't always fail at exactly the same spot.
The Nant script that's executing is pretty much several iterations of this copy code:
<copy todir="${backup.root}\{dirname}">
<fileset basedir="s:">
<include name="**/*" />
</fileset>
</copy>
Although some of the commands are 'move' rather than 'copy'.
The fact this happens at different points in the scripts suggests to me that this is either down to a timeout, or to the script being unable to access files that in use by the system when the script is running. But I've never been able to get a successful execution through, no matter what time of day I set it to run.
The error messages are not particularly helpful in identifying what the problem actually is. And googling them is not particularly enlightening. I'm not expecting a solution to this (though one would be nice) - but it'd be enormously helpful if I could just get some pointers on where to look next in terms of identifying the problem.

As this is a backup you could set the copy task to proceed on failure until you identify the problem. Obviously you don't want to leave it that way permanently.
See http://nant.sourceforge.net/release/0.85/help/tasks/copy.html
I would add the verbose='true' and failonerror='false' attributes to the copy task and see if that helps.
Setting overwrite based on your scenario may also help.

First of all extract the Nant section to a sandbox Nant file and run Nant on your own from command line so you don't have to wait to test until the scheduled backup time each day. Basically setup a testbed for ease of testing.
Then run it and see where it fails. Then shorten the nant task till it works no matter how little work it's doing. Now you should know what first causes it to fail. Focus on that.
I've used Nant a fair amount and:
Cannot copy '[filename].bak' to '[server]'
makes me think some properties aren't being resolved. Why would someone name a file [filename].bak? 'filename' looks like the name of a property that doesn't exist in Nant. Just going by gut feel here.

Related

When running a shell script with loops, operation just stops

NOTICE: Feedback on how the question can be improved would be great as I am still learning, I understand there is no code because I am confident it does not need fixing. I have researched online a great deal and cannot seem to find the answer to my question. My script works as it should when I change the parameters to produce less outputs so I know it works just fine. I have debugged the script and got no errors. When my parameters are changed to produce more outputs and the script runs for hours then it stops. My goal for the question below is to determine if linux will timeout a process running over time (or something related) and, if, how it can be resolved.
I am running a shell script that has several for loops which does the following:
- Goes through existing files and copies data into a newly saved/named file
- Makes changes to the data in each file
- Submits these files (which number in the thousands) to another system
The script is very basic (beginner here) but so long as I don't give it too much to generate, it works as it should. However if I want it to loop through all possible cases which means I will generates 10's of thousands of files, then after a certain amount of time the shell script just stops running.
I have more than enough hard drive storage to support all the files being created. One thing to note however is that during the part where files are being submitted, if the machine they are submitted to is full at that moment in time, the shell script I'm running will have to pause where it is and wait for the other machine to clear. This process works for a certain amount of time but eventually the shell script stops running and won't continue.
Is there a way to make it continue or prevent it from stopping? I typed control + Z to suspend the script and then fg to resume but it still does nothing. I check the status by typing ls -la to see if the file size is increasing and it is not although top/ps says the script is still running.
Assuming that you are using 'Bash' for your script - most likely, you are running out of 'system resources' for your shell session. Also most likely, the manner in which your script works is causing the issue. Without seeing your script it will be difficult to provide additional guidance, however, you can check several items at the 'system level' that may assist you, i.e.
review system logs for errors about your process or about 'system resources'
check your docs: man ulimit (or 'man bash' and search for 'ulimit')
consider removing 'deep nesting' (if present); instead, create work sets where step one builds the 'data' needed for the next step, i.e. if possible, instead of:
step 1 (all files) ## guessing this is what you are doing
step 2 (all files)
step 3 (all files
Try each step for each file - Something like:
for MY_FILE in ${FILE_LIST}
do
step_1
step_2
step_3
done
:)
Dale

Shake: Signal whether anything had to be rebuilt at all

I use shake to build a bunch of static webpages, which I then have to upload to a remote host, using sftp. Currently, the cronjob runs
git pull # get possibly updated sources
./my-shake-system
lftp ... # upload
I’d like to avoid running the final command if shake did not actually rebuild anything. Is there a way to tell shake “Run command foo, after everything else, and only if you changed something!”?
Or alternatively, have shake report whether it did something in the process exit code?
I guess I can add a rule that depends on all possibly generated file, but that seems to be redundant and error prone.
Currently there is no direct/simple way to determine if anything built. It's also not such a useful concept as for simpler build systems, as certain rules (especially those that define storedValue to return Nothing) will always "rerun", but then very quickly decide they don't need to run the rules that depend on them. To Shake, that is the same as rerunning. I can think of a few approaches, which one is best probably depends on your situation:
Tag the interesting rules
You could tag each interesting rule (one that produces something that needs uploading) with a function that writes to a specific file. If that specific file exists, then you need to upload. This might work slightly better, as if you do multiple Shake runs, and in the first something changes but the second nothing does, the file will still be present. If it makes sense, use an IORef instead of a file.
Use profiling
Shake has quite advanced profiling. If you pass shakeProfile=["output.json"] it will produce a JSON file detailing what built and when. Runs are indexed by an Int, with 0 for the most recent run, and any runs that built nothing are excluded. If you have one rule that always fires (e.g. write to a dummy file with alwaysRerun) then if anything fired at the same time, it rebuilt.
Watch the .shake.database file size
Shake has a database, stored under shakeFiles. Each uninteresting run it will grow by a fairly small amount (~100 bytes) - but a fixed size given your system. If it changes in size by a greater amount, then it did something interesting.
Of these approaches, tagging the interesting rules is probably the simplest and most direct (although does run the risk of you forgetting to tag something).

How to find the time when a Puppet manifest is executed

I'm wondering if anyone knows a good way to get the date and time when a portion of code in a Puppet manifest is actually executed. Sometimes my manifests take a long time to run, and I need to schedule a task to occur soon after the end of the run, no matter when that occurs.
I have tried the time() function, setting a variable using generate() (using the date function on the Puppet master), and even creating a custom fact, but everything I've tried gets evaluated when the manifests are parsed on the server, rather than when they actually execute on the client.
Any ideas? The clients are all Windows, FWIW.
Thanks in advance!
I am not sure I understand what you mean, but you can't get this information during catalog compilation (obviously), so you can't use it to change the way the catalog will be applied.
If you need to trigger another process on the same host, then you should use any IPC mechanism you have available. You can exec anything, and have it happen just after any other resources is applied, so it is just a matter of finding the proper command.

Cucumber: Each feature passes individually, but not together

I am writing a Rails 3.1 app, and I have a set of three cucumber feature files. When run individually, as with:
cucumber features/quota.feature
-- or --
cucumber features/quota.feature:67 # specifying the specific individual test
...each feature file runs fine. However, when all run together, as with:
cucumber
...one of the tests fails. It's odd because only one test fails; all the other tests in the feature pass (and many of them do similar things). It doesn't seem to matter where in the feature file I place this test; it fails if it's the first test or way down there somewhere.
I don't think it can be the test itself, because it passes when run individually or even when the whole feature file is run individually. It seems like it must be some effect related to running the different feature files together. Any ideas what might be going on?
It looks like there is a coupling between your scenarios. Your failing scenario assumes that system is in some state. When scenarios run individually system is in this state and so scenario passes. But when you run all scenarios, scenarios that ran previously change this state and so it fails.
You should solve it by making your scenarios completely independent. Work of any scenario shouldn't influence results of other scenarios. It's highly encouraged in Cucumber Book and Specification by Example.
I had a similar problem and it took me a long time to figure out the root cause.
I was using #selenium tags to test JQuery scripts on a selenium client.
My page had an ajax call that was sending a POST request. I had a bug in the javascript and the post request was failing. (The feature wasn't complete and I hadn't yet written steps to verify the result of the ajax call.)
This error was recorded in Capybara.current_session.server.error.
When the following non-selenium feature was executed a Before hook within Capybara called Capybara.reset_sessions!
This then called
def reset!
driver.reset! if #touched
#touched = false
raise #server.error if #server and #server.error
ensure
#server.reset_error! if #server
end
#server.error was not nil for each scenario in the following feature(s) and Cucumber reported each step as skipped.
The solution in my case was to fix the ajax call.
So Andrey Botalov and Doug Noel were right. I had carry over from an earlier feature.
I had to keep debugging until I found the exception that was being raised and investigate what was generating it.
I hope this helps someone else that didn't realise they had carry over from an earlier feature.

How can I setup a system to tell me if a cron job is NOT running fine?

This is more of an "general architecture" problem. If you have a cron job (or even a Windows scheduled task) running periodically, its somewhat simple to have it send you an email / text message that all is well, but how do I get informed when everything is NOT okay? Basically, if the job doesn't run at its scheduled time or Windows / linux has its own set of hangups that prevent the task from running...?
Just seeking thoughts of people who've faced this situation before and come up with interesting solutions...
A way I've done it in the past is to simply put at the top of each script (say, checkUsers.sh):
touch /tmp/lastrun/checkUsers.sh
then have another job that runs periodically that uses find to locate all those "marker" files in tmp/lastrun that are older than a day.
You can fiddle with the timings, having /tmp/lastrun/hour/ and tmp/lastrun/day/ to separate jobs that have different schedules.
Note that this won't catch scripts that have never run since they will never create the initial file for find-ing. To alleviate that, you can either:
create that file manually when creating the cron job (won't handle situations where someone inadvertently deletes the marker file); or
maintain a list of required marker files somewhere so that you can detect when they're missing as well as outdated.
And, if your cron job is not a script, put the touch directly into crontab:
0 4 * * * ( touch /tmp/lastrun/daily/checkUsers ; /usr/bin/checkUsers )
It's a lot easier to validate a simple find script than to validate every one of your cron jobs.

Resources