Condor on Win7: connection issue (Errno 10054) - io

I have installed condor 8.2.0 on several Win7 (32/64bit) computers according this guide: http://www.slideshare.net/gtelzur/condor8-win-install All the services run on the same machine and therefore I rule out a physical network interrupt.
Whenever a job is created/submitted, it stays in the idle mode. A detailed look a the log files unveil the following issue (ShadowLog):
07/07/14 08:10:47 (1.1) (PID1): **** condor_shadow (condor_SHADOW) pid PID1 EXITING WITH STATUS 107
07/07/14 08:10:47 (1.0) (PID2): condor_read() failed: recv(fd=540) returned -1, errno = 10054 , reading 5 bytes from startd slot1#mycomputer.mydomain.local.
07/07/14 08:10:47 (1.0) (PID2): IO: Failed to read packet header
07/07/14 08:10:47 (1.0) (PID2): Can no longer talk to condor_starter <192.168.25.120:56186>
I couldn't find more details about an IO exception with ID 10054. Beyond that Google does not give me useful hints if I search for "Condor IO: Failed to read packet header".
Do you have a clue what could address the issue?

I had the same issue and it was fixed when I reinstalled Condor in C:\Condor (it was in D:\Condor).
Note that with Condor 8.2.1 I ran into an unrelated problem: I had to edit the condor_config file and remove one $ in the line CONDOR_HOST = $$(FULL_HOSTNAME), as otherwise there was a parsing error.

When you see
condor_read() failed: .... reading 5 bytes from .....
In one of the log files, that usually means that the other side of the connection hung up, so you should look in the log file for the other side of the conversion. In this case, that would be the StarterLog.slot1 on mycomputer.mydomain.local (or possibly just the StarterLog, if the problem happens very early).
Usually when a daemon hangs up, the reason for the hang up is in the log, and very often the problem is that the other side of the conversation isn't authorized. See configuration values that match ALLOW_* to see what is authorized.

Related

Linux Capabilities: CAP_SYS_NICE=ep results in AttachNotSupportedException on JDK11+

Currently running on a fedora machine with JDK 11.0.15, and java capabilities set to "cap_sys_nice+ep".
The command that I am testing is "jcmd JFR.start ....". However, I end up getting the error,
AttachNotSupportedException. Unable to open socket file tmp/attach_pid1234: "
"target process 1234 doesn't respond within 10000 ms"
Process 1234 was launched via JDK11.0.15 java with the CAP_SYS_NICE capability set.
However, when I attempt to do a Java Flight Recording, I always get the error above. jcmd does not have any capabilities set.
What really puzzles me is the fact that, given the same process run using JDK11.0.15, if I utilize jcmd from JDK8 or JDK7, everything works fine.
The only solution I have found is by giving JDK11.0.15 jcmd CAP_SYS_PTRACE=ep capabilities. Could anyone explain what exactly is going on in the background? Thanks.

Snakemake cannot write metadata

I have troubles to get snakemake-minimal=7.8.5 to run on Windows 10. I can execute rules, but snakemake terminates due to an error regarding the metadata:
Failed to set marker file for job started ([Errno 2] No such file or directory: 'C:\\test\\project\\.snakemake\\incomplete\\cnVucy9leHBlcmltZW50XzAzL2RmX2ludGVuc2l0aWVzX3Byb3RlaW5Hcm91cHNfbG9uZ18yMDE3XzIwMThfMjAxOV8yMDIwX04wNTAxNV9NMDQ1NDcvUV9FeGFjdGl2ZV9IRl9YX09yYml0cmFwX0V4YWN0aXZlX1Nlcmllc19zbG90XyM2MDcwLzE0X2V4cGVyaW1lbnRfMDNfZGF0YS5pcHluYg=='). Snakemake will work, but cannot ensure that output files are complete in case of a kill signal or power loss. Please ensure write permissions for the directory C:\test\project\.snakemake
I tried to troubleshoot doing the following
change the folders: Documents, User folder, and like the above in the root folder of my c drive
I tried to manipulate the security settings: Controlled folder or RandsomWare Access, see discussion -> it is deactivated
If I erase the .snakemake it is re-creating upon execution, so I assume I have write access. However, some security setting is disallowing the long filename with the hash
I tried the same workflow on a different Windows 10 machine and there I don't get the error, so I assume it is some windows issue.
Did anyone encounter the same error and found a solution?
I agree it is due to the length of the filename. It seems the default max filename length is 260. The file you pasted has a length of 262. You can edit the registry to allow longer filenames. Also consider opening an issue in snakemake to improve the documentation or otherwise address this issue for windows machines.

How to avoid fatal error or unknown error in powershell on windows server 2016

Sometimes i get unknow error in powershell on windows server 2016, every time this error have random behavior, but every time is "Fatal error", like on screenshot.
Screen_Shot
Unknown type: 541
#
# Fatal error in, line 0
# unreachable code
#
#
#
#FailureMessage Object: 0000007BD8BFF1F0
What is that and how to fix this? I launch node.js on this server
Memory leak? Disc error?
P.S. on Linux server i dont have any errors, but i need to launch my code on Windows server 2016. Sorry if my question was stupid or is a duplicate.
Does in your code you've setup something like :
$ErrorActionPreference = Stop
This will stop your process at any errors; anyway the best practice is to use the try/catch block to handle any error and , at least, display where and what is the error message.
In your case, I suspect there is a bad memory access or conflict with something but without more details it's pretty difficult.
Do you use in your script the try and catch block ?
I found that problem spot - CPU.

Octave cannot plot: cygoctave-1.dll loaded to different address

I am trying to use Octave in Win 7, 64 bit. I installed cygwin64, with octave, gnuplot and x11. However, when I start the X server and opened octave, trying to plot, it came up with this:
octave:1> plot(1:10)
0 [main] octave-3.6.4 5560 child_info_fork::abort: C:\cygwin64\bin\cygoctave-1.dll: Loaded to different address: parent(0xF30000) != child(0xE90000)
error: popen2: process creation failed -- Resource temporarily unavailable
error: called from:
error: /usr/share/octave/3.6.4/m/plot/private/__gnuplot_open_stream__.m at line 30, column 44
error: /usr/share/octave/3.6.4/m/plot/__gnuplot_drawnow__.m at line 72, column 19
Would anyone please help a little bit here?
Thank you!
-Shawn
Got it solved. I got the answer from the cygwin mailing list, as follows:
The problem is, the hash algorithm used by ld to compute a default DLL
load address is not exactly bullet proof, not even with such a big
address space we have now available for DLLs. It still requires to
run rebase to be on the safe side.
However, I just found a problem in the 64 distro which results in not
running autorebase as part of an update. This should be fixed soon.
For the time being, stop all Cygwin processes, start a naked dash and
run /usr/bin/rebaseall.
All credit goes to Corinna.

nodejs - jade ReferenceError: process is not defined

the projcet is generated by webstorm's express template.
the npm dependencies haven be installed!
the result page is OK when I run the appplicaion, but the console will always say:
'ReferenceError: process is not defined'
why will this happened ? I am under Win7 64bit.
I finally found where the issue originates. It's not Jade or Express, it's Uglify-JS which is a dependency of transformers which is a dependency of Jade which is often a dependency of express.
I have experienced this issue in WebStorm IDE by JetBrains on Windows 7 and Windows 8 (both 64-bit).
I already went ahead and fixed the issue in this pull request.
All I needed to do was include process in the node vm context object. After doing that I received a new error:
[ReferenceError: Buffer is not defined]
All I had to do was include Buffer in the vm context object as well and I no longer get those silly messages.
I still don't fully understand why it only happens during debugging but in my extremely limited experience I've come to find that the node vm module is a fickle thing, or at least the way some people use it is.
Edit: It's a bug in the node vm module itself. I figured out how to reproduce it and I figured out why it only happens during debugging.
This bug only happens if you include the third (file) argument to vm.runInContext(code, context, file);. All the documentation says about this argument is that it is optional and is only used in stack traces. Right off the bat you can now see why it only happens during debugging. However, when you pass in this argument some funny behavior begins to occur.
To reproduce the error (note that the file argument must be passed in or this error never occurs at all):
The file argument must end with ".js" and must contain at least one forward slash or double backslash. Since this argument is expected to be a file path it makes sense that the presence of these might trigger some other functionality.
The code you pass in (first argument) must not begin with a function. If it begins with a function then the error does not occur. So far it seems that beginning the code with anything but a function will generate the reference error. Don't ask me why this argument has any effect on whether or not the error shows up because I have no idea.
You can fix the error by including process in the context object that you pass to vm.createContext(contextObject);.
var context = vm.createContext({
console: console,
process: process
});
If your file path argument is well-formed (matches the requirements in #1) then including process in the context will get rid of the error message; that is, unless your file path does not point to an actual file in which case you see the following:
{ [Error: ENOENT, no such file or directory 'c:\suatils.js']
errno: 34,
code: 'ENOENT',
path: 'c:\\test.js',
syscall: 'open' }
Pointing it to an actual file will get rid of this error.
I'm going to fork the node repository and see if I can improve this function and the way it behaves, then maybe I'll submit a pull request. At the very least I will open a ticket for the node team.
Edit 2: I've determined that it is an issue with WebStorm specifically. When WebStorm starts the node process we get this issue. If you debug from the command line there is no problem.
Video: http://youtu.be/WkL9a-TVHNY?hd=1
Try this...
set the webstorm debugger to break on all unhandled exceptions. then run the app in debug mode. I think you will find the [referenceError] is being thrown from the referenced fs.js.
More specifically, fs.js line 684:
fs.statSync = function(path) {
nullCheck(path);
**return binding.stat(pathModule._makeLong(path));**
};
These were my findings using the same dev environment as you. (win 64, webstorm, node, etc...)
from there you can use webstorms evaluate expression to re-run that line of code and see exactly why you are failing.
I had the same error coming up and it was because I had the following at the top of my file:
const argv = require("minimist")(process.argv.slice(two));
const process = require("child_process");
That confused node, I guess it thought process hadn't been defined at that point.
Changing the second line to a different variable name resolved the issue

Resources