file command generating 'invalid argument' - linux

I have a perl script that traverses a set of directories and when it hits one of them it blows up with an Invalid Argument and I want to be able to programmatically skip it. I thought I could start by finding out the file type with the file command but it too blows up like this:
$ file /sys/devices/virtual/net/br-ex/speed
/sys/devices/virtual/net/br-ex/speed: ERROR: cannot read `/sys/devices/virtual/net/br-ex/speed' (Invalid argument)
If I print out the mode of the file with the perl or python stat function it tells me 33060 but I'm not sure what all the bits mean and I'm hoping a particular one would tell me not to try to look inside. Any suggestions?

To understand the stats number you got, you need to convert the number to octal (in python oct(...)).
Then you'll see that 33060 interprets to 100444. You're interested only in the last three digits (444). The first digit is file owner permissions, the second is group and the third is everyone else.
You can look at each of the numbers (in your case all are 4) as 3 binary bits in this order:
read-write-execute.
Since in your case owner, group & other has 4, it is translated (for all of them) to 100 (in binary) which means that only the read bit is on for all three - meaning that all three can only read the file.
As far as file permissions go, you should have been successful reading /sys/devices/virtual/net/br-ex/speed.
There are two reasons for the read to fail:
- Either speed is a directory, (directories require execute permissions to read inside).
- Or it's a special file - which can be tested using the -f flag in perl or bash, or using os.path.isfile(...) in python.
Anyhow, you can use the following links to filter files & directories according to their permissions in the 3 languages you mentioned:
ways to test permissions in perl.
ways to test permissions in python.
ways to test permissions in bash.

Not related to this particular case, but I hit the same error when I ran it on a malicious ELF (Linux executable) file. In that case it was because the program headers of the ELF was intentionally corrupted. Looking at the source code for file command, this is clear as it checks the ELF headers and bails out with the same error in case the headers are corrupted:
/*
* Loop through all the program headers.
*/
for ( ; num; num--) {
if (pread(fd, xph_addr, xph_sizeof, off) <
CAST(ssize_t, xph_sizeof)) {
file_badread(ms);
return -1;
}
TLDR; The file command checks not only the magic bytes, but it also performs other checks to validate a file type.

Related

How to check the available free space in Haxe?

I don't find a way to check the free space available in a device using Haxe, Openfl, Lime or another library.
I would like to avoid download data that will exceed the size recommended for an app in each device.
What do you do to check that?
Try creating a file of that size! Then either delete it or reopen and write (not append) over its contents.
I don't know whether all platforms Haxe supports will work fine with this trick, but this algorithm is reported to work in many places and languages (I personally tested it in Ruby and saw the same suggestion for C++/.NET). To check whether X bytes of disk space are available:
open a new file for writing
seek X-1 bytes from the beginning
write a byte of data (whatever you want, 0, 42...)
close the file (probably unrelated to the task at hand, but don't forget to do that anyway)
If there's insufficient disk space, you'll likely get an exception at some point in this algorithm. You'll have to find out what errors to expect and process them properly.
Using ihx I've found this is working and requires nothing but Haxe Standard Library:
haxe interactive shell v0.3.4
type "help" for help
>> import sys.io.*;
>> var f = File.write('loca', true)
sys.io.FileOutput : { __f => #abstract }
>> f.seek(39999, FileSeek.SeekBegin)
Void : null
>> f.writeByte(0)
Void : null
>> f.close()
Void : null
After these manipulations, I had a file named loca of exactly 40000 bytes in my working directory.
By the way, be careful when doing things like these in ihx since it re-runs the entire session with the last entered line appended each time.
Ongoing experimentation:
However, when there's insufficient disk space, it may not fail with errors. In this case you'll have to check the real size with sys.FileSystem.stat(path).size. And don't forget to delete the file if there's not enough space.

What does the function ttyn(3) return?

The man page is here: http://man.cat-v.org/unix-6th/3/ttyn
This example:
if (ttyn(0) = 'x'){
...
}
The man page says "x is returned if the indicated file does not correspond to a
typewriter."
The indicated file would be argument 0, so the standardinput, right?
And what is a typewriter? My keyboard?
What are you checking with this line?
if (ttyn(0) = 'x')
At that point in time, a typewriter (or teletype, or tty) was an RS-232 terminal connected to the computer via a serial port. The device entries in /dev corresponding to these ports were named /dev/tty0, /dev/tty1, /dev/ttya, etc. Each of those files was a character special file, as opposed to an ordinary file.
When a terminal was detected by the system, typically by being turned on or connected through a modem, the init process opened the device on file descriptors 0, 1, and 2 in a new process, and those file descriptors persisted through the login process, a user's shell, and any processes forked from the shell.
As you said in your question, file descriptor 0 is also called standard input.
The ttyn function calls fstat on its argument, which returns some info about the file such as its inode number, permissions, etc. ttyn then reads through /dev, looking at each file that starts with "tty", to see which one has the same inode number as ttyn's argument. When it finds a match, it returns the 4th character of the filename, which would be '0', '1', 'a', etc. If no matches are found, it returns 'x'.
There were generally a console and a few 8-port serial interfaces on a PDP-11. so there was no ttyx. And you could name devices in /dev anything you wanted. So it was easy to avoid /dev/ttyx being an actual device.
Commands like goto could use ttyn(0) != 'x' to determine whether the user was actually typing the command on a terminal.
Here is the default config file, /etc/ttys, used by init in V6. The console was tty8.
In V7 Unix, the functionality of ttyn was replaced by ttyname, which could accommodate longer device names, and isatty, which returned true if the fle descriptor was a terminal device. The goto command was not present in V7.
I've never seen this library call before; I'm used to the more familiar ttyname. The webpage doesn't give a return value, but based on what the text says, it would give the last char value in the string returned by ttynam(3). So if stdin (fd0) was connected to "/dev/tty2", then the return value would be the char 2. And in C, you would be able to check it with:
if (ttyn(0) == '2') { ... }
Granted the documentation is not clear. And it is using bad terminology; instead of "typewriter", it should be using "teletype" or "terminal", which are the accepted terms. Remember that stdin can be different from stdout; it is perfectly possible to do run cat </dev/tty1 > /dev/tty2, assuming you have the permissions for it.

learnyounode 'My First I/O' example

This program puzzles me. The goal of this program is to count the number of newlines in a file and output it in command prompt. Learnyounode then runs their own check on the file and sees if their answer matches your answer.
So I start with the answer :
var fs = require('fs');
var filename = process.argv[2];
file = fs.readFileSync(filename);
contents = file.toString();
console.log(contents.split('\n').length - 1);
learnyounode verifies that this program correctly counts the number of new lines. But when I change the program to any of the following, it doesn't print out the same number as learnyounode prints out.
file = fs.readFileSync(C:/Nick/test.txt);
file = fs.readFileSync(test.txt);
Shouldn't nodejs readFileSync be able to input an address and read it correctly?
Lastly, this program is supposed to print out the # of newlines in a program. Why does both the correct program and learnyounode print out the same number that is different from the amount of newlines everytime I run this program?
For example, the number of newlines in test.txt is 3. But running this program prints out a different number everytime, like 45, 15, 2, etc. Yet at the same time, it is verified as a correct program by learnyounode because both their answers match! What is going on?
EDIT:
test.txt looks like this
ok
testing
123
So, I tried your program on my local machine and your program works fine. I am not an expert on learnyounode. I just tried it after your question but I think I understand how it works. As such, here are the answers to your questions:
Shouldn't nodejs readFileSync be able to input an address and read it correctly?
This method from nodejs is working fine. You can try printing the contents of the file and you'll see that there are no problems.
Why does both the correct program and learnyounode print out the same number that is different from the amount of newlines everytime I run this program.
learnyounode is running your program with a different filename as input each time. It verifies the output of your program by running its own copy of correct code against the same file.
But when I change the program to any of the following, it doesn't print out the same number as learnyounode prints out.
That is because at this point, your code is processing a fixed file whereas learnyounode is still processing different files on each iteration.
This tripped me up too. If you read the learnyounode instructions closely they explicitly say...
"The full path to the file to read will be provided as the first command-line argument."
This means they are providing the path to their own file.
When you use process.argv[2], this is passing in the 3rd array item (the learnyounode test txt file) into your script. If you run a console.log(process.argv); you'll see the full array object looks something like this:
[ '/usr/local/bin/node',
'/Users/user/pathstuff/learnyounode/firstio.js',
'/var/folders/41/p2jvc80j26l7nty0sk0zs1z40000gn/T/_learnyounode_1613.txt' ]
The reason the validation numbers begin to mismatch when you substitute your own text file for their is because your file always has 3 lines whereas their unit tests keep passing in different length files via process.argv.
Hope that helps.
when you are using process.argv[2] in learnyounode, the argument is provided by learnyounode automatically, so it prints different number of lines like 45, 15, 2 etc at multiple times verification.
If you remember the second challenge "BABYSTEPS" carefully this was given:
learnyounode will be supplying arguments to your program when you run
learnyounode verify program.js so you don't need to supply them yourself.
That's why different line numbers at program.js verification on multiple times.
there are two different ways.
if you run program like:
node program_name.js
than you need to add path to text file:
node program_name.js text_file.txt
in this case make sure that files are in the same directory.
or you can run it with command:
learnyounode program_name.js
and than default text file will be provided by learnyounode. You can watch content of this text file by using
console.log(buffer)
Problem statement says
The full path to the file to read will be provided as the first
command-line argument.
So you've to pass the path/to/file as an argument.
Remember process.argv
you should use the following method to execute .js files
node program_name.js /path/to/text_file_name
rather than
learnyounode run program_name.js /path/to/text_file_name
on this method, Node.js will run your program with specify files of you enter on the command-line-interface.
wish this answer can help you programming. :)

Issue with filepath name, possible corrupt characters

Perl and html, CGI on Linux.
Issue with file path name, being passed in a form field, to a CGI on server.
The issue is with the Linux file path, not the PC side.
I am using 2 programs,
1) program written years ago, dynamic html generated in a perl program, and presented to the user as a form. I modified by inserting the needed code to allow a the user to select a file from their PC, to be placed on the Linux machine.
Because this program already knew the filepath, needed on the linux side, I pass this filepath in a hidden form field, to program 2.
2) CGI program on Linux side, to run when form on (1) is posted.
Strange issue.
The filepath that I pass, has a very strange issue.
I can extract it using
my $filepath = $query->param("serverfpath");
The above does populate $filepath with what looks like exactly the correct path.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
However, if I populate $filepath with EXACTLY the same string, via hard coding it, it works, and my file successfully uploads.
For example:
$fpath1 = $query->param("serverfpath");
$fpath2 = "/opt/webhost/ims/DOCURVC/data"
A comparison of $fpath1 and $fpath2 reveals that they are exactly equal.
A length check of $fpath1 and $fpath2 reveals that they are exactly the same length.
I have tried many methods of cleaning the data in $fpath1.
I chomp it.
I remove any non standard characters.
$fpath1 =~ s/[^A-Za-z0-9\-\.\/]//g;
and this:
my $safe_filepath_characters = "a-zA-Z0-9_.-/";
$fpath1 =~ s/[^$safe_filepath_characters]//g;
But no matter what I do, using $fpath1 causes an error, using $fpath2 works.
What could be wrong with the data in the $fpath1, that would cause it to successfully compare to $fpath2, yet not be equal, visually look exactly equal, show as having the exact same length, but not work the same?
For the below file open block.
$upload_dir = $fpath1
causes complete failure of CGI to load, as if it can not find the CGI (which I know is sometimes caused by syntax error in the CGI script).
$uplaod_dir = $fpath2
I get a successful file upload
$uplaod_dir = ""
The call to the cgi does not fail, it executes the else block of the below if, as expected.
here is the file open block:
if (open ( UPLOADFILE, ">$upload_dir/$filename" ))
{
binmode UPLOADFILE;
while ( <$upload_filehandle> )
{
print UPLOADFILE;
}
close UPLOADFILE;
$msgstr="Done with Upload: upload_dir=$upload_dir filename=$filename";
}
else
{
$msgstr="ERROR opening for upload: upload_dir=$upload_dir filename=$filename";
}
What other tests should I be performing on $fpath1, to find out why it does not work the same as its hard-coded equivalent $fpath2
I did try character replacement, a single character at a time, from $fpath2 to $fpath1.
Even doing this with a single character, caused $fpath1 to have the same error as $fpath2, although the character looked exactly the same.
Is your CGI perhaps running perl with the -T (taint mode) switch (e.g., #!/usr/bin/perl -T)? If so, any value coming from untrusted sources (such as user input, URIs, and form fields) is not allowed to be used in system operations, such as open, until it has been untainted by using a regex capture. Note that using s/// to modify it in-place will not untaint the value.
$fpath1 =~ /^([A-Za-z0-9\-\.\/]*)$/;
$fpath1 = $1;
die "Illegal character in fpath1" unless defined $fpath1;
should work if taint mode is your issue.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
Premature end of script headers? Try running the CGI from the command line:
perl your_upload_script.cgi serverfpath=/opt/webhost/ims/DOCURVC/data

How to get the name of a file acting as stdin/stdout?

I'm having the following problem. I want to write a program in Fortran90 which I want to be able to call like this:
./program.x < main.in > main.out
Additionally to "main.out" (whose name I can set when calling the program), secondary outputs have to be written and I wanted them to have a similar name to either "main.in" or "main.out" (they are not actually called "main"); however, when I use:
INQUIRE(UNIT=5,NAME=sInputName)
The content of sInputName becomes "Stdin" instead of the name of the file. Is there some way to obtain the name of files that are linked to stdin/stdout when the program is called??
Unfortunately the point of i/o redirection is that you're program doesn't have to know what the input/output files are. On unix based systems you cannot look at the command line arguments as the < main.in > main.out are actually processed by the shell which uses these files to set up standard input and output before your program is invoked.
You have to remember that sometimes the standard input and output will not even be files, as they could be a terminal or a pipe. e.g.
./generate_input | ./program.x | less
So one solution is to redesign your program so that the output file is an explicit argument.
./program.x --out=main.out
That way your program knows the filename. The cost is that your program is now responsible for openning (and maybe creating) the file.
That said, on linux systems you can actually find yout where your standard file handles are pointing from the special /proc filesystem. There will be symbolic links in place for each file descriptor
/proc/<process_id>/fd/0 -> standard_input
/proc/<process_id>/fd/1 -> standard_output
/proc/<process_id>/fd/2 -> standard_error
Sorry, I don't know fortran, but a psudeo code way of checking the output file could be:
out_name = realLink( "/proc/"+getpid()+"/fd/1" )
if( isNormalFile( out_name ) )
...
Keep in mind what I said earlier, there is no garauntee this will actually be a normal file. It could be a terminal device, a pipe, a network socket, whatever... Also, I do not know what other operating systems this works on other than redhat/centos linux, so it may not be that portable. More a diagnostic tool.
Maybe the intrinsic subroutines get_command and/or get_command_argument can be of help. They were introduced in fortran 2003, and either return the full command line which was used to invoke the program, or the specified argument.

Resources