How can a shell script determine if a file is binary or text?

How can a shell script determine if a file is binary or text? - linux

Have a situation where I need a shell or bash script to determine if a file is binary or not. The issue here is that the linux environment does not have file available and the grep version is from busybox which doesn't support -I. I found a perl method (the perl version is old supports -e but not -E) of it working but it's slow. Does anyone have a faster method of determining if a file is binary? TIA!!
#!/bin/sh
is_text_file() {
# grep -qI '.' "$1" ### busy box grep doesn't support
perl -e 'exit((-B $ARGV[0])?1:0);' "$1" ### works but is slow
}
do_test_text_file_on_dir() {
for f in "$1"/*; do
[ -f "$f" ] || continue
if is_text_file "$f"; then
echo "$f" is not a binary file
fi
done
}
do_test_text_file_on_dir ~/testdir

Avoid the time it takes to repeatedly load perl by doing all the work in Perl.
#!/usr/bin/perl
for (#ARGV) {
stat($_)
or warn("Can't stat \"$_\": $!\n"), next;
-f _ && !-B _
or next;
print("\"$_\" isn't a binary file\n");
}
Usage:
do_test_text_file_on_dir ~/testdir/*
Note: !-B _ is equivalent to -T _ except for empty files.

Related

How to develop a Condition to close program only when log file has been updated in Bash Script [duplicate]

I want to run a shell script when a specific file or directory changes.
How can I easily do that?

You may try entr tool to run arbitrary commands when files change. Example for files:
$ ls -d * | entr sh -c 'make && make test'
or:
$ ls *.css *.html | entr reload-browser Firefox
or print Changed! when file file.txt is saved:
$ echo file.txt | entr echo Changed!
For directories use -d, but you've to use it in the loop, e.g.:
while true; do find path/ | entr -d echo Changed; done
or:
while true; do ls path/* | entr -pd echo Changed; done

I use this script to run a build script on changes in a directory tree:
#!/bin/bash -eu
DIRECTORY_TO_OBSERVE="js" # might want to change this
function block_for_change {
inotifywait --recursive \
--event modify,move,create,delete \
$DIRECTORY_TO_OBSERVE
}
BUILD_SCRIPT=build.sh # might want to change this too
function build {
bash $BUILD_SCRIPT
}
build
while block_for_change; do
build
done
Uses inotify-tools. Check inotifywait man page for how to customize what triggers the build.

Use inotify-tools.
The linked Github page has a number of examples; here is one of them.
#!/bin/sh
cwd=$(pwd)
inotifywait -mr \
--timefmt '%d/%m/%y %H:%M' --format '%T %w %f' \
-e close_write /tmp/test |
while read -r date time dir file; do
changed_abs=${dir}${file}
changed_rel=${changed_abs#"$cwd"/}
rsync --progress --relative -vrae 'ssh -p 22' "$changed_rel" \
usernam#example.com:/backup/root/dir && \
echo "At ${time} on ${date}, file $changed_abs was backed up via rsync" >&2
done

How about this script? Uses the 'stat' command to get the access time of a file and runs a command whenever there is a change in the access time (whenever file is accessed).
#!/bin/bash
while true
do
ATIME=`stat -c %Z /path/to/the/file.txt`
if [[ "$ATIME" != "$LTIME" ]]
then
echo "RUN COMMNAD"
LTIME=$ATIME
fi
sleep 5
done

Check out the kernel filesystem monitor daemon
http://freshmeat.net/projects/kfsmd/
Here's a how-to:
http://www.linux.com/archive/feature/124903

As mentioned, inotify-tools is probably the best idea. However, if you're programming for fun, you can try and earn hacker XPs by judicious application of tail -f .

Just for debugging purposes, when I write a shell script and want it to run on save, I use this:
#!/bin/bash
file="$1" # Name of file
command="${*:2}" # Command to run on change (takes rest of line)
t1="$(ls --full-time $file | awk '{ print $7 }')" # Get latest save time
while true
do
t2="$(ls --full-time $file | awk '{ print $7 }')" # Compare to new save time
if [ "$t1" != "$t2" ];then t1="$t2"; $command; fi # If different, run command
sleep 0.5
done
Run it as
run_on_save.sh myfile.sh ./myfile.sh arg1 arg2 arg3
Edit: Above tested on Ubuntu 12.04, for Mac OS, change the ls lines to:
"$(ls -lT $file | awk '{ print $8 }')"

Add the following to ~/.bashrc:
function react() {
if [ -z "$1" -o -z "$2" ]; then
echo "Usage: react <[./]file-to-watch> <[./]action> <to> <take>"
elif ! [ -r "$1" ]; then
echo "Can't react to $1, permission denied"
else
TARGET="$1"; shift
ACTION="$#"
while sleep 1; do
ATIME=$(stat -c %Z "$TARGET")
if [[ "$ATIME" != "${LTIME:-}" ]]; then
LTIME=$ATIME
$ACTION
fi
done
fi
}

Quick solution for fish shell users who wanna track a single file:
while true
set old_hash $hash
set hash (md5sum file_to_watch)
if [ $hash != $old_hash ]
command_to_execute
end
sleep 1
end
replace md5sum with md5 if on macos.

Here's another option: http://fileschanged.sourceforge.net/
See especially "example 4", which "monitors a directory and archives any new or changed files".

inotifywait can satisfy you.
Here is a common sample for it:
inotifywait -m /path -e create -e moved_to -e close_write | # -m is --monitor, -e is --event
while read path action file; do
if [[ "$file" =~ .*rst$ ]]; then # if suffix is '.rst'
echo ${path}${file} ': '${action} # execute your command
echo 'make html'
make html
fi
done

Suppose you want to run rake test every time you modify any ruby file ("*.rb") in app/ and test/ directories.
Just get the most recent modified time of the watched files and check every second if that time has changed.
Script code
t_ref=0; while true; do t_curr=$(find app/ test/ -type f -name "*.rb" -printf "%T+\n" | sort -r | head -n1); if [ $t_ref != $t_curr ]; then t_ref=$t_curr; rake test; fi; sleep 1; done
Benefits
You can run any command or script when the file changes.
It works between any filesystem and virtual machines (shared folders on VirtualBox using Vagrant); so you can use a text editor on your Macbook and run the tests on Ubuntu (virtual box), for example.
Warning
The -printf option works well on Ubuntu, but do not work in MacOS.

How can I test if a file could be marked as executable and run? [duplicate]

I am wondering what's the easiest way to check if a program is executable with bash, without executing it ? It should at least check whether the file has execute rights, and is of the same architecture (for example, not a windows executable or another unsupported architecture, not 64 bits if the system is 32 bits, ...) as the current system.

Take a look at the various test operators (this is for the test command itself, but the built-in BASH and TCSH tests are more or less the same).
You'll notice that -x FILE says FILE exists and execute (or search) permission is granted.
BASH, Bourne, Ksh, Zsh Script
if [[ -x "$file" ]]
then
echo "File '$file' is executable"
else
echo "File '$file' is not executable or found"
fi
TCSH or CSH Script:
if ( -x "$file" ) then
echo "File '$file' is executable"
else
echo "File '$file' is not executable or found"
endif
To determine the type of file it is, try the file command. You can parse the output to see exactly what type of file it is. Word 'o Warning: Sometimes file will return more than one line. Here's what happens on my Mac:
$ file /bin/ls
/bin/ls: Mach-O universal binary with 2 architectures
/bin/ls (for architecture x86_64): Mach-O 64-bit executable x86_64
/bin/ls (for architecture i386): Mach-O executable i386
The file command returns different output depending upon the OS. However, the word executable will be in executable programs, and usually the architecture will appear too.
Compare the above to what I get on my Linux box:
$ file /bin/ls
/bin/ls: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), stripped
And a Solaris box:
$ file /bin/ls
/bin/ls: ELF 32-bit MSB executable SPARC Version 1, dynamically linked, stripped
In all three, you'll see the word executable and the architecture (x86-64, i386, or SPARC with 32-bit).
Addendum
Thank you very much, that seems the way to go. Before I mark this as my answer, can you please guide me as to what kind of script shell check I would have to perform (ie, what kind of parsing) on 'file' in order to check whether I can execute a program ? If such a test is too difficult to make on a general basis, I would at least like to check whether it's a linux executable or osX (Mach-O)
Off the top of my head, you could do something like this in BASH:
if [ -x "$file" ] && file "$file" | grep -q "Mach-O"
then
echo "This is an executable Mac file"
elif [ -x "$file" ] && file "$file" | grep -q "GNU/Linux"
then
echo "This is an executable Linux File"
elif [ -x "$file" ] && file "$file" | grep q "shell script"
then
echo "This is an executable Shell Script"
elif [ -x "$file" ]
then
echo "This file is merely marked executable, but what type is a mystery"
else
echo "This file isn't even marked as being executable"
fi
Basically, I'm running the test, then if that is successful, I do a grep on the output of the file command. The grep -q means don't print any output, but use the exit code of grep to see if I found the string. If your system doesn't take grep -q, you can try grep "regex" > /dev/null 2>&1.
Again, the output of the file command may vary from system to system, so you'll have to verify that these will work on your system. Also, I'm checking the executable bit. If a file is a binary executable, but the executable bit isn't on, I'll say it's not executable. This may not be what you want.

Seems nobody noticed that -x operator does not differ file with directory.
So to precisely check an executable file, you may use
[[ -f SomeFile && -x SomeFile ]]

Testing files, directories and symlinks
The solutions given here fail on either directories or symlinks (or both). On Linux, you can test files, directories and symlinks with:
if [[ -f "$file" && -x $(realpath "$file") ]]; then .... fi
On OS X, you should be able to install coreutils with homebrew and use grealpath.
Defining an isexec function
You can define a function for convenience:
isexec() {
if [[ -f "$1" && -x $(realpath "$1") ]]; then
true;
else
false;
fi;
}
Or simply
isexec() { [[ -f "$1" && -x $(realpath "$1") ]]; }
Then you can test using:
if `isexec "$file"`; then ... fi

Also seems nobody noticed -x operator on symlinks. A symlink (chain) to a regular file (not classified as executable) fails the test.

First you need to remember that in Unix and Linux, everything is a file, even directories. For a file to have the rights to be executed as a command, it needs to satisfy 3 conditions:
It needs to be a regular file
It needs to have read-permissions
It needs to have execute-permissions
So this can be done simply with:
[ -f "${file}" ] && [ -r "${file}" ] && [ -x "${file}" ]
If your file is a symbolic link to a regular file, the test command will operate on the target and not the link-name. So the above command distinguishes if a file can be used as a command or not. So there is no need to pass the file first to realpath or readlink or any of those variants.
If the file can be executed on the current OS, that is a different question. Some answers above already pointed to some possibilities for that, so there is no need to repeat it here.

To test whether a file itself has ACL_EXECUTE bit set in any of permission sets (user, group, others) regardless of where it resides, i. e. even on a tmpfs with noexec option, use stat -c '%A' to get the permission string and then check if it contains at least a single “x” letter:
if [[ "$(stat -c '%A' 'my_exec_file')" == *'x'* ]] ; then
echo 'Has executable permission for someone'
fi
The right-hand part of comparison may be modified to fit more specific cases, such as *x*x*x* to check whether all kinds of users should be able to execute the file when it is placed on a volume mounted with exec option.

This might be not so obvious, but sometime is required to test the executable to appropriately call it without an external shell process:
function tkl_is_file_os_exec()
{
[[ ! -x "$1" ]] && return 255
local exec_header_bytes
case "$OSTYPE" in
cygwin* | msys* | mingw*)
# CAUTION:
# The bash version 3.2+ might require a file path together with the extension,
# otherwise will throw the error: `bash: ...: No such file or directory`.
# So we make a guess to avoid the error.
#
{
read -r -n 4 exec_header_bytes 2> /dev/null < "$1" ||
{
[[ -x "${1%.exe}.exe" ]] && read -r -n 4 exec_header_bytes 2> /dev/null < "${1%.exe}.exe"
} ||
{
[[ -x "${1%.com}.com" ]] && read -r -n 4 exec_header_bytes 2> /dev/null < "${1%.com}.com"
}
} &&
if [[ "${exec_header_bytes:0:3}" == $'MZ\x90' ]]; then
# $'MZ\x90\00' for bash version 3.2.42+
# $'MZ\x90\03' for bash version 4.0+
[[ "${exec_header_bytes:3:1}" == $'\x00' || "${exec_header_bytes:3:1}" == $'\x03' ]] && return 0
fi
;;
*)
read -r -n 4 exec_header_bytes < "$1"
[[ "$exec_header_bytes" == $'\x7fELF' ]] && return 0
;;
esac
return 1
}
# executes script in the shell process in case of a shell script, otherwise executes as usual
function tkl_exec_inproc()
{
if tkl_is_file_os_exec "$1"; then
"$#"
else
. "$#"
fi
return $?
}
myscript.sh:
#!/bin/bash
echo 123
return 123
In Cygwin:
> tkl_exec_inproc /cygdrive/c/Windows/system32/cmd.exe /c 'echo 123'
123
> tkl_exec_inproc /cygdrive/c/Windows/system32/chcp.com 65001
Active code page: 65001
> tkl_exec_inproc ./myscript.sh
123
> echo $?
123
In Linux:
> tkl_exec_inproc /bin/bash -c 'echo 123'
123
> tkl_exec_inproc ./myscript.sh
123
> echo $?
123

bash script porting issue related to script program

I am trying to port an existing bash script to Solaris and FreeBSD. It works fine on Fedora and Ubuntu.
This bash script uses the following set of commands to flush the output to the temporary file.
file=$(mktemp)
# record test_program output into a temp file
script -qfc "test_program arg1" "$file" </dev/null &
The script program does not have -qfc options on FreeBSD and Solaris. On Solaris and FreeBSD, script program only has -a option. I have done the following until now:
1) update to latest version of bash. This did not help.
2) Try to find out where exactly is the source code of "script" program is. I could not find it either.
Can somebody help me out here?

script is a standalone program, not part of the shell, and as you noticed, only the -a flag is available in all variants. The FreeBSD version supports something similar to -f (-F <file>) and doesn't need -c.
Here's an ugly but more portable solution:
buildsh() {
cat <<-!
#!/bin/sh
SHELL="$SHELL" exec \\
!
# Build quoted argument list
while [ $# != 0 ]; do echo "$1"; shift; done |
sed 's/'\''/'\'\\\\\'\''/g;s/^/'\''/;s/$/'\''/;!$s/$/ \\/'
}
# Build a shell script with the arguments and run it within `script`
record() {
local F t="$(mktemp)" f="$1"
shift
case "$(uname -s)" in
Linux) F=-f ;;
FreeBSD) F=-F ;;
esac
buildsh "$#" > "$t" &&
chmod 500 "$t" &&
SHELL="$t" script $F "$f" /dev/null
rm -f "$t"
sed -i '1d;$d' "$f" # Emulate -q
}
file=$(mktemp)
# record test_program output into a temp file
record "$file" test_program arg1 </dev/null &

List files greater than 100K in bash

I want to list the files recursively in the HOME directory. I'm trying to write my own script , so I should not use the command find or ls. My script is:
#!/bin/bash
minSize=102400;
printFiles() {
for x in "$1/"*; do
if [ -d "$x" ]; then
printFiles "$x";
else
size=$(wc -c "$x");
if [[ "$size" -gt "$minSize" ]]; then
echo "$size";
fi
fi
done
}
printFiles "/~";
So, the problem here is that when I run this script, the terminal throws Line 11: division by 0 and /home/gandalf/Videos/*: No such file or directory. I have not divided by any number, why I'm getting this error?. And the second one?
Alternatively, I can't use find or ls because I have to display the files one by one asking to the user if he want to see the next file or not. This is possible using the command find or ls or only can be done writing my own function?
Thanks.

size=$(wc -c "$x");
That's the line that is failing. When you run that wc command manually you should be able to see why:
$ wc -c /tmp/out
5 /tmp/out
The output contains not only the file size but also the file name. So you can't use $size with the -gt comparator on the next line. One way to fix that is to change the wc line to use cut (or awk, or sed, etc) to keep just the file size.
size=$(wc -c "$x" | cut -f1 -d " ")
A simpler alternative suggested by #mklement0:
size=$(wc -c < "$x")

How do I check if a file is executable on Linux [duplicate]

I am wondering what's the easiest way to check if a program is executable with bash, without executing it ? It should at least check whether the file has execute rights, and is of the same architecture (for example, not a windows executable or another unsupported architecture, not 64 bits if the system is 32 bits, ...) as the current system.

Seems nobody noticed that -x operator does not differ file with directory.
So to precisely check an executable file, you may use
[[ -f SomeFile && -x SomeFile ]]

Testing files, directories and symlinks
The solutions given here fail on either directories or symlinks (or both). On Linux, you can test files, directories and symlinks with:
if [[ -f "$file" && -x $(realpath "$file") ]]; then .... fi
On OS X, you should be able to install coreutils with homebrew and use grealpath.
Defining an isexec function
You can define a function for convenience:
isexec() {
if [[ -f "$1" && -x $(realpath "$1") ]]; then
true;
else
false;
fi;
}
Or simply
isexec() { [[ -f "$1" && -x $(realpath "$1") ]]; }
Then you can test using:
if `isexec "$file"`; then ... fi

Also seems nobody noticed -x operator on symlinks. A symlink (chain) to a regular file (not classified as executable) fails the test.

First you need to remember that in Unix and Linux, everything is a file, even directories. For a file to have the rights to be executed as a command, it needs to satisfy 3 conditions:
It needs to be a regular file
It needs to have read-permissions
It needs to have execute-permissions
So this can be done simply with:
[ -f "${file}" ] && [ -r "${file}" ] && [ -x "${file}" ]
If your file is a symbolic link to a regular file, the test command will operate on the target and not the link-name. So the above command distinguishes if a file can be used as a command or not. So there is no need to pass the file first to realpath or readlink or any of those variants.
If the file can be executed on the current OS, that is a different question. Some answers above already pointed to some possibilities for that, so there is no need to repeat it here.

To test whether a file itself has ACL_EXECUTE bit set in any of permission sets (user, group, others) regardless of where it resides, i. e. even on a tmpfs with noexec option, use stat -c '%A' to get the permission string and then check if it contains at least a single “x” letter:
if [[ "$(stat -c '%A' 'my_exec_file')" == *'x'* ]] ; then
echo 'Has executable permission for someone'
fi
The right-hand part of comparison may be modified to fit more specific cases, such as *x*x*x* to check whether all kinds of users should be able to execute the file when it is placed on a volume mounted with exec option.

This might be not so obvious, but sometime is required to test the executable to appropriately call it without an external shell process:
function tkl_is_file_os_exec()
{
[[ ! -x "$1" ]] && return 255
local exec_header_bytes
case "$OSTYPE" in
cygwin* | msys* | mingw*)
# CAUTION:
# The bash version 3.2+ might require a file path together with the extension,
# otherwise will throw the error: `bash: ...: No such file or directory`.
# So we make a guess to avoid the error.
#
{
read -r -n 4 exec_header_bytes 2> /dev/null < "$1" ||
{
[[ -x "${1%.exe}.exe" ]] && read -r -n 4 exec_header_bytes 2> /dev/null < "${1%.exe}.exe"
} ||
{
[[ -x "${1%.com}.com" ]] && read -r -n 4 exec_header_bytes 2> /dev/null < "${1%.com}.com"
}
} &&
if [[ "${exec_header_bytes:0:3}" == $'MZ\x90' ]]; then
# $'MZ\x90\00' for bash version 3.2.42+
# $'MZ\x90\03' for bash version 4.0+
[[ "${exec_header_bytes:3:1}" == $'\x00' || "${exec_header_bytes:3:1}" == $'\x03' ]] && return 0
fi
;;
*)
read -r -n 4 exec_header_bytes < "$1"
[[ "$exec_header_bytes" == $'\x7fELF' ]] && return 0
;;
esac
return 1
}
# executes script in the shell process in case of a shell script, otherwise executes as usual
function tkl_exec_inproc()
{
if tkl_is_file_os_exec "$1"; then
"$#"
else
. "$#"
fi
return $?
}
myscript.sh:
#!/bin/bash
echo 123
return 123
In Cygwin:
> tkl_exec_inproc /cygdrive/c/Windows/system32/cmd.exe /c 'echo 123'
123
> tkl_exec_inproc /cygdrive/c/Windows/system32/chcp.com 65001
Active code page: 65001
> tkl_exec_inproc ./myscript.sh
123
> echo $?
123
In Linux:
> tkl_exec_inproc /bin/bash -c 'echo 123'
123
> tkl_exec_inproc ./myscript.sh
123
> echo $?
123

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string