Print permissions from file arguments in Bash script - linux

I'm having trouble reading the permissions of file arguments. I looks like it has something to do with hidden files but I'm not sure why.
Current Code:
#!/bin/bash
if [ $# = 0 ]
then
echo "Usage ./checkPerm filename [filename2 ... filenameN]"
exit 0
fi
for file in $#
do
ls -l | grep $file | cut -f1 -d' '
# Do Something
done
I can get the permissions for each input, but when a hidden file is run through through the loop it re-prints the permissions of all files.
-bash-4.1$ ll test*
-rw-r--r-- 1 user joe 0 Nov 11 19:07 test1
-r-xr-xr-x 1 user joe 0 Nov 11 19:07 test2*
-r--r----- 1 user joe 0 Nov 11 19:07 test3
-rwxr-x--- 1 user joe 0 Nov 11 19:07 test4*
-bash-4.1$ ./checkPerm test*
-rw-r--r--
-rw-r--r--
-r-xr-xr-x
-r--r-----
-rwxr-x---
-r--r-----
-rw-r--r--
-r-xr-xr-x
-r--r-----
-rwxr-x---
-bash-4.1$
What is going on in the loop?

It's your grep:
ls -l | grep 'test2*'
This will grep out anything starting with test since you're basically asking for anything starting with test that might end with 0 or more 2s in it, as specified by the 2*.
To get your intended result, simply remove your loop and replace it with this:
ls -l "$#" | cut -d' ' -f1
Or keep your loop, but remove the grep:
ls -l $file | cut -d' ' -f1
Also, technically, none of those files are hidden. Hidden files in bash start with ., like .bashrc.

When you do the ls -l inside the loop and then grep the results, if there are files that contain test1 in the name, but not at the start, they are selected by the grep, giving you extra results. You could see that by doing:
ls -l | grep test
and seeing that there are many more entries than the 4 you get with ls -l test*.
Inside your loop, you should probably use just:
ls -ld "$file" | cut -d' ' -f1

Related

How to only display owner of file when using ls command with special edge case

My objective is to find all files in a directory recursively and display only the file owner name so I'm able to use uniq to count the # of files a user owns in a directory. The command I am using is the following:
command = "find " + subdirectory.directoryPath + "/ -type f -exec ls -lh {} + | cut -f 3 -d' ' | sort | uniq -c | sort -n"
This command successfully displays only the owner of the file of each line, and allows me to count of the # of times the owner names is repeated, hence getting the # of files they own in a subdirectory. Cut uses ' ' as a delimiter and only keeps the 3rd column in ls, which is the owner of the file.
However, for my purpose there is this special edge case, where I'm not able to obtain the owner name if the following occurs.
-rw-r----- 1 31122918 group 20169510233 Mar 17 06:02
-rw-r----- 1 user1 group 20165884490 Mar 25 11:11
-rw-r----- 1 user1 group 20201669165 Mar 31 04:17
-rwxr-x--- 1 user3 group 20257297418 Jun 2 13:25
-rw-r----- 1 user2 group 20048291543 Mar 4 22:04
-rw-r----- 1 14235912 group 20398346003 Mar 10 04:47
The special edge cases are the #s as the owner you see above. The current command Im using can detect user1,user2,and user3 perfectly, but because the numbers are placed all the way to the right, the command above doesn't detect the numbers, and simply displays nothing. Example output is shown here:
1
1 user3
1 user2
1
2 user1
Can anyone help me parse the ls output so I'm able to detect these #'s when trying to only print the file owner column?
cut -d' ' won't capture the third field when it contains leading spaces -- each space is treated as the separator of another field.
Alternatives:
cut -c
123456789X123456789X123456789X123456789X123456789L0123456789X0123
-rw-r----- 1 31122918 group 20169510233 Mar 17 06:02
-rw-r----- 1 user1 group 20165884490 Mar 25 11:11
The data you seek is between characters 15 and 34 on each line, so you can say
cut -c14-39
perl/awk: other tools are adept at extracting data out of a line. Try one of
perl -lane 'print $F[2]'
awk '{print $3}'
Don't try to parse the output of ls. Use the stat command.
find dirname ! -user root -type f -exec stat --format=%U {} + | sort | uniq -c | sort -n
%U prints the owner username.
Merging multiple spaces
tr -s ' '
Get file users
ls -hl | tr -s ' ' | cut -f 3 -d' '
ls -hl | awk '{print $3}'
sudo find ./ ! -user root -type f -exec ls -lh {} + | tr -s ' ' | cut -f 3 -d' ' | sort | uniq -c | sort -n
You can use the below command to display only the owner of a directory or a file.
stat -c "%U" /path/of/the/file/or/directory
If you also want to print the group of a file or directory you can use %G as well.
stat -c "%U %G" /path/of/the/file/or/directory

Use Bash Perl to fetch substring off of command output

Let's presume the text I'm working with is (which is outputted by pecl install xdebug):
| - A list of all settings: https://xdebug.org/docs-settings.php |
| - A list of all functions: https://xdebug.org/docs-functions.php |
| - Profiling instructions: https://xdebug.org/docs-profiling2.php |
| - Remote debugging: https://xdebug.org/docs-debugger.php |
| |
| |
| NOTE: Please disregard the message |
| You should add "extension=xdebug.so" to php.ini |
| that is emitted by the PECL installer. This does not work for |
| Xdebug. |
| |
+----------------------------------------------------------------------+
running: find "/tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2" | xargs ls -dils
1078151 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2
1078337 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr
1078338 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local
1078339 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local/lib
1078340 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local/lib/php
1078341 4 drwxr-xr-x 3 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local/lib/php/extensions
1078342 4 drwxr-xr-x 2 root root 4096 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local/lib/php/extensions/no-debug-non-zts-20180731
1078336 2036 -rwxr-xr-x 1 root root 2084800 Feb 3 17:40 /tmp/pear/temp/pear-build-defaultuserNxuIJy/install-xdebug-2.9.2/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so
Build process completed successfully
Installing '/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so'
install ok: channel://pecl.php.net/xdebug-2.9.2
configuration option "php_ini" is not set to php.ini location
You should add "zend_extension=/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so" to php.ini
I want to extract this part off of this output and save it in a variable for later use:
zend_extension=/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so
I have attempted doing it like this with Perl without success:
echo $OUTPUT | perl -lne 'm/You should add "(.*)"/; print $1'
How do I get the substring dynamically with perl? What's the pattern that I need to use?
With the $OUTPUT text placed in a file output.txt
cat output.txt | perl -wnE'say $1 if /You should add "(zend_extension=.*)"/'
This uses the specifics of the shown text, in particular the seemingly unique zend_extension=... preface for the path, to distinguish the needed line from an earlier "You should add" pattern. Change as needed, to what is more suitable for your problem.
If the text is thrown at the one-liner as one string in your code then add -0777 flag to test.
Otherwise please clarify how that $OUTPUT comes about.
Tested with a bash script
#!/bin/bash
# Last modified: 2020 Feb 03 (12:58)
OUTPUT=$(cat "output.txt")
echo $OUTPUT | perl -wnE'say $1 if /You should add "(zend_extension=.*)"/'
where output.txt is a file with the text from the question, and the right line is printed.
You can use this perl:
perl -lne 'print $1 if /You should add "(?!extension=xdebug\.so)([^"]+)"/' <<< "$OUTPUT"
zend_extension=/usr/local/lib/php/extensions/no-debug-non-zts-20180731/xdebug.so
Negative lookahead (?!extension=xdebug\.so) will ignore line extension=xdebug.so in output.
Alternatively you may match You should add at the line start:
perl -lne 'print $1 if /^You should add "([^"]+)"/' <<< "$OUTPUT"
Probably OP meant to use
echo $OUTPUT | perl -ne 'm/You should add "(.*)"/ && print $1'
or
echo $OUTPUT | perl -ne 'print $1 if m/You should add "(.*)"/'

Sort files in directory and then printing the content

I need to write a script to sort filenames by the character that comes after the first "0" in the name. All the file names contain at least one 0.
Then the script should print the content of each file by that order.
I know i need to use sort and cat. But i can't figure out what sort. This is as far as I've got.
#!/bin/bash
dir=$(pwd)
for n in $dir `ls | sort -u ` ; do
cat $n
done;
Assuming that
the first zero could be anywhere in the filename,
there could be several files with the same name after the zero,
you want to be able to handle any filename, including dotfiles and names containing newlines, and
you have GNU CoreUtils installed (standard on common distros),
you'll need to do something crazy like this (untested):
find . -mindepth 1 -maxdepth 1 -exec printf '%s\0' {} + | while IFS= read -r -d ''
do
printf '%s\0' "${REPLY#*0}"
done | sort --unique --zero-terminated | while IFS= read -r -d ''
do
for file in ./*"$REPLY"
do
[…]
done
done
Explanation:
Print all filenames NUL separated and read them back in to be able to do variable substitution on them.
Remove everything up to and including the first zero in the filename and print that.
Sort by the remainder of the filename, making sure to only print each unique suffix once.
Process each file ending with the (now sorted) suffix.
Take a look at this find + xargs that will correctly handle filenames with "funny characters":
find . -maxdepth 1 -type f -name '*0*' -print0 | sort -zt0 -k2 | xargs -0 cat
You could write a script that looks like this:
#/bin/bash
# using "shopt -s nullglob" so that an empty directory won't give you a literal '*'.
shopt -s nullglob
# get a sorted directory listing
filelist=$(for i in .*0* *0*; do echo "$i"; done | sort -t0 -k2)
IFS=$(echo -en "\n\b")
# iterate over your sorted list
for f in $filelist
do
# just cat text files.
file $f | grep text > /dev/null 2>&1
if [ $? = 0 ]
then
cat $f
fi
done
Test:
[plankton#localhost SO_scripts]$ ls -l
total 40
-rw-r--r-- 1 plankton plankton 10 Sep 9 10:56 afile0zzz
-rw-r--r-- 1 plankton plankton 14 Sep 9 10:56 bfile xxx0yyy
-rwxr-xr-x 1 plankton plankton 488 Sep 9 10:56 catfiles.sh
-rw-r--r-- 1 plankton plankton 9 Sep 9 10:56 file0123
-rw-r--r-- 1 plankton plankton 9 Sep 9 10:56 file0124
-rw-r--r-- 1 plankton plankton 7 Sep 9 10:56 file0a
-rw-r--r-- 1 plankton plankton 8 Sep 9 10:56 file0aa
-rw-r--r-- 1 plankton plankton 7 Sep 9 10:56 file0b
-rw-r--r-- 1 plankton plankton 9 Sep 9 10:56 file0bbb
-rw-r--r-- 1 plankton plankton 18 Sep 9 10:56 files*_0asdf
[plankton#localhost SO_scripts]$ ./catfiles.sh
. is not a text file
.. is not a text file
Doing catfiles.sh
#/bin/bash
# using "shopt -s nullglob" so that an empty directory won't give you a literal '*'.
shopt -s nullglob
# get a sorted directory listing
filelist=$(for i in .* *; do echo "$i"; done | sort -t0 -k2)
IFS=$(echo -en "\n\b")
# iterate over your sorted list
for f in $(for i in .* *; do echo "$i"; done | sort -t0 -k2)
do
# just cat text files.
file $f | grep text > /dev/null 2>&1
if [ $? = 0 ]
then
echo "Doing $f"
cat $f
else
echo "$f is not a text file"
fi
done
Doing file0123
file0123
Doing file0124
file0124
Doing file0a
file0a
Doing file0aa
file0aa
Doing files*_0asdf
file with * in it
Doing file0b
file0b
Doing file0bbb
file0bbb
Doing bfile xxx0yyy
bfile xxx0yyy
Doing afile0zzz
afile0zzz
Updated as per PesaThe's suggestion of .*0* *0*.
dir=$(pwd)
for n in `ls -1 $dir | sort -t0 -k2`; do
cat $n
done;

Find regular expression matching condition

I have a set of files including a date in their name:
MERRA2_400.tavg1_2d_slv_Nx.20151229.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151230.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151231.SUB.nc
I want to select the files matching a condition on this date. In this example: date > 20151230
I tried things like:
find . -regex ".*.SUB.nc" | cut -d "." -f 4 | while read a; do if [ $a -ge 20151201 ]; then echo $a; fi; done
BUT:
1) This is returning only a part of the filename, whereas I would like to return the entire filename.
2) There may be a more elegant way than using while read/do
thanks in advance!
Rearranging your code becomes:
#!/usr/bin/env bash
find . -regex ".*.SUB.nc" \
| rev | cut -d '.' -f 3 | rev \
| while read a; do
if [ $a -ge 20151201 ]; then
echo $a
fi
done
rev | cut -d '.' -f 3 | rev is used because
if you give absolute path or
the subdirectories have . in them
then it won't be the 4th field, but it will always be the 3rd last field.
This will give the output:
20151231
20151229
20151230
To show the complete file names change echo $a with ls *$a*. Output:
MERRA2_400.tavg1_2d_slv_Nx.20151231.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151229.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151230.SUB.nc
I tested this script with file names whose dates are less than 20151201. For example MERRA2_400.tavg1_2d_slv_Nx.20151200.SUB.nc. The results are consistent.
Perhaps a more efficient way to accomplish your task is using a grep regex like:
find . -regex ".*.SUB.nc" | grep -E "201512(0[1-9]|[1-9][0-9])|201[6-9][0-9][0-9][0-9]"
This will work just fine.
find . -regex ".*.SUB.nc" | rev | cut -d '.' -f 3 | rev | while read a; do if [ $a -ge 20151201 ]; then echo `ls -R | grep $a` ;fi ;done
rev | cut -d '.' -f 3 | rev is used because
if you give absolute path or
the subdirectories have . in them
then it won't be the 4th field now, but it will always be the 3rd last field always.
ls -R | grep $a so that you can recursively find out the name of the file.
Assume is the files and file structure is :
[root#localhost temp]# ls -lrt -R
.:
total 8
-rw-r--r--. 1 root root 0 Apr 25 16:15 MERRA2_400.tavg1_2d_slv_Nx.20151231.SUB.nc
-rw-r--r--. 1 root root 0 Apr 25 16:15 MERRA2_400.tavg1_2d_slv_Nx.20151230.SUB.nc
-rw-r--r--. 1 root root 0 Apr 25 16:15 MERRA2_400.tavg1_2d_slv_Nx.20151229.SUB.nc
drwxr-xr-x. 2 root root 4096 Apr 25 16:32 temp.3
drwxr-xr-x. 3 root root 4096 Apr 25 17:13 temp2
./temp.3:
total 0
./temp2:
total 4
-rw-r--r--. 1 root root 0 Apr 25 16:27 MERRA2_400.tavg1_2d_slv_Nx.20151111.SUB.nc
-rw-r--r--. 1 root root 0 Apr 25 16:27 MERRA2_400.tavg1_2d_slv_Nx.20151222.SUB.nc
drwxr-xr-x. 2 root root 4096 Apr 25 17:13 temp21
./temp2/temp21:
total 0
-rw-r--r--. 1 root root 0 Apr 25 17:13 MERRA2_400.tavg1_2d_slv_Nx.20151333.SUB.nc
Running above command gives :
MERRA2_400.tavg1_2d_slv_Nx.20151229.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151231.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151230.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151333.SUB.nc
MERRA2_400.tavg1_2d_slv_Nx.20151222.SUB.nc

Split texts into smaller texts of n number of words

I have a large number of texts (several thousand) in a txt format and would like to split them into 500-word long chunks and to save these chunks into separate folders.
< *.txt tr -c A-Za-z0-9 \\n | grep -v '^$' | split -l 500
can do the job but it splits texts to one word per line, whereas I would like to retain the original format.
I was wondering if there is a bash command or Python script to do this.
You should also be able to do that with csplit, but I had better luck with the perl solution found here; https://unix.stackexchange.com/questions/66513/how-can-i-split-a-large-text-file-into-chunks-of-500-words-or-so
Thanks to Joseph R.
$ cat generatewordchunks.pl
perl -e '
undef $/;
$file=<>;
while($file=~ /\G((\S+\s+){500})/gc)
{
$i++;
open A,">","chunk-$i.txt";
print A $1;
close A;
}
$i++;
if($file=~ /\G(.+)\Z/sg)
{
open A,">","chunk-$i.txt";
print A $1;
}
' $1
$ ./generatewordchunks.pl woord.list
$ ls -ltr
total 13
-rwxrwx--- 1 root vboxsf 5934 Jul 31 16:03 woord.list
-rwxrwx--- 1 root vboxsf 362 Jul 31 16:08 generatewordchunks.pl
-rwxrwx--- 1 root vboxsf 4203 Jul 31 16:11 chunk-1.txt
-rwxrwx--- 1 root vboxsf 1731 Jul 31 16:11 chunk-2.txt

Resources