Linux Bash file Reading Lines and words - linux

I apologize if this is a trivial question. I am learning how to use linux bash and this little task is giving me a headache...
So I need to write a script, let's call it count.sh. I want that: for each file in the working directory, prints the filename, the number of lines, and the number of words to the console:
test.txt 100 1023
someOtherfiles 10 233
So far, I know that the following gives me all the files names in the directory. And thanks for all who helped me, I get this working version:
for f in *; do
echo -n "$f"
cat "$f" | wc -wl
done
I would really appreciate your help! Thanks ahead!
P.s. If you know great resources (links for tutorials) for learning about script and you are willing to share it with me. I think I really need to know these basics. Thanks again!

If you must have the file name as the first field in your output, try this:
for f in *; do
if [ -f "$f" ]; then
echo -n "$f"
cat "$f" | wc -wl
fi
done

for f in *; do
if [[ -f $f ]]; then
echo "$f $(wc -wl < "$f")"
fi
done
[[ -f $f ]] processes only files (excludes subdirectories) and also handles the case where the directory is empty (in which case * is (by default) left unexpanded, i.e. assigned to $f as is).
echo "$f $(wc -wl < "$f")" uses command substitution ($( ... )) to directly include the output from the enclosed command in the output string passed to echo.
Note that the reason that < is used to direct the content of file $f to wc via stdin is that wc would otherwise append the name of the input file to its output (thanks, #R Sahu).

Related

extracting files that doesn't have a dir with the same name

sorry for that odd title. I didn't know how to word it the right way.
I'm trying to write a script to filter my wiki files to those got directories with the same name and the ones without. I'll elaborate further.
here is my file system:
what I need to do is print a list of those files which have directories in their name and another one of those without.
So my ultimate goal is getting:
with dirs:
Docs
Eng
Python
RHEL
To_do_list
articals
without dirs:
orphan.txt
orphan2.txt
orphan3.txt
I managed to get those files with dirs. Here is me code:
getname () {
file=$( basename "$1" )
file2=${file%%.*}
echo $file2
}
for d in Mywiki/* ; do
if [[ -f $d ]]; then
file=$(getname $d)
for x in Mywiki/* ; do
dir=$(getname $x)
if [[ -d $x ]] && [ $dir == $file ]; then
echo $dir
fi
done
fi
done
but stuck with getting those without. if this is the wrong way of doing this please clarify the right one.
any help appreciated. Thanks.
Here's a quick attempt.
for file in Mywiki/*.txt; do
nodir=${file##*/}
test -d "${file%.txt}" && printf "%s\n" "$nodir" >&3 || printf "%s\n" "$nodir"
done >with 3>without
This shamelessly uses standard output for the non-orphans. Maybe more robustly open another separate file descriptor for that.
Also notice how everything needs to be quoted unless you specifically require the shell to do whitespace tokenization and wildcard expansion on the value of a token. Here's the scoop on that.
That may not be the most efficient way of doing it, but you could take all files, remove the extension, and the check if there isn't a directory with that name.
Like this (untested code):
for file in Mywiki/* ; do
if [ -f "$d" ]; then
dirname=$(getname "$d")
if [ ! -d "Mywiki/$dirname" ]; then
echo "$file"
fi
fi
done
To List all the files in current dir
list1=`ls -p | grep -v /`
To List all the files in current dir without extension
list2=`ls -p | grep -v / | sed 's/\.[a-z]*//g'`
To List all the directories in current dir
list3=`ls -d */ | sed -e "s/\///g"`
Now you can get the desired directory listing using intersection of list2 and list3. Intersection of two lists in Bash

Bash scripting wanting to find a size of a directory and if size is greater than x then do a task

I have put the following together with a couple of other articles but it does not seem to be working. What I am trying to do eventually do is for it to check the directory size and then if the directory has new content above a certain total size it will then let me know.
#!/bin/bash
file=private/videos/tv
minimumsize=2
actualsize=$(du -m "$file" | cut -f 1)
if [ $actualsize -ge $minimumsize ]; then
echo "nothing here to see"
else
echo "time to sync"
fi
this is the output:
./sync.sh: line 5: [: too many arguments
time to sync
I am new to bash scripting so thank you in advance.
The error:
[: too many arguments
seems to indicate that either $actualsize or $minimumsize is expanding to more than one argument.
Change your script as follows:
#!/bin/bash
set -x # Add this line.
file=private/videos/tv
minimumsize=2
actualsize=$(du -m "$file" | cut -f 1)
echo "[$actualsize] [$minimumsize]" # Add this line.
if [ $actualsize -ge $minimumsize ]; then
echo "nothing here to see"
else
echo "time to sync"
fi
The set -x will echo commands before attempting to execute them, something which assists greatly with debugging.
The echo "[$actualsize] [$minimumsize]" will assist in trying to establish whether these variables are badly formatted or not, before the attempted comparison.
If you do that, you'll no doubt find that some arguments will result in a lot of output from the du -m command since it descends into subdirectories and gives you multiple lines of output.
If you want a single line of output for all the subdirectories aggregated, you have to use the -s flag as well:
actualsize=$(du -ms "$file" | cut -f 1)
If instead you don't want any of the subdirectories taken into account, you can take a slightly different approach, limiting the depth to one and tallying up all the sizes:
actualsize=$(find . -maxdepth 1 -type f -print0 | xargs -0 ls -al | awk '{s += $6} END {print int(s/1024/1024)}')

How do I search for a file based on what is output by a command running on that file

I am working on a project for one of my professors and he asked me to sort a couple hundred .fits images based on their header files (specifically what star they are images of) I think that grep would be the best way to do this however I can't seam to figure out how to use grep based on the header.
I am entering:
ls | imhead *.fits | grep -E -r "PG\ 1104+243" *
to just list them out for now, once they are listed I know how to copy them into a directory.
I am new to using grep so I am unsure as to where my error lies? any help would be greatly appreciated! Thanks!
Assuming that imghead will extract the headers of the .fits as txt, you can use a simple shell script to do it:
script.sh
#!/bin/bash
grep "$1" "$2" > /dev/null 2>&1 && echo "$2"
Note that the + is a special character if you use extended regular expression, meaning if you pass the -E as in the question. A simple grep without any options should do the trick here.
Use find to exec the script on every *.fits file in the current folder:
find -maxdepth 1 -name '*.fits' -exec ./script.sh 'PG 1104+243' {} \;
If you are going to copy/move/alter or do something with the files you find, you might be better off, in terms of complexity and ease of quoting, using a loop like this:
#!/bin/bash
find . -name \*.fits -print0 | while read -d '' -r file; do
echo Checking file: $file
imhead "$file" | grep -q 'PG 1104+243'
if [ $? -eq 0 ]; then
echo Object matches: $file
fi
done

bash scripts list files in a directory

I'm writing a script that takes an argument which is a directory .
i want to be able to construct list/array with all the files that have a certain extension in that directory and cut their extension .
For example if i have directory containing :
aaa.xx
bbb.yy
ccc.xx
and im searching for *.xx .
my list/array would be : aaa ccc.
I'm trying to use the code in this thread example the accepted answer .
set tests_list=[]
for f in $1/*.bpt
do
echo $f
if [[ ! -f "$f" ]]
then
continue
fi
set tmp=echo $f | cut -d"." -f1
#echo $tmp
tests_list+=$tmp
done
echo ${tests_list[#]}
if i run this script i get that the loop only executes once with $f is tests_list=[]/*.bpt which is weird since $f should be a file name in that directory , and echo empty string.
i validated that i'm in the correct directory and that the argument directory have files with .bpt extensions .
This should work for you:
for file in *.xx ; do echo "${file%.*}" ; done
To expand this to a script that takes an argument as a directory:
#!/bin/bash
dir="$1"
ext='xx'
for file in "$dir"/*."$ext"
do
echo "${file%.*}"
done
edit: switched ls with for - thanks #tripleee for the correction.
filear=($(find path/ -name "*\.xx"))
filears=()
for f in ${filear[#]}; do filears[${#filears[#]}]=${f%\.*}; done

Renaming a set of files to 001, 002,

I originally had a set of images of the form image_001.jpg, image_002.jpg, ...
I went through them and removed several. Now I'd like to rename the leftover files back to image_001.jpg, image_002.jpg, ...
Is there a Linux command that will do this neatly? I'm familiar with rename but can't see anything to order file names like this. I'm thinking that since ls *.jpg lists the files in order (with gaps), the solution would be to pass the output of that into a bash loop or something?
If I understand right, you have e.g. image_001.jpg, image_003.jpg, image_005.jpg, and you want to rename to image_001.jpg, image_002.jpg, image_003.jpg.
EDIT: This is modified to put the temp file in the current directory. As Stephan202 noted, this can make a significant difference if temp is on a different filesystem. To avoid hitting the temp file in the loop, it now goes through image*
i=1; temp=$(mktemp -p .); for file in image*
do
mv "$file" $temp;
mv $temp $(printf "image_%0.3d.jpg" $i)
i=$((i + 1))
done
A simple loop (test with echo, execute with mv):
I=1
for F in *; do
echo "$F" `printf image_%03d.jpg $I`
#mv "$F" `printf image_%03d.jpg $I` 2>/dev/null || true
I=$((I + 1))
done
(I added 2>/dev/null || true to suppress warnings about identical source and target files. If this is not to your liking, go with Matthew Flaschen's answer.)
Some good answers here already; but some rely on hiding errors which is not a good idea (that assumes mv will only error because of a condition that is expected - what about all the other reaons mv might error?).
Moreover, it can be done a little shorter and should be better quoted:
for file in *; do
printf -vsequenceImage 'image_%03d.jpg' "$((++i))"
[[ -e $sequenceImage ]] || \
mv "$file" "$sequenceImage"
done
Also note that you shouldn't capitalize your variables in bash scripts.
Try the following script:
numerate.sh
This code snipped should do the job:
./numerate.sh -d <your image folder> -b <start number> -L 3 -p image_ -s .jpg -o numerically -r
This does the reverse of what you are asking (taking files of the form *.jpg.001 and converting them to *.001.jpg), but can easily be modified for your purpose:
for file in *
do
if [[ "$file" =~ "(.*)\.([[:alpha:]]+)\.([[:digit:]]{3,})$" ]]
then
mv "${BASH_REMATCH[0]}" "${BASH_REMATCH[1]}.${BASH_REMATCH[3]}.${BASH_REMATCH[2]}"
fi
done
I was going to suggest something like the above using a for loop, an iterator, cut -f1 -d "_", then mv i i.iterator. It looks like it's already covered other ways, though.

Resources