Remove zeros inside file name - linux

I have this file name
1006_12_000123123_000023126.data
and I want this file name. I have arround 300000 files.
1006_12_123123_23126.png
I tried som of these solution, but they are for filename like 00002323.jpg
Bash command to remove leading zeros from all file names
I can use mv to rename.

for original_name in *.data; do
# determine new file name from original:
# remove zeroes and change extension.
new_name=$(echo "$original_name" | sed -e 's/_0*/_/g' -e s'/.data$/.png/')
mv "$original_name" "$new_name"
done

Use this
ls * | sed -e 'p;s/_0*/_/g' | xargs -n2 mv

Related

Bash script to move first N files with specific name

I'm trying to move only 100 files with a specific extensions (from the current directory to the parent directory), but the following attempt of mine does not work
for file in $(ls -U | grep *.txt | tail -100)
do
mv $file ../
done
Can you point me to the correct approach?
Since you didn't quote *.txt, the shell expanded it to all the filenames ending in .txt. So your command is something like:
ls -U | grep file1.txt file2.txt file3.txt ... | tail -100
Since grep has filename arguments, it ignores its standard input. It outputs all the lines matching file1.txt in the remaining files. There's probably no matches, so nothing is piped to tail -100. And even if there were matches, the output would be the lines from the files, not filenames, so it wouldn't be useful for the mv command.
You can loop over the filenames directly, and use a counter variable to stop after 100 files.
counter=0
for file in *.txt
do
if (( counter >= 100 ))
then break
fi
mv "$file" ../
((counter++))
done
This avoids the pitfalls of parsing the output of ls.
this will do the job:
ls -U *.txt | tail -100 | while read filename; do mv "$filename" ../; done
while read filename respect spaces in the filename.
Run this in the text file directory:
#!/bin/bash
for txt_file in ./*.txt; do
((c++==100)) && break
mv "$txt_file" ../
done

Filter directories in piped input

I have a bash command that lists a number of files and directories. I want to remove everything that is not an existing directory. Is there anyway I can do this without creating a script of my own? I.e. I want to use pre-existing programs available in linux.
E.g. Given that I have this folder:
dir1/
dir2/
file.txt
I want to be able to run something like:
echo dir1 dir2 file.txt somethingThatDoesNotExist | xargs [ theCommandIAmLookingFor]
and get
dir1
dir2
It would be better if the command generating the putative paths used a better delimeter, but you might be looking for something like:
... | xargs -n 1 sh -c 'test -d "$0" && echo $0'
You can use this command line using grep -v:
your_command | grep -vxFf <(printf '%s\n' */ | sed 's/.$//') -
This will filter out all the sub-directories in current path from your list.
If in case you want to list only existing directories then remove -v as:
your_command | grep -xFf <(printf '%s\n' */ | sed 's/.$//') -
Note that glob */ prints all sub-directories in current path with a trailing / and sed is used to remove this last /.

How to move files where the first line contains a string?

I am currently using the following command:
grep -l -Z -E '.*?FindMyRegex' /home/user/folder/*.csv | xargs -0 -I{} mv {} /home/destination/folder
This works fine. The problem is it uses grep on the entire file.
I would like to use the grep command on the FIRST line of the file only.
I have tried to use head -1 file | at the beginning, but it did not work.
A change I would add to your script is -
for file in *.csv; do
head -1 "$file" | grep -l -Z -E '.*?FindMyRegex' | xargs -0 -I{} mv {} /home/destination/folder;
done
you can maybe try sed '1q' file.csv | grep ... to search the regexp only in the first line.
You don't need grep or find, as long as your files don't have embedded newlines.
I don't know an easy way off the top of my head to get sed to delimit with nulls.
mv $( for f in /home/user/folder/*.csv;
do sed -ns '1 { /yourPattern/F; q; }' $f;
done ) /home/destination/folder/
EDIT
Rewrote with a loop. This will run a separate instance of sed to check each file, but at least it shouldn't read beyond the first line. It will fail syntactically if there are no hits.
You might need -E depending on your regex.
-n says don't print records from the files.
-s says treat each file as a distinct input - this is so the filenames aren't always the first one.
This does require GNU sed for the F.
gawk 'FNR==1{if($0~/PATTERN/)
printf "mv %s %s\n",FILENAME, "/target";nextfile}' /path/*.csv
First of all, in your regex: .*?FindMyRegex the .*? doesn't make any sense, they could be removed.
The above awk (gawk) one-liner will build up mv file target command lines for you. You can check them, if you are satisfied with them, pipe the output to |sh , the commands are gonna be executed.
replace PATTERN by your regex pattern, and /target by the real target dir.
The one-liner is assuming that the filenames don't contain special chars (space i.e.), if it is the case, add "s to the mv cmd.
using GNU awk to find the filenames, pipe the filenames into xargs
gawk -v pattern="myRegex" '
FNR == 1 {if ($0 ~ pattern) printf "%s\0", FILENAME; nextfile}
' *.csv | xargs -0 echo mv -t destination
If it looks OK, remove "echo"
Try this Shellcheck-clean Bash code:
#! /bin/bash
shopt -s nullglob # Globs that match nothing expand to nothing
shopt -s dotglob # Globs match files whose names start with '.'
dest=/home/destination/folder
for file in *.csv ; do
head -n 1 -- "$file" | grep -qE '.*?FindMyRegex' && mv -- "$file" "$dest"
done
shopt -s nullglob prevents an error if there are no .csv files in the directory.
shopt -s dotglob ensures that files whose name starts with '.' are handled.
The -- in the options for head and mv ensures that files whose names begin with - are handled correctly.
The quotes in "$file" and "$dest" ensure that names that contain whitespace (actually $IFS) characters (including newlines) or glob metacharacters are handled correctly.
Note that the .*? in the reqular expression is probably redundant, and may not do what you think it does (grep -E doesn't do non-greedy matching).

concatenate files and remove source file

I have this command to concate files matching a pattern, but I do want to remove them, and I want to prevent the case when a file that just created should be just deleted (and no concatenation)
sample files names:
start-2014-03-25-08-08.log
scheduled-2014-03-19-13-03.log
scheduled-2014-03-19-14-58.log
command used
ls -1 | sed -r "s/(.*)-[0-9]{4}(-[0-9]{2})+/cat \1* >> \1$(date +"-%Y-%m-%d-%H-%M")/" | uniq | cat
output is:
cat start* >> start-2014-03-26-12-26.log
cat scheduled* >> scheduled-2014-03-26-12-26.log
but I do want to remove the files once they have been appended. Since the files are large, it could be a slight chance of delay that meanwhile appending a new "save pattern" file is created and I do not want to remove that one.
What would be the correct way?
Update
I have this now.
rm -f temp.files;ls -1 *.log > temp.files; cat temp.files | sed -r "s/(.*)-[0-9]{4}(-[0-9]{2})+\.log/cat \1* >> \1$(date +"-%Y-%m-%d-%H-%M").log/" | uniq | sh; xargs rm -rf < temp.files; rm -f temp.files
No temporary files needed.
ts=$(date +"-%Y-%m-%d-%H-%M")
for f in *; do
prefix=${f%%-*}
cat "$f" >> "$prefix-$ts"
rm "$f"
done
Since it's possible for the loop to take more than a minute to run, I set ts outside the loop so that the same minute is always used. You can move that assignment inside the loop if you want different output files depending on when the concatenation actually takes place.
Since you generated that cat command using sed and later pipe it to sh, you could modify the sed expression so as to instruct sh to delete the file if it was appended successfully, i.e., change the replacement expression to:
cat \1* >> \1$(date +"-%Y-%m-%d-%H-%M").log \&\& rm -f &
from
cat \1* >> \1$(date +"-%Y-%m-%d-%H-%M").log
Note that you need to escape the & in the replacement in order to produce the literal & and & by itself would be the entire match (the input filename in your case).
This would also obviate the need of rm -rf < temp.files in your command since every file would be removed after being appended.
Finally I had this:
rm -f temp.files;ls -1 *.log > temp.files; cat temp.files | sed -r "s/(.*)-[0-9]{4}(-[0-9]{2})+\.log/cat \1* >> \1$(date +"-%Y-%m-%d-%H-%M").log/" | uniq | sh; xargs rm -rf < temp.files; rm -f temp.files

Ordering a loop in bash

I've a bash script like this:
for d in /home/test/*
do
echo $d
done
Which ouputs this:
/home/test/newer dir
/home/test/oldest dir
I'd like to order the folders by creation time so that the 'oldest dir' directory appears first in the list. I've tried ls and tree variations to no avail.
For example,
for d in `ls -d -c -1 $PWD/*`
Returns:
/home/test/oldest
dir
/home/test/newer
dir
Very close, but it does not respect the space in the directory name. My question, how would I have oldest dir on top and support the whitespace?
ls -d -c $PWD/* | while read line
do echo "$line"
done
Another technique, kind of a Schwartzian transform:
stat -c $'%Z\t%n' /home/test/* | sort -n | cut -f2- |
while IFS= read -r filename; do
# ...
This solution is fragile with filenames containing newlines.

Resources