How to spilt a files into chunks and store no.of chunks,file name and size to json? - linux

I want to automate this proces with linux shell script(bash script)
eg:
afile.txt
bfile.txt
cfile.txt
dfile.txt
efile.txt
ffile.txt
gfile.ps1
only .txt files to be divided into chunks of 1000bytes
eg:
afile.txt00
afile.txt01
afile.txt02
after that need to create a json containing like
{"avaible_files":[["afile.txt",2056,"1.0",3],["bfile.txt",948,"2.0",1],["cfile.txt",1054,"1.001",2],["dfile.txt",3085,"3.0",4],["efile.txt",9685,"1.0.0",10],["efile.txt",6985,"1.0.2",7],["dfile.txt",65,"1.0.0",1],["ffile.txt",9996,"3.1.0",10],["gfile.txt",785,"2.0.0",1]]}
in the json, data format is [file name,size,version,chunks]
Here version is hard coded text written inside the .txt file
function M.version()
return "1.0"
end
please help me writing bash script that will do this job
Thanks in advance

Here is the first part:
ls *.txt | while read FILE; do split -b 1000 -d $FILE $FILE; done
Second part is less clear to me...

Related

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

I am trying to rename all files located in a directory (recursively) with a specific meta data field appended to the end of the png file name.
the meta data field name is "aesthetic_score" with a value range from 1.0-9.0
when I type:
exiftool -Aesthetic_score -G1 -s testn.png
the result is:
[PNG] Aesthetic_score : 7.0
This is how I would like to append the png files recursively within a directory.
Note i would like to swap out the word aesthetic with the word chad in the append, and not all files will have this data field:
input file:
filename001.png (metadata aesthetic_score:7.0)
output:
filename001-chad-score-70.png
I tried to use Digikam and JExifToolGui-2.01, without success.
I am trying to perform this task in the cmd line, although other solutions are welcome. Thank you for your help.
So, this might work for you, I can't really test it; note that you would need to get rid of the echo before the mv for it to actually do something (rename rather than just show what it would do).
while read name
do
newname=$(exiftool -G1 -s "$name"|awk '$2~/FileName/{name=$4}; $2~/Aesthetic_score/{basename=gensub(/^(.+)\....$/,"\\1","1",name);ext=gensub(/^.*\.(...)$/,"\\1","1",name);gsub(/\./,"",$4);print basename"."$4"."ext}')
echo mv "$name" "$newname"
done <<<$( find -iname \*.png )
Basically the find at the very end finds all the pngs.
The while loop takes every name find throws it, and passes each file through exiftool (using your specs) and parses the output using awk, which then outputs the new name, which gets captured in the shell variable by the same name.
And finally the mv (without the echo) renames the files.

balancing the bash calculations

We have a tool for cutting adaptors https://github.com/vsbuffalo/scythe/blob/master/README.md and we wanted it to be used on all the files in the raw folder and make an output of each file separately as OUT+File Name.
Something is wrong with this script I wrote, because it doesn't take each file separately, and the whole thing doesn't work properly. It's gonna generateing empty file named OUT+files
Expected operation will looks:
take file1, use scythe on it, write output as OUTfile1
take file2 etc.
#!/bin/bash
FILES=/home/dave/raw/*
for f in $FILES
do
echo "Processing the $f file..."
/home/deve/scythe/scythe -a /home/dev/scythe/illumina_adapters.fa -o "OUT"+$f $f
done
Additionally, I noticed (testing for a single file) that the script uses only one core out of 130 available. Is there any way to improve it?
There is no string concatenation operator in shell. Use juxtaposition instead; it's "OUT$f", not "OUT"+$f.

bash - opening an image only when a corresponding text file exists

I came across a problem in Bash when I would try to only open images based upon the information stored in .txt files about them. I am trying to sort a number of images by size or height, and display an image with them in the sorted order, but if there exists a .jpg in the folder without a .txt file with the same name, it should not process it.
I have the sorting piece of my situation done, and am trying to figure out how I would go about opening only the images that have a .jpg extension WITH a .txt file.
I figured a solution would look like me putting every .jpg's name (without extension) in a list and then process through the list and run something like:
[if -f $filename.txt ]; then ~~~
but I came across the problem of iterating through without a for-loop, or else all the pictures would open multiple times. My attempt was:
for i in *jpg; do
y=$y ${i.jpg}
done
if[ -f $y.txt ] then
(sorting parts)
This only looked at the last filename in y, as it should, but I am trying to figure out a way to look at each separate filename and see if there exists that textfile, in order to include it in the sorting.
Thanks so much for your help!
Collecting a list of file names in a single variable is an antipattern. You want to collect them in an array instead.
a=()
for f in *.jpg; do
if [ -e "${f%.jpg}".txt ]; then
continue
fi
a+=("$f")
done
# now do things with "${a[#]}"
Frequently, you don't really need to collect the files in an array -- just do everything you were doing inside the for loop to each individual file as you traverse the files.
(And actually y=$y ${i%.jpg} doesn't append to y -- it sets y to itself for the duration of attempting to execute a file named i sans the .jpg extension, which would most likely fail in the vast majority of cases.)
I would do the file check first such that find just reports files that have a corresponding text file. The following snippet will just display jpg files that have a corresponding txt file:
find . -name "*.jpg" -maxdepth 1 -exec /bin/bash -c '[ -e "${0%.*}.txt" ] && echo "$0";' {} \;

Filtering text files in cmd?

Is there any way that one can filter a text file in Windows' CMD as with awk in shell script?
I have a somehow large file and I only need the last column from each row. This will be done extremely easy with awk, but I have no means of using that now.
Try this our
Get-Content .\test.csv | %{ $_.Split(',')[1]; }
or for more reference
check out this site
[1]: http://windows-powershell-scripts.blogspot.in/2009/06/awk-equivalent-in-windows-powershell.html
This will return every last term after the last comma in a .csv file for example:
#echo off
type "file.csv" | repl ".*,(.*)" "$1" >"newfile.txt"
This uses a helper batch file called repl.bat (by dbenham) - download from: https://www.dropbox.com/s/qidqwztmetbvklt/repl.bat
Place repl.bat in the same folder as the batch file or in a folder that is on the path.

Piping with multiple commands

Assume you have a file called “heading” as follows
echo "Permissions^V<TAB>^V<TAB>Size^V<TAB>^V<TAB>File Name" > heading
echo "-------------------------------------------------------" >> heading
Write a (single) set of commands that will create a report as follows:
make a list of the names, permissions and size of all the files in your current directory,
matching (roughly) the format of the heading you just created,
put the list of files directly following the heading, and
save it all into a file called “file.list”.
All this is to be done without destroying the heading file.
I need to be able to do this all in a pipleline without altering the file. I can't seem to do this without destroying the file. Can somebody please make a pipe for me?
You can use command group:
{ cat heading; ls -l | sed 's/:/^V<tab>^V<tab>/g'; } > file.list

Resources