(Linux) Recursively overwrite all files in folder with data from another file

(Linux) Recursively overwrite all files in folder with data from another file - linux

I find myself in a situation similar to this question:
Linux: Overwrite all files in folder with specified data?
The answers there work nicely, however, they are for typed-out text. Allow me to provide context.
I have a Linux terminal which the following file structure: (with files & folders irrelevant to the question removed)
root/
empty.svg
svg/
257238.svg
297522.svg
a7yf872.svg
236y27fh.svg
38277.svg
... (~200 other .svg files with arbitrary names)
2903852.svg
The framework I am working with requires those .svg files to exist with those specific filenames, but obviously, it does not care about SVG image they contain. I do not plan on using such files and they take up a hefty amount of space on disk, so I wish to convert them all into empty SVGs, aka the empty.svg file on my root directory, which is a 12x12 transparent SVG file (124 bytes). This way the framework shouldn't error out like it did when I tried simply overwriting the raw data of those SVGs with plaintext using the answer of the question linked at the top of this question. I've tried many methods by trying to be creative with my basic Linux command-line knowledge but no success. How do I accomplish this?
TL;DR: How to recursively overwrite all files in a folder with the raw data of another file from Linux CLI?

Similar to the link, you can use tee command, but instead of echo use cat to copy file contents, where cat is the command to read the contents of the file.
cat empty.svg | tee svg/257238.svg svg/297522.svg <etc>
But if there are a lot of files in svg directory it will be useful to use loop to automate the previous command:
for f in svg/*; do
if [[ "$f" == *.svg ]]; then
cat empty.svg > "$f"
fi
done
Here we use pipes and redirections to connect commands and redirect previous command output.

Related

shell script for opening the different files in a folder in linux

Hi everyone this is saikrishna. I need some help in linux shell scripts. I need to open the different types of files like mp3,mp4,jpg...etc and other extensions are existing in the same folder. I had tried "gnome" code for this but it opens only one file i needed to open all the files one after the other.
is it possible in linux.need help for it

You can list multiple files using ls and then use while to open them one by one:
ls *.mp3 | while read -r file; do xdg-open "$file"; done
see this answer for more details.

Move files to different directories based on file name tokens

I am looking to write a script to move files from a directory:
/home/mydir/
To another directory based on tokens in the file name. I have a bunch of files named as such:
red_office_mike_2015_montreal_546968.ext
or
$color_$location_$name_$year_$city_$numbers.extension (files will be various movie files: mov, mp4, mkv, etc.)
I would like the script to move the files to the following location:
/dir/work/$color/$name
Then verify the file has successfully copied, and delete the original file once it has.
I would also love it if the script would create the to directory if it does not already exist.
So in summary, I need a script to move files based on underscore separated tokens, create the to directory if it doesn't already exist, verify the successful copy (maybe with a size check), then delete the original file.
I am working on linux, and would prefer a bash script. The variables I have given are generic, and I will incorporate some other things to the script, I'm just looking for help on building the skeleton.
Thanks in advance for any help!

It's not a bash script, but perl is much better at this kind of thing and is installed on all Linux systems
while(<>) {
chomp;
$file = $_;
($colour, $location, $name, $year, $city, $numbers) = split(/_/,$file);
$dest0 = "/dir/work/$colour";
$dest1 = "$dest0/$name";
mkdir ($dest0) unless (-d $dest0);
mkdir ($dest1) unless (-d $dest1);
rename ($file, "$dest1/$file");
}
The script splits your input file on the underscore character, creates all the directories to the destination and then renames the file to the new filename. Rename takes care of all the copying and deleting for you. In fact it just changes the directory entries without any copying at all.
UPDATE
The above version takes its input from a file containing a list of filenames to process. For an alternative version which processes all files in the current directory, replace the while line with
while(glob("*")) {

I was able to fumble around online and come up with a for loop to do this task. I used cut and it made things simple. Here is what worked for me:
#!/bin/sh
cd "${1:-.}"
for f in *.*; do
color=`echo "$f" | cut -d'_' -f1`
name=`echo "$f" | cut -d'_' -f3`
todir="/dir/work/$color/$name"
mkdir -p "$todir"
mv "$f" "$todir"
done
This worked perfectly and I hope it can help others who might need to create directories based on portions of filenames.
The first line under the shebang made it so that it will either look at the current working directory or a directory you pass it as an argument.
Thanks to those who chimed in on the original post. I'm new with scripting so it take me a while to figure this stuff out. I love this site though, it is super helpful!

wget: downloading all files in directories/subdirectories

Basically on a webpage there is a list of directories, and each of these has further subdirectories. The subdirectories contain a number of files, and I want to download to a single location on my linux machine one file from each subdirectory which has the specific sequence letters 'RMD' in it.
E.g., say the main webpage links to directories dir1, dir2, dir3..., and each of these has subdirectories dir1a, dir1b..., dir2a, dir2b... etc. I want to download files of the form:
webpage/dir1/dir1a/file321RMD210
webpage/dir1/dir1b/file951RMD339
...
webpage/dir2/dir2a/file416RMD712
webpage/dir2/dir2b/file712RMD521
The directories/subdirectories are not sequentially numbered like in the above example (that was just me making it simpler to read) so is there a terminal command that will recursively go through each directory and subdirectory and download every file with the letters 'RMD' in the file name?
The website in question is: here
I hope that's enough information.

An answer with a lot of remarks:
In case the website supports ftp, you better use #MichaelBaldry's answer. This answer aims to give a way to do it with wget (but this is less efficient for both server and client).
Only in case the website works with a directory listing, you can use the -r flag for this (the -R flag aims to find links in webpages and then downloads these pages as well).
The following method is inefficient for both server and client and can result in a huge load if the pages are generated dynamically. The website you mention furthermore specifically asks not to fetch data that way.
wget -e robots=off -r -k -nv -nH -l inf -R jpg,jpeg,gif,png,tif --reject-regex '(.*)\?(.*)' --no-parent 'http://atmos.nmsu.edu/PDS/data/mslrem_1001/DATA/'
with:
wget the program you want to call;
-e robots=off; the fact that you ignore the websites request not to download this automatically;
-r: you download recursively;
-R jpg,jpeg,gif,png,tif: reject the downloading of media (the small images);
--reject-regex '(.*)\?(.*)' do not follow or download query pages (sorting of index pages).
-l inf: that you keep downloading for an infinite level
--no-parent: prevent wget from starting to fetch links in the parent of the website (for instance the .. links to the parent directory).
wget downloads the files breadth-first so you will have to wait a long time before it eventually starts fetching the real data files.
Note that wget has no means to guess the directory structure at server-side. It only aims to find links in the fetched pages and thus with this knowledge aims to generate a dump of "visible" files. It is possible that the webserver does not list all available files, and thus wget will fail to download all files.

I've noticed this site supports FTP protocol, which is a far more convenient way of reading files and folders. (Its for transferring files, not web pages)
Get a FTP client (lots of them about) and open ftp://atmos.nmsu.edu/PDS/data/mslrem_1001/DATA/ you can probably just highlight all the folders in there and hit download.

One solution using saxon-lint :
saxon-lint --html --xpath 'string-join(//a/#href, "^M")' http://atmos.nmsu.edu/PDS/data/mslrem_1001/DATA/ | awk '/SOL/{print "http://atmos.nmsu.edu/PDS/data/mslrem_1001/DATA/"$0}' | while read url; do saxon-lint --html --xpath 'string-join(//a/#href, "^M")' "$url" | awk -vurl="$url" '/SOL/{print url$0}'; done | while read url2; do saxon-lint --html --xpath 'string-join(//a/#href, "^M")' "$url2" | awk -vurl2="$url2" '/RME/{print url2$0}'; done | xargs wget
Edit the
"^M"
by control+M (Unix) or \r\n for windows

"Spoof" File Extension In Bash

Is there a way to "spoof" the file extension of a file in bash for consumption by another program? I can think of doing some shell scripting and making lots of soft-links, but that isn't very scalable.
Let's imagine I have a program I'm trying to use that requires input files to be of a specific file extension, and it has no method of turning off this check.

You could make a fifo with the requisite extension and cat any other file type into it. So, if your crazy program needs to see files that end in .funky, you can do this:
mkfifo file.funky
cat someotherfile > file.funky &
someprogram file.funky

Create a symbolic link for each file you want to have a particular extension, then pass the name of the symlink to the command.
For example suppose you have files with names of the form *.foo and you need to refer to them with extensions of .bar:
for file in *.foo ; do
ln -s $file _$$_$file.bar
done
I precede each symlink name with _$$_ to avoid the possibility of colliding with an existing file name (you don't want to do ln -s file.foo file.bar if file.bar already exists).
With a little more programming, your script can keep track of which symlinks it created and, if you like, clean them up after executing the command.
This assumes, as you stated in the question, that the command can't be forced to accept a different extension.
You could, without too much difficulty, create a wrapper script that replaces the command in question, creating the symlinks, invoking the command, and cleaning up after itself automatically.

Linux command line - grabbing parts of the file path

I'm in the process of attempting to convert all my WAV files to FLAC files in such a way that my music directory for FLACs is identical to my music directory for WAVs.
At the moment I have my music archive set up, such that a typical album is here:
/directory1/directory2/directory3/Music/WAV/Artist/Album
So I would like a one-to-one correspondance for my FLAC files that looks as follows:
/directory1/directory2/directory3/Music/FLAC/Artist/Album.
I know that I will have to use find to list all the directories/subdirectories as follows:
find -type d -exec commands.sh
But how do I write the commands.sh file such that it will grab the Artist/Album part of the path in the WAV directory, mkdir the same /Artist/Album in the FLAC directory, and then output the flacs to the FLAC/Artist/Album directory?
I know the command for converting flacs to an output directory of your choice is:
flac -5 --out-prefix="/desired/output/path" *.wav
So I guess I'm just having trouble with grabbing/recreating the file paths!

This would be a whole lot easier in a scripting language like ruby, perl or python. Something like this is a fairly straightforward starter project in any of those languages. There are libraries for find and path manipulation that make this
all pretty easy.
However, there are two posix utilities that can help with splitting apart pathnames. dirname and basename. I think those two and sed should let you do
what you want.
Find will always return relative paths and any exec occurs in the directory of the target by default, if this is not what you want look in the man pages. You can force find to stay in the top level directory.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string