Sync without scanning individual files? [closed]

Sync without scanning individual files? [closed] - linux

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Consider two directories:
/home/user/music/flac
/media/MUSIC/flac
I would like the second directory (destination; a USB drive) to contain the same files and structure as the first directory (master). There are 3600+ files (59G in total). Every file is scanned using unison, which is painfully slow. I would rather it compare based on file name, size, and modification time.
I think rsync might be better but the examples from the man pages are rather cryptic, and Google searches did not reveal any simple, insightful examples. I would rather not accidentally erase files in the master. ;-)
The master list will change over time: directories reorganized, new files added, and existing files updated (e.g., re-tagging). Usually the changes are minor; taking hours to complete a synchronization strikes me as sub-optimal.
What is the exact command to sync the destination directory with the master?
The command should copy new files, reorganize moved files (or delete then copy), and copy changed files (based on date). The destination files should have their timestamp set to the master's timestamp.

You can use rsync this way:
rsync --delete -r -u /home/user/music/flac/* /media/MUSIC/flac
It will delete files in /media/MUSIC/flac (never on master), and update based on file date.
There are more options, but I think this way is sufficient for you. :-)
(I just did simple tests! Please test better!)

You can use plain old cp to copy new & changed files (as long as your filesystems have working timestamps):
cp -dpRuv /home/user/music/flac /media/MUSIC/
To delete files from the destination that don't exist at the source, you'll need to use find. Create a script /home/user/bin/remover.sh like so:
#!/bin/bash
CANONNAME="$PWD/$(basename $1)"
RELPATH=$(echo "$CANONNAME" | sed -e "s#/media/MUSIC/flac/##")
SOURCENAME="/home/user/music/flac/$RELPATH"
if [ ! -f "$SOURCENAME" ]; then
echo "Removing $CANONNAME"
rm "$CANONNAME"
fi
Make it executable, then run it from find:
find /media/MUSIC/flac -type f -execdir /home/user/bin/remover.sh "{}" \;
The only thing this won't do is remove directories from the destination that have been removed in the source - if you want that too you'll have to make a third pass, with a similar find/script combination.

Related

split directory with 10000 files into 2 directories [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have directory /logos which contains approximately 10000 png images. Can you please suggest some script to make two new folders /logos-1 and /logos-2 each one with half of the images from initial folder?
Thank you in advance <3

One approach could be to iterate over the files in the folder, keep and counter and move they files the other directory on each iteration:
counter=0
mkdir -p logos-0
mkdir -p logos-1
for file in logos/*
do
[ -e "$file" ] || continue
echo mv "$file" "logos-$((counter++%2))/"
done
Remove the echo if the mv commands looks appropriate.

You can use rename, a.k.a. Perl rename and prename for that. I assume you don't really want the leading slashes and you aren't really working in the root directory - put them back if you are.
rename --dry-run -p -N 01 '$_ = join "", "logos-", $N++%2+1, "/$_"' *.png
Sample Output
'1.png' would be renamed to 'logos-2/1.png'
'10.png' would be renamed to 'logos-1/10.png'
'2.png' would be renamed to 'logos-2/2.png'
'3.png' would be renamed to 'logos-1/3.png'
'4.png' would be renamed to 'logos-2/4.png'
'5.png' would be renamed to 'logos-1/5.png'
'6.png' would be renamed to 'logos-2/6.png'
'7.png' would be renamed to 'logos-1/7.png'
'8.png' would be renamed to 'logos-2/8.png'
'9.png' would be renamed to 'logos-1/9.png'
You can remove the --dry-run if the output looks good. The -p means it will create any necessary directories/paths for you. If you aren't familiar with Perl that means:
"Set N=1. For each PNG file, make the new name (which we must store in special variable $_) equal to the result of joining the word logos- with a number alternating between 1 and 2, with a slash followed by whatever it was before ($_)."
You may find this alternative way of writing it easier:
rename --dry-run -N 01 '$_ = sprintf("logos-%d/$_", $N%2+1)' *.png
Using this tool confers several benefits:
you can do dry runs
you can calculate any replacement you like
you don't need to create directories
it will not clobber files if multiple inputs rename to the same output
On macOS, use homebrew and install with:
brew install rename

mkdir: cannot create directory '': No such file or directory [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
My code is meant to find a /jar folder , loop through the .jar files and create .xml and .trigger files with the same file name. Additionally, it creates an ~ID~ filename ~ID~ field within the XML file.
I am getting the error
mkdir: cannot create directory '': No such file or directory
I have the /tmp folder set up, including an /xml /jar and /trigger folder, so those not being there isn't the issue.
#!/bin/bash
jar_dir= /c/Users/hi/Desktop/Work/tmp/jar
xml_dir= /c/Users/hi/Desktop/Work/tmp/xml
trigger_dir= /c/Users/hi/Desktop/Work/tmp/trigger
# the following creates output directories if they don't exist
mkdir -p "${xml_dir}"
mkdir -p "${trigger_dir}"
# we start the for loop through all the files named `*.jar` located in the $jar_dir directory
for f in $(find ${jar_dir} -name "*.jar")
do
file_id=$(basename -s .jar ${f}) # extract the first part of the file name, excluding .jar
echo "<ID>${file_id}</ID>" > ${xml_dir}/${file_id}.xml
touch ${trigger_dir}/${file_id}.trigger # this one just creates an empty file at ${trigger_dir}/${file_id}.trigger
done

This command:
jar_dir= /c/Users/hi/Desktop/Work/tmp/jar
is exactly equivalent to
jar_dir="" /c/Users/hi/Desktop/Work/tmp/jar
and tells the shell to run the command /c/Users/hi/Desktop/Work/tmp/jar with the environment variable jar_dir set to the empty string (only for the duration of that command). If /c/.../jar is a directory, that should give you an error, too.
To assign the /c/.../jar string to the variable instead, lose the space.
The error message you get comes from mkdir trying to create a directory with an empty name. (It's a bit confusing though.)
For problems with shell scripts, it often helps to paste the script to https://www.shellcheck.net/, which recognizes most of the usual mistakes and can tell what to do.
See these posts on SO and unix.SE for discussion:
Command not found error in Bash variable assignment
Is it shell portable to run a command on the same line after variable assignment?
Spaces in variable assignments in shell scripts

xml_dir= /c/Users/hi/Desktop/Work/tmp/xml
Can't have a space after the equal. So
xml_dir=/c/Users/hi/Desktop/Work/tmp/xml

Ubuntu - bulk file rename [duplicate]

This question already has answers here:
Rename multiple files based on pattern in Unix
(24 answers)
Closed 2 years ago.
I have a folder containing a sequence of files whose names bear the form filename-white.png. e.g.
images
arrow-down-white.png
arrow-down-right-white.png
...
bullets-white.png
...
...
video-white.png
I want to strip out the -white bit so the names are simply filename.png. I have played around, dry run with -n, with the Linux rename command. However, my knowledge of regexes is rather limited so I have been unable to find the right way to do this.

If you are in the directory above images, the command is
rename "s/-white.png/.png/" images/*
If your current directory is images, then run rename "s/-white.png/.png/" ./* instead. To do a dry run, just attach a -n like you said:
rename -n "s/-white.png/.png/" images/*
or
rename -n "s/-white.png/.png/" ./*

Linux/Cygwin recursively copy file change extension [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm looking for a way to recursively find files with extension X (.js) and make a copy of the file in the same directory with extension Y (.ts).
e.g. /foo/bar/foobar.js --> /foo/bar/foobar.js and /foo/bar/foobar.ts
/foo/bar.js --> /foo/bar.js and /foo/bar.ts etc etc
My due diligence:
I was thinking of using find & xargs & cp and brace expansion (cp foobar.{js,ts}) but xargs uses the braces to denote the list of files passed from xargs. This makes me sad as I just recently discovered the awesome-sauce that is brace expansion/substitution.
I feel like there has to be a one-line solution but I'm struggling to come up with one.
I've found ideas for performing the task: copying the desired to a new directory and then merging this directory with the new one; recursively run a renaming script in each directory; copy using rsync; use find, xargs and cpio.
As it stands it appears that running a renaming script script like this is what I'll end up doing.

find . -name "*.js" -exec bash -c 'name="{}"; cp "$name" "${name%.js}.ts"' \;
Using find, you can execute a command directly on a file that you've found, by using the -exec option; you don't need to pipe it through xargs. It takes the command name followed by arguments to the command, followed by a single argument ;, which you have to escape to avoid the shell interpreting it. find will replace any occurrence of {} in the command name or arguments with the file found.
In order call a command with the appropriate ending substituted, there are multiple approaches you can take, but a simple one is to use Bash's parameter expansion. You need to define a shell parameter that contains the name (in this case, I creatively chose name={}), and then you can use parameter expansion on it. ${variable%suffix} strips off suffix from the value of $variable; I then add on .ts to the end, and have the name I'm looking for.

how to separate source code and data while minimizing directory changes during working? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
This is a general software engineering problem about working on Linux. Suppose I have source code, mainly scripts. They manipulate text data, take text files as input and output. I am thinking about how to appropriately separate src code and data while minimizing directory changes during working. I see two possibilities:
mix code and data together. In this way, it minimizes directory transitions and eliminating the need of typing paths to files during working. Most of the time I just call:
script1 data-in data-out # call script
vi data-out # view result
The problem is that as the number of code and data files grows, it looks messy facing a long list of both code and data files.
Separate code and data in two folders, say "src" and "data". When I am in "src" folder, doing the above actions would require:
script1 ../data/data-in ../data/data-out # call script
vi ../data/data-out or cd data; vi data-out # view result
The extra typing of parent directories "../data" causes hassle, especially when there are lots of quick testings of scripts.
You might suggest I do it the other way around, in the data folder. But then similarly I need to call ../src/script1, again a hassle of typing prefix "../src". Yeah, we could add "src" to PATH. But what if there are dependencies among scripts across parent-child directories? e.g., suppose under "src" there are "subsrc/script2", and within script1, it calls "./subsrc/script2 ..."? Then calling script1 in "data" folder would throw error, because there is no "subsrc" folder under "data" folder.
Well separation of code & data, and minimizaing directory changes seem to be conflicting requirements. Do you have any suggestions? Thanks.

I would use the cd - facility of the shell plus setting the PATH to sort this out — possibly with some scripts to help.
I'd ensure that the source directory, where the programs are built, is on my PATH, at the front. I'd cd into either the data directory or the source directory, (maybe capture the directory with d=$PWD for the data directory, or s=$PWD for the source directory), then switch to the other (and capture the directory name again). Now I can switch back and forth between the two directories using cd - to switch.
Depending on whether I'm in 'code work' or 'data work' mode, I'd work primarily in the appropriate directory. I might have a simple script to (cd $source_directory; make "$#") so that if I need to build something, I can do so by running the script. I can edit files in either directory with a minimum of fuss, either with a swift cd - plus vim, or with vim $other_dir/whichever.ext. Because the source directory is on PATH, I don't have to specify full paths to the commands in it.
I use an alias alias r="fc -e -" to repeat a command. For example, to repeat the last vim command, r v; the last make command, r m; and so on.
I do this sort of stuff all the time. The software I work on has about 50 directories for the full build, but I'm usually just working in a couple at a time. I have sets of scripts to rebuild the system based on on where I'm working (chk.xyzlib and chk.pqrlib to build in the corresponding sets of directories, for example; two directories for each of the libraries). I prefer scripts to aliases; you can interpolate arguments more easily with scripts whereas with aliases, you can only append the arguments. The (cd $somewhere; make "$#") notation doesn't work with aliases.

It's a little more coding, but can you set environment variables from the command line to specify the data directory?
export DATA_INPUT_DIR=/path/to/data
export DATA_OUTPUT_DIR=/path/to/outfiles
Then your script can process files relative to these directories:
# Set variables at the top of your scripts:
in_dir="${DATA_INPUT_DIR:-.}" # Default to current directory
out_dir="${DATA_OUTPUT_DIR:-.}" # Defailt to current directory
# 1st arg is input file. Prepend $DATA_INPUT_DIR unless path is absolute.
infile = "$1"
[ "${1::1}" == "/" ] || infile="$DATA_INPUT_DIR/$infile"
# 2nd arg is output file. Prepend $DATA_OUTPUT_DIR unless path is absolute.
outfile = "$2"
[ "${2::1}" == "/" ] || outfile="$DATA_OUTPUT_DIR/$outfile"
# Remainder of the script uses $infile and $outfile.
Of course, you could also open several terminal windows: some for working on the code and others for executing it. :-)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string