Check if same file exists in another directory using Bash

Check if same file exists in another directory using Bash - linux

I'm new to bash and would like your help; couldn't find an answer for this case.
I'm trying to check if the files in one directory exist in another directory
Let's say I have the path /home/public/folder/ (here I have several files)
and I want to check if the files exist in /home/private/folder2
I tried that
for file in $firstPath/*
do
if [ -f $file ]; then
(ask if to over write etc.. rest of the code)
And also
for file in $firstPath/*
do
if [ -f $file/$secondPath ]; then
(ask if to over write etc.. rest of the code)
Both don't work; it seems that in the first case, it compares the files in the first path (so it always ask me if I want to overwrite although it doesn't exist in the second path)
And in the second case, it doesn't go inside the if statement.
How could I fix that?

When you have a construct like for file in $firstPath/*, the value of $file is going to include the value of $firstPath, which does not exist within $secondPath. You need to strip the path in order to get the bare filename.
In traditional POSIX shell, the canonical way to do this was with an external tool called basename. You can, however, achieve what is generally thought to be equivalent functionality using Parameter Expansion, thus:
for file in "$firstPath"/*; do
if [[ -f "$secondPath/${file##*/}" ]]; then
# file exists, do something
fi
done
The ${file##*/} bit is the important part here. Per the documentation linked above, this means "the $file variable, with everything up to the last / stripped out." The result should be the same as what basename produces.
As a general rule, you should quote your variables in bash. In addition, consider using [[ instead of [ unless you're actually writing POSIX shell scripts which need to be portable. You'll have a more extensive set of tests available to you, and more predictable handling of variables. There are other differences too.

Related

How can I match a string with optional characters in bash?

I have a small little script that deals with pipelines on Jenkins. It needs to be able to grab a file from a folder named after the pipeline name.
Most pipeline names follow this pattern: {Name}Pipeline/{Name}Pipeline.properties
However, a few pipelines have a three-digit version number appended, like so: {Name}Pipeline122/{Name}Pipeline122.properties
In my script, I have a line that stores the path to this properties file in a variable: APP_PROPERTY_FILE=/path/to/file/${NAME}Pipeline/${NAME}Pipeline.properties
Herein lies the problem! How can I allow bash to match pipeline names without the version number AND pipeline names with the version number?
Thanks!

I believe that the user want to select a file that has optional element (3 digit number), and store the file name into shell variable.
Two challenges: (1) regular assignment var=/path/to/something* do not perform pathname expansion and (2) regular pattern matching do not support optional elements.
Possible solutions are 'if-then-else' or using extended globs. Both solutions assumed that one of the files exists.
APP_PROPERTY_FILE=/path/to/file/${NAME}Pipeline/${NAME}Pipeline.properties
if [ ! -f "$APP_PROPERY_FILE" ] ; then
APP_PROPERTY_FILE=$(echo /path/to/file/${NAME}Pipeline/${NAME}Pipeline[0-9][-0-9][0-9].properties)
Using extglob can also work.
APP_PROPERTY_FILE=$(shopt -s extglob ; echo /path/to/file/${NAME}Pipeline/${NAME}Pipeline?([0-9][-0-9][0-9]).properties ; echo $1)

Linux Date not showing the date value sometimes

I have defined a variable inside one of the shell script to create the file name with date value in it.
I used "date +%Y%m%d" command to insert the current date which was defined in date_val variable.
And I have defined the filename variable to have "${path}/sample_${date_val}.txt
For few days it was creating the file name properly as /programfiles/sample_20180308.txt
But today the filename was created without date as /programfiles/sample_.txt
When I try to execute the command "date +%Y%m%d" in linux, it is returning the correct value - 20180309.
Any idea why the filename was created without the date value ??? . I did not modify anything in my script too. So wondering what might have gone wrong.
Sample excerpt of my script is given below for easy understanding :
EDITED
path=/programfiles
date_val=$(date +%Y%m%d )
file_name=${path}/sample_${date_val}.txt

Although incredibly unlikely, it's certainly possible for date to fail, based on the source code. Under the covers, it calls either clock_gettime() or gettimeofday(), both of which can fail.
The date program will also refuse to output anything to standard output if the date from either of those two functions is out of range during the call to (which is possible if they fail).
It's also possible that the date program could "disappear" for various reasons, such as actually being hidden or permissions changed, or a shortage of resources like file handles when attempting to open the executable.
As mentioned, all these possibilities are a stretch, unlikely to happen in the real world.
If you want to handle the case where you get inadequate output from date, you can simply try until you get a valid one, something like (with the possibility of adding some limit to detect if it's never any good):
todaysDate="$(date +%Y%m%d)"
while [[ ! $x =~ ^[0-9]{8}$ ]] ; do
sleep 1
todaysDate="$(date +%Y%m%d)"
done
# todaysDate now guaranteed to be eight digits.

How to call a large list of paired files to be executed by a program in BASH?

I have a large directory of files (100+) that I'd like to pass through a program via the terminal.
The files are paired and all follow a naming scheme like such:
TS-8_S53_L001_R1_001.fastq
TS-8_S53_L001_R2_001.fastq
RS-9_S54_L001_R1_001.fastq
RS-9_S54_L001_R2_001.fastq
And the program execution looks like:
Seqprogram -i1 Blah_R1_001.fastq -i2 Blah_R2_001.fastq -o Blah_paired.fastq
All of these files are in one directory.
I'd like to able to run the program on all of the files, using the files paired together in the proper sequence (R1 files are passed through i1, the R1 and R2 files have the same base name) and the output file (-o) is saved under the base name with some identifier attached ("_paired", etc).
I've envisioned on how I'd do this over Python; however, I am trying to get better with BASH.
I'm familiar with how one might call multiple files into a single command; i.e., uncompressing all .gz files in a particular directory
gunzip "*.gz"
But this command has two inputs, and the inputs must be ordered, so the wildcard scheme isn't sufficient.
Thanks

Use a wildcard to get one file of the pair, and then use parameter substitution to get the other corresponding filenames.
for i1 in *_R1_001.fastq; do
i2=${i1/R1_001/R2_001}
paired=${i1/R1_001/paired}
Seqprogram -i1 "$i1" -i2 "$i2" -o "$paired"
done

The easiest way to do this is to match a single one of the three filenames patterned, and to modify it to get the other two.
That is to say:
for r1file in *_R1_*.fastq; do
r2file=${r1file/_R1_/_R2_}
pairfile=${r1file%_R1_*}_paired.fastq
Seqprogram -i1 "$r1file" -i2 "$r2file" -o "$pairfile"
done

Puppet existing directory to bool

is it possible to use a function, which checks if the given directory/file exists and assigns the boolean return value to a variable?
I want to execute a part of my manifest only if a file/directory doesn't exist.
Greetings

You will have to create a custom fact to get that piece of information from the agent to your master. The easiest way to do this is an external fact. On Linux, this scriptlet would suffice.
#!/bin/sh
[ -d /the/directory/in/question ] || exit 0
echo 'my_directory=present'
You can then use the $my_directory fact in your manifests. It's missing if the directory is not yet there, and otherwise has the value 'present'.

Organize code in unix bash scripting

I am used to object oriented programming. Now, I have just started learning unix bash scripting via linux.
I have a unix script with me. I wanted to break it down into "modules" or preferably programs similar to "more", "ls", etc., and then use pipes to link all my programs together. E.g., "some input" myProg1 | myProg2 | myProg3.
I want to organize my code and make it look neater, instead of all in one script. Also, it will be easy to do testing and development.
Is it possible to do this, especially as a newbie ?

There are a few things you could take a look at, for example the usage of aliases in bash and storing them in either bashrc or a seperate file called by bashrc
that will make running commands easier..
take a look here for expanding commands into aliases (simple aliases are easy)
You can also look into using functions in your code (lots of bash scripts in above link's home folder to make sense of functions browse this site :) which has much better examples...
Take a look here for some piping tails into script
pipe tail output into another script
The thing with bash is its flexibility, so for example if something starts to get too messy for bash you could always write a perl/Java any lang and then call this from within your bash script, capture its output and do something else..
Unsure why all the pipes anyways here is something that may be of help:
./example.sh 20
function one starts with 20
In function 2 20 + 10 = 30
Function three returns 10 + 10 = 40
------------------------------------------------
------------------------------------------------
Local function variables global:
Result2: 30 - Result3: 40 - value2: 10 - value1: 20
The script:
example.sh
#!/bin/bash
input=$1;
source ./shared.sh
one
echo "------------------------------------------------"
echo "------------------------------------------------"
echo "Local function variables global:"
echo "Result2: $result2 - Result3: $result3 - value2: $value2 - value1: $value1"
shared.sh
function one() {
value1=$input
echo "function one starts with $value1"
two;
}
function two() {
value2=10;
result2=$(expr $value1 + $value2)
echo "In function 2 $value1 + $value2 = $result2"
three;
}
function three() {
local value3=10;
result3=$(expr $value2 + $result2;)
echo "Function three returns $value2 + $value3 = $result3"
}
I think the pipes you mean can actually be functions and each function can call one another.. and then you give the script the value which it passes through the functions..
bash is pretty flexible about passing values around, so long as the function being called before has the variable the next function being called by it can reuse it or it can be called from main program
I also split out the functions which can be sourced by another script to carry out the same functions
E2A Thanks for the upvote, I have also decided to include this link
http://tldp.org/LDP/abs/html/sample-bashrc.html
There is an awesome .bashrc to be reused, it has a lot of functions which will also give some insight into how to simplify a lot of daily repetitive commands such as that require piping, an alias can be written to do all of them for you..

You can do one thing.
Just as a C program can be divided into a header file and a source file for reducing complexity, you can divide your bash script into two scripts - a header and a main script but with some differences.
Header file - This will contain all the common variables defined and functions defined which will be used by your main script.
Your script - This will only contain function calls and other logic.You need to use "source <"header-file path">" in your script at starting to get all the functions and variables declared in the header available to your script.

Shell scripts have standard input and output like any other program on Unix, so you can use them in pipes. Splitting your scripts is a good solution because you can later use them in pipes with other commands.
I organize my Bash projects in the following way :
Each command is put in its own file
Reusable functions are kept in a library file which is just a classic script with only functions
All files are in the same directory, so commands can find the library with $(dirname $0)/library
Configuration is stored in another file as environment variables
To keep things clear, you should not use global variables to communicate between functions and main program.
I prepare a template for scripts with the following parts prepared :
Header with name and copyright
Read configuration with source
Load library with source
Check parameters
Function to display help, which is called if asked for or if parameters are wrong
My best advice is : always write the help function, as the next person who will need it is ... yourself !
To install your project you simply copy all files, and explain what to configure in the configuration file.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Check if same file exists in another directory using Bash - linux

Related

How can I match a string with optional characters in bash?

Linux Date not showing the date value sometimes

How to call a large list of paired files to be executed by a program in BASH?

Puppet existing directory to bool

Organize code in unix bash scripting

Categories

Resources