How to create an alias with relative pwd + string - linux

I want to set an alias to switch from two WordPress instances on the CLI. Each of them have the same paths except for the names of their respective sites e.g:
srv/deployment/work/sitename1/wp-content/uploads/2018/
srv/deployment/work/sitename2/wp-content/uploads/2018/
How do I create an alias that takes the "pwd" of the current location and cd
s to exactly the same location on the other site?

How about a bash function instead of an alias, gives you a little more freedom.
Save this bash function to a file like switchsite.sh. Modify the variables to your needs. Then load it into your bash with:
source switchsite.sh
If you are in /srv/deployment/work/sitename1/wp-content/uploads/2018, do
switchsite sitename2
and you will be in /srv/deployment/work/sitename2/wp-content/uploads/2018.
switchsite() {
# modify this to reflect where your sites are located, no trailing slash
where_my_sites_are=/srv/deployment/work
# modify this so it includes all characters that can be in a site name
pattern_matching_sitenames=[a-z0-9_\-]
# this is the first argument when the function is called
newsite=$1
# this replaces the site name in the current working directory
newdir=$(pwd | sed -n -e "s#\($where_my_sites_are\)/\($pattern_matching_sitenames\+\)/\(.*\)#\1/$newsite/\3#p")
cd $newdir
}
How it works: The sed expression splits the output of pwd into three parts: what is before the current site name, the current site name, and what comes after. Then sed puts it back together with the new site name. Just make sure the pattern can match all characters that could be in your site name. Research character classes for details.

Add the below lines into ~/.bash_aliases
export sitename1=srv/deployment/work/sitename1/wp-content/uploads/2018/
export sitename2=srv/deployment/work/sitename2/wp-content/uploads/2018/
After that
source ~/.bash_aliases
Then you can simply type sitename1 and sitename2 from anywhere to switch to respective directories

Related

Bash script for loop variable returns all files in directory rather than a single file name [duplicate]

I would like to write the following function in bash:
go() {
cd "~/project/entry ${1}*"
}
What this would do is to cd into a project subdirectory with prefix entry (note space) and possibly a long suffix. I would only need to give it a partial name and it will complete the suffix of the directory name.
So, if for example, I have the following folders:
~/project/entry alpha some longer folder name
~/project/entry beta another folder name
~/project/entry gamma
I can run go b and it will put me into ~/project/entry beta another folder name.
The problem is, of course, that the wildcard doesn't expand inside double quotes. I cannot omit the quotes because then I will not be able to capture the spaces properly.
How do I get the wildcard to expand while at the same time preserving the spaces?
Move the quotes. Just don't quote the *. Probably also good not to quote the ~.
go() {
cd ~/"project/entry ${1}"*
}
That being said if this matches more than one thing cd will use the first match and ignore all the other matches.

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would have a specific file name as the query, output, etc., but I'm trying to use a variable so the loop can work through many files.
(Disclaimer: I am new to coding.) I believe that I am running into serious problems with trying to use my file names within my functions. As it is, my code will execute, but it creates a bunch of extra unintended files. This is what I intend for my script to do:
Line 1: Iterate through every file in my "sequences" directory. (All of which end with ".fa", if that is helpful.)
Line 3: Recognize the filename as a variable. (I know, I know, I think I've done this horribly wrong.)
Line 4: Run the blastp function using the file name as the argument for the "query" flag, always use "database.faa" as the argument for the "db" flag, and output the result in a new file that is has the same name as the initial file, but with ".txt" at the end.
Line 5: Output parts of the output file from line 4 into a new file that has the same name as the initial file, but with "_top_hits.txt" at the end.
for sequence in ./sequences/{.,}*;
do
echo "$sequence";
blastp -query $sequence -db database.faa -out ${sequence}.txt -evalue 1e-10 -outfmt 7
cat ${sequence}.txt | awk '/hits found/{getline;print}' | grep -v "#">${sequence}_top_hits.txt
done
When I ran this code, it gave me six new files derived from each file in the directory (and they were all in the same directory - I'd prefer to have them all in their own folders. How can I do that?). They were all empty. Their suffixes were, ".txt", ".txt.txt", ".txt_top_hits.txt", "_top_hits.txt", "_top_hits.txt.txt", and "_top_hits.txt_top_hits.txt".
If I can provide any further information to clarify anything, please let me know.
If you're only interested in *.fa files I would limit your input to only those matching files like this:
for sequence in sequences/*.fa;
do
I can propose you the following improvements:
for fasta_file in ./sequences/*.fa # ";" is not necessary if you already have a new line for your "do"
do
# ${variable%something} is the part of $variable
# before the string "something"
# basename path/to/file is the name of the file
# without the full path
# $(some command) allows you to use the result of the command as a string
# Combining the above, we can form a string based on our fasta file
# This string can be useful to name stuff in a clean manner later
sequence_name=$(basename ${fasta_file%.fa})
echo ${sequence_name}
# Create a directory for the results for this sequence
# -p option avoids a failure in case the directory already exists
mkdir -p ${sequence_name}
# Define the name of the file for the results
# (including our previously created directory in its path)
blast_results=${sequence_name}/${sequence_name}_blast.txt
blastp -query ${fasta_file} -db database.faa \
-out ${blast_results} \
-evalue 1e-10 -outfmt 7
# Define a file name for the top hits
top_hits=${sequence_name}/${sequence_name}_top_hits.txt
# alternatively, using "%"
#top_hits=${blast_results%_blast.txt}_top_hits.txt
# No need to cat: awk can take a file as argument
awk '/hits found/{getline;print}' ${blast_results} \
| grep -v "#" > ${sequence_name}_top_hits.txt
done
I made more intermediate variables, with (hopefully) meaningful names.
I used \ to escape line ends and allow putting commands in several lines.
I hope this improves code readability.
I haven't tested. There may be typos.
You should be using *.fa if you only want files with a .fa ending. Additionally, if you want to redirect your output to new folders you need to create those directories somewhere using
mkdir 'folder_name'
then you need to redirect your -o outputs to those files, something like this
'command' -o /path/to/output/folder
To help you test this script out, you can run each line one by one to test them. You need to make sure each line works by itself before combining.
One last thing, be careful with your use of colons, it should look something like this:
for filename in *.fa; do 'command'; done

string manipulation of Directory structure

Scenario: I have a script but no idea where I am in the directory tree, I need to resolve back to the nearest known location UPROC[something]
What I have so far:
I have a script running in a directory for example:
/home/jim/query/UPROCL/test/bob/dircut.sh
now the only constant in this is that the Directory I want will begin with UPROC... maybe not UPROCL but definitely UPROC
So I have written the following:
#!/bin/bash
#Absolute path for this script
SCRIPT=$(readlink -f "$0")
echo $SCRIPT
#Gets Path of script without script name
SCRIPTPATH=$(dirname "$SCRIPT")
echo $SCRIPTPATH
#Cuts everything after UPROC(.* is wildcard)/
CUTDOWN=$(sed 's/\(UPROC.*\/\).*/\1/' <<< $SCRIPTPATH)
echo $CUTDOWN
The only problem is that it output is:
/home/jim/query/UPROCL/test/bob/dircut.sh
/home/jim/query/UPROCL/test/bob
/home/jim/query/UPROCL/test/
Can some tell me what is wrong with my sed command as it is not cutting down to
/home/jim/query/UPROCL/
Because * is greedy. You want to be more selective about what characters are allowed following "UPROC" -- any non-slash
Not
sed 's/\(UPROC.*\/\).*/\1/'
but
sed -r 's,(UPROC[^/]*/).*,\1,'
Using different delimiters for the s/// command reduces the "leaning toothpick" problem.
Because the .* in the () is matching to the / at the end of test/.
You need [^/]* instead of . to not match any slashes.
When you want to know in which directory you are, why don't use pwd?
One thing which might be useful: the command pwd shows the value of the environment variable PWD (uppercase). In case you want to use the current directory as a value, you might use this.

Script shell for renaming and rearranging files

I would like to rearrange and rename files.
I have this tree structure of files :
ada/rda/0.05/alpha1_freeSurface.md
ada/rda/0.05/p_freeSurface.md
ada/rda/0.05/U_freeSurface.md
ada/rda/0.1/alpha1_freeSurface.md
ada/rda/0.1/p_freeSurface.md
ada/rda/0.1/U_freeSurface.md
I want that files will be renamed and rearranged like this structure below:
ada/rda/ada-0.05-alpha1.md
ada/rda/ada-0.05-p.md
ada/rda/ada-0.05-U.md
ada/rda/ada-0.1-alpha1.md
ada/rda/ada-0.1-p.md
ada/rda/ada-0.1-U.md
Using the perl rename (sometimes called prename) utility:
rename 's|ada/rda/([^/]*)/([^_]*).*|ada/rda/ada-$1-$2.md|' ada/rda/*/*
(Note: by default, some distributions install a rename command from the util-linux package. This command is incompatible. If you have such a distribution, see if the perl version is available under the name prename.)
How it works
rename takes a perl commands as an argument. Here the argument consists of a single substitute command. The new name for the file is found from applying the substitute command to the old name. This allows us not only to give the file a new name but also a new directory as above.
In more detail, the substitute command looks like s|old|new|. In our case, old is ada/rda/([^/]*)/([^_]*).*. This captures the number in group 1 and the beginning of the filename (the part before the first _) in group 2. The new part is ada/rda/ada-$1-$2.md. This creates the new file name using the two captured groups.
You can use basename and dirname functions to reconstruct the new filename:
get_new_name()
{
oldname=$1
prefix=$(basename $oldname _freeSurface.md)
dname=$(dirname $oldname)
basedir=$(dirname $dname)
dname=$(basename $dname)
echo "$basedir/ada-$dname-$prefix.md"
}
e.g. get_new_name("ada/rda/0.05/alpha1_freeSurface.md") will show ada/rda/ada-0.05-alpha1.md in console.
Then, you can loop through all your files and use mv command to rename the files.

How do you format output string in bash script for input by another script?

I need to unzip a bunch of student assignment (jar) files so that I can use a script to submit the contents to the Moss (Stanford) plagiarism detection server. I did the same thing in Java which was trivial but I'm trying to re-implement to as a bash script.
I am trying to do the following:
Get a list of student names (each student has a directory).
In each student directory, sub-directories exist numbered from 1 to the
latest submission. I need to get the directory with the highest
number.
Inside of each of those submission directories contains a
jar file that I need. I copy each jar into a temp directory with the
same name as the student and unzip it.
I need that temp directory listing formatted as a string in the form
/tempDir/studentName1/.languageExt /tempDir/studentName2/.languageExt
The student directory has the basic structure:
Student_Root_Directory:
Student1
Student2
Student1
Sub-Directories: 1 2 3 4 5
1: student1.jar
2: student1.jar
...
Student2
Sub-Directories: 1 2 3
1. student2.jar
...
To do the first 3 steps above I did:
#!/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: .c, .cpp, .java, .py
students=`ls $1`
student_dir=$1
languageExt=$2
mossDir="/home/moss"
tempDir="/home/moss/tempJarStorage"
for student in $students
do
latestSubmissionDir=`ls -t $student_dir/$student | head -1`
for jarDir in $latestSubmissionDir
do
mkdir $tempDir/$student
cp $student_dir/$student/$jarDir/*.jar $tempDir/$student
unzip -d $tempDir/$student/ -o -j $tempDir/$student/$student.jar *.$languageExt
rm $tempDir/$student/$student.jar
done
done
...which results in a number of student directories being created in a temp directory that contains only the unzipped contents for the student submissions.
I need the ls output of the new temp directories formatted as a string that contains:
/tempDir/studentName1/\*.languageExt /tempDir/studentName2/\*.languageExt
I have tried variations on
find "$tempDir" -iname "*.$languageExt" -printf "%p/*.$languageExt"
using iname and not - but I either have output that contains extra directory information such as $tempDir/*.languageExt (when I just need the subdirectories $tempDir/$studentName/*.languageExt) or I have output where the path for every source file is also listed such as:
$tempDir/$studentName/studentNameA.java
$tempDir/$studentName/studentNameB.java
when I only need
$tempDir/$studentName/*.java
I think this should be really easy and I'm just over thinking it. Any hints for improving the script also appreciated.
Here's a revised version of the script hat may work:
#/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: c, cpp, java, py
students_dir=$1
languageExt=$2
studentPathsT=( "$students_dir"/*/ )
mossDir='/home/moss'
tempDir='/home/moss/tempJarStorage'
for studentPathT in "${studentPathsT[#]}"; do
student=$(basename "$studentPathT")
mkdir "$tempDir/$student"
submissionDirsT=( "$studentPathT"*/ )
latestSubmissionDirT=${submissionDirsT[${#submissionDirsT[#]-1]}
cp "$latestSubmissionDirT"*.jar "$tempDir/$student/"
unzip -d "$tempDir/$student/" -o -j "$tempDir/$student/*.jar" "*.$languageExt"
rm "$tempDir/$student"/*.jar
done
# Note that at this point `"$tempDir"/*/*.$languageExt` would expand
# to all extracted submission files, across all students.
# Finally, output each student's extracted files as an unexpanded glob à la
# /{tempDir}/{studentName1}/*.{languageExt}
for pT in "$tempDir"/*/; do
echo "$pT*.$languageExt"
# Note: If there is a chance that your filenames contain
# embedded newlines (rare in practice) using `echo` won't work properly
# as #Charles Duffy points out.
# If that is a concern, use
# printf '%s\0' "$pT*.$languageExt"
# and process the output with a utility that can process NUL characters
# as separators, such as `xargs -0`.
done
It avoids using ls and only uses pathname expansion and array variables so as to properly deal with paths that contain embedded spaces and other shell metacharacters.
suffix ...T in variable names indicates that a particular path or array of paths is *T*erminated, i.e, that it ends in a /.
The assumption is that the numbered subdirectories do not go beyond 9, as the implicit lexical sorting of pathname expansion is relied upon; if the numbers go higher, explicit numerical sorting must be applied.
Note that the globs (pathname patterns) passed to unzip are intentionally double-quoted, as they should be interpreted by unzip, not the shell.
Note that, based on your original code, I've assumed that $languageExt does NOT start with . (e.g., cpp rather than .cpp), despite what your comment says.

Resources