sed in bash to overrewrite to same file [duplicate] - linux

This question already has answers here:
sed edit file in place
(15 answers)
Closed 7 years ago.
I want to remove the headers of a file and replace its content without headers in the same file.
Example: file_student
name age
XYS 24
RTF 56
The output should be:
XYS 24
RTF 56
The scenario is that I do not want to create any new file for this change. Can sed do this?
I tried:
sed 1d /tmp/file_student.txt |
hadoop fs -copyfromLocal /tmp/file_student.txt /tmp/file_student_no_header.txt
But that does not work. Any help is appreciated!

an extract from sed's man page
-i[SUFFIX]'
--in-place[=SUFFIX]'
This option specifies that files are to be edited in-place. GNU
`sed' does this by creating a temporary file and sending output to
this file rather than to the standard output.(1).
This option implies `-s'.
When the end of the file is reached, the temporary file is renamed
to the output file's original name. The extension, if supplied,
is used to modify the name of the old file before renaming the
temporary file, thereby making a backup copy(2)).
This rule is followed: if the extension doesn't contain a `*',
then it is appended to the end of the current filename as a
suffix; if the extension does contain one or more `*' characters,
then _each_ asterisk is replaced with the current filename. This
allows you to add a prefix to the backup file, instead of (or in
addition to) a suffix, or even to place backup copies of the
original files into another directory (provided the directory
already exists).
If no extension is supplied, the original file is overwritten
without making a backup.
so you need to change your sed command to sth like
sed -i 1d file | whatever
hope this helps.

If you don't want to use sed, try tail - e.g. you have a file called xxx:
tail -n +2 xxx > xxx.tmp && mv xxx.tmp xxx

Related

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

I am trying to rename all files located in a directory (recursively) with a specific meta data field appended to the end of the png file name.
the meta data field name is "aesthetic_score" with a value range from 1.0-9.0
when I type:
exiftool -Aesthetic_score -G1 -s testn.png
the result is:
[PNG] Aesthetic_score : 7.0
This is how I would like to append the png files recursively within a directory.
Note i would like to swap out the word aesthetic with the word chad in the append, and not all files will have this data field:
input file:
filename001.png (metadata aesthetic_score:7.0)
output:
filename001-chad-score-70.png
I tried to use Digikam and JExifToolGui-2.01, without success.
I am trying to perform this task in the cmd line, although other solutions are welcome. Thank you for your help.
So, this might work for you, I can't really test it; note that you would need to get rid of the echo before the mv for it to actually do something (rename rather than just show what it would do).
while read name
do
newname=$(exiftool -G1 -s "$name"|awk '$2~/FileName/{name=$4}; $2~/Aesthetic_score/{basename=gensub(/^(.+)\....$/,"\\1","1",name);ext=gensub(/^.*\.(...)$/,"\\1","1",name);gsub(/\./,"",$4);print basename"."$4"."ext}')
echo mv "$name" "$newname"
done <<<$( find -iname \*.png )
Basically the find at the very end finds all the pngs.
The while loop takes every name find throws it, and passes each file through exiftool (using your specs) and parses the output using awk, which then outputs the new name, which gets captured in the shell variable by the same name.
And finally the mv (without the echo) renames the files.

rename file using a loop [duplicate]

This question already has answers here:
Rename multiple files based on pattern in Unix
(24 answers)
Closed 2 years ago.
I have a lot of file with the same format (???_ideal.sdf) name and I need to rename all (???.sdf) removing form the name "ideal". For example:
Files:
002_ideal.sdf
ERT_ideal.sdf
234_ideal.sdf
sCX_idel.sdf
New Files:
002.sdf
ERT.sdf
234.sdf
SCX.sdf
I thought to using a loop but I don’t know how to indicate that in the new file name should be removed "individual".
For example:
for file in ???_individual.sdf; mv $file what?
With your shown samples, could you please try following. This will only print the rename command on terminal(for safer side, check and make sure if commands are fine and looking ok to you first before actually renaming files), to perform actual rename remove echo from following.
for file in *_ideal.sdf
do
echo mv "$file" "${file/_ideal/}"
done
Based on this answer you can also write it as one line with:
rename 's/^(.*)_ideal.png/$1.png/s' **/**
First parameter replaces the the _ideal.png, second one means all files.

How to concatenate a string value at the head of a text file [duplicate]

This question already has answers here:
Unix command to prepend text to a file
(21 answers)
Closed 2 years ago.
Real nit picky Linux question.
I have a text file, call it userec. I also have a string variable 'var_a'.
I want to concatenate the string value, let just say it's 'howdy' to the top of the text file.
So something like
echo $var_a | cat usrec > file_out
where it pipes the output from the echo $var_a as a file and adds it to the top of file_out and then adds the rest of the usrec file.
So if the userec file contains just the line 'This is the second line' then the contents of file_out should be:
howdy
This is the second line.
problem is that's not what the command is doing and I do not want to create a variable to store var_a in. This is running from a script and I don't want to create any extra flack to have to clean up afterwards.
I've tried other variations and I'm comming up empty.
Can anyone help me?
If you give cat any file names then it does not automatically read its standard input. In that case, you must use the special argument - in the file list to tell it to read the standard input, and where to include it in the concatenated output. Since apparently you want it to go at the beginning, that would be:
echo $var_a | cat - usrec > file_out
I would simply do :
echo $var_a > file_out
cat usrec >> file_out

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would have a specific file name as the query, output, etc., but I'm trying to use a variable so the loop can work through many files.
(Disclaimer: I am new to coding.) I believe that I am running into serious problems with trying to use my file names within my functions. As it is, my code will execute, but it creates a bunch of extra unintended files. This is what I intend for my script to do:
Line 1: Iterate through every file in my "sequences" directory. (All of which end with ".fa", if that is helpful.)
Line 3: Recognize the filename as a variable. (I know, I know, I think I've done this horribly wrong.)
Line 4: Run the blastp function using the file name as the argument for the "query" flag, always use "database.faa" as the argument for the "db" flag, and output the result in a new file that is has the same name as the initial file, but with ".txt" at the end.
Line 5: Output parts of the output file from line 4 into a new file that has the same name as the initial file, but with "_top_hits.txt" at the end.
for sequence in ./sequences/{.,}*;
do
echo "$sequence";
blastp -query $sequence -db database.faa -out ${sequence}.txt -evalue 1e-10 -outfmt 7
cat ${sequence}.txt | awk '/hits found/{getline;print}' | grep -v "#">${sequence}_top_hits.txt
done
When I ran this code, it gave me six new files derived from each file in the directory (and they were all in the same directory - I'd prefer to have them all in their own folders. How can I do that?). They were all empty. Their suffixes were, ".txt", ".txt.txt", ".txt_top_hits.txt", "_top_hits.txt", "_top_hits.txt.txt", and "_top_hits.txt_top_hits.txt".
If I can provide any further information to clarify anything, please let me know.
If you're only interested in *.fa files I would limit your input to only those matching files like this:
for sequence in sequences/*.fa;
do
I can propose you the following improvements:
for fasta_file in ./sequences/*.fa # ";" is not necessary if you already have a new line for your "do"
do
# ${variable%something} is the part of $variable
# before the string "something"
# basename path/to/file is the name of the file
# without the full path
# $(some command) allows you to use the result of the command as a string
# Combining the above, we can form a string based on our fasta file
# This string can be useful to name stuff in a clean manner later
sequence_name=$(basename ${fasta_file%.fa})
echo ${sequence_name}
# Create a directory for the results for this sequence
# -p option avoids a failure in case the directory already exists
mkdir -p ${sequence_name}
# Define the name of the file for the results
# (including our previously created directory in its path)
blast_results=${sequence_name}/${sequence_name}_blast.txt
blastp -query ${fasta_file} -db database.faa \
-out ${blast_results} \
-evalue 1e-10 -outfmt 7
# Define a file name for the top hits
top_hits=${sequence_name}/${sequence_name}_top_hits.txt
# alternatively, using "%"
#top_hits=${blast_results%_blast.txt}_top_hits.txt
# No need to cat: awk can take a file as argument
awk '/hits found/{getline;print}' ${blast_results} \
| grep -v "#" > ${sequence_name}_top_hits.txt
done
I made more intermediate variables, with (hopefully) meaningful names.
I used \ to escape line ends and allow putting commands in several lines.
I hope this improves code readability.
I haven't tested. There may be typos.
You should be using *.fa if you only want files with a .fa ending. Additionally, if you want to redirect your output to new folders you need to create those directories somewhere using
mkdir 'folder_name'
then you need to redirect your -o outputs to those files, something like this
'command' -o /path/to/output/folder
To help you test this script out, you can run each line one by one to test them. You need to make sure each line works by itself before combining.
One last thing, be careful with your use of colons, it should look something like this:
for filename in *.fa; do 'command'; done

bash script zip filename parsing strangely

I'm trying to zip various files together (one of the included files is actually a zip itself) and name the resulting zip based on a handful of bash variables defined earlier. One of the variables used in the zip file name is being parsed from a #define in a config.h file. I successfully parsed together a .zip with the correct name, but when I tried to implement the same zip script in a slightly different situation I get erroneous zip names.
In Windows explorer, the erroneous zip name looks something like X1276N~E.ZIP
In linux the zip appears with the intended name except with a question mark (which I've come to understand to be some sort of placeholder). i.g. foo-stuff-bar-9.1b?.zip
My current code trying to zip a file with name foo-stuff-bar-9.1b.zip:
foo_name=$1
bar_name=$2
rev_number=$(grep define[[:space:]]*SOME_NUMBER $directory/config.h | awk '{print $3;}'| tr -d '/"')
archive_name="$foo_name"-stuff-"$bar_name"-9."$rev_number"
zip "$archive_name".zip file1 file2 backup1.zip file3
So "foo_name" and "bar_name" are strings coming from the terminal when the script is run, "rev_number" is being parsed from config.h, and I'm formatting it all into "archive_name" before using it in the zip command.
I've tried all sorts of variations of quotation marks and brackets and I get the same weird name name no matter what I try. I'm not sure where my error is being caused as I'm parsing from many sources. Any advice is much appreciated.
Per Marc B's suggestion, I piped the string to xxd -b to look at each character byte by byte. It appeared as though I was accidentally parsing a character at the end of $archive_name when scraping the config.h file.
I was able to fix this by just piping my string through tr -d "[:cntrl:]" to remove the any control characters that would give weird file names.

Resources