how to add a character before every line? - linux

I have a big txt file where I want to add a fasta symbol before every line as a new line. I tried with sed, I can add it before the line but not as a new line.
I have file like this
AAAAAAAAAACA
AAAAAAAAAACTTAT
AAAAAAAAACATGTGACTA
AAAAAAAAACTTATTCTTTTT
AAAAAAAACATGTGACT
And I want something like this
>
AAAAAAAAAACA
>
AAAAAAAAAACTTAT
>
AAAAAAAAACATGTGACTA
>
AAAAAAAAACTTATTCTTTTT
>
AAAAAAAACATGTGACT
Thanks,

you can use the sed command like this:
sed 's/^/>\n/g' file.txt> file2.txt

SED
# If you want to edit the file in-place
sed -i -e 's/^/prefix/' file
# If you want to create a new file
sed -e 's/^/prefix/' file > file.new
How?
simple format is : sed 's/x/y/g' - with this, x will be replaced with y
Use -e to add the script to the command (better to use it always) (More described here (click here))
Use -i for input in the same file
AWK
awk '{print ">\n"$0}' file >> newFile
How?
Simple format : awk '{print $0}' file will print whole lines as it is.
Then just add the prefix you need "xyz\n"\
WHILE LOOP
while read line ;
do
echo -e ">\n$line" ;
done < file
With this you can play around your each lines
If you want to do for particular lines alone use the file input as done < <(cat file | grep "TT")
If you want to add some condition, if line contains... then.. echo "prefix"$line is also possible.
Note: it removes all leading and trailing whitespace characters (spaces and tabs, or any whitespace characters present in IFS)
Note:
If prefix contains /, you can use any other character not in prefix, or
escape the /, so the sed command becomes
's#^#/opt/workdir#'
# or
's/^/\/opt\/workdir/'

$ awk '$0=">\n"$0' <(echo -e "foo\nbar")
>
foo
>
bar
# change inplace
$ cat file
foo
bar
baz
$ awk -i inplace '$0=">\n"$0' file
$ cat file
>
foo
>
bar
>
baz

Related

Bash: redirect `cat` to file without newline

I'm sure this question has been answered somewhere, but while searching for it, I can't find my exact scenario. I have two files that I am concatenating into a single file, but I am also adding user input between the two files. The problem is that a newline is being inserted after the first file, the rest of it works as desired.
touch newFile
cat file1 > newFile
echo -n $userInput >> newFile
cat file2 >> newFile
How do I prevent or remove the newline when file1 is added to newFile? If I cat file1 there seems to be a newline added by cat but everything I see about cat says it doesn't do that. If I vim file1 there's not a blank line at the end of the file that would indicate the newline is a part of the file, so either cat is actually adding a newline, or the redirect > is doing it, or echo adds a newline at the beginning of its output, none of which would be desirable in this situation. One solution I saw was to use
cat file1 | tr -d '\n'
but that discards all the newlines in the file, also not desirable. So, to repeat my question:
How do I cat file1 into the new file and add user input without adding the newline between them?
(cat is not a requirement, but I am not familiar with printf, so if that's the solution then please elaborate on its use).
With these inputs:
userInput="Test Test Test"
echo "Line 1
Line 2
Line 3" >file1
echo "Line 4
Line 5
Line 6" >file2
I would do:
printf "%s%s%s" "$(cat file1)" "$userInput" "$(cat file2)" >newfile
The creation of >newfile is equivalent to touch and adding content in your first step. A bit easier to see intent with this.
I get:
$ cat newfile
Line 1
Line 2
Line 3Test Test TestLine 4
Line 5
Line 6
Like all other Unix tools, Vim considers \n a line terminator and not a line separator.
This means that a linefeed after the last piece of text will be considered part of the last line, and will not show an additional blank line.
If there is no trailing linefeed, Vim will instead show [noeol] in the status bar when the file is loaded:
foo
~
~
~
~
~
"file" [noeol] 1L, 3C 1,1 All
^---- Here
So no, the linefeed is definitely part of your file and not being added by bash in any way.
If you want to strip all trailing linefeeds, you can do this as a side effect of command expansion:
printf '%s' "$(<file1)" >> newfile
touch newFile
echo -n "$(cat file1)" > newFile
echo -n $userInput >> newFile
cat file2 >> newFile
That did the trick.

Linux: Append variable to end of line using line number as variable

I am new to shell scripting. I am using ksh.
I have this particular line in my script which I use to append text in a variable q to the end of a particular line given by the variable a
containing the line number .
sed -i ''$a's#$#'"$q"'#' test.txt
Now the variable q can contain a large amount of text, with all sorts of special characters, such as !##$%^&*()_+:"<>.,/;'[]= etc etc, no exceptions. For now, I use a couple of sed commands in my script to remove any ' and " in this text (sed "s/'/ /g" | sed 's/"/ /g'), but still when I execute the above command I get the following error
sed: -e expression #1, char 168: unterminated `s' command
Any sed, awk, perl, suggestions are very much appreciated
The difficulty here is to quote (escape) the substitution separator characters # in the sed command:
sed -i ''$a's#$#'"$q"'#' test.txt
For example, if q contains # it will not work. The # will terminate the replacement pattern prematurely. Example: q='a#b', a=2, and the command expands to
sed -i 2s#$#a#b# test.txt
which will not append a#b to the end of line 2, but rather a#.
This can be solved by escaping the # characters in q:
sed -i 2s#$#a\#b# test.txt
However, this escaping could be cumbersome to do in shell.
Another approach is to use another level of indirection. Here is an example of using a Perl one-liner. First q is passed to the script in quoted form. Then, within the script the variable assigned to a new internal variable $q. Using this approach there is no need to escape the substitution separator characters:
perl -pi -E 'BEGIN {$q = shift; $a = shift} s/$/$q/ if $. == $a' "$q" "$a" test.txt
Do not bother trying to sanitize the string. Just put it in a file, and use sed's r command to read it in:
echo "$q" > tmpfile
sed -i -e ${a}rtmpfile test.txt
Ah, but that creates an extra newline that you don't want. You can remove it with:
sed -e ${a}rtmpfile test.txt | awk 'NR=='$a'{printf $0; next}1' > output
Another approach is to use the patch utility if present in your system.
patch test.txt <<-EOF
${a}c
$(sed "${a}q;d" test.txt)$q
.
EOF
${a}c will be replaced with the line number followed by c which means the operation is a change in line ${a}.
The second line is the replacement of the change. This is the concatenated value of the original text and the added text.
The sole . means execute the commands.

UNIX: Grep a specific word and all the text following it

I have a variable in Unix, that stores multiple lines of alpha-numeric characters. I want to grep to a specific word and get all the text following it.
For example, $Variable contains:
Hello, User
Your files are:
File1 : Exists
File2 : None
Let us say I want to find File2, which is the last line and I want if it is Yes or None or whatever text is present after the colon and save it to another variable.
Use sed instead
sed -n '/the word you are looking for/,$p' <file name>
or since you said it was in a variable something more like:
echo "$variable" | sed -n '/the word you are looking for/,$p'
sed -n says do not print.
the pattern says from "the word you are looking for" to $ which is the end of file do the p command which is print :)
If you have to stop before the end of the file then you have to replace $ with the end pattern
If you just want to save the results to another variable:
new_variable=$(echo "$variable" | sed -n '/the word you are looking for/,$p')
Also note that is the string you are looking for has / in it then you must escape it with \ so it would look like
new_variable=$(echo "$variable" | sed -n '/the word you are\/ looking for/,$p')
So you have a variable defined as:
$ var="abc\ndef\nghi\njkl\nmn"
Then, if you want to print "line" containing "ghi" and following this way:
$ echo -e $var | sed -n '/ghi/,$p'
grep is to Globally search for a Regular Expression and Print the matching string. That is not what you want to do, you want to take a Stream of input and EDit it to output part of it. Guess what tool does THAT in UNIX.
$ echo "$var"
Hello, User
Your files are:
File1 : Exists
File2 : None
$ var2=$(echo "$var" | sed -n 's/^File2 : //p')
$ echo "$var2"
None
Given:
variable="Hello, User
Your files are:
File1 : Exists
File2 : None"
You can get the information for File2 into another variable file2 using:
file2=$(echo "$variable" | sed -n '/File2/ s/File2 *: *//p')
The double quotes preserve newlines in the variable. The -n suppresses the default printing. The pattern matches the line containing File2 followed by any number of spaces, a colon and any number of additional spaces; it is replaced by nothing, and the remainder of the line is printed by sed and that is captured in the variable file2. If there can be spaces in front of File2 in the data, you can arrange to match and remove them too.

Insert line in the middle of file with standard unix tools

I can grap a specific line from a file using sed. Is there an easy way to take this line or paragraph and insert onto a specific line in another file?
sed -n 1,10p >> foo appends the result to foo, which places it at the bottom. Is there a standard unix tool to insert onto a specific line?
Perhaps you are looking for sed's r command?
sed '123r file.txt' main.txt
inserts the contents of file.txt at line 123 of main.txt, printing everything to standard output.
(If your sed has the -i option, you can make it modify main.txt directly; otherwise, it will not modify its input files.)
If you want to replace the nth line in file foo you can do it with
cp foo foo.tmp
head -n $((n-1)) foo.tmp > foo
echo "newline" >> foo
tail -n +$((n+1)) foo.tmp >> foo
So you take the first n-1 lines with head -n NR, append your new line and then append the rest starting from line n+1 with tail -n +NR.
This might work for you (GNU sed):
sed '123s|.*|sed '\''1,10!d'\'' insert.txt|e' main.txt

Add a prefix string to beginning of each line

I have a file as below:
line1
line2
line3
And I want to get:
prefixline1
prefixline2
prefixline3
I could write a Ruby script, but it is better if I do not need to.
prefix will contain /. It is a path, /opt/workdir/ for example.
# If you want to edit the file in-place
sed -i -e 's/^/prefix/' file
# If you want to create a new file
sed -e 's/^/prefix/' file > file.new
If prefix contains /, you can use any other character not in prefix, or
escape the /, so the sed command becomes
's#^#/opt/workdir#'
# or
's/^/\/opt\/workdir/'
awk '$0="prefix"$0' file > new_file
In awk the default action is '{print $0}' (i.e. print the whole line), so the above is equivalent to:
awk '{print "prefix"$0}' file > new_file
With Perl (in place replacement):
perl -pi 's/^/prefix/' file
You can use Vim in Ex mode:
ex -sc '%s/^/prefix/|x' file
% select all lines
s replace
x save and close
If your prefix is a bit complicated, just put it in a variable:
prefix=path/to/file/
Then, you pass that variable and let awk deal with it:
awk -v prefix="$prefix" '{print prefix $0}' input_file.txt
Here is a hightly readable oneliner solution using the ts command from moreutils
$ cat file | ts prefix | tr -d ' '
And how it's derived step by step:
# Step 0. create the file
$ cat file
line1
line2
line3
# Step 1. add prefix to the beginning of each line
$ cat file | ts prefix
prefix line1
prefix line2
prefix line3
# Step 2. remove spaces in the middle
$ cat file | ts prefix | tr -d ' '
prefixline1
prefixline2
prefixline3
If you have Perl:
perl -pe 's/^/PREFIX/' input.file
Using & (the whole part of the input that was matched by the pattern”):
cat in.txt | sed -e "s/.*/prefix&/" > out.txt
OR using back references:
cat in.txt | sed -e "s/\(.*\)/prefix\1/" > out.txt
Using the shell:
#!/bin/bash
prefix="something"
file="file"
while read -r line
do
echo "${prefix}$line"
done <$file > newfile
mv newfile $file
While I don't think pierr had this concern, I needed a solution that would not delay output from the live "tail" of a file, since I wanted to monitor several alert logs simultaneously, prefixing each line with the name of its respective log.
Unfortunately, sed, cut, etc. introduced too much buffering and kept me from seeing the most current lines. Steven Penny's suggestion to use the -s option of nl was intriguing, and testing proved that it did not introduce the unwanted buffering that concerned me.
There were a couple of problems with using nl, though, related to the desire to strip out the unwanted line numbers (even if you don't care about the aesthetics of it, there may be cases where using the extra columns would be undesirable). First, using "cut" to strip out the numbers re-introduces the buffering problem, so it wrecks the solution. Second, using "-w1" doesn't help, since this does NOT restrict the line number to a single column - it just gets wider as more digits are needed.
It isn't pretty if you want to capture this elsewhere, but since that's exactly what I didn't need to do (everything was being written to log files already, I just wanted to watch several at once in real time), the best way to lose the line numbers and have only my prefix was to start the -s string with a carriage return (CR or ^M or Ctrl-M). So for example:
#!/bin/ksh
# Monitor the widget, framas, and dweezil
# log files until the operator hits <enter>
# to end monitoring.
PGRP=$$
for LOGFILE in widget framas dweezil
do
(
tail -f $LOGFILE 2>&1 |
nl -s"^M${LOGFILE}> "
) &
sleep 1
done
read KILLEM
kill -- -${PGRP}
Using ed:
ed infile <<'EOE'
,s/^/prefix/
wq
EOE
This substitutes, for each line (,), the beginning of the line (^) with prefix. wq saves and exits.
If the replacement string contains a slash, we can use a different delimiter for s instead:
ed infile <<'EOE'
,s#^#/opt/workdir/#
wq
EOE
I've quoted the here-doc delimiter EOE ("end of ed") to prevent parameter expansion. In this example, it would work unquoted as well, but it's good practice to prevent surprises if you ever have a $ in your ed script.
Here's a wrapped up example using the sed approach from this answer:
$ cat /path/to/some/file | prefix_lines "WOW: "
WOW: some text
WOW: another line
WOW: more text
prefix_lines
function show_help()
{
IT=$(CAT <<EOF
Usage: PREFIX {FILE}
e.g.
cat /path/to/file | prefix_lines "WOW: "
WOW: some text
WOW: another line
WOW: more text
)
echo "$IT"
exit
}
# Require a prefix
if [ -z "$1" ]
then
show_help
fi
# Check if input is from stdin or a file
FILE=$2
if [ -z "$2" ]
then
# If no stdin exists
if [ -t 0 ]; then
show_help
fi
FILE=/dev/stdin
fi
# Now prefix the output
PREFIX=$1
sed -e "s/^/$PREFIX/" $FILE
You can also achieve this using the backreference technique
sed -i.bak 's/\(.*\)/prefix\1/' foo.txt
You can also use with awk like this
awk '{print "prefix"$0}' foo.txt > tmp && mv tmp foo.txt
Using Pythonize (pz):
pz '"preix"+s' <filename
Simple solution using a for loop on the command line with bash:
for i in $(cat yourfile.txt); do echo "prefix$i"; done
Save the output to a file:
for i in $(cat yourfile.txt); do echo "prefix$i"; done > yourfilewithprefixes.txt
You can do it using AWK
echo example| awk '{print "prefix"$0}'
or
awk '{print "prefix"$0}' file.txt > output.txt
For suffix: awk '{print $0"suffix"}'
For prefix and suffix: awk '{print "prefix"$0"suffix"}'
For people on BSD/OSX systems there's utility called lam, short for laminate. lam -s prefix file will do what you want. I use it in pipelines, eg:
find -type f -exec lam -s "{}: " "{}" \; | fzf
...which will find all files, exec lam on each of them, giving each file a prefix of its own filename. (And pump the output to fzf for searching.)
If you need to prepend a text at the beginning of each line that has a certain string, try following. In the following example, I am adding # at the beginning of each line that has the word "rock" in it.
sed -i -e 's/^.*rock.*/#&/' file_name
SETLOCAL ENABLEDELAYEDEXPANSION
YourPrefix=blabla
YourPath=C:\path
for /f "tokens=*" %%a in (!YourPath!\longfile.csv) do (echo !YourPrefix!%%a) >> !YourPath!\Archive\output.csv

Resources