Delete the first five characters on any line of a text file in Linux with sed - linux

I need a one-liner to remove the first five characters on any line of a text file. How can I do that with sed?

Use cut:
cut -c6-
This prints each line of the input starting at column 6 (the first column is 1).

sed 's/^.....//'
means
replace ("s", substitute) beginning-of-line then 5 characters (".") with nothing.
There are more compact or flexible ways to write this using sed or cut.

sed 's/^.\{,5\}//' file.dat

awk '{print substr($0,6)}' file

Related

Using Sed or Awk to divide a file into two based on whether a line contains a numeric value

I have used sed and awk for little while now, but I am having a challenge with the below problem. I am asking for an experienced sed/awk guru to help.
I have a file where some lines have numbers and some lines do not, like:
afjjdjfj.uihuihi
trfg.rtyhd
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
rtygfd.ijhniuh
etc.
I would like to have exactly two files out of this one, where every line is represented in one of the two files (none are deleted).
One containing all lines with any numbers 0-9 on them so given above file result would be:
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
and another file containing the rest of the lines that do not have any numbers 0-9 on them, so given the above, file it would be:
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
I've tried different strategies in both sed and awk and nothing is giving me exactly what I need.
What would be the best sed or awk one liner to solve this problem?
Thank you for your time,
Tom
Easily with Awk:
awk '/[0-9]/{print > file1; next} {print > file2}' inputfile
With single GNU sed command:
sed -ne '/[0-9]/w with_digits.txt' -e '//!w no_digits.txt' input
Results:
> cat no_digits.txt
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
> cat with_digits.txt
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
w filename Write the pattern space to filename.
If you don't mind running twice over the input, you can use just grep:
grep '[0-9]' input > with_digits
grep -v '[0-9]' input > without_digits
perl -MFile::Slurp -lpe '/\d/ ? append_file("digits.txt",$_) : append_file("no_digits.txt",$_)' input.txt

Replace first six commas for each line in a text file

I want to replace the first six , for each line in a text file using sed or something similar in linux.
There are more than six , on each line, but only the first six should be replaced by |.
Sed doesn't really support the notion of "the first n occurrences", only "the n-th occurrence"; GNU sed has one for "replace all matches from the n-th on", which is not what you want in this case. To get the first six commas replaced, you have to call the s command six times:
sed 's/,/|/;s/,/|/;s/,/|/;s/,/|/;s/,/|/;s/,/|/' infile
If, however, you know that there are no | in the file and you have GNU sed, you can do this:
sed 's/,/|/g;s/|/,/7g' infile
This replaces all commas with pipes, then turns the pipes from the 7th on back to commas.
If you do have pipes beforehand, you can turn them into something that you know isn't in the string first:
sed 's/|/~~/g;s/,/|/g;s/|/,/7g;s/~~/|/g' infile
This makes all | into ~~ first, then all , into |, then the | from the 7th on back into ,, and finally the ~~ back into |.
Testing on this input file:
,,,,,,X,,,,,,
,,,|,,,|,,,|,,,|
the first and third command result in
||||||X,,,,,,
||||||||,,,|,,,|
The second one would fail on the second line because there are already pipe characters.
This might work for you (GNU sed):
sed 'y/,/\n/;s/\n/,/7g;y/\n/|/' file
Translate all ,'s to \n's, then replace from the seventh \n to the end of line by ,'s, then replace the remaining \n's by |'s.
Use the following pattern in sed: sed 's/old/new/<number>'
Where <number> is the number of times you want this pattern applied.
You can replace <number> with g to apply the pattern to all occurrences.
You can try this sed,
sed -r ':loop; s/^([^,]*),/\1|/g; /^([^|]*\|){6}/t; b loop' file
(OR)
sed ':loop; s/^\([^,]*\),/\1|/g; /^\([^|]*|\)\{6\}/t; b loop' file
Test:
$ cat file
a,b,c,d,e,f,g,h,i,j,k
$ sed -r ':loop; s/^([^,]*),/\1|/g; /^([^|]*\|){6}/t; b loop' file
a|b|c|d|e|f|g,h,i,j,k
Note: This will work only if you do not have any pipe(|) before that.

How to remove blank space between some words using sed?

I want to replace characters between specific words in a line (multiple lines). for example:
first second third | first line
first second third | second line
first second third | third line
first second third | forth line
....
I want to replace characters between third and first/second/third/forth etc...using sed or vi in linux.
If this question is already answered, can you please provide me the link?
Thanks!
You can use the following:
sed 's/ |.[^a-z]*//g' text.txt
or if you want to have a space after 'third':
sed 's/ |.[^a-z]*/ /g' text.txt
remember about the -i flag to make permanent changes.
sed -i 's/\ /whatever/g' ej.txt
-i: in file, means that changes are made directly in the file
-s: substitute
-'\ ': to recognize blank space
-g: all matches on each line
Try this
sed 's/second third[^a-zA-Z]*/second third/g' file
It will replace everything between third and this first letter. And if it works use -i if you want to modify the original file

Removing string between two symbol in line

I am trying to remove a string between two symbol in line from a csv file. Here is my sample file :
1.1.1.1,A-B:,awef.C.D.E
1.1.1.2,A-B:,few.C.D.E
1.1.1.3,A-B:,dfs.C.D
1.1.1.4,A-B:,few.C.D
1.1.1.5,A-B:,fdsferger.C.D.E
1.1.1.6,A-B:,wef.C.D
1.1.1.7,A-B:,jty.C.D.E
The output would be like this :
1.1.1.1,A-B:,C.D.E
1.1.1.2,A-B:,C.D.E
1.1.1.3,A-B:,C.D
1.1.1.4,A-B:,C.D
1.1.1.5,A-B:,C.D.E
1.1.1.6,A-B:,C.D
1.1.1.7,A-B:,C.D.E
Any way I can achieve it?
The following awk command can do this:
awk 'BEGIN{FS=OFS=","}{sub("[^.]*.","",$3);print}'
It basically divides each line into the three comma-separated fields then removes the initial part of the third field, up to and including the first . character.
Then it simply outputs them again.
See the following transcript for a demonstration:
pax> echo '1.1.1.1,A-B:,awef.C.D.E
1.1.1.2,A-B:,few.C.D.E
1.1.1.3,A-B:,dfs.C.D
1.1.1.4,A-B:,few.C.D
1.1.1.5,A-B:,fdsferger.C.D.E
1.1.1.6,A-B:,wef.C.D
1.1.1.7,A-B:,jty.C.D.E' | awk 'BEGIN{FS=OFS=","}{sub("[^.]*.","",$3);print}'
1.1.1.1,A-B:,C.D.E
1.1.1.2,A-B:,C.D.E
1.1.1.3,A-B:,C.D
1.1.1.4,A-B:,C.D
1.1.1.5,A-B:,C.D.E
1.1.1.6,A-B:,C.D
1.1.1.7,A-B:,C.D.E
Here is an awk that should do:
awk '{sub(/:,[^.]*\./,":,")}1' file
1.1.1.1,A-B:,C.D.E
1.1.1.2,A-B:,C.D.E
1.1.1.3,A-B:,C.D
1.1.1.4,A-B:,C.D
1.1.1.5,A-B:,C.D.E
1.1.1.6,A-B:,C.D
1.1.1.7,A-B:,C.D.E
You can use sed also
sed -r 's/(.*:,)([a-z]*.)(.*)/\1\3/g'
(or)
sed -r 's/:,[^.]+\./:,/' file
This might work for you (GNU sed):
sed 's/^\(.*,\)[^.]*\./\1/' file
Use greed to gather up all the columns but the last and then delete upto and including the first ..

Awk or shell script for executing following program

mansa, amit, janani ,[rakesh]
aruna,mahesh,,prathiksha
This is my input.
I need a shell script or a awk command that gives me output in following manner
mansa
amit
janani
rakesh
aruna
mahesh
prathiksha
The script should remove all ,'s brackets.
I tried this
awk -F "\[\][,]+" '{for(i=1;i<=NF;i++){print $i}}'
but its printing one extra line after each record.
Easier with grep:
$ grep -o '[a-z]\+' file
mansa
amit
janani
rakesh
aruna
mahesh
prathiksha
Another option might be tr:
tr -cs '[:alpha:]' '[\n*]' < file
Although it would create empty lines if there is leading whitespace, which could then be filtered out:
tr -cs '[:alpha:]' '[\n*]' < file | awk NF
Assuming you only want to remove square brackets and split the items from within comma delimeters, you could use the following:
perl -pe 's/,+/,/g ; s/[\[\]]//g ; s/\s*,\s*/\n/g' foo.txt
The reason I recommend this approach is in the event that your named values have numbers in them or other non-alpha characters you may want to preserve.
The perl expression above contains 3 regular expressions. First part reduces multiple commas into one (to avoid empty values between commas. The second part removes the square braces. The third part splits values by replacing commas (with whitespace on either left or right) with newlines.
Output would be as follows:
mansa
amit
janani
rakesh
aruna
mahesh
prathiksha

Resources