I have an output like this:
a/foo bar /
b/c/foo sth /xyz
cc/bar ghj /axz/byz
What i want is just this line:
a/foo bar /
To be more clear, I want those line ending with a specific string. I want to grep lines that have a / character at their last column.
You can use $ like this:
$ grep '/$' file
a/foo bar /
As $ stands for end of line, /$ matches those lines whose last character is a /.
grep '/$'
slash is not special character for grep and $ means match expression at the end of a line.
You can even grep the last column with only backlash at last column (but not the only column in the line)
I assumed tha the last column of a line is a string with more than one white space in front the string and no more character after the string. This assumption does not fulfill the requirement if there has only one column in that line because it does not need space in front of it to show it is last column if there has only one column.
By enable perl regular expressions (-P),
grep -P '\s+/$'
\s means matches any whitespace character (space, tab, newline)
plus sign means match 1 or more times for preceding element
$ means end of string
OR refer to Character Classes and Bracket Expressions
grep '[[:space:]]\+/$'
OR
grep '[[:blank:]]\+/$'
‘[:blank:]’ Blank characters: space and tab.
‘[:space:]’ Space characters: in the ‘C’ locale, this is tab, newline,
vertical tab, form feed, carriage return, and space. It is a synonym for '\s'.
Refer to #fedorqui, the backslash after ]] is used to distinguish with
the literal +. Thanks for the explanations.
Sorry if wrong for perl answer because I never use or learn Perl expression but really hope can help you find the last column slash so may be you can read these for more information for searching backspace with slash at end of line
grep with regexp: whitespace doesn't match unless I add an assertion
Regular expressions in Perl
Related
How to change delimiter from current comma (,) to semicolon (;) inside .txt file using linux command?
Here is my ME_1384_DataWarehouse_*.txt file:
Data Warehouse,ME_1384,Budget for HW/SVC,13/05/2022,10,9999,13/05/2022,27,08,27,08
Data Warehouse,ME_1384,Budget for HW/SVC,09/05/2022,10,9999,09/05/2022,45,58,45,58
Data Warehouse,ME_1384,Budget for HW/SVC,25/05/2022,10,9999,25/05/2022,7,54,7,54
Data Warehouse,ME_1384,Budget for HW/SVC,25/05/2022,10,9999,25/05/2022,7,54,7,54
It is very important that value of last two columns is number with 2 decimal places, so value of last 2 columns in first row for example is:"27,08"
That could be the main problem why delimiter couldn't be change in proper way.
I tried with:
sed 's/,/;/g' ME_1384_DataWarehouse_*.txt
and every comma sign has been changed, including mentioned value of the last 2 columns.
Is there anyone who can help me out with this issue?
With sed you can replace the nth occurrence of a certain lookup string. Example:
$ sed 's/,/;/4' file
will replace the 4th comma with a semicolon.
So, if you know you have 11 fields (10 commas), you can do
$ sed 's/,/;/g;s/;/,/10;s/;/,/8' file
Example:
$ seq 1 11 | paste -sd, | sed 's/,/;/g;s/;/,/10;s/;/,/8'
1;2;3;4;5;6;7;8,9;10,11
Your question is somewhat unclear, but if you are trying to say "don't change the last comma, or the third-to-last one", a solution to that might be
perl -pi~ -e 's/,(?![^,]+(?:,[^,]+,[^,]+)?$)/;/g' ME_1384_DataWarehouse_*.txt
Perl in isolation does not perform any loop over the input lines, but the -p option says to loop over input one line at a time, like sed, and print every line (there is also -n to simulate the behavior of sed -n); the -i~ says to modify the file, but save the original with a tilde added to its file name as a backup; and the regex uses a negative lookahead (?!...) to protect the two fields you want to exempt from the replacement. Lookaheads are a modern regex feature which isn't supported by older tools like sed.
Once you are satisfied with the solution, you can remove the ~ after -i to disable the generation of backups.
You can do this with awk:
awk -F, 'BEGIN {OFS=";"} {a=$NF;NF-=1; printf "%s,%s\n",$0,a} ' input_file
This should work with most awk version (do not count on Solaris standard awk)
The idea is to store the last element from row in variable, decrease the number of fields and then print using new delimiter, comma and stored last field.
Please help me with a unix command to replace anything between two delimiter positions.
For ex: I have multiple files with below header data and I want replace the data between * delimiters at 9th and 10th position
ISA*00* *00* *ZZ*80881 *ZZ*TNC0022 *190115*1237*^*00501*000320089*0*P*|~
My output should like this:
ISA*00* *00* *ZZ*80881 *ZZ*TNC0022 *190327*1237*^*00501*000320089*0*P*|~
Try this:
perl -pe 's/^((?:[^*]*\*){9})([^*]+)(.*)/${1}190327$3/'
The regexp searches for 9 occurences {9} of anything but not being a star [^*] followed by a star \* and stores all in the first capture group. The second capture is at least one character not being a star [^*]+. And the third capture is the rest of the line.
A matching line gets replaced by the first part ${1}, your new value 190327 and the third part $3.
I'm having a .txt file looking like this (along about 400 rows):
lettuceFMnode_1240 J_C7R5_99354_KNKSR3_Oligomycin 81.52
lettuceFMnode_3755 H_C1R3_99940_KNKSF2_Tubulysin 70
lettuceFMnode_17813 G_C4R5_80184_KNKS113774F_Tetronasin 79.57
lettuceFMnode_69469 J_C11R7_99276_KNKSF2_Nystatin 87.27
I want to edit the names in the entire 2nd column so that only the last part will stay (meaning delete anything before that, so in fact leaving what comes after the last _).
I looked into different solutions using a combination of cut and sed, but couldn't understand how the code should be built.
Would appreciate any tips and help!
Thank you!
Here's one way:
perl -pe 's/^\S+\s+\K\S+_//'
For every line of input (-p) we execute some code (-e ...).
The code performs a subtitution (s/PATTERN/REPLACEMENT/).
The pattern matches as follows:
^ beginning of string
\S+ 1 or more non-whitespace characters (the first column)
\s+ 1 or more whitespace characters (the space after the first column)
\K do not treat the text matched so far as part of the final match
\S+ 1 or more non-whitespace characters (the second column)
_ an underscore
Because + is greedy (it matches as many characters as possible), \S+_ will match everything up to the last _ in the second column.
Because we used \K, only the rest of the pattern (i.e. the part of the match that lies in the second column) gets replaced.
The replacement string is empty, so the match is effectively removed.
With sed:
sed 's/ [^ ]*_/ /' file
Replace first space followed by non-space characters ([^ ]*) followed by _ widh one space.
iIf a dash is in the string "grep -w" is not unique. How can I solve this?
Example:
File1:
football01 football01test
# grep -iw ^football01
football01
File2:
football01 football01-test
# grep -iw ^football01
football01
football01-test
This is the expected and documented behaviour:
-w, --word-regexp
Select only those lines containing matches that form whole words. The test is that the
matching substring must either be at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end of the line or followed
by a non-word constituent character. Word-constituent characters are letters, digits,
and the underscore.
If you add a dash, it terminates your first word as a dash is a "non-word constituent character". If you write the words together, then a word-regexp grep will treat it as one word and not match it.
What exactly it is that you want to do?
If you only want to know if your line is football01 and nothing else, you can do it as
grep -i "^football01$"
If you want to achieve something else, could you please explain what it is.
The -w switch is for word regex. In file1, football01test is a word and in file2 football01 and test are two words separated by a hyphen.
man grep says this for -w
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be at the
beginning of the line, or preceded by a non-word constituent
character. Similarly, it must be either at the end of the
line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the underscore.
Since football01 doesn't match football01test as a whole word, you aren't getting that info from grep.
If you were to do grep -i ^football01 file1.txt, you will get both lines.
In vim I have a line of text like this:
abcdef
Now I want to add an underscore or something else between every letter, so this would be the result:
a_b_c_d_e_f
The only way I know of doing this wold be to record a macro like this:
qqa_<esc>lq4#q
Is there a better, easier way to do this?
:%s/\(\p\)\p\#=/\1_/g
The : starts a command.
The % searches the whole document.
The \(\p\) will match and capture a printable symbol. You could replace \p with \w if you only wanted to match characters, for example.
The \p\#= does a lookahead check to make sure that the matched (first) \p is followed by another \p. This second one, i.e., \p\#= does not form part of the match. This is important.
In the replacement part, \1 fills in the matched (first) \p value, and the _ is a literal.
The last flag, g is the standard do them all flag.
If you want to add _ only between letters you can do it like this:
:%s/\a\zs\ze\a/_/g
Replace \a with some other pattern if you want more than ASCII letters.
To understand how this is supposed to work: :help \a, :help \zs, :help \ze.
Here's a quick and a little more interactive way of doing this, all in normal mode.
With the cursor at the beginning of the line, press:
i_<Esc>x to insert and delete the separator character. (We do this for the side effect.)
gp to put the separator back.
., hold it down until the job is done.
Unfortunately we can't use a count with . here, because it would just paste the separator 'count' times on the spot.
Use positive lookahead and substitute:
:%s/\(.\(.\)\#=\)/\1_/g
This will match any character followed by any character except line break.
:%s/../&:/g
This will add ":" after every two characters, for the whole line.
The first two periods signify the number of characters to be skipped.
The "&" (from what I gathered) is interpreted by vim to identify what character is going to be added.
Simply indicate that character right after "&"
"/g" makes the change globally.
I haven't figured out how to exclude the end of the line though, with the result being that the characters inserted get tagged onto the end...so that something like:
"c400ad4db63b"
Becomes "c4:00:ad:4d:b6:3b:"