Grep group of lines - linux

I'd like to traverse a text file a pull out groups of lines at a time.
In the example below, I'd like to grep all lines below AAA but stop at bbb (ie all of the 'xxx')
Thanks
example:
-------AAA-------
xxx
xxx
xxx
xxx
xxx
-------bbb--------
yyy
yyy
yyy
yyy
------AAA---------
xxx
xxx
xxx
xxx
------bbb--------
yyy

if you don't care about inclusion of AAA and bbb lines, this should suffice for your example
$ awk '/AAA/,/bbb/' file
if you don't want AAA and bbb lines
$ awk '/bbb/{f=0}/AAA/{f=1;next}f{print}' file
Alternatively, if you have Ruby(1.9+)
$ ruby -0777 -ne 'puts $_.scan(/-+AAA-+(.*?)-+bbb-+/m) ' file

Related

shell duplicate spaces in file

Is it possible to remove multiple spaces from a text file and save the changes in the same file using awk or grep?
Input example:
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
Simply reset value of $1 to again $1 which will allow OFS to come into picture and will add proper spaces into lines.
awk '{$1=$1} 1' Input_file
EDIT: Since OP mentioned that what if we want to keep only starting spaces then try following.
awk '
match($0,/^ +/){
spaces=substr($0,RSTART,RLENGTH)
}
{
$1=$1
$1=spaces $1
spaces=""
}
1
' Input_file
Using sed
sed -i -E 's#[[:space:]]+# #g' < input file
For removing spaces at the start
sed -i -E 's#[[:space:]]+# #g; s#^ ##g' < input file
Demo:
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$sed -i -E 's#[[:space:]]+# #g' test.txt
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$

linux command to delete the last column of csv

How can I write a linux command to delete the last column of tab-delimited csv?
Example input
aaa bbb ccc ddd
111 222 333 444
Expected output
aaa bbb ccc
111 222 333
It is easy to remove the fist field instead of the last. So we reverse the content, remove the first field, and then revers it again.
Here is an example for a "CSV"
rev file1 | cut -d "," -f 2- | rev
Replace the "file1" and the "," with your file name and the delimiter accordingly.
You can use cut for this. You specify a delimiter with option -d and then give the field numbers (option -f) you want to have in the output. Each line of the input gets treated individually:
cut -d$'\t' -f 1-6 < my.csv > new.csv
This is according to your words. Your example looks more like you want to strip a column in the middle:
cut -d$'\t' -f 1-3,5-7 < my.csv > new.csv
The $'\t' is a bash notation for the string containing the single tab character.
You can use below command which will delete the last column of tab-delimited csv irrespective of field numbers,
sed -r 's/(.*)\s+[^\s]+$/\1/'
for example:
echo "aaa bbb ccc ddd 111 222 333 444" | sed -r 's/(.*)\s+[^\s]+$/\1/'

grep -v except pattern

I want to grep -v file except pattern.
this is my file content (test.txt):
a
aaa
bbb
ccc
I want to this result:
aaa
bbb
ccc
And cat test.txt |grep -v "a" --exclude="aaa" is not correctly work and return this:
bbb
ccc
You need to use word boundary \b which matches between a word character and a non-word character.
$ grep -v '\ba\b' file
aaa
bbb
ccc
OR
$ grep -v '^a$' file
aaa
bbb
ccc
^ Asserts that we are at the start of a line and $ asserts that we are at the end of a line.
$ grep -w -v "a" test.txt
aaa
bbb
ccc
From the man page
-w, --word-regexp
Select only those lines containing matches that form whole
words.

How to replace the character I want in a line

1 aaa bbb aaa
2 aaa ccccccccc aaa
3 aaa xx aaa
How to replace the second aaa to yyy for each line
1 aaa bbb yyy
2 aaa ccccccccc yyy
3 aaa xx yyy
Issuing the following command will solve your problem.
:%s/\(aaa.\{-}\)aaa/\1yyy/g
Another way would be with \zs and \ze, which mark the beginning and end of a match in a pattern. So you could do:
:%s/aaa.*\zsaaa\ze/yyy
In other words, find "aaa" followed by anything and then another "aaa", and replace that with "yyy".
If you have three "aaa"s on a line, this won't work, though, and you should use \{-} instead of *. (See :h non-greedy)

Combine file names and content

I got several files like this:
First file is named XXX
1
2
3
Second file is named YYY
4
5
6
I would like to write content and the file names to a separate file that would look like this:
1 XXX
2 XXX
3 XXX
4 YYY
5 YYY
6 YYY
Can someone suggest a way to do this?
awk '{print $0,FILENAME}' file1 file2
Or Ruby(1.9+)
$ ruby -ne 'puts "#{$_.chomp} #{ARGF.filename}"' file1 file2
Without further explanation of what you actually need this should work:
for file in $(ls)
do
echo -n $file >> outfile
cat $file >> outfile
done

Resources