How to filter data out of tabulated stdout stream in Bash?

How to filter data out of tabulated stdout stream in Bash? - linux

Here's what output looks like, basically:
? RESTRequestParamObj.cpp
? plugins/dupfields2/_DupFields.cpp
? plugins/dupfields2/_DupFields.h
I need to get the filenames from second column and pass them to rm. There's AWK script that goes like awk '{print $2}' but I was wondering if there's another solution.

If you have spaces between the ? and the filename then:
cut -c9-
If they're tabs then:
cut -f2

Placed your output in file
$> cat ./text
? RESTRequestParamObj.cpp
? plugins/dupfields2/_DupFields.cpp
? plugins/dupfields2/_DupFields.h
Edit it with sed
$> cat ./text | sed -r -e 's/(\?[\ \t]*)(.*)/\2/g'
RESTRequestParamObj.cpp
plugins/dupfields2/_DupFields.cpp
plugins/dupfields2/_DupFields.h
Sed in here is matching 2 parts of line -
? with tabs or spaces
Other characters until the end f the line
And then it changes whole line only with second part.

This might work for you:
echo "? RESTRequestParamObj.cpp" | sed -e 's/^\S\+/rm /' | sh
or using GNU sed
echo "? RESTRequestParamObj.cpp"| sed -r 's/^\S+/rm /e'

bash only solution, assuming your output comes from stdin:
while read line; do echo ${line##* }; done

use cut/perl instead
cut -f2 -t'\t'|xargs rm -rf
<your output>|perl -ne '#cols = split /\t/; print $cols[1]'|xargs rm -rf

Related

How do you change column names to lowercase with linux and store the file as it is?

I am trying to change the column names to lowercase in a csv file. I found the code to do that online but I dont know how to replace the old column names(uppercase) with new column names(lowercase) in the original file. I did something like this:
$cat head -n1 xxx.csv | tr "[A-Z]" "[a-z]"
But it simply just prints out the column names in lowercase, which is not enough for me.
I tried to add sed -i but it did not do any good. Thanks!!

Using awk (readability winner) :
concise way:
awk 'NR==1{print tolower($0);next}1' file.csv
or using ternary operator:
awk '{print (NR==1) ? tolower($0): $0}' file.csv
or using if/else statements:
awk '{if (NR==1) {print tolower($0)} else {print $0}}' file.csv
To change the file for real:
awk 'NR==1{print tolower($0);next}1' file.csv | tee /tmp/temp
mv /tmp/temp file.csv
For your information, sed using the in place edit switch -i do the same: it use a temporary file under the hood.
You can check this by using :
strace -f -s 800 sed -i'' '...' file

Using perl:
perl -i -pe '$_=lc() if $.==1' file.csv
It replace the file on the fly with -i switch

You can use sed to tell it to replace the first line with all lower-case and then print the rest as-is:
sed '1s/.*/\L&/' ./xxx.csv
Redirect the output or use -i to do an in-place edit.
Proof of Concept
$ echo -e "COL1,COL2,COL3\nFoO,bAr,baZ" | sed '1s/.*/\L&/'
col1,col2,col3
FoO,bAr,baZ

How to apply my sed command to some lines of all my files?

I've 95 files that looks like :
2019-10-29-18-00/dev/xx;512.00;0.4;/var/x/xx/xxx
2019-10-29-18-00/dev/xx;512.00;0.68;/xx
2019-10-29-18-00/dev/xx;512.00;1.84;/xx/xx/xx
2019-10-29-18-00/dev/xx;512.00;80.08;/opt/xx/x
2019-10-29-18-00/dev/xx;20480.00;83.44;/var/x/x
2019-10-29-18-00/dev/xx;3584.00;840.43;/var/xx/x
2019-10-30-00-00/dev/xx;2048.00;411.59;/
2019-10-30-00-00/dev/xx;7168.00;6168.09;/usr
2019-10-30-00-00/dev/xx;3072.00;1036.1;/var
2019-10-30-00-00/dev/xx;5120.00;348.72;/tmp
2019-10-30-00-00/dev/xx;20480.00;2033.19;/home
2019-10-30-12-00;/dev/xx;5120.00;348.72;/tmp
2019-10-30-12-00;/dev/hd1;20480.00;2037.62;/home
2019-10-30-12-00;/dev/xx;512.00;0.43;/xx
2019-10-30-12-00;/dev/xx;3584.00;794.39;/xx
2019-10-30-12-00;/dev/xx;512.00;0.4;/var/xx/xx/xx
2019-10-30-12-00;/dev/xx;512.00;0.68;/xx
2019-10-30-12-00;/dev/xx;512.00;1.84;/var/xx/xx
2019-10-30-12-00;/dev/xx;512.00;80.08;/opt/xx/x
2019-10-30-12-00;/dev/xx;20480.00;83.44;/var/xx/xx
2019-10-30-12-00;/dev/x;3584.00;840.43;/var/xx/xx
For some lines I've 2019-10-29-18-00/dev and for some other lines, I've 2019-10-30-12-00;/dev/
I want to add the ; before the /dev/ where it is missing, so for that I use this sed command :
sed 's/\/dev/\;\/dev/'
But How I can apply this command for each lines where the ; is missing ? I try this :
for i in $(cat /home/xxx/xxx/xxx/*.txt | grep -e "00/dev/")
do
sed 's/\/dev/\;\/dev/' $i > $i
done
But it doesn't work... Can you help me ?

Could you please try following with GNU awkif you are ok with it.
awk -i inplace '/00\/dev\//{gsub(/00\/dev\//,"/00;/dev/")} 1' *.txt
sed solution: Tested with GNU sed for few files and it worked fine.
sed -i.bak '/00\/dev/s/00\/dev/00\;\/dev/g' *.txt

This might work for you (GNU sed & parallel):
parallel -q sed -i 's#;*/dev#;/dev#' ::: *.txt
or if you prefer:
sed -i 's#;*/dev#;/dev#' *.txt

Ignore lines with ;/dev.
sed '/;\/dev/{p;d}; s^/dev^;/dev^'
The /;\/dev/ check if the line has ;/dev. If it has ;/dev do: p - print the current line and d - start from the beginning.
You can use any character with s command in sed. Also, there is no need in escaping \;, just ;.
How I can apply this command for each lines where the ; is missing ? I try this
Don't edit the same file redirecting to the same file $i > $i. Think about it. How can you re-write and read from the same file at the same time? You can't, the resulting file will be in most cases empty, as the > $i will "execute" first making the file empty, then sed $i will start running and it will read an empty file. Use a temporary file sed ... "$i" > temp.txt; mv temp.txt "$i" or use gnu extension -i sed option to edit in place.
What you want to do really is:
grep -l '00/dev/' /home/xxx/xxx/xxx/*.txt |
xargs -n1 sed -i '/;\/dev/{p;d}; s^/dev^;/dev^'
grep -l prints list of files that match the pattern, then xargs for each single one -n1 of the files executes sed which -i edits files in place.

grep for filtering can be eliminated in your case, we can accomplish the task with a single sed command:
for f in $(cat /home/xxx/xxx/xxx/*.txt)
do
[[ -f "$f" ]] && sed -Ei '/00\/dev/ s/([^;])(\/dev)/\1;\2/' "$f"
done

The easiest way would be to adjust your regex so that it's looking a bit wider than '/dev/', e.g.
sed -i -E 's|([0-9])/dev|\1;/dev|'
(note that I'm taking advantage of sed's flexible approach to delimiters on substitute. Also, -E changes the group syntax)
Alternatively, sed lets you filter which lines it handles:
sed -i '/[0-9]\/dev/ s/\/dev/;/dev/'
This uses the same substitution you already have but only applied on lines that match the filter regex

Bash script to remove last three charater in a file name

For ex the file is this:
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00
I want to rename this file to:
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN

Using ${parameter%word} (Remove matching suffix pattern):
$ echo "$fn"
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00
$ echo "${fn%:*}"
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN

Using cut
$ echo $fn
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00
$ echo $fn |cut -d: -f1
NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN
Using awk
echo $fn |awk -F : '{print $1}'
more ways...

According to the link here:
This should work:
awk '{old=$0;gsub(/...$/,"",$0);system("mv \""old"\" "$0)}'
provided the file name is given as input.
For eg:
ls -1 NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00|nawk '{old=$0;gsub(/...$/,"",$0);system("mv \""old"\" "$0)}'

Rename file using bash string manipulations:
# Filename needs to be in a variable
file=NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00
# Rename file
mv "$file" "${file%???}"
This removes the last three characters from filename.

Using just bash:
fn='NBDG6_CDRCCN_4004_-TTNBDG6_CCN_51-140108-1433-802580.00.Blk32768Blk.CCN:00'
mv "$fn" "${fn::-3}"

if you have Ruby
echo NBDG6_CD* | ruby -e 'f=gets.chomp;File.rename(f, f[0..-4])'

sed extract text between two patterns where second pattern may be either of one

I am trying to extract text between pattern1 (fixed) and pattern2 (this can be p2-1/p2-2).
can you please tell me how to achieve this in a single command?
A file starts with start and ends with either end or close
File1:
======
junktest
data
start
stackoverflow
sed
close
File2:
======
data2
start
stackoverflow
end
I can extract text from File1 with
sed -n "/start/,/close/p"
And from File2 with
sed -n "/start/,/end/p"
I need a single sed command to achieve both..
something like:
sed -n "/start/, /close or end /p"

Both GNU sed and BSD sed:
sed -nE '/start/,/close|end/p' file

This awk looks better
awk '/start/,/end|close/' file

sed -n -E "/Word1/,/Word2-1/p" | sed -n -E "/Word1/,/Word2-2/p"

Easy with awk:
$ awk '/start/{p=1}p{print}/end|close/{p=0}' file

Text formating - sed, awk, shell

I need some assistance trying to build up a variable using a list of exclusions in a file.
So I have a exclude file I am using for rsync that looks like this:
*.log
*.out
*.csv
logs
shared
tracing
jdk*
8.6_Code
rpsupport
dbarchive
inarchive
comms
PR116PICL
**/lost+found*/
dlxwhsr*
regression
tmp
working
investigation
Investigation
dcsserver_weblogic_
dcswebrdtEAR_weblogic_
I need to build up a string to be used as a variable to feed into egrep -v, so that I can use the same exclusion list for rsync as I do when egrep -v from a find -ls.
So I have created this so far to remove all "*" and "/" - and then when it sees certain special characters it escapes them:
cat exclude-list.supt | while read line
do
echo $line | sed 's/\*//g' | sed 's/\///g' | 's/\([.-+_]\)/\\\1/g'
What I need the ouput too look like is this and then export that as a variable:
SEXCLUDE_supt="\.log|\.out|\.csv|logs|shared|PR116PICL|tracing|lost\+found|jdk|8\.6\_Code|rpsupport|dbarchive|inarchive|comms|dlxwhsr|regression|tmp|working|investigation|Investigation|dcsserver\_weblogic\_|dcswebrdtEAR\_weblogic\_"
Can anyone help?

A few issues with the following:
cat exclude-list.supt | while read line
do
echo $line | sed 's/\*//g' | sed 's/\///g' | 's/\([.-+_]\)/\\\1/g'
Sed reads files line by line so cat | while read line;do echo $line | sed is completely redundant also sed can do multiple substitutions by either passing them as a comma separated list or using the -e option so piping to sed three times is two too many. A problem with '[.-+_]' is the - is between . and + so it's interpreted as a range .-+ when using - inside a character class put it at the end beginning or end to lose this meaning like [._+-].
A much better way:
$ sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file
\.log
\.out
\.csv
logs
shared
tracing
jdk
8\.6\_Code
rpsupport
dbarchive
inarchive
comms
PR116PICL
lost\+found
dlxwhsr
regression
tmp
working
investigation
Investigation
dcsserver\_weblogic\_
dcswebrdtEAR\_weblogic\_
Now we can pipe through tr '\n' '|' to replace the newlines with pipes for the alternation ready for egrep:
$ sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file | tr "\n" "|"
\.log|\.out|\.csv|logs|shared|tracing|jdk|8\.6\_Code|rpsupport|dbarchive|...
$ EXCLUDE=$(sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file | tr "\n" "|")
$ echo $EXCLUDE
\.log|\.out|\.csv|logs|shared|tracing|jdk|8\.6\_Code|rpsupport|dbarchive|...
Note: If your file ends with a newline character you will want to remove the final trailing |, try sed 's/\(.*\)|/\1/'.

This might work for you (GNU sed):
SEXCLUDE_supt=$(sed '1h;1!H;$!d;g;s/[*\/]//g;s/\([.-+_]\)/\\\1/g;s/\n/|/g' file)

This should work but I guess there are better solutions. First store everything in a bash array:
SEXCLUDE_supt=$( sed -e 's/\*//g' -e 's/\///g' -e 's/\([.-+_]\)/\\\1/g' exclude-list.supt)
and then process it again to substitute white space:
SEXCLUDE_supt=$(echo $SEXCLUDE_supt |sed 's/\s/|/g')

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to filter data out of tabulated stdout stream in Bash? - linux

If you have spaces between the ? and the filename then: cut -c9- If they're tabs then: cut -f2

This might work for you: echo "? RESTRequestParamObj.cpp" | sed -e 's/^\S\+/rm /' | sh or using GNU sed echo "? RESTRequestParamObj.cpp"| sed -r 's/^\S+/rm /e'

bash only solution, assuming your output comes from stdin: while read line; do echo ${line##* }; done

use cut/perl instead cut -f2 -t'\t'|xargs rm -rf <your output>|perl -ne '#cols = split /\t/; print $cols[1]'|xargs rm -rf

Related

How do you change column names to lowercase with linux and store the file as it is?

How to apply my sed command to some lines of all my files?

Bash script to remove last three charater in a file name

sed extract text between two patterns where second pattern may be either of one

Text formating - sed, awk, shell

Categories

Resources