Perl replace produces empty file from script, not from bash - linux

I'm getting pretty frustrated with this problem at the moment. I can't see what I'm doing wrong. I have this problem with google chrome that it gives a notice of not being shut down properly. I want to get rid of this. Also I have some older replaces that have to do with full screen size. In bash, all the lines produce the expected result; however, in a script file, it produces an empty settings file...
These lines are in the file:
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"work_area_bottom.*/\"work_area_bottom\": $(xrandr | grep \* | cut -d' ' -f4 | cut -d'x' -f2),/" > ~/.config/google-chrome/Default/Preferences
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"bottom.*/\"bottom\": $(xrandr | grep \* | cut -d' ' -f4 | cut -d'x' -f2),/" > ~/.config/google-chrome/Default/Preferences
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"work_area_right.*/\"work_area_right\": $(xrandr | grep \* | cut -d' ' -f4 | cut -d'x' -f1),/" > ~/.config/google-chrome/Default/Preferences
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"right.*/\"right\": $(xrandr | grep \* | cut -d' ' -f4 | cut -d'x' -f1),/" > ~/.config/google-chrome/Default/Preferences
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"exit_type.*/\"exit_type\": \"Normal\",/" > ~/.config/google-chrome/Default/Preferences
cat ~/.config/google-chrome/Default/Preferences | perl -pe "s/\"exited_cleanly.*/\"exited_cleanly\": true,/" > ~/.config/google-chrome/Default/Preferences
I've been googling a lot for this issue; however, I do not get the right search words to get a helpful result.
Problem is solved by using the perl -p -i -e option like so:
perl -p -i -e "s/\"exit_type.*/\"exit_type\": \"Normal\",/" ~/.config/google_chrome/Default/Preferences
The above line is enough to get rid of the Google chrome message of incorrect shutdown

Your problem is almost certainly:
> ~/.config/google-chrome/Default/Preferences
Because that > says 'truncate the file'. And worse - it does this first before starting to read it. So you truncate the file before reading it, resulting in a zero length file feeding into a zero length file.
I would suggest you want to do this exclusively in perl, rather than a halfway house. perl supports the -i option for an "in place edit".
Or just write your script in perl to start with. (If give sample input and output, knocking up an example that'll do what you want will be quite straightforward).

If you need search and replace some text I suggest use:
ack -l --print0 '2011...' ~/.config/google-chrome/Default/Preferences | xargs -0 -n 1 sed -i -e 's/2011../2015.../g'

Related

sed works differently for a tail -f vs tail?

I have a log file that is line separated by \n with multi-line logs (ones containing SQL) being separated by \r\n. In order to pull the SQL statements out of the file I need to convert the \r\n to \s. Not knowing sed I googled and found a good solution that works very well but it fails when I switch to tail -f.
eg. these work:
tail -1000 /mylogfile.log | sed -e ':a;N;$!ba;s/\r\n/ /g' | grep "Executing command"
cat /mylogfile.log | sed -e ':a;N;$!ba;s/\r\n/ /g' | grep "Executing command"
but this returns no data at all
tail -f /mylogfile.log | sed -e ':a;N;$!ba;s/\r\n/ /g' | grep "Executing command"
EDIT: For the person who added a "This question already has an answer here:", no that does not answer the question at all. First, that other question wasn't even resolved for the person who asked it. Second, it talks only about grep, the problem is with sed. I can have 10 greps and it still works fine if I switch from sed to perl. eg
tail -f /mylog.log | perl -ne 's/\r\n/ /g; print;' | grep "Executing command" | grep -vi ANALYZE | grep -vi DESCRIBE | grep -vi "SHOW PARTITION"

different shell behaviour: bash omits newline, zsh keeps it

I have a script which searches source files that contain a "TODO"
note inside the comments. Furthermore I use a concatenation of grep, git blame, uniq and sort to get
the list ordered by the person who wrote the TODO comment.
The following works fine in bash and zsh:
#!/bin/bash
for FILE in $(grep -r -i "todo" apps/business | awk '{print $1}' | sed 's/://' | sed 's/\#//')
do
git blame $FILE | grep -i "todo"
done | sort -k2 | uniq
Now I want to count all the entries. Instead of calling the (time expensive)
grep/git blame again, I want to save everything into $MATCHES to count it
without evaluating it again.
MATCHES=$(for FILE in $(grep -r -i "todo" apps/business | awk '{print $1}' | sed 's/://' | sed 's/\#//')
do
git blame $FILE | grep -i "todo"
done | sort -k2 | uniq)
echo $MATCHES
That's where I experience different behaviour in bash/zsh:
zsh: Returns the same as the first script (as expected)
bash: Ignores the newlines of git blame, puts everything on one line. wc -l counts 1 line.
What am I missing here? Why is bash behaving differently here?
And how do I get bash to not-ignore the newline?
zsh doesn't perform word-splitting on the unquoted parameter expansion $MATCH by default. Use echo "$MATCHES" | wc -l, and bash should work as well.
Note this is the wrong way to iterate over the output of a command; use a while loop and the read command instead.
grep -ri "todo" apps/business | awk '{print $1}' | sed -e 's/://' -e 's/\#//' |
while IFS= read -r FILE; do
git blame "$FILE" | grep -i todo
done | sort -k2 | uniq

Output of wc -l without file-extension

I've got the following line:
wc -l ./*.txt | sort -rn
i want to cut the file extension. So with this code i've got the output:
number filename.txt
for all my .txt-files in the .-directory. But I want the output without the file-extension, like this:
number filename
I tried a pipe with cut for different kinds of parameter, but all i got was to cut the whole filename with this command.
wc -l ./*.txt | sort -rn | cut -f 1 -d '.'
Assuming you don't have newlines in your filename you can use sed to strip out ending .txt:
wc -l ./*.txt | sort -rn | sed 's/\.txt$//'
unfortunately, cut doesn't have a syntax for extracting columns according to an index from the end. One (somewhat clunky) trick is to use rev to reverse the line, apply cut to it and then rev it back:
wc -l ./*.txt | sort -rn | rev | cut -d'.' -f2- | rev
Using sed in more generic way to cut off whatever extension the files have:
$ wc -l *.txt | sort -rn | sed 's/\.[^\.]*$//'
14 total
8 woc
3 456_base
3 123_base
0 empty_base
A better approach using proper mime type (what is the extension of tar.gz or such multi extensions ? )
#!/bin/bash
for file; do
case $(file -b $file) in
*ASCII*) echo "this is ascii" ;;
*PDF*) echo "this is pdf" ;;
*) echo "other cases" ;;
esac
done
This is a POC, not tested, feel free to adapt/improve/modify

Bash script to return domains instead of URL's

I have this bash script that i wrote to analyse the html of any given web page. What its actually supposed to do is to return the domains on that page. Currently its returning the number of URL's on that web page.
#!/bin/sh
echo "Enter a url eg www.bbc.com:"
read url
content=$(wget "$url" -q -O -)
echo "Enter file name to store URL output"
read file
echo $content > $file
echo "Enter file name to store filtered links:"
read links
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq | awk '/http/' > $links)
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
cat out
How can i get it to return the domains instead of the URL's. From my programming knowledge I know its supposed to do parsing from the right but i am a newbie at bash scripting. Can someone please help me. This is as far as I have gone.
I know there's a better way to do this in awk but you can do this with sed, by appending this after your awk '/http/':
| sed -e 's;https\?://;;' | sed -e 's;/.*$;;'
Then you want to move your sort and uniq to the end of that.
So that the whole line will look like:
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | awk '/http/' | sed -e 's;https\?://;;' | sed -e 's;/.*$;;' | sort | uniq -c > out)
You can get rid of this line:
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
EDIT 2:
Please note, that you might want to adapt the search patterns in the sed expressions to your needs. This solution considers only http[s]?://-protocol and www.-servers...
EDIT:
If you want count and domains:
lynx -dump -listonly http://zelleke.com | \
sed -n '4,$ s#^.*http[s]?://\([^/]*\).*$#\1#p' | \
sort | \
uniq -c | \
sed 's/www.//'
gives
2 wordpress.org
10 zelleke.com
Original Answer:
You might want to use lynx for extracting links from URL
lynx -dump -listonly http://zelleke.com
gives
# blank line at the top of the output
References
1. http://www.zelleke.com/feed/
2. http://www.zelleke.com/comments/feed/
3. http://www.zelleke.com/
4. http://www.zelleke.com/#content
5. http://www.zelleke.com/#secondary
6. http://www.zelleke.com/
7. http://www.zelleke.com/wp-login.php
8. http://www.zelleke.com/feed/
9. http://www.zelleke.com/comments/feed/
10. http://wordpress.org/
11. http://www.zelleke.com/
12. http://wordpress.org/
Based on this output you achieve desired result with:
lynx -dump -listonly http://zelleke.com | \
sed -n '4,$ s#^.*http://\([^/]*\).*$#\1#p' | \
sort -u | \
sed 's/www.//'
gives
wordpress.org
zelleke.com
You can remove path from url with sed:
sed s#http://##; s#/.*##
I want to say you also, that these two lines are wrong:
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq | awk '/http/' > $links)
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
You must make either redirection ( > out ), or command substitution $(), but not two thing at the same time. Because the variables will be empty in this case.
This part
content=$(wget "$url" -q -O -)
echo $content > $file
would be also better to write this way:
wget "$url" -q -O - > $file
you may be interested by it:
https://www.rfc-editor.org/rfc/rfc3986#appendix-B
explain the way to parse uri using regex.
so you can parse an uri from the left this way, and extract the "authority" that contains domain and subdomain names.
sed -r 's_^([^:/?#]+:)?(//([^/?#]*))?.*_\3_g';
grep -Eo '[^\.]+\.[^\.]+$' # pipe with first line, give what you need
this is interesting to:
http://www.scribd.com/doc/78502575/124/Extracting-the-Host-from-a-URL
assuming that url always begin this way
https?://(www\.)?
is really hazardous.

How to extract version from a single command line in linux?

I have a product which has a command called db2level whose output is given below
I need to extract 8.1.1.64 out of it, so far i came up with,
db2level | grep "DB2 v" | awk '{print$5}'
which gave me an output v8.1.1.64",
Please help me to fetch 8.1.1.64. Thanks
grep is enough to do that:
db2level| grep -oP '(?<="DB2 v)[\d.]+(?=", )'
Just with awk:
db2level | awk -F '"' '$2 ~ /^DB2 v/ {print substr($2,6)}'
db2level | grep "DB2 v" | awk '{print$5}' | sed 's/[^0-9\.]//g'
remove all but numbers and dot
sed is your friend for general extraction tasks:
db2level | sed -n -e 's/.*tokens are "DB2 v\([0-9.]*\)".*/\1/p'
The sed line does print no lines (the -n) but those where a replacement with the given regexp can happen. The .* at the beginning and the end of the line ensure that the whole line is matched.
Try grep with -o option:
db2level | grep -E -o "[0-9]+\.[0-9]+\.[0-9]\+[0-9]+"
Another sed solution
db2level | sed -n -e '/v[0-9]/{s/.*DB2 v//;s/".*//;p}'
This one desn't rely on the number being in a particular format, just in a particular place in the output.
db2level | grep -o "v[0-9.]*" | tr -d v
Try s.th. like db2level | grep "DB2 v" | cut -d'"' -f2 | cut -d'v' -f2
cut splits the input in parts, seperated by delimiter -d and outputs field number -f

Resources