How to edit this file using grep or using cat or using vim or using another tool? - vim

One of my elder brother who is studying in Statistics. Now, he is writing his thesis paper in LaTeX. Almost all contents are written for the paper. And he took 5 number after point(e.g. 5.55534) for each value those are used for his calculation. But, at the last time his instructor said to change those to 3 number after point(e.g. 5.555) which falls my brother in trouble. Finding and correcting those manually is not easy. So, he told me to help.
I believe there is also a easy solution which is know to me. The snapshot of a portion of the thesis looks like-
&se($\hat\beta_1$)&0.35581&0.35573&0.35573\\
&mse($\hat\beta_1$)&.12945&.12947&.12947\\
\addlinespace
&$\hat\beta_2$&0.03329&0.03331&0.03331 \\
&se($\hat\beta_2$)&0.01593&0.01592&0.01591\\
&mse($\hat\beta_2$)&.000265&.000264&.000264 \\
\midrule
{n=100} & $\hat\beta_1$&-.52006&-.52001&-.51946\\
&se($\hat\beta_1$)&.22819&.22814&.22795\\
&mse($\hat\beta_1$)&.05247&.05244&.05234\\
\addlinespace
&$\hat\beta_2$&0.03134&0.03134&0.03133 \\
&se($\hat\beta_2$)&0.00979&0.00979&0.00979\\
&mse($\hat\beta_2$)&.000098&.000098&.000098
I want -
&se($\hat\beta_1$)&0.355&0.355&0.355\\
&mse($\hat\beta_1$)&.129&.129&.129\\
......................................................................
........................................................................
........................................................................
Note: Don't feel boring for the syntax(These are LaTeX syntax).
If anybody has solution or suggestion, please provide. Thank you.

In sed:
$ sed 's/\(\.[0-9]\{3\}\)[0-9]*/\1/g' file
&se($\hat\beta_1$)&0.355&0.355&0.355\\
&mse($\hat\beta_1$)&.129&.129&.129\\
ie. replace period starting numeric strings with at least 3 numbers with the leading period and three first numbers.

Here is the command in vim:
:%s/\.\d\{3}\zs\d\+//g
Explanation:
: entering command-mode
% is the range of all lines of the file
s substitution command
\.\d\{3}\zs\d\+ pattern you would like to change
\. literal point (.)
\d\{3} match 3 consecutive digits
\zs start substitution from here
\d\+ one or more digits
g Replace all occurrences in the line
Concerning grep and cat they have nothing to do with replacing text. These commands are only for searching and printing contents of files.
Instead, what you are looking is substitution there are lots of commands in Linux that can do that mainly sed, perl, awk, ex etc.

Related

vim Search Replace should use replaced text in following searches

I have a data file (comma separated) that has a lot of NAs (It was generated by R). I opened the file in vim and tried to replace all the NA values to empty strings.
Here is a sample slimmed down version of a record in the file:
1,1,NA,NA,NA,NATIONAL,NA,1,NANA,1,AMERICANA,1
Once I am done with the search-replace, the intended output should be:
1,1,,,,NATIONAL,,1,NANA,1,AMERICANA,1
In other words, all the NAs should be replaced except the words NATIONAL, NANA and AMERICANA.
I used the following command in vim to do this:
1, $ s/\,NA\,/\,\,/g
But, it doesn't seem to work. Here is the output that I get:
1,1,,NA,,NATIONAL,,1,NANA,1,AMERICANA,1
As you can see, there is one ,NA, that is left out of the replacement process.
Does anyone have a good way to fix it? Thanks.
A trivial solution is to run the same command again and it will take care of the remaining ,NA,. However, it is not a feasible solution because my actual data file has 100s of columns and 500K+ rows each with a variable number of NAs.
, doesn't have a special meaning so you don't have to escape it:
:1,$s/,NA,/,,/g
Which doesn't solve your problem.
You can use % as a shorthand for 1,$:
:%s/,NA,/,,/g
Which doesn't solve your problem either.
The best way to match all those NA words to the exclusion of other words containing NA would be to use word boundaries:
:%s/,\<NA\>,/,,/g
Which still doesn't solve your problem.
Which makes those commas, that you used to restrict the match to NA and that are causing the error, useless:
:%s/\<NA\>//g
See :help :range and :help \<.
Use % instead of 1,$ (% means "the buffer" aka the whole file).
You don't need \,. , works fine.
Vim finds discrete, non-overlapping matches. so in ,NA,NA,NA, it only finds the first ,NA, and third ,NA, as the middle one doesn't have its own separate surrounding ,. We can modify the match to not include certain characters of our regex with \zs (start) and \ze (end). These modify our regex to find matches that are surrounded by other characters, but our matches don't actually include them, so we can match all the NA in ,NA,NA,NA,.
TL;DR: %s/,\zsNA\ze,//g

How to make a Palindrome with a sed command?

I'm trying to find the code that searches all palindromes in a dictionary file
this is what I got atm which is wrong :
sed -rn '/^([a-z])-([a-z])\2\1$/p' /usr/share/dict/words
Can somebody explain the code as well.
Found the right answer.
sed -n '/^\([a-z]\)\([a-z]\)\2\1$/p' /usr/share/dict/words
I have no idea why I used -
I also don't have an explenation for the \ ater each group
You can use the grep command as explained here
grep -w '^\(.\)\(.\).\2\1'
explanation The grep command searches for the first any three letters by using (.)(.). after that we are searching the same 2nd character and 1st character is occuring or not.
The above grep command will find out only 5 letters palindrome words.
extended version is proposed as well on that page; and works correctly for the first line but then crashes... there is surely some good to keep and maybe to adapt...
Guglielmo Bondioni proposed a single RE that finds all palindromes up to 19 characters long using 9 subexpressions and 9 back-references:
grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1' file
You can extend this further as much as you want :)
Perl to the rescue:
perl -lne 'print if $_ eq reverse' /usr/share/dict/words
Hate to say it, but while regex may be able to cook your breakfast, I don't think it can find a palindrome. According to the all-knowing Wikipedia:
In the automata theory, a set of all palindromes in a given alphabet is a typical example of a language that is context-free, but not regular. This means that it is impossible for a computer with a finite amount of memory to reliably test for palindromes. (For practical purposes with modern computers, this limitation would apply only to incredibly long letter-sequences.)
In addition, the set of palindromes may not be reliably tested by a deterministic pushdown automaton which also means that they are not LR(k)-parsable or LL(k)-parsable. When reading a palindrome from left-to-right, it is, in essence, impossible to locate the "middle" until the entire word has been read completely.
So a regular expression won't be able to solve the problem based on the problem's nature, but a computer program (or sed examples like #NeronLeVelu or #potong) will work.
explanation of your code
sed -rn '/^([a-z])-([a-z])\2\1$/p' /usr/share/dict/words
select and print line that correspond to :
A first (starting the line) small alphabetic character followed by - followed by another small alaphabetic character (could be the same as the first) followed by the last letter of the previous group followed by the first letter Letter1-Letter2Letter2Letter1 and the no other element (end of line)
sample:
a-bba
a is first letter
b second letter
b is \2
a is \1
But it's a bit strange for any work unless it came from a very specific dictionnary (limited to combination by example)
This might work for you (GNU sed):
sed -r 'h;s/[^[:alpha:]]//g;H;x;s/\n/&&/;ta;:a;s/\n(.*)\n(.)/\n\2\1\n/;ta;G;/\n(.*)\n\n\1$/IP;d' file
This copies the original string(s) to the hold space (HS), then removes everything but alpha characters from the string(s) and appends this to the HS. The second copy is then reversed and the current string(s) and the reversed copy compared. If the two strings are equal then the original string(s) is printed out otherwise the line is deleted.

vim substitute mulitple characters in a line

Command :%s:a:b will modify line aaa to line baa. The question is how to get result bbb using only one command (not using :%s:a:b 3 times, what I am doing now :-) ).
You need to add g flag at the end, like this:
:%s:a:b:g
When working with regular expressions this flag commonly means a "global" replacement, i.e. replace all occurrences.
The same technique usually works in other tools too that use regular expressions, for example sed, perl, etc.
UPDATE
I am surprised that such a simple answer still keeps receiving upvotes... So for you vim fans out there I recommend this great site where I still keep learning interesting new stuff: http://vimcasts.org/
remember the 'e' flag
:%s:a:b:e
Have a look at this answer Multiple search and replace in one line

Detect repeated characters using grep

I'm trying to write a grep (or egrep) command that will find and print any lines in "words.txt" which contain the same lower-case letter three times in a row. The three occurrences of the letter may appear consecutively (as in "mooo") or separated by one or more spaces (as in "x x x") but not separated by any other characters.
words.txt contains:
The monster said "grrr"!
He lived in an igloo only in the winter.
He looked like an aardvark.
Here's what I think the command should look like:
grep -E '\b[^ ]*[[:alpha:]]{3}[^ ]*\b' 'words.txt'
Although I know this is wrong, but I don't know enough of the syntax to figure it out. Using grep, could someone please help me?
Does this work for you?
grep '\([[:lower:]]\) *\1 *\1'
It takes a lowercase character [[:lower:]] and remembers it \( ... \). It than tries to match any number of spaces _* (0 included), the rememberd character \1, any number of spaces, the remembered character. And that's it.
You can try running it with --color=auto to see what parts of the input it matched.
Try this. Note that this will not match "mooo", as the word boundary (\b) occurs before the "m".
grep -E '\b([[:alpha:]]) *\1 *\1 *\b' words.txt
[:alpha:] is an expression of a character class. To use as a regex charset, it needs the extra brackets. You may have already known this, as it looks like you started to do it, but left the open bracket unclosed.

Vim replacing linefeeds

I resisted Vim, but have now given in. It works large files like a hot knife through butter.
Situation: I have a large text file, I want to put a pipe character at the beginning and ending of each line.
Problem: These Vims and other variations didn't work:
:%s/$/|\$|
:%s/\r/|\r|
:%s/$/|\r|
I suspect it's something simple to fix this, but searching Google and Stack didn't help.
You nearly had it:
:%s/^\|$/|/g
^\|$ means beginning or end of line. In a Vim regex, the | "or" pipe gets escaped. That is followed by /|/g -- replace with | globally.
Personally, I'd prefer the expressiveness of 'surround each line with pipe chars':
:%s/.*/|&|
This, in my brain, is more intuitive than 'replace a non-existing, imaginary character at the start or end of a line'.
It also depends on fewer magic chars (^,$, quoted \| which are all clumsy to type and error prone. (I mean, do you remember to quote the regex |? Do you know the difference between $ and \_$? Did you know some, if not all, of these depend on the 'magic configuration' of vim?)).
The above suffers from none of those.

Resources