Redhat linux, file to sort - "aaa":
4;AAA;456
3;BBB;567
2;AAA;123
1;BBB;234
5;AAA;000
sort only by second field - command:
sort -t ";" -k2,2 aaa
output is:
2;AAA;123
4;AAA;456
5;AAA;000
1;BBB;234
3;BBB;567
In my opinion output should be:
4;AAA;456
2;AAA;123
5;AAA;000
3;BBB;567
1;BBB;234
Error in sort?
There could be other reasons, but I'll guess that it is your "opinion", because you think that for records with equal keys, whichever one was first encountered in the file should be first in the output.
That is known as a "stable sort".
Stable sorts can take more work, and in most cases aren't required, so by default the sort command doesn't do it. Hence the results you saw.
It can do it if you want it to though:
$ sort --stable --field-separator=";" --key="2,2" aaa
4;AAA;456
2;AAA;123
5;AAA;000
3;BBB;567
1;BBB;234
Related
So I did an OS version-up in a linux server, and was seeing if any setting has been changed.
And when I typed "sysctl -a | grep "net.ipv4.ip_forward"
The following line was added,
net.ipv4.ip_forward_use_pmtu = 0
I know that this is because this parameter is in /proc/sys.
But I think if the result of sysctl before upload did not show this line, it was not in /proc/sys before as well, right ?
I know that 0 means " this setting is not applied...So basically it does not do anything.
But why this line is added.
The question is
Is there any possible reason that can add this line?
Thank you, ahead.
Even the question itself "added in the result of sysctl in linux server" is wrong here.
sysctl in the way you invoked it, lists all the entries.
grep which you used to filter those entries "selects" matching texts, if you'd run grep foo against the list:
foo
foobar
both items would be matched. That's exactly what you see but the only difference is instead of "foo" you have "net.ipv4.ip_forward".
Using --color shows that clearly:
Pay attention to the use of fgrep instead of grep because people tend to forget that grep interprets some characters as regular expressions, and the dot . means any character, which might also lead to unexpected matches.
I noticed something a bit odd while fooling around with sed. If you try to remove multiple line intervals (by number) from a file, but any interval specified later in the list is fully contained within an interval earlier in the list, then an additional single line is removed after the specified (larger) interval.
seq 10 > foo.txt
sed '2,7d;3,6d' foo.txt
1
9
10
This behaviour was behind an annoying bug for me, since in my script I generated the interval endpoints on the fly, and in some cases the intervals produced were redundant. I can clean this up, but I can't think of a good reason why sed would behave this way on purpose.
Since this question was highlighted as needing an answer in the Stack Overflow Weekly Newsletter email for 2015-02-24, I'm converting the comments above (which provide the answer) into a formal answer. Unattributed comments here were made by me in essentially equivalent form.
Thank you for a concise, complete question. The result is interesting. I can reproduce it with your script. Intriguingly, sed '3,6d;2,7d' foo.txt (with the delete operations in the reverse order) produces the expected answer with 8 included in the output. That makes it look like it might be a reportable bug in (GNU) sed, especially as BSD sed (on Mac OS X 10.10.2 Yosemite) works correctly with the operations in either order. I tested using 'sed (GNU sed) 4.2.2' from an Ubuntu 14.04 derivative.
More data points for you/them. Both of these include 8 in the output:
sed -e '/2/,/7/d' -e '/3/,/6/d' foo.txt
sed -e '2,7d' -e '/3/,/6/d' foo.txt
By contrast, this does not:
sed -e '/2/,/7/d' -e '3,6d' foo.txt
The latter surprised me (even accepting the basic bug).
Beats me. I thought given some of sed's arcane constructs that you might be missing the batman symbol or something from the middle of your command but sed -e '2,7d' -e '3,6d' foo.txt behaves the same way and swapping the order produces the expected results (GNU sed 4.2.2 on Cygwin). /bin/sed on Solaris always produces the expected result and interestingly so does GNU sed 3.02. Ed Morton
More data: it only seems to happen with sed 4.2.2 if the 2nd range is a subset of the first: sed '2,5d;2,5d' shows the bug, sed '2,5d;1,5d' and sed '2,5d;2,6d' do not. glenn jackman
The GNU sed home page says "Please send bug reports to bug-sed at gnu.org" (except it has an # in place of ' at '). You've got a good reproduction; be explicit about the output you expect vs the output you get (they'll get the point, but it's best to make sure they can't misunderstand). Point out that the reverse ordering of the commands works as expected, and give the various other commands as examples of working or not working. (You could even give this Q&A URL as a cross-reference, but make sure that the bug report is self-contained so that it can be understood even if no-one follows the URL.)
You can also point to BSD sed (and the Solaris version, and the older GNU 3.02 sed) as behaving as expected. With the old version GNU sed working, it means this is arguably a regression. […After a little experimentation…] The breakage occurred in the 4.1 release; the 4.0.9 release is OK. (I also checked 4.1.5 and 4.2.1; both are broken.) That will help the maintainers if they want to find the trouble by looking at what changed.
The OP noted:
Thanks everyone for comments and additional tests. I'll submit a bug report to GNU sed and post their response. santayana
My file contains something like the below:
X-TM-AS-Product-Ver: IMSVA-8.2.0.1391-8.0.0.1202-22662.005
X-TM-AS-Result: No--0.364-7.0-31-10
X-imss-scan-details: No--0.364-7.0-31-10
X-TMASE-Version: IMSVA-8.2.0.1391-8.0.1202-22662.005
X-TMASE-Result: 10--0.363600-5.000000
X-TMASE-MatchedRID: 40jyuBT4FtykMGOaBzW2QbxygpRxo469FspPdEyOR1qJNv6smPBGj5g3
9Rgsjteo4vM1YF6AJbZcLc3sLtjOty5V0GTrwsKpl6V6bOpOzUAdzA5USlz33EYWGTXfmDJJ3Qf
wsVk0UbuGrPnef/I+eo9h73qb6JgVCR2fClyPE+EPh2lMKov3fdtvzshqXylpWZGeMhmJ7ScqBW
z6M5VHW/fngY5M/1HkzhvqqZL61o+ZdBoyruxjzQ==
This is my real text! I need to extract this line!
The existing code, written in the past by someone else, executes the below line:
cat $my_file | egrep -v "^(X-TM-AS)"
| egrep -v "X-imss-scan-details"
supposedly to remove all those key value lines which start with "X-".
The above piece of code has been working fine up until today because keys starting with X-TMASE has never been among the keys in the past. It has started to appear in the files today, and therefore it has caused the code to fail in extraction of the useful data.
Among the newly added keys, it seems to me that X-TMASE-MatchedRID is the one creating the headache for us, as it has a value which spans multiple lines:
X-TMASE-MatchedRID: 40jyuBT4FtykMGOaBzW2QbxygpRxo469FspPdEyOR1qJNv6smPBGj5g3
9Rgsjteo4vM1YF6AJbZcLc3sLtjOty5V0GTrwsKpl6V6bOpOzUAdzA5USlz33EYWGTXfmDJJ3Qf
wsVk0UbuGrPnef/I+eo9h73qb6JgVCR2fClyPE+EPh2lMKov3fdtvzshqXylpWZGeMhmJ7ScqBW
z6M5VHW/fngY5M/1HkzhvqqZL61o+ZdBoyruxjzQ==
Initially I tried the below:
cat $my_file | egrep -v "^(X-TM-AS)"
| egrep -v "X-imss-scan-details"
| egrep -v "^(X-TMASE-)"
But it didn't work. It didn't completely eliminate the value for X-TMASE-MatchedRID:
9Rgsjteo4vM1YF6AJbZcLc3sLtjOty5V0GTrwsKpl6V6bOpOzUAdzA5USlz33EYWGTXfmDJJ3Qf
wsVk0UbuGrPnef/I+eo9h73qb6JgVCR2fClyPE+EPh2lMKov3fdtvzshqXylpWZGeMhmJ7ScqBW
z6M5VHW/fngY5M/1HkzhvqqZL61o+ZdBoyruxjzQ==
This is my real text! I need to extract this line!
I wanted the output to be:
This is my real text! I need to extract this line!
That is, I don't want any metadata to be seen in the output.
Any idea how that can be achieved using egrep or any equivalent command?
If you just want to remove the first paragraph some other command is better, for example sed
sed '1,/^$/ d' "$my_file"
I have a file called this.txt that has this content:
a
b
c
d
Which I generate using: ls /home > this.txt
Then I create a file called that.txt that has this content:
a
c
d
f
Which I generate using: ssh -p 1111 root#176.178.1.8 'ls /home' > that.txt
When I compare both using diff this.txt that.txt I get normal results.
Then I get the file that2.txt using an expect script to avoid typing the password for the ssh connection, with this content
a
c
d
f
Using cat I compare (visually) both files and are the same, but when I use diff this.txt that.txt I get results with no sense (it says that nothing from this.txt is in that2.txt).
Also if I use diff that.txt that2.txt I get the no sense result.
Maybe is because I'm using two different interpreters (because I use expect and bash) and the files are coded different? Any ideas?
PD: hopefully I explained myself. I'm not an English speaker and this is my first question.
I’d assume you have files with either blanks at the ends of lines or different end-of-line markers, possibly both. Please compare the outputs of od -c that.txt and od -c that2.txt. Also, it may be worth checking the file sizes.
Oh, and I should add that you do not need to put your password into an expect script. ssh can work with public key pairs, a much safer alternative, and not really hard to set up. Check man ssh-keygen for a start.
I wrote a script for a Linux bash shell.
One line takes a list of filenames and sorts them. The list looks like this:
char32.png char33.png [...] char127.png
It goes from 32 to 127.
The default sorting of ls of this list is like this
char100.png char101.png [...] char32.png char33.png [...] char99.png
Luckily, there is sort, which has the handy -V switch which sorts the list correctly (as in the first example).
Now, I have to port this script to OSX and sort in OSX is lacking the -V switch.
Do you have a clever idea of how to sort this list correctly?
Do they all start with a fixed string (char in your example)? If so:
sort -k1.5 -n
-k1.5 means to sort on the first key (there’s only one key in your example) starting from the 5th character, which will be the first digit. -n means to sort numerically. This works on Linux too.