Extract last and second last strings of a file in shell variables - linux

Although it is looking similar to my previous post but here purpose is different.
udit#udit-Dabba ~/ah $ cat decrypt.txt
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02 *00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e* 00
00 00 03 29
I want to extract last string of the file (here it is 29) in a shell varaible
I tried this ...
size=`wc -w encrypt.txt`
awk -v size=$size 'BEGIN {RS=" ";ORS=" ";}' {if (NR>size-1 &&
NR < size+1)print $0}' decrypt.txt
Output :
29
But when I changed the file slightly ..
udit#udit-Dabba ~/ah $ cat decrypt.txt
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02 *00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e* 00
65 6b 61 68 61 6e 67 61 00 00 03 29
Output :
03
Why there is discrepency between the results ??
I am new to awk and shell features so I am not sure whether it is a right way to do so or not ???
I think there should be some variation of grep,sed,awk or any other linux command which may solve my problem but I am not aware of it.
Please guide me for this.
Thanx in advance.
Purpose :
Make two variables in a shell script which should store last and second last strings of an input file.
Limitation :
Every input file contains a blank line at the end of file.
(Like in above mentioned file , after the file contents there would be one more blank line just like hitting ENTER key and that can not be changed because it is being generated through a C program at run time.)

grep -v "^$" file | tr " " "\n" | tail -n 2
Maybe the grep-part isn't perfect and maybe should change.
Edit
tr -s " " "\n" < file | tail -n 2
is better solution - see Gordon Davisson's comment.

To get the last field:
awk '{ if (NF > 0) { last = $NF } } END { print last }' "$#"
The second last field is trickier for the case where there is just one field on the last line (so you need the last field from the line before).
awk '{ if (NF > 0)
{
if (NF == 1) { lastbut1 = last; last = $1; }
else { lastbut1 = $(NF-1); last = $NF; }
}
}
END { print lastbut1 " " last; }' "$#"
This produces a blank and the last value if the file contains but one value. It produces just a blank if there are no values at all.

If you consider the record separator to be space or newline, then you just need to keep the last 2 records.
awk -v 'RS=[ \n]+' '{prev2 = prev1; prev1 = $0} END {print prev2, prev1}' filename

FIRST="$(head -n 1 file)"
LAST="$(tail -n 1 file)"
LASTBUTONE="$(tail -n 2 file | head -n 1)"
naturally, you can cut off the last field in a variety of ways:
echo "$ONEOFTHOSE" | gawk '{print $(NF)}'
echo "$ONEOFTHOSE" | sed -e 's/^.*[[:space:]]//'

Here's a tr/sed solution:
answers=$(tr -d '\n' <input_file | sed -r 's/.*(\S\S)\s*(\S\S)\s*$/\1 \2/')
echo "Last = ${answers#???} Penultimate = ${answers%???}"
Sed only:
answers=$(sed -r '1{h;d};H;${x;y/\n/ /;s/.*(\S\S)\s*(\S\S)\s*$/\1 \2/p};d' input_file)
echo "Last = ${answers#???} Penultimate = ${answers%???}"

If you got 'rev' utility installed below one would be handy. Presuming that space is the delimiter.
rev <file>|cut -f1,2|rev

Related

Vim: calling xxd with system command in substitution results in conversion error

Background is that I have a log file that contains hex dumps that I want to convert with xxd to get that nice ASCII column that shows possible strings in the binary data.
The log file format looks like this:
My interesting hex dump:
00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78
00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c
00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20
00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69
00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20
00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30
00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65
00 73 00 2e
Visually selecting the hex dump and do xxd -r -p followed by a xxd -g1 on the result does exactly what I'm aiming for.
However, since the number of dumps I want to convert are quite a few I would rather automate the process.
So I'm using the following substitute command to do the conversion:
:%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',system('xxd -r -p',submatch(0)))
The expression matches the entire hex dump in the log file. The match is sent to xxd -r -p as stdin and its output is used as stdin for xxd -g1.
Well, that's the idea at least.
The thing is that the above almost works. It produces the following result:
My interesting hex dump:
00000000: 01 53 01 6f 01 6d 01 65 01 20 01 74 01 65 01 78 .S.o.m.e. .t.e.x
00000010: 01 74 01 20 01 65 01 78 01 61 01 6d 01 70 01 6c .t. .e.x.a.m.p.l
00000020: 01 65 01 20 01 75 01 73 01 69 01 6e 01 67 01 20 .e. .u.s.i.n.g.
00000030: 01 55 01 54 01 46 01 2d 01 31 01 36 01 20 01 69 .U.T.F.-.1.6. .i
00000040: 01 6e 01 20 01 6f 01 72 01 64 01 65 01 72 01 20 .n. .o.r.d.e.r.
00000050: 01 74 01 6f 01 20 01 67 01 65 01 74 01 20 01 30 .t.o. .g.e.t. .0
00000060: 01 78 01 30 01 30 01 20 01 62 01 79 01 74 01 65 .x.0.0. .b.y.t.e
00000070: 01 73 01 2e .s..
All 00 bytes have mysteriously transformed into 01.
It should have produced the following:
My interesting hex dump:
00000000: 00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78 .S.o.m.e. .t.e.x
00000010: 00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c .t. .e.x.a.m.p.l
00000020: 00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20 .e. .u.s.i.n.g.
00000030: 00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69 .U.T.F.-.1.6. .i
00000040: 00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20 .n. .o.r.d.e.r.
00000050: 00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30 .t.o. .g.e.t. .0
00000060: 00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65 .x.0.0. .b.y.t.e
00000070: 00 73 00 2e .s..
What am I not getting here?
Of course I can use macros and other ways of doing this, but I want to understand why my substitution command doesn't do what I expect.
Edit:
For anyone that want to achieve the same thing I provide the substitution expression that works on an entire file. The expression above was only for testing purposes using the log file example also from above.
The one below is the one that performs a correct conversion, modified based on the information Kent provided in his answer.
:%s/\(\(\x\{2\} \)\{16\}\_.\)\+/\=system('xxd -p -r | xxd -g1',submatch(0))
very likely, the problem is string conversion in the system() The input will be converted into a string by vim, so does the output of your first xxd command.
You can try to extract that hex parts into a file. then:
xxd -r -p theFile|vim -
And then calling the system('xxd -g1', alltext), you are gonna get something else than 00 too.
This doesn't work in the same way of a pipe (xxd ...|xxd...). But unfortunately, the system() function doesn't accept pipes.
If you want to fix your :s command, you need to call systemlist() on your first xxd call to get the data in binary format, then pass it to the 2nd xxd:
:%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',systemlist('xxd -r -p',submatch(0)))
The cmd above will generate the 00s. since there is no string conversion.
However, when working with some data format other than plain string, perhaps we can use filters instead of calling system(). It would be a lot eaiser. For your example:
2,$!xxd -r -p|xxd -g1

Show NUL character in Sublime Text 3

I'm attempting to copy/paste ASCII characters from a Hex editor into a Sublime Text 3 Plain Text document, although NUL characters do not show/display and the string is truncated:
Hexadecimal:
48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 00 66 6F
6F 62 61 72 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00
ASCII:
Hello, World!�foobar�������������������������
Sublime Text: Truncates copied string and does not show NUL characters
TextMate: Shows NUL via "Show Invisibles"
I've tried the suggestion mentioned here by adding "draw_white_space": "all" to my preferences — still no luck! Is this possible with Sublime Text 3?
You're not alone in having this problem - others have posted bug reports about this behaviour: https://github.com/SublimeTextIssues/Core/issues/393
However it's not consistent:
Behaviour seems dependent on the file and where the NUL chars exist;
Similar issue here, with the console: https://github.com/SublimeTextIssues/Core/issues/1939

Extract portion of string by checking multiple conditions - Shell script

First time I am trying any shell script, the task which I need to perform is below:
Out put of this command is below -> cat /sys/kernel/debug/spmi/spmi-0/data
00800 00 03 03 00 01 01 00 C0 10 00 00 00 00 20 00 00
00810 00 03 03 03 00 03 03 00 00 00 00 00 00 00 00 00
00820 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00830 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00840 0F 07 01 00 0F 07 04 00 0F 07 07 80 0F 07 04 00
00850 0F 07 04 00 0F 03 08 00 00 00 01 80 00 00 00 00
00860 00 00 00 80 00 00 04 80 00 00 04 00 00 00 00 00
00870 0F 00 00 00 02 04 00 00 00 00 00 00 00 00 00 00
00880 FE 00 40 00 00 00 00 00 05 00 20 00 01 00 00 00
I need to check the value of first row and 14th column value and extract it. That value either can be 00 or 20. On the basis of that value i have to change the dir name which i think i can take care.
can any body help me out in this as i have googled from https://unix.stackexchange.com/questions/37313/how-do-i-grep-for-multiple-patterns but could not make it.
It is very easy!
$ head /sys/kernel/debug/spmi/spmi-0/data -n 1| cut -d " " -f 14
Explanation:
head /sys/kernel/debug/spmi/spmi-0/data -n 1 it will output first line from given file.
cut -d " " -f 14 will select 14th field/column (where fields get delimited by " " -Space
Edit
Usage
value=`head /sys/kernel/debug/spmi/spmi-0/data -n 1| cut -d " " -f 14`
echo $value
awk - solution
Alternate solution provided by twalberg
awk 'NR==1{print $14}' /sys/kernel/debug/spmi/spmi-0/data
Create a pattern which you need to search like in your case it’s 00 or 20 as i understand.
pattern="00|20"
cat /sys/kernel/debug/spmi/spmi-0/data | grep -E “${pattern}”
Then you can use cut -d " " -f2 for getting specific colum
This will give the output based on your pattern, hope i understood your problem correct.

Cannot ungzip : Minimum header length is 10 bytes

I have a problem with my website, which is completely gzipped:
http://goout.cz/cs/fotoreporty/
The page can be easily shown in Chrome, but in Safari it never loads (and I suppose in other browsers as well. When I try:
curl -v http://goout.cz/cs/fotoreporty/ | gzip -d
I am getting expected results. But validation on
http://validator.w3.org/check?uri=http%3A%2F%2Fgoout.cz%2Fcs%2Ffotoreporty%2F#fatal-errors
yields:
The error was: Can't gunzip content: Header Error: Minimum header size is 10 bytes
What is wrong with the gzip format? How can I solve it? Thanks.
EDIT:
The gzip header seems to me okay to me:
$ curl http://goout.cz/cs/ | head -1 | hexdump | head -1
0000000 1f 8b 08 00 00 00 00 00 00 00 ed 5d cd 73 db 38
$ curl http://goout.cz/cs/fotoreporty/ | head -1 | hexdump | head -1
0000000 1f 8b 08 00 00 00 00 00 00 00 ed 7d cf 73 e3 46

Confusion while extracting required part of a file using awk

I have a script making use of awk,sed,grep and other shell features.
I have stuck at a place so need your help ...
This is the input file for the my problem
udit#udit-Dabba ~/ah $ cat decrypt.txt
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02 *00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e* 00
00 00 03 29
My purpose is to extract 00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e from the above mentioned file
,also marked between *'s above
Although obvious but these *'s are shown to clear the situation here , they are not actually present in the file.
The last five units of the file as shown above are ..
00 00 00 03 29
These 00 are simple pad bytes and 03 specify their pad length
and now here is the part of script to extract the required part :
size=`wc -w decrypt.txt`
padlen=3 // calculated by some other mechanism
awk -v size=$size -v padlen=$padlen 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40
&& NR <=size-padlen-2) print $0}' decrypt.txt | sed '1,1s/ //'
output :
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69
My problem :
last unit 6e missing
Also tried through terminal ...
size=68,padlen=3 so loop should go from NR=40 to NR<=63
udit#udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 65)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e 00
00
Working fine if loop goes upto 65.So should also work upto 63
udit#udit-Dabba ~/ah $ awk 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <= 64)
print $0}' decrypt.txt | sed '1,1s/ //'
00 00 e0 f9 6a 61 61 6e
65 6b 61 68 61 6e 67 61 79 65 77 6f 64 69 6e
But what is this ???? when I decrease 65 to 64 , there is loss of two 00 units.Why this is happening ???
Also tried this one but could not find a reason why this weird output.
udit#udit-Dabba ~/ah $ awk 'BEGIN {RS="[ \n]";ORS=" ";} {if (NR > 40
&& NR <=65)print $0}' decrypt.txt | sed '1,1s/ //'
0002 00 00 e0 f9 6a 61 61 6e 65 6b 61 68 61 6e 67 61 79 65 77 6f 64
Plase help me out ...
May be I have explained the problem more than the required but really need it .
I am new to all these shell and awk things and so there may be a silly mistake which I could not find out .
Please help me on this ..
Thnx in advance ..
EDIT :
60 00 00 00 00 17 3a 20 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 02
These are fixed 40 units of ipv6 header,will always remain same.
The portion between *'s is of variable length that is why I need to work in that way otherwise it would have been a simple task .
_padlen=3 _length=23
awk '{
for (i = NF - l - p - 2; i < NF - p - 2; i++)
printf "%s", ($i (i < NF - p - 2 - 1 ? OFS : ORS))
}' l="$_length" p="$_padlen" RS= ORS='\n' decrypt.txt
I made some small changes in the code and able to get till 6e*
size=68; padlen=3 ;awk -v size=$size -v padlen=$padlen 'BEGIN {RS=" ";ORS=" ";} {if (NR > 40 && NR <=size-padlen-1) print $0}' decrypt.txt | sed '1,1s/ //'
I made size as 68 becos wc wis printing the size and file name and you have to remove it when u are passing the same to the awk script.
Note: I havent understood your requirement fully
If I understand the problem as being: discard the first 40 values and the last n values (where n is the padding + 2 i.e. in this case 3 + 2 = 5), this might work:
header=40;padding=5;
tr -d '\n' <decrypt.txt |
sed -r 's/\s+/ /g;s/^(\S+\s+){'"$header"'}//;s/(\S+\s*){'"$padding"'}$//'
The trick is to unroll the data and then pick the bits you want.

Resources