Can someone please explain to me how the zlib library works in Nodejs?
I'm fairly new to Nodejs, and I'm not yet sure how to use buffers and streams.
My simple scenario is a string variable, and I want to either zip or unzip (deflate or inflate, gzip or gunzip, etc') the string to another string.
I.e. (how I would expect it to work)
var zlib = require('zlib');
var str = "this is a test string to be zipped";
var zip = zlib.Deflate(str); // zip = [object Object]
var packed = zip.toString([encoding?]); // packed = "packedstringdata"
var unzipped = zlib.Inflate(packed); // unzipped = [object Object]
var newstr = unzipped.toString([again - encoding?]); // newstr = "this is a test string to be zipped";
Thanks for the helps :)
For anybody stumbling on this in 2016 (and also wondering how to serialize compressed data to a string rather than a file or a buffer) - it looks like zlib (since node 0.11) now provides synchronous versions of its functions that do not require callbacks:
var zlib = require('zlib');
var input = "Hellow world";
var deflated = zlib.deflateSync(input).toString('base64');
var inflated = zlib.inflateSync(new Buffer(deflated, 'base64')).toString();
console.log(inflated);
Syntax has changed to simply:
var inflated = zlib.inflateSync(Buffer.from(deflated, 'base64')).toString()
Update: Didn't realize there was a new built-in 'zlib' module in node 0.5. My answer below is for the 3rd party node-zlib module. Will update answer for the built-in version momentarily.
Update 2: Looks like there may be an issue with the built-in 'zlib'. The sample code in the docs doesn't work for me. The resulting file isn't gunzip'able (fails with "unexpected end of file" for me). Also, the API of that module isn't particularly well-suited for what you're trying to do. It's more for working with streams rather than buffers, whereas the node-zlib module has a simpler API that's easier to use for Buffers.
An example of deflating and inflating, using 3rd party node-zlib module:
// Load zlib and create a buffer to compress
var zlib = require('zlib');
var input = new Buffer('lorem ipsum dolor sit amet', 'utf8')
// What's 'input'?
//input
//<Buffer 6c 6f 72 65 6d 20 69 70 73 75 6d 20 64 6f 6c 6f 72 20 73 69 74 20 61 6d 65 74>
// Compress it
zlib.deflate(input)
//<SlowBuffer 78 9c cb c9 2f 4a cd 55 c8 2c 28 2e cd 55 48 c9 cf c9 2f 52 28 ce 2c 51 48 cc 4d 2d 01 00 87 15 09 e5>
// Compress it and convert to utf8 string, just for the heck of it
zlib.deflate(input).toString('utf8')
//'x???/J?U?,(.?UH???/R(?,QH?M-\u0001\u0000?\u0015\t?'
// Compress, then uncompress (get back what we started with)
zlib.inflate(zlib.deflate(input))
//<SlowBuffer 6c 6f 72 65 6d 20 69 70 73 75 6d 20 64 6f 6c 6f 72 20 73 69 74 20 61 6d 65 74>
// Again, and convert back to our initial string
zlib.inflate(zlib.deflate(input)).toString('utf8')
//'lorem ipsum dolor sit amet'
broofa's answer is great, and that's exactly how I'd like things to work. For me node insisted on callbacks. This ended up looking like:
var zlib = require('zlib');
var input = new Buffer('lorem ipsum dolor sit amet', 'utf8')
zlib.deflate(input, function(err, buf) {
console.log("in the deflate callback:", buf);
zlib.inflate(buf, function(err, buf) {
console.log("in the inflate callback:", buf);
console.log("to string:", buf.toString("utf8") );
});
});
which gives:
in the deflate callback: <Buffer 78 9c cb c9 2f 4a cd 55 c8 2c 28 2e cd 55 48 c9 cf c9 2f 52 28 ce 2c 51 48 cc 4d 2d 01 00 87 15 09 e5>
in the inflate callback: <Buffer 6c 6f 72 65 6d 20 69 70 73 75 6d 20 64 6f 6c 6f 72 20 73 69 74 20 61 6d 65 74>
to string: lorem ipsum dolor sit amet
Here is a non-callback version of the code:
var zlib = require('zlib');
var input = new Buffer.from('John Dauphine', 'utf8')
var deflated= zlib.deflateSync(input);
console.log("Deflated:",deflated.toString("utf-8"));
var inflated = zlib.inflateSync(deflated);
console.log("Inflated:",inflated.toString("utf-8"))
Related
I have memory leak on .Net Web service application.Upon suggestion, I'm able to analyze the dump file. I guess it is a native memory leak. But, I'm unable to figure out the root cause of the issue. I have followed steps mentioned in the link
Here what I have so far
address summary
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Heap 328 4a256000 ( 1.159 GB) 69.38% 57.93%
<unknown> 1253 1b64b000 ( 438.293 MB) 25.63% 21.40%
Free 246 151fa000 ( 337.977 MB) 16.50%
Native Heap
0:000> !heap -s
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
001b0000 00000002 1036480 1024552 1036480 411 745 68 5 0 LFH
00010000 00008000 64 4 64 2 1 1 0 0
001b0000 is using more than 1GB
Allocation info
0:000> !heap -stat -h 001b0000
heap # 001b0000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
4e24 3a24 - 11bf2510 (28.82)
1001f e89 - e8ac297 (23.61)
Filtering 4e24
0:000> !heap -flt s 4e24
_HEAP # 1b0000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
01fa4810 09c6 0000 [00] 01fa4818 04e24 - (busy)
01fa9640 09c6 09c6 [00] 01fa9648 04e24 - (busy)
01fae470 09c6 09c6 [00] 01fae478 04e24 - (busy)
There are ton of busy
0:000> dc 01fb32a0 L 2000
01fb32a0 b87718ff 0c12a52e 6f4d3c00 656c7564 ..w......<Module
01fb32b0 6977003e 4d2e796c 4d6b636f 6c75646f >.wily.MockModul
01fb32c0 65440065 6c756166 79442074 696d616e e.Default Dynami
01fb32d0 6f4d2063 656c7564 6c697700 67412e79 c Module.wily.Ag
01fb32e0 00746e65 2e6d6f63 796c6977 6573692e ent.com.wily.ise
01fb32f0 7261676e 65722e64 74736967 49007972 ngard.registry.I
01fb3300 69676552 79727473 76726553 00656369 RegistryService.
01fb3310 6f63736d 62696c72 73795300 006d6574 mscorlib.System.
01fb3320 656a624f 50007463 79786f72 67655249 Object.ProxyIReg
01fb3330 72747369 72655379 65636976 6f4d4e00 istryService.NMo
01fb3340 49006b63 6f766e49 69746163 61486e6f ck.IInvocationHa
01fb3350 656c646e 695f0072 636f766e 6f697461 ndler._invocatio
01fb3360 6e61486e 72656c64 73795300 2e6d6574 nHandler.System.
01fb3370 6c666552 69746365 4d006e6f 6f687465 Reflection.Metho
01fb3380 666e4964 6d5f006f 6f687465 666e4964 dInfo._methodInf
01fb3390 70614d6f 766e4900 00656b6f 2e6d6f63 oMap.Invoke.com.
01fb33a0 796c6977 6573692e 7261676e 74752e64 wily.isengard.ut
01fb33b0 742e6c69 00656572 65726944 726f7463 il.tree.Director
01fb33c0 74615079 646e4168 72746e45 65520079 yPathAndEntry.Re
01fb33d0 74736967 6e457972 00797274 72657571 gistryEntry.quer
01fb33e0 746e4579 73656972 6d6f6300 6c69772e yEntries.com.wil
01fb33f0 73692e79 61676e65 6f2e6472 696f676e y.isengard.ongoi
01fb3400 7571676e 00797265 65755141 6f4e7972 ngquery.AQueryNo
01fb3410 69666974 69746163 72006e6f 73696765 tification.regis
01fb3420 4f726574 696f676e 7551676e 00797265 terOngoingQuery.
01fb3430 2e6d6f63 796c6977 6573692e 7261676e com.wily.isengar
01fb3440 6f702e64 666f7473 65636966 736f5000 d.postoffice.Pos
01fb3450 66664f74 53656369 69636570 72656966 tOfficeSpecifier
01fb3460 72694400 6f746365 61507972 61006874 .DirectoryPath.a
01fb3470 6e456464 00797274 45746567 7972746e ddEntry.getEntry
01fb3480 6c656400 45657465 7972746e 74656700 .deleteEntry.get
01fb3490 44627553 63657269 69726f74 2e007365 SubDirectories..
01fb34a0 726f7463 74632e00 0000726f 00000000 ctor..ctor......
01fb34b0 00000000 00000000 00000000 00000000 ...............
I'm not sure if I'm in the right path of casue analysis
That < Module > in the beginning is a sign of dynamically generated assembly.
Load SOS extension using .loadby sos clr (for current machine dump) or .cordll -ve -u -l if you debug someone else's dump (doesn't work well in old Windbg 6.x, but works well for WinDbg from Windows Development Kit 8 and above)
Execute !eeheap and check the Module Thunk heaps section. It should contain thousands of records:
--------------------------------------
Module Thunk heaps:
Module 736b1000: Size: 0x0 (0) bytes.
Module 004f2ed4: Size: 0x0 (0) bytes.
...
<thousands of similar lines>
...
Total size: Size: 0x0 (0) bytes.
In my case it was assemblies generated for serialization by MS XmlSerializer class that took all the memory:
00000000`264b7640 00 00 3e 00 ce 01 00 00 00 00 00 00 00 3c 4d 6f ..>..........<Mo
00000000`264b7650 64 75 6c 65 3e 00 4d 69 63 72 6f 73 6f 66 74 2e dule>.Microsoft.
00000000`264b7660 47 65 6e 65 72 61 74 65 64 43 6f 64 65 00 52 65 GeneratedCode.Re
00000000`264b7670 66 45 6d 69 74 5f 49 6e 4d 65 6d 6f 72 79 4d 61 fEmit_InMemoryMa
00000000`264b7680 6e 69 66 65 73 74 4d 6f 64 75 6c 65 00 6d 73 63 nifestModule.msc
00000000`264b7690 6f 72 6c 69 62 00 53 79 73 74 65 6d 2e 53 65 63 orlib.System.Sec
I could avoid this leak by using only a single instance of XmlSerializer for each type.
In your case it seems that some other thing (wily.MockModule) generates assemblies, and some other solution might be required
I am working on Change Streams introduced in MongoDB Version 3.6. Change Streams have a feature where I can specify to start streaming changes from a particular change in history. In native driver for Node.js, to resume change stream, it says (documentation here)
Specifies the logical starting point for the new change stream. This should be the _id field from a previously returned change stream document.
When I print it in console, this is what I am getting
{ _id:
{ _data:
Binary {
_bsontype: 'Binary',
sub_type: 0,
position: 49,
buffer: <Buffer 82 5a 61 a5 4f 00 00 00 01 46 64 5f 69 64 00 64 5a 61 a5 4f 08 c2 95 31 d0 48 a8 2e 00 5a 10 04 7c c9 60 de de 18 48 94 87 3f 37 63 08 da bb 78 04> } },
...
}
My problem is I do not know how to store the _id of this format in a database or a file. Is it possible to convert this binary object to string so I can use it later to resume my change stream from that particular _id. Example code would be greatly appreciated.
Convert BSON Binary to buffer and back
const Binary = require('mongodb').Binary;
const fs = require('fs');
Save _data from _id:
var wstream = fs.createWriteStream('/tmp/test');
wstream.write(lastChange._id._data.read(0));
wstream.close();
Then rebuild resumeToken:
fs.readFile('/tmp/test', void 0, function(err, data) {
const resumeToken = { _data: new Binary(data) };
});
Using NodeJS v5.6 I created a file called read-stream.js:
const
fs = require('fs'),
stream = fs.createReadStream(process.argv[2]);
stream.on('data', function(chunk) {
process.stdout.write(chunk);
});
stream.on('error', function(err) {
process.stderr.write("ERROR: " + err.message + "\n");
});
and a data file in plain text called target.txt:
hello world
this is the second line
If I do node read-stream.js target.txt the contents of target.txt are printed normally on my console and all is well.
However if I switch process.stdout.write(chunk); with console.log(chunk); then the result I get is this:
<Buffer 68 65 6c 6c 6f 20 77 6f 72 6c 64 0a 74 68 69 73 20 69 73 20 74 68 65 20 73 65 63 6f 6e 64 20 6c 69 6e 65 0a>
I recently found out that by doing console.log(chunk.toString()); the contents of my file are once again printed normally.
As per this question, console.log is supposed to use process.stdout.write with the addition of a \n character. But what exactly is happening with encoding/decodings here?
Thanks in advance.
process.stdout is a stream and its write() function only accepts strings and buffers. chunk is a Buffer object, process.stdout.write writes the bytes of data directly in your console so they appear as strings. console.log builds a string representation of the Buffer object before outputting it, hence the <Buffer at the beginning to indicate the object's type and following are the bytes of this buffer.
On a side note, process.stdout being a stream, you can pipe to it directly instead of reading every chunk:
stream.pipe(process.stdout);
I believe I found out what's happening:
The implementation of console.log in NodeJS is this:
Console.prototype.log = function() {
this._stdout.write(util.format.apply(this, arguments) + '\n');
};
However the util.format function of lib/util.js in NodeJS uses the inspect method on any input object which in turn: Returns a string representation of object, which is useful for debugging.
Thus what's happening here is that due to util.format "object casting", anytime that we pass an object to console.log, that particular object is first turned into a string representation and then is passed to process.stdout.write as a string and finally gets written to the terminal.
So, when we directly use process.stdout.write with buffer objects, util.format is completely skipped and each byte is directly written to terminal as process.stdout.write is designed to handle them directly.
I have been given a text file containing hex data which I know forms a jpeg image. Below is an example of the format:
FF D8 FF E0 00 10 4A 46 49 46 00 01 02 00 00 64 00 64 00 00 FF E1 00 B8 45 78 69 00 00 4D
This is only a snippet but you get the idea.
Does anyone know how I could convert this back into the original jpeg?
To convert from a hex string to a byte you use the Convert.ToByte with a base 16 parameter.
To convert a byte array to a Bitmap you put it in a Stream and use the Bitmap(Stream) constructor:
using System.IO;
//..
string hexstring = File.ReadAllText(yourInputFile);
byte[] bytes = new byte[hexstring.Length / 2];
for (int i = 0; i < hexstring.Length; i += 2)
bytes[i / 2] = Convert.ToByte( hexstring.Substring(i, 2), 16);
using (MemoryStream ms = new MemoryStream(bytes))
{
Bitmap bmp = new Bitmap(ms);
// now you can do this:
bmp.Save(yourOutputfile, System.Drawing.Imaging.ImageFormat.Jpeg);
bmp.Dispose(); // dispose of the Bitmap when you are done with it!
// or that:
pictureBox1.Image = bmp; // Don't dispose as long as the PictureBox needs it!
}
I guess that there are more LINQish way but as long as it works..
I have following function in node.js inside a http.request()
res.on('data', function (chunk) {
var sr="response: "+chunk;
console.log(chunk);
});
I get this in console
<Buffer 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 2e 30 22 20 65 6e 63 6f
64 69 6e 67 3d 22 75 74 66 2d 38 22 20 3f 3e 3c 72 65 73 75 6c 74 3e 3c 73 75 63
...>
But when i use this:
res.on('data', function (chunk) {
var sr="response: "+chunk;
console.log(sr);
});
I get a proper xml response like this:
responose: .....xml responose.....
I don't understand why i need to append a string to output the proper response. And what is meant by the response generated in the first code?
chunk is a Buffer, which is Node's way of storing binary data.
Because it's binary data, you need to convert it to a string before you can properly show it (otherwise, console.log will show its object representation). One method is to append it to another string (your second example does that), another method is to call toString() on it:
console.log(chunk.toString());
However, I think this has the potential of failing when chunk contains incomplete characters (an UTF-8 character can consist of multiple bytes, but you don't get the guarantee that chunk isn't cut off right in the middle of such a byte string).
Chunk is just a buffer where the data is stored in Binary, so you could use utf8 for the character encoding as well which will output the data as String, and this you will need to do when you are creating the readStream.
var myReadStream = fs.createReadStream( __dirname + '/readme.txt', 'utf8');
myReadStream.on('data', function(chunk){
console.log('new chunk received');
console.log(chunk);
})