which delimiter can I use safely to separate zlib deflated strings in node - node.js

I need to send content from a client to a remote server using node.js.
The content can be anything (a user can upload any file).
Each piece of content is compressed by zlib.deflate before sending it to the remote.
I prefer not to make multiple roundtrips and send the entire content at once.
To separate between each piece of content, I need a character that can't be used in the compressed string, so I can split it safely on the remote.

There is no such character or sequence of characters. zlib compressed data can contain any sequence of bytes.
You could encode the zlib compressed data to avoid one byte value, expanding compressed data slightly. Then you could use that one byte value as a delimiter.
Example code:
// Example of encoding binary data to a sequence of bytes with no zero values.
// The result is expanded slightly. On average, assuming random input, the
// expansion is less than 0.1%. The maximum expansion is less than 14.3%, which
// is reached only if the input is a sequence of bytes all with value 255.
#include <stdio.h>
// Encode binary data read from in, to a sequence of byte values in 1..255
// written to out. There will be no zero byte values in the output. The
// encoding is decoding a flat (equiprobable) Huffman code of 255 symbols.
void no_zeros_encode(FILE *in, FILE *out) {
unsigned buf = 0;
int bits = 0, ch;
do {
if (bits < 8) {
ch = getc(in);
if (ch != EOF) {
buf += (unsigned)ch << bits;
bits += 8;
}
else if (bits == 0)
break;
}
if ((buf & 127) == 127) {
putc(255, out);
buf >>= 7;
bits -= 7;
}
else {
unsigned val = buf & 255;
buf >>= 8;
bits -= 8;
if (val < 127)
val++;
putc(val, out);
}
} while (ch != EOF);
}
// Decode a sequence of byte values made by no_zeros_encode() read from in, to
// the original binary data written to out. The decoding is encoding a flat
// Huffman code of 255 symbols. no_zeros_encode() will not generate any zero
// byte values in its output (that's the whole point), but if there are any
// zeros in the input to no_zeros_decode(), they are ignored.
void no_zeros_decode(FILE *in, FILE *out) {
unsigned buf = 0;
int bits = 0, ch;
while ((ch = getc(in)) != EOF)
if (ch != 0) { // could flag any zeros as an error
if (ch == 255) {
buf += 127 << bits;
bits += 7;
}
else {
if (ch <= 127)
ch--;
buf += (unsigned)ch << bits;
bits += 8;
}
if (bits >= 8) {
putc(buf, out);
buf >>= 8;
bits -= 8;
}
}
}

Related

How to convert string to binary representation in game maker?

I found a script that converts binary to string but how can I input a string and get the binary representation? so say I put in "P" I want it to output 01010000 as a string.
I have this but it is not what I am trying to do - it converts a string containing a binary number into a real value of that number:
///string_to_binary(string)
var str = argument0;
var output = "";
for(var i = 0; i < string_length(str); i++){
if(string_char_at(str, i + 1) == "0"){
output += "0";
}
else{
output += "1";
}
}
return real(output);
Tip: search for GML or other language term, these questions answered many times. Also please check your tag as it is the IDE tag, not language tag.
Im not familiar with GML myself, but a quick search showed this:
At least semi-official method for exactly this: http://www.gmlscripts.com/script/bytes_to_bin
/// bytes_to_bin(str)
//
// Returns a string of binary digits, 1 bit each.
//
// str raw bytes, 8 bits each, string
//
/// GMLscripts.com/license
{
var str, bin, p, byte;
str = argument0;
bin = "";
p = string_length(str);
repeat (p) {
byte = ord(string_char_at(str,p));
repeat (8) {
if (byte & 1) bin = "1" + bin else bin = "0" + bin;
byte = byte >> 1;
}
p -= 1;
}
return bin;
}
GML forum (has several examples) https://www.reddit.com/r/gamemaker/comments/4opzhu/how_could_i_convert_a_string_to_binary/
///string_to_binary(string)
var str = argument0;
var output = "";
for(var i = 0; i < string_length(str); i++){
if(string_char_at(str, i + 1) == "0"){
output += "0";
}
else{
output += "1";
}
}
return real(output);
And other language examples:
C++ Fastest way to Convert String to Binary?
#include <string>
#include <bitset>
#include <iostream>
using namespace std;
int main(){
string myString = "Hello World";
for (std::size_t i = 0; i < myString.size(); ++i)
{
cout << bitset<8>(myString.c_str()[i]) << endl;
}
}
Java: Convert A String (like testing123) To Binary In Java
String s = "foo";
byte[] bytes = s.getBytes();
StringBuilder binary = new StringBuilder();
for (byte b : bytes)
{
int val = b;
for (int i = 0; i < 8; i++)
{
binary.append((val & 128) == 0 ? 0 : 1);
val <<= 1;
}
binary.append(' ');
}
System.out.println("'" + s + "' to binary: " + binary);
JS: How to convert text to binary code in JavaScript?
function convert() {
var output = document.getElementById("ti2");
var input = document.getElementById("ti1").value;
output.value = "";
for (var i = 0; i < input.length; i++) {
output.value += input[i].charCodeAt(0).toString(2) + " ";
}
}
I was looking around for a simple GML script to convert a decimal to binary and return the bits in an array. I didn't find anything for my need and to my liking so I rolled my own. Short and sweet.
The first param is the decimal number (string or decimal) and the second param is the bit length.
// dec_to_bin(num, len);
// argument0, decimal string
// argument1, integer
var num = real(argument0);
var len = argument1;
var bin = array_create(len, 0);
for (var i = len - 1; i >= 0; --i) {
bin[i] = floor(num % 2);
num -= num / 2;
}
return bin;
Usage:
dec_to_bin("48", 10);
Output:
{ { 0,0,0,0,1,1,0,0,0,0 }, }
i think the binary you mean is the one that computers use, if thats the case, just use the common binary and add a kind of identification.
binary is actually simple, instead of what most people think.
every digit represents the previous number *2 (2¹, 2², 2³...) so we get:
1, 2, 4, 8, 16, 32, 64, 128, 256, 512...
flip it and get:
...512, 256, 128, 64, 32, 16, 8, 4, 2, 1
every digit is "activated" with 1's, plus all the activated number ant thats the value.
ok, so binary is basically another number system, its not like codes or something. Then how are letters and other characters calculated?
they arent ;-;
we just represent then as their order on their alphabets, so:
a=1
b=2
c=3
...
this means that "b" in binary would be "10", but "2" is also "10". So thats where computer's binary enter.
they just add a identification before the actual number, so:
letter_10 = b
number_10 = 2
signal_10 = "
wait, but if thats binary there cant be letter on it, instead another 0's and 1's are used, so:
011_10 = b
0011_10 = 2
001_10 = "
computers also cant know where the number starts and ends, so you have to always use the same amount of numbers, which is 8. now we get:
011_00010 = b
0011_0010 = 2
001_00010 = "
then remove the "_" cuz again, computers will only use 0's and 1's. and done!
so what i mean is, just use the code you had and add 00110000 to the value, or if you want to translate these numbers to letters as i wanted just add 01100000
in that case where you have the letter and wants the binary, first convert the letter to its number, for it just knows that the letters dont start at 1, capitalized letters starts at 64 and the the non-capitalized at 96.
ord("p")=112
112-96=16
16 in binary is 10000
10000 + 01100000 = 01110000
"p" in binary is 01110000
ord("P")=80
80-64=16
16 in binary is 10000
10000 + 01000000 = 01010000
"P" in binary is 01010000
thats just a explanation of what the code should do, actually im looking for a simple way to turn binary cuz i cant understand much of the code you showed.
(011)
1000 1111 10000 101 1001 1000 101 1100 10000 101 100

Generate Checksum for String

I would like to Generate Checksum for Strings/Data
1. The same data should produce the same Checksum
2. Two different data strings can't product same checksum. Random collision of 0.1% can be negligible
3. No encryption/decryption of data
4. Checksum length need not be too huge and contains letters and characters.
5. Must be too fast and efficient. Imagine generating checksum(s) for 100 Mb of text data should be in less than 5mins. Generating 1000 checksums for less than 1 KB of each segment data should be in less than 10 seconds.
Any algorithm or implementation reference and suggestions are most appreciated.
You can write a custom hash function: (c++)
long long int hash(String s){
long long k = 7;
for(int i = 0; i < s.length(); i++){
k *= 23;
k += s[i];
k *= 13;
k %= 1000000009;
}
return k;
}
This should give you a well (collision free for most samples) hash value.
A very common, fast checksum is the CRC-32, a 32-bit polynomial cyclic redundancy check. Here are three implementations in C, which vary in speed vs. complexity, of the CRC-32: (This is from http://www.hackersdelight.org/hdcodetxt/crc.c.txt)
#include <stdio.h>
#include <stdlib.h>
// ---------------------------- reverse --------------------------------
// Reverses (reflects) bits in a 32-bit word.
unsigned reverse(unsigned x) {
x = ((x & 0x55555555) << 1) | ((x >> 1) & 0x55555555);
x = ((x & 0x33333333) << 2) | ((x >> 2) & 0x33333333);
x = ((x & 0x0F0F0F0F) << 4) | ((x >> 4) & 0x0F0F0F0F);
x = (x << 24) | ((x & 0xFF00) << 8) |
((x >> 8) & 0xFF00) | (x >> 24);
return x;
}
// ----------------------------- crc32a --------------------------------
/* This is the basic CRC algorithm with no optimizations. It follows the
logic circuit as closely as possible. */
unsigned int crc32a(unsigned char *message) {
int i, j;
unsigned int byte, crc;
i = 0;
crc = 0xFFFFFFFF;
while (message[i] != 0) {
byte = message[i]; // Get next byte.
byte = reverse(byte); // 32-bit reversal.
for (j = 0; j <= 7; j++) { // Do eight times.
if ((int)(crc ^ byte) < 0)
crc = (crc << 1) ^ 0x04C11DB7;
else crc = crc << 1;
byte = byte << 1; // Ready next msg bit.
}
i = i + 1;
}
return reverse(~crc);
}
// ----------------------------- crc32b --------------------------------
/* This is the basic CRC-32 calculation with some optimization but no
table lookup. The the byte reversal is avoided by shifting the crc reg
right instead of left and by using a reversed 32-bit word to represent
the polynomial.
When compiled to Cyclops with GCC, this function executes in 8 + 72n
instructions, where n is the number of bytes in the input message. It
should be doable in 4 + 61n instructions.
If the inner loop is strung out (approx. 5*8 = 40 instructions),
it would take about 6 + 46n instructions. */
unsigned int crc32b(unsigned char *message) {
int i, j;
unsigned int byte, crc, mask;
i = 0;
crc = 0xFFFFFFFF;
while (message[i] != 0) {
byte = message[i]; // Get next byte.
crc = crc ^ byte;
for (j = 7; j >= 0; j--) { // Do eight times.
mask = -(crc & 1);
crc = (crc >> 1) ^ (0xEDB88320 & mask);
}
i = i + 1;
}
return ~crc;
}
// ----------------------------- crc32c --------------------------------
/* This is derived from crc32b but does table lookup. First the table
itself is calculated, if it has not yet been set up.
Not counting the table setup (which would probably be a separate
function), when compiled to Cyclops with GCC, this function executes in
7 + 13n instructions, where n is the number of bytes in the input
message. It should be doable in 4 + 9n instructions. In any case, two
of the 13 or 9 instrucions are load byte.
This is Figure 14-7 in the text. */
unsigned int crc32c(unsigned char *message) {
int i, j;
unsigned int byte, crc, mask;
static unsigned int table[256];
/* Set up the table, if necessary. */
if (table[1] == 0) {
for (byte = 0; byte <= 255; byte++) {
crc = byte;
for (j = 7; j >= 0; j--) { // Do eight times.
mask = -(crc & 1);
crc = (crc >> 1) ^ (0xEDB88320 & mask);
}
table[byte] = crc;
}
}
/* Through with table setup, now calculate the CRC. */
i = 0;
crc = 0xFFFFFFFF;
while ((byte = message[i]) != 0) {
crc = (crc >> 8) ^ table[(crc ^ byte) & 0xFF];
i = i + 1;
}
return ~crc;
}
If you simply google "CRC32", you will get more info than you could possibly absorb.

how to find decode way to decode a USSD Command's result in c#?

I'm working on my GSM modem (Huawei E171) to send USSD commands.
to do this i use this commands at the first:
AT+CMGF=1
AT+CSCS=? ----> result is "IRA" this is my modem default
after that i sent these commands and i have got these results and everything works fine.
//*141*1# ----->to check my balance
+CUSD:
0,"457A591C96EB40B41A8D0692A6C36C17688A2E9FCB667AD87D4EEB4130103D
0C8281E4753D0B1926E7CB2018881E06C140F2BADE5583819A4250D24D2FC
BDD653A485AD787DD65504C068381A8EF76D80D2287E53A55AD5653D554
31956D04",15
//*100# ----> this command give me some options to charge my mobile
+CUSD:
1,"06280627062C06470020062706CC06310627064606330644000A0030002E062E0
63106CC062F00200634062706310698000A0031002E067E062706330627063106A
F0627062F000A0032002E0622067E000A0033002E06450644062A000A003
4002E06330627064506270646000A0035002E067E0627063106330
6CC06270646000A002300200028006E0065007800740029000A",72
i found some codes to decode these result:
to decode checking balance result i used:
string result141="457A591C96EB40B41A8D0692A6C36C17688A......."
byte[] packedBytes = ConvertHexToBytes(result141);
byte[] unpackedBytes = UnpackBytes(packedBytes);
//gahi in kar mikone gahi balkaee nafahmidam chera
string o = Encoding.Default.GetString(unpackedBytes);
my function's codes are:
public static byte[] ConvertHexToBytes(string hexString)
{
if (hexString.Length % 2 != 0)
return null;
int len = hexString.Length / 2;
byte[] array = new byte[len];
for (int i = 0; i < array.Length; i++)
{
string tmp = hexString.Substring(i * 2, 2);
array[i] =
byte.Parse(tmp, System.Globalization.NumberStyles.HexNumber);
}
return array;
}
public static byte[] UnpackBytes(byte[] packedBytes)
{
byte[] shiftedBytes = new byte[(packedBytes.Length * 8) / 7];
int shiftOffset = 0;
int shiftIndex = 0;
// Shift the packed bytes to the left according
//to the offset (position of the byte)
foreach (byte b in packedBytes)
{
if (shiftOffset == 7)
{
shiftedBytes[shiftIndex] = 0;
shiftOffset = 0;
shiftIndex++;
}
shiftedBytes[shiftIndex] = (byte)((b << shiftOffset) & 127);
shiftOffset++;
shiftIndex++;
}
int moveOffset = 0;
int moveIndex = 0;
int unpackIndex = 1;
byte[] unpackedBytes = new byte[shiftedBytes.Length];
//
if (shiftedBytes.Length > 0)
{
unpackedBytes[unpackIndex - 1] =
shiftedBytes[unpackIndex - 1];
}
// Move the bits to the appropriate byte (unpack the bits)
foreach (byte b in packedBytes)
{
if (unpackIndex != shiftedBytes.Length)
{
if (moveOffset == 7)
{
moveOffset = 0;
unpackIndex++;
unpackedBytes[unpackIndex - 1] =
shiftedBytes[unpackIndex - 1];
}
if (unpackIndex != shiftedBytes.Length)
{
// Extract the bits to be moved
int extractedBitsByte = (packedBytes[moveIndex] &
_decodeMask[moveOffset]);
// Shift the extracted bits to the proper offset
extractedBitsByte =
(extractedBitsByte >> (7 - moveOffset));
// Move the bits to the appropriate byte
//(unpack the bits)
int movedBitsByte =
(extractedBitsByte | shiftedBytes[unpackIndex]);
unpackedBytes[unpackIndex] = (byte)movedBitsByte;
moveOffset++;
unpackIndex++;
moveIndex++;
}
}
}
// Remove the padding if exists
if (unpackedBytes[unpackedBytes.Length - 1] == 0)
{
byte[] finalResultBytes = new byte[unpackedBytes.Length - 1];
Array.Copy(unpackedBytes, 0,
finalResultBytes, 0, finalResultBytes.Length);
return finalResultBytes;
}
return unpackedBytes;
}
but to decode second result i used:
string strHex= "06280627062C06470020062706CC06310......";
strHex = strHex.Replace(" ", "");
int nNumberChars = strHex.Length / 2;
byte[] aBytes = new byte[nNumberChars];
using (var sr = new StringReader(strHex))
{
for (int i = 0; i < nNumberChars; i++)
aBytes[i] = Convert.ToByte(
new String(new char[2] {
(char)sr.Read(), (char)sr.Read() }), 16);
}
string decodedmessage= Encoding.BigEndianUnicode.
GetString(aBytes, 0, aBytes.Length);
both of theme works current but why i should different decoding way to decode these results?
from where i can find, i should use which one of these two types of decoding?
USSD command responses +CUSD unsolicited responses are formatted as follows:
+CUSD: <m>[<str_urc>[<dcs>]]
Where "m" is the type of action required, "str_urc" is the response string, and "dcs" is the response string encoding.
This quote is from a Siemens Cinterion MC55i manual but applies generally to other modem manufacturers:
If dcs indicates that GSM 03.38 default alphabet is used TA converts GSM alphabet into current TE character
set according to rules of GSM 07.05 Annex A. Otherwise in case of invalid or omitted dcs conversion of
str_urc is not possible.
USSD's can be sent in 7-Bit encoded format or UC2 hence when looking at your two example responses you can see either a DCS of 15 or 72.
GSM 03.38 Cell Broadcast Data Coding Scheme in integer format (default 15). In case of an invalid or omitted
dcs from the network side (MT) will not be given out.
So if you get a DCS of 15 then it is 7-Bit encoded. And if it's 72 then it will be UC2. So from this you can easily select either your first decoding routine or second.

How can I safely and simply read a line of text from a file or stdin?

Given that fgets only sometimes includes a linebreak, and fscanf is inherently unsafe, I would like a simple alternative to read text line-by-line from a file. Is this page a good place to find such a function?
Yes. The following function should satisfy this requirement without creating any damaging security flaws.
/* reads from [stream] into [buffer] until terminated by
* \r, \n or EOF, or [lastnullindex] is reached. Returns
* the number of characters read excluding the terminating
* character. [lastnullindex] refers to the uppermost index
* of the [buffer] array. If an error occurs or non-text
* characters (below space ' ' or above tilde '~') are
* detected, the buffer will be emptied and 0 returned.
*/
int readline(FILE *stream, char *buffer, int lastnullindex) {
if (!stream) return 0;
if (!buffer) return 0;
if (lastnullindex < 0) return 0;
int inch = EOF;
int chi = 0;
while (chi < lastnullindex) {
inch = fgetc(stream);
if (inch == EOF || inch == '\n' || inch == '\r') {
buffer[chi] = '\0';
break;
} else if (inch >= ' ' && inch <= '~') {
buffer[chi] = (char)inch;
chi++;
} else {
buffer[0] = '\0';
return 0;
}
}
if (chi < 0 || chi > lastnullindex) {
buffer[0] = '\0';
return 0;
} else {
buffer[chi] = '\0';
return chi;
}
}

Pipe Read Processing

I have to get input from a user, put that into a pipe(in the parent process) then I have to process the string in the child. All uppercase letters need to be lowercase and all lowercase letters must be uppercase. My issue is with the output of the pipe. My code will only change the letter case of the first character in the string and I am not sure why. The child pipe is reading through all the characters (at least it appears to be). I was hoping someone could tell me why this wont process each character.
while (read(pfd[0], &buf, strlen(cmd)) > 0){
if(buf >= 'a' && buf <= 'z'){
buf = toupper(buf);
}
else{
buf = tolower(buf);
}
}
write(STDOUT_FILENO, &buf, strlen(cmd));
You are making two common mistakes.
(1) read does not buffer for you so you are not guaranteed to get len bytes (i.e.strlen(cmd) in your case.). read will return whatever number of bytes it has available up to the length you specify but it can and often will return less. So you want to change your read loop to reflect that.
(2) buf is presumably a char array. You are always changing the first byte and only the first byte. You need to iterate over the all the bytes you just read.
So putting it all together, something like
while ((bytesread = read(pfd[0], &buf, strlen(cmd))) > 0)
{
for (int i = 0; i < bytesread; ++i)
{
if(buf[i] >= 'a' && buf[i] <= 'z')
buf[i] = toupper(buf[i]);
else
buf[i] = tolower(buf[i]);
}
write(STDOUT_FILENO, &buf, bytesread);
}

Resources