Converting a string and an integer to a hex in smalltalk - string

TL;DR: how do I convert ints into hex, also how would I convert a 1 character string into a Hex ( ie 'F' -> 0xF )
looking to convert a character to a hex value, do some math, then convert back into a character.
so I have something like this:
addBits: aValue move: action
"aValue is always either 5 or 10 (0xA)"
"move is either 'a' for add or 's' for subtract"
|sum conversion|
"self stringToMakeHex is the string, it's always either an 'F', 'A', '0',
or '5' I need to turn it into either 0xF,0xA,0x0,or 0x5"
conversion := (self stringToMakeHex) asInteger.
(action = 's')
ifTrue:[sum:= conversion - aValue.]
ifFalse:[sum:=conversion + aValue.].
stringToMakeHex: (sum asString).
I know I shouldn't be doing asInteger, as it converts 'F' into a zero somehow, so I'm wondering if there's a nice way to get 0xF or even 15 from it. My other problem is that aValue is coming in as an integer (5 and 10 base10) so I'll need a way to get the hex values 0x5 and 0xA.
All this data is retrieved via TCP/IP from a dif program so it's out of my control what format I receive... Doesn't help that I need to send back a string in order for the communication to be handled across the connection

In Pharo (and Squeak if I'm not mistaken) you can use the class side #readFrom:base: method of Integer: Integer readFrom: self stringToMakeHex base: 16.
This gives you an integer of value 15 in the case of 'F'.
If I were you I'd encapsulate the reading and printing in a new class, e.g. HexString. You could implement on: on the class side and +, -, and printOn: on the instance side.
Update:
To get a String from an integer in a specific base use #printStringBase: (in Pharo) or #printOn:base: (which is probably more portable) like so: 12 printStringBase: 16. This evaluates to 'C'.

Try this:
(Compiler evaluate: '16r', self stringToMakeHex)
To convert back:
(sum printStringRadix: 16)

Related

Converting an integer to a bytes object in Python 3 results in a "Q" in PyCharm Debugger?

I am writing a function to implement a LFSR in Python 3. I've figured out the hard part, which is actually stepping through the state using the feedback value to get the key and getting all of the output bytes by ANDing the key and the input data. To make it easier to do bit shifts and OR and XOR operations, I did all of my operations on integers and saved the result as an integer, however, the function requires a return type of bytes. I have been doing some Googling of my own, and it seems like an accepted way to convert an integer to a bytes object is to do something like result_bytes = result_bytes.to_bytes((result_bytes.bit_length() + 7) // 8, 'big'). Running hex_result_bytes = hex(result_bytes) (mind you, in this context, result_bytes is still currently an integer) and my result_bytes is an integer with a value of 3187993425, I get a str result of "0xbe04eb51", so clearly I should expect a bytes object that looks like b'\xbe\x04\xeb\x51'.
After running result_bytes = result_bytes.to_bytes((result_bytes.bit_length() + 7) // 8, 'big'), I should expect b'\xbe\x04\xeb\x51'. Instead, the result I get is b'\xbe\x04\xebQ' in the PyCharm debugger. Is there something obvious that I am missing here? I don't even know how I got Q because Q is clearly not something that you can get as a hexadecimal byte.
I'm not sure why the PyCharm debugger does this, but it seems to be converting part of the binary string to ASCII. I ran an expression to see if b'\xbe\x04\xebQ' == b'\xbe\x04\xeb\x51', and it evaluates to True. 51 in hex corresponds to Q in ASCII. Another binary string that I am working with right now should be b'\x48\xDC\x40\xD1\x4C', but it is also converting part of it into ASCII and showing it as b'H\xdc#\xd1L', where hex 48 corresponds to H in ASCII, hex 40 corresponds to # in ASCII, and hex 4C corresponds to L in ASCII

Convert large decimal number to hexadecimal notation

When creating a String object in Swift you can use a String Format Specifier to convert an integer to hexadecimal notation.
print(String(format:"%x", 1234))
// output: 4d2
// expected output: 4d2
But when numbers become bigger, the output is not as expected.
print(String(format:"%x", 12345678901234))
// output: 73ce2ff2
// expected output: b3a73ce2ff2
It seems that the output of String(format:"%x", n) is truncated at 8 characters. I don't think in hexadecimal natively, this makes debugging hard. I have seen answers for other programming languages where it is explained that you need to brake-up the large integer into parts, but that seems wrong to me.
What am I doing wrong here?
What is the right way to convert decimal numbers to hexadecimal numbers in Swift?
You need to use %lx or %llx
print(String(format:"%lx", 12345678901234))
b3a73ce2ff2
Table 2 on the site you linked specifies them
l -
Length modifier specifying that a following d, o, u, x, or X conversion specifier applies to a long or unsigned long argument.
x is for unsigned 32 bit integers which only go up to 4.294.967.296

Output UUID in Go as a short string

Is there a built in way, or reasonably standard package that allows you to convert a standard UUID into a short string that would enable shorter URL's?
I.e. taking advantage of using a larger range of characters such as [A-Za-z0-9] to output a shorter string.
I know we can use base64 to encode the bytes, as follows, but I'm after something that creates a string that looks like a "word", i.e. no + and /:
id = base64.StdEncoding.EncodeToString(myUuid.Bytes())
A universally unique identifier (UUID) is a 128-bit value, which is 16 bytes. For human-readable display, many systems use a canonical format using hexadecimal text with inserted hyphen characters, for example:
123e4567-e89b-12d3-a456-426655440000
This has length 16*2 + 4 = 36. You may choose to omit the hypens which gives you:
fmt.Printf("%x\n", uuid)
fmt.Println(hex.EncodeToString(uuid))
// Output: 32 chars
123e4567e89b12d3a456426655440000
123e4567e89b12d3a456426655440000
You may choose to use base32 encoding (which encodes 5 bits with 1 symbol in contrast to hex encoding which encodes 4 bits with 1 symbol):
fmt.Println(base32.StdEncoding.EncodeToString(uuid))
// Output: 26 chars
CI7EKZ7ITMJNHJCWIJTFKRAAAA======
Trim the trailing = signs when transmitting, so this will always be 26 chars. Note that you have to append "======" prior to decode the string using base32.StdEncoding.DecodeString().
If this is still too long for you, you may use base64 encoding (which encodes 6 bits with 1 symbol):
fmt.Println(base64.RawURLEncoding.EncodeToString(uuid))
// Output: 22 chars
Ej5FZ-ibEtOkVkJmVUQAAA
Note that base64.RawURLEncoding produces a base64 string (without padding) which is safe for URL inclusion, because the 2 extra chars in the symbol table (beyond [0-9a-zA-Z]) are - and _, both which are safe to be included in URLs.
Unfortunately for you, the base64 string may contain 2 extra chars beyond [0-9a-zA-Z]. So read on.
Interpreted, escaped string
If you are alien to these 2 extra characters, you may choose to turn your base64 string into an interpreted, escaped string similar to the interpreted string literals in Go. For example if you want to insert a backslash in an interpreted string literal, you have to double it because backslash is a special character indicating a sequence, e.g.:
fmt.Println("One backspace: \\") // Output: "One backspace: \"
We may choose to do something similar to this. We have to designate a special character: be it 9.
Reasoning: base64.RawURLEncoding uses the charset: A..Za..z0..9-_, so 9 represents the highest code with alphanumeric character (61 decimal = 111101b). See advantage below.
So whenever the base64 string contains a 9, replace it with 99. And whenever the base64 string contains the extra characters, use a sequence instead of them:
9 => 99
- => 90
_ => 91
This is a simple replacement table which can be captured by a value of strings.Replacer:
var escaper = strings.NewReplacer("9", "99", "-", "90", "_", "91")
And using it:
fmt.Println(escaper.Replace(base64.RawURLEncoding.EncodeToString(uuid)))
// Output:
Ej5FZ90ibEtOkVkJmVUQAAA
This will slightly increase the length as sometimes a sequence of 2 chars will be used instead of 1 char, but the gain will be that only [0-9a-zA-Z] chars will be used, as you wanted. The average length will be less than 1 additional character: 23 chars. Fair trade.
Logic: For simplicity let's assume all possible uuids have equal probability (uuid is not completely random, so this is not the case, but let's set this aside as this is just an estimation). Last base64 symbol will never be a replaceable char (that's why we chose the special char to be 9 instead of like A), 21 chars may turn into a replaceable sequence. The chance for one being replaceable: 3 / 64 = 0.047, so on average this means 21*3/64 = 0.98 sequences which turn 1 char into a 2-char sequence, so this is equal to the number of extra characters.
To decode, use an inverse decoding table captured by the following strings.Replacer:
var unescaper = strings.NewReplacer("99", "9", "90", "-", "91", "_")
Example code to decode an escaped base64 string:
fmt.Println("Verify decoding:")
s := escaper.Replace(base64.RawURLEncoding.EncodeToString(uuid))
dec, err := base64.RawURLEncoding.DecodeString(unescaper.Replace(s))
fmt.Printf("%x, %v\n", dec, err)
Output:
123e4567e89b12d3a456426655440000, <nil>
Try all the examples on the Go Playground.
As suggested here, If you want just a fairly random string to use as slug, better to not bother with UUID at all.
You can simply use go's native math/rand library to make random strings of desired length:
import (
"math/rand"
"encoding/hex"
)
b := make([]byte, 4) //equals 8 characters
rand.Read(b)
s := hex.EncodeToString(b)
Another option is math/big. While base64 has a constant output of 22
characters, math/big can get down to 2 characters, depending on the input:
package main
import (
"encoding/base64"
"fmt"
"math/big"
)
type uuid [16]byte
func (id uuid) encode() string {
return new(big.Int).SetBytes(id[:]).Text(62)
}
func main() {
var id uuid
for n := len(id); n > 0; n-- {
id[n - 1] = 0xFF
s := base64.RawURLEncoding.EncodeToString(id[:])
t := id.encode()
fmt.Printf("%v %v\n", s, t)
}
}
Result:
AAAAAAAAAAAAAAAAAAAA_w 47
AAAAAAAAAAAAAAAAAAD__w h31
AAAAAAAAAAAAAAAAAP___w 18owf
AAAAAAAAAAAAAAAA_____w 4GFfc3
AAAAAAAAAAAAAAD______w jmaiJOv
AAAAAAAAAAAAAP_______w 1hVwxnaA7
AAAAAAAAAAAA_________w 5k1wlNFHb1
AAAAAAAAAAD__________w lYGhA16ahyf
AAAAAAAAAP___________w 1sKyAAIxssts3
AAAAAAAA_____________w 62IeP5BU9vzBSv
AAAAAAD______________w oXcFcXavRgn2p67
AAAAAP_______________w 1F2si9ujpxVB7VDj1
AAAA_________________w 6Rs8OXba9u5PiJYiAf
AAD__________________w skIcqom5Vag3PnOYJI3
AP___________________w 1SZwviYzes2mjOamuMJWv
_____________________w 7N42dgm5tFLK9N8MT7fHC7
https://golang.org/pkg/math/big

transform string/char to uint8

Why does the expression:
test = cast(strtrim('3'), 'uint8')
produce 51?
This is also true for:
test = cast(strtrim('3'), 'int8')
Thanks.
Because 51 is the ASCII code for the character '3'.
If you want to transform the string to numeric 3, you should use
uint8(str2double('3'))
Note that str2double will ignore trailing spaces, so that strtrim isn't necessary.
EDIT
When a string is used in an numeric operation, Matlab automatically converts it to its ASCII value. For example
>> '1'+1
ans =
50
Because 51 is the ASCII value for the character '3'.
This is because '3' is seen as an ASCII character to matlab. By casting as a signed or unsigned integer (8 bits in this case) you are asking Matlab to convert an ASCII '3' to a decimal number. In this case the decimal number is 51. If you want to look at more conversions here is a basic document.

How do int-to-string casts work in Go?

I only started Go today, so this may be obvious but I couldn't find anything on it.
What does var x uint64 = 0x12345678; y := string(x) give y?
I know var x uint8 = 65; y := string(x) would give y the byte 65, character A, and common sense would suggest (since types larger than uint8 are allowed to be cast to strings) that they would simply be packed in to native byte order (i.e little endian) and assigned to the variable.
This does not seem to be the case:
hex.EncodeToString([]byte(y)) ==> "efbfbd"
First thought says this is an address with the last byte being left off because of some weird null terminator thingy, but if I allocate two x and y variables with two different values and print them out I get the same result.
var x, x2 uint64 = 0x10000000, 0x20000000
y, y2 := string(x), string(x2)
fmt.Println(hex.EncodeToString([]byte(y))) // "efbfbd"
fmt.Println(hex.EncodeToString([]byte(y2))) // "efbfbd"
Maddeningly I can't find the implementation for the string type anywhere although I probably haven't looked hard enough.
This is covered in the Spec: Conversions: Conversions to and from a string type:
Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD".
So effectively when you convert a numeric value to string, it can only yield a string having one rune (character). And since Go stores strings as the UTF-8 encoded byte sequences in memory, that is what you will see if you convert your string to []byte:
Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
When you try to conver the 0x12345678, 0x10000000 and 0x20000000 values to string, since they are outside of the range of valid Unicode code points, as per spec they are converted to "\uFFFD" which in UTF-8 encoding is []byte{239, 191, 189}; when encoded to hex string:
fmt.Println(hex.EncodeToString([]byte("\uFFFD"))) // Output: efbfbd
Or simply:
fmt.Printf("%x", "\uFFFD") // Output: efbfbd
Read the blog post Strings, bytes, runes and characters in Go for more details about string internals.
And btw since Go 1.5 the Go runtime is implemented (mostly) in Go, so these conversions are now implemented in Go and can be found in the runtime package: runtime/string.go, look for the intstring() function.

Resources