How to convert string into fixed number of bytes? - string

I want to create an 8 bytes sized variable that will include my string.
byte = 8_bytes_variable
str = 'hello'
# Put str inside byte while byte still remains of size 8 bytes.

You can format the string first by adding some space to the beginning of it. Here I assumed that each character takes 1 bit. (Chinese characters take more)
str = 'hello'
if len(str.encode('utf-8')) > 8:
print("This is not possible!")
else:
str2 = '{0: >8}'.format(str) # adds needed space to the beginnig of str
byte = str2.encode('utf-8')
In order to get the original string later, you can use lstrip():
str2 = byte.decode()
str = str2.lstrip()

Related

Why does using "+=" to append to a List[str] result in an unexpected newline character, while "c = c + a" results in c being empty?

I'm working on this problem on LeetCode:
https://leetcode.com/problems/read-n-characters-given-read4/
The question reads:
Given a file and assume that you can only read the file using a given
method read4, implement a method to read n characters.
Method read4:
The API read4 reads 4 consecutive characters from the file, then
writes those characters into the buffer array buf4.
The return value is the number of actual characters read.
Note that read4() has its own file pointer, much like FILE *fp in C.
Definition of read4:
Parameter: char[] buf4
Returns: int
Note: buf4[] is destination not source, the results from read4 will be
copied to buf4[]
...
Method read:
By using the read4 method, implement the method read that reads n
characters from the file and store it in the buffer array buf.
Consider that you cannot manipulate the file directly.
The return value is the number of actual characters read.
Definition of read:
Parameters: char[] buf, int n
Returns: int
Note: buf[] is destination not source, you will need to write the
results to buf[]
I put together the following simple solution:
"""
The read4 API is already defined for you.
#param buf4, a list of characters
#return an integer
def read4(buf4):
# Below is an example of how the read4 API can be called.
file = File("abcdefghijk") # File is "abcdefghijk", initially file pointer (fp) points to 'a'
buf4 = [' '] * 4 # Create buffer with enough space to store characters
read4(buf4) # read4 returns 4. Now buf = ['a','b','c','d'], fp points to 'e'
read4(buf4) # read4 returns 4. Now buf = ['e','f','g','h'], fp points to 'i'
read4(buf4) # read4 returns 3. Now buf = ['i','j','k',...], fp points to end of file
"""
class Solution:
def read(self, buf, n):
"""
:type buf: Destination buffer (List[str])
:type n: Number of characters to read (int)
:rtype: The number of actual characters read (int)
"""
buf4 = ['']*4
c = 1
while n > 0 and c > 0:
c = read4(buf4)
if c:
if n >= c:
buf += buf4
elif n < c:
buf += buf4[:n]
n -= c
return len(buf)
When I use "+=" to add the contents of buf4 to buf I get a newline character in my output, as in the following example:
"
abc"
If I instead write buf = buf + buf4, I get just the newline character, like so:
"
"
Does anyone know what might be going on here? I know I could solve this problem by using for loops instead. I'm just curious to know what's going on.
I found this article that explains that "+=" and "c = c + b" use different special methods:
Why does += behave unexpectedly on lists?
However I don't think this explains the unexpected newline character. Does anyone know where this newline character is coming from?

Convert bitarray to string without using ChrW function

I need to convert bitarray to unicode string (default encoding in vb.net for string type).
For example, let's say the 000000000100000100000000010000100000000001000011 is the bit string representation for 3 x 16bit unicode characters, respectively A = 65; B = 66 and C = 67 (codePoints resulted in conversion of those bits to Integer).
Now, those bits are stored in bitarray. Is there a way to convert the bitarray to string without using the build-in ChrW function?
I need this because since I already have the bits ordered as encoding would do, so I try to avoid a double conversion to gain some performance.
Use Byte() and System.Text.Encoding...
'BIN: 00000000 01000001 00000000 01000010 00000000 01000011
'HEX: 0 0 4 1 0 0 4 2 0 0 4 3
Dim s1 As String = "ABC"
Console.WriteLine("Original string:" & s1)
Dim b1 As Byte() = System.Text.Encoding.BigEndianUnicode.GetBytes(s1)
Console.WriteLine("Big-endian UTF-16 (Hex):" & BitConverter.ToString(b1))
b1(1) = b1(1) Or CByte(&H4) 'change byte 1 from 01000001 to 01000101
Dim s2 As String = System.Text.Encoding.BigEndianUnicode.GetString(b1)
Console.WriteLine("Modified string:" & s2)
Console.ReadKey()
I don't know if this'll actually be faster than using ChrW(), and it's probably not the prettiest solution either ;), but I here's how I did it:
To do what you ask I did these steps:
Reverse the bit array (since .NET interprets them from left to right, rather than right to left)
Create a byte array of the bits. The size of the array should be the amount of bits divided by 8 (8 bits = 1 byte) rounded up to the nearest integer (see Math.Ceiling()).
Use Encoding.Unicode to decode the byte array into a string.
Convert the string into an array of chars, reverse it, and convert the new char array back into a string.
I've put this together in a function:
Public Function BitArrayToString(ByVal Bits As BitArray) As String
'Reverse the bits (I didn't have access to a compiler tha supports Linq, please don't hate).
Dim ReversedValues As Boolean() = New Boolean(Bits.Count - 1) {}
For x As Integer = 0 To Bits.Count - 1
ReversedValues((ReversedValues.Length - 1) - x) = Bits(x)
Next
'Put the reversed bits into a new bit array.
Dim ReversedBits As New BitArray(ReversedValues)
'Declare a byte array to 1/8th of the bit array's size, rounded up to the nearest integer.
Dim Bytes As Byte() = New Byte(Math.Ceiling(ReversedBits.Length / 8) - 1) {}
ReversedBits.CopyTo(Bytes, 0)
'Decode the byte array into a string.
Dim Result As String = System.Text.Encoding.Unicode.GetString(Bytes)
'Get the string as a char array and reverse it.
Dim Chars As Char() = Result.ToCharArray()
Array.Reverse(Chars)
'Return the resulting string from our reversed chars.
Return New String(Chars)
End Function
Online test: https://ideone.com/SUTWlJ

How to define a function with a parameter one string and return a new string which has the middle char repeated as much as the length of the str

Define a function called repeat_middle which receives as parameter one string (with at least one character), and it should return a new string which will have the middle character/s in the string repeated as many times as the length of the input (original) string.
Notice that if the original string has an odd number of characters there is only one middle character. If, on the other hand, if the original string has an even number of characters then there will be two middle characters, and both have to be repeated (see the example).
Additionally, if there is only one middle character, then the string should be surrounded by 1 exclamation sign in each extreme . If , on the other hand, the original string has two middle characters then the returned string should have two exclamation signs at each extreme.
As an example, the following code fragment:
print (repeat_middle("abMNcd"))
should produce the output:
!!MNMNMNMNMNMN!!
Try the following:
def repeat_middle(string):
l = len(string)
if l % 2 == 0:
return "!!{}!!".format(string[int(l / 2 - .5) : int(l / 2 + 1.5)] * l)
else:
return "{}".format(string[int(l / 2)] * l)
odd = "ham"
even = "spam"
print("Original odd length string: {}".format(odd))
print("Returned string: {}".format(repeat_middle(odd)))
print("")
print("Original even length string: {}".format(even))
print("Returned string: {}".format(repeat_middle(even)))
Where the sample output is:
Original even length string: spam
Returned string: !!papapapa!!
Original odd length string: ham
Returned string: aaa
You will find that print(repeat_middle("abMNcd")) does indeed output !!MNMNMNMNMNMN!!.

Python3 adding an extra byte to the byte string

file_1 = (r'res\test.png')
with open(file_1, 'rb') as file_1_:
file_1_read = file_1_.read()
file_1_hex = binascii.hexlify(file_1_read)
print ('Hexlifying test.png..')
pack = ("test.packet")
file_1_size_bytes = len(file_1_read)
print (("test.png is"),(file_1_size_bytes),("bytes."))
struct.pack( 'i', file_1_size_bytes)
file_1_size_bytes_hex = binascii.hexlify(struct.pack( '>i', file_1_size_bytes))
print (("Hexlifyed length - ("),(file_1_size_bytes_hex),(")."))
with open(pack, 'ab') as header_1_:
header_1_.write(binascii.unhexlify(file_1_size_bytes_hex))
print (("("),(binascii.unhexlify(file_1_size_bytes_hex)),(")"))
with open(pack, 'ab') as header_head_1:
header_head_1.write(binascii.unhexlify("0000020000000D007200650073002F00000074006500730074002E0070006E006700000000"))
print ("Header part 1 added.")
So this writes "0000020000000D007200650073002F00000074006500730074002E0070006E006700000000(00)" to the pack unhexlifyed.
There's an extra "00" byte at the end. this is messing everything up im trying to do because the packets length is referred back to when loading it and i have about 13 extra "00" bytes at the end of each string i write to the file. So in turn my file is 13 bytes longer than it should be. Not to mention the headers byte length isnt being read properly because the padding is off by 1 byte.
You seem to be saying that binascii.unhexlify does not really condense the input string. I have trouble believing that. Here is a minimal complete runnable example and the output I get with 3.4.2 on Win 7.
import binascii
import io
b = binascii.unhexlify(
"000000030000000100000000000000040041004E0049004D00000000000000")
print(b) # bytes
bf = io.BytesIO()
bf.write(b)
print(bf.getvalue())
>>>
b'\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x04\x00A\x00N\x00I\x00M\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x04\x00A\x00N\x00I\x00M\x00\x00\x00\x00\x00\x00\x00'
Unhexlify has converted each pair of hex characters to the byte expected.

how to convert a double array to character string

hello i have entered some text and convert it to the binary values.these binary values get stored in a array of data type double. Now i want to get the char array from that array containing binary values.
text2='hello how are u';
text3=double(text2);
nValues = numel(text3);
B=8;
bit_stream = zeros(1,nValues*B);
% eight bit for binary representation of each character.
for iBit = 1:B %# Loop over the bits
bit_stream(iBit:B:end) = bitget(text3,B-iBit+1); %# Get the bit values
end
bitstream=bit_stream;
how to perform vice-versa..
text2_recovered = char( 2.^(7:-1:0) * reshape(bit_stream, 8, []) );
Explanation:
Arrange bits in groups of 8 (reshape(...,8,[]));
Convert each group to a byte value (2.^(7:-1:0)*...) ;
Convert those bytes to characters (char).

Resources