Postscript: how to convert a integer to string? - string

In postscript , the cvs *operator* is said to convert a number to a string. How should I use it ?
I tried :
100 100 moveto
3.14159 cvs show
or
100 100 moveto
3.14159 cvs string show
but it didn't work.
Any help ?

Try 3.14159 20 string cvs show.
string needs a size and leaves the created string on the stack. cvs needs a value and a string to store the converted value.
If you're doing lots of string conversions, it may be more efficient to create one string and reuse it in each conversion:
/s 20 string def
3.14159 s cvs show

tldr;
A common idiom is to use a literal string as a template.
1.42857
( ) cvs show
more...
You can even do formatted output by presenting cvs with various substrings of a larger string.
%0123456.......
(2/7 = ) dup 6 7 getinterval
2.85714 exch cvs pop show
But the Ghostscript Style Guide forbids this. And it's pretty much the only published Postscript Style Guide we have. (A discussion about this in comp.lang.postscript.) So a common recommendation is to allocate a fresh string when you need it and let the garbage collector earn its keep.
4.28571 7 string cvs show
Freshly allocating a string can be very important if you're wrapping this action in a procedure.
/toString { ( ) cvs } def
% vs
/toString { 10 string cvs } def
If you allocate a fresh string, then the enclosing procedure can be treated as a pure function of its inputs. If you use an embedded literal string as the buffer, then this resulting string is state-dependent and will be invalidated if the generating procedure is run again.
too much, don't do this...
As a last resort, the truly lazy hacker will hijack =string, the built-in 128-byte buffer used by = and == to output numbers (using, of course, our friend cvs). This is interpreter-specific and not portable according to the standard.
5.71428 =string cvs show
And if you like that one, you can combine it with ='s other trick: immediately evaluated names.
{ 7.14285 //=string cvs show } % embed =string in this procedure
This shaves that extra microsecond off, and makes it much harder to interactively inspect the code. Calling == on this procedure will not reveal the fact that you are using =string; it looks just like any other string.
Using =string in this manner inherits all the state-dependency problems described in the last section, ramped up a notch because there's only one =string buffer. And it adds a portability issue to boot, since =string is non standard -- albeit available in historical Adobe implementations and Ghostscript -- it is a legacy hack and should be used only in situations where a legacy hack is appropriate.
something else, no one (here) asked for...
One more trick for the bag, from a post by Helge Blischke in comp.lang.postscript. This is a simple way to get a zero-padded integer.
/bindec % <integer> bindec <string_of_length_6>
{
1000000 add 7 string cvs 1 6 getinterval
}bind def

Related

MALAB Coder - Static size string in sprintf

How can I prevent MATLAB Coder to generate variable size code for a simple number insertion into a string?
for i=1:4
name=sprintf('Data%d.bin',int8(i));
stuff(name);
end
In the generated C code it uses a lot of functions like emxutil to determine the size of the generated string for sprtintf.
I just want to say that i is only one digit. How can I do that?!
The followings also do not work
name=['Data',char(i),'.bin'];
Using the following also gives an error for generating code that LHS is fixed sized but RHS is varying:
coder.varsize('name',[1,14],[0,0])
I just tested the following again. It works well and also can be used for more digits, and it does not use var size stuff.
name=['Data',int2str(i),'.bin'];
Also, these can be used if we are sure that i is one digit:
['Data' char(48+i) '.bin']
['Data' char('0'+i) '.bin']

need guidance with basic function creation in MATLAB

I have to write a MATLAB function with the following description:
function counts = letterStatistics(filename, allowedChar, N)
This function is supposed to open a text file specified by filename and read its entire contents. The contents will be parsed such that any character that isn’t in allowedChar is removed. Finally it will return a count of all N-symbol combinations in the parsed text. This function should be stored in a file name “letterStatistics.m” and I made a list of some commands and things of how the function should be organized according to my professors' lecture notes:
Begin the function by setting the default value of N to 1 in case:
a. The user specifies a 0 or negative value of N.
b. The user doesn’t pass the argument N into the function, i.e., counts = letterStatistics(filename, allowedChar)
Using the fopen function, open the file filename for reading in text mode.
Using the function fscanf, read in all the contents of the opened file into a string variable.
I know there exists a MATLAB function to turn all letters in a string to lower case. Since my analysis will disregard case, I have to use this function on the string of text.
Parse this string variable as follows (use logical indexing or regular expressions – do not use for loops):
a. We want to remove all newline characters without this occurring:
e.g.
In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.
In my younger and more vulnerableyears my father gave me some advicethat I’ve been turning over in my mindever since.
Replace all newline characters (special character \n) with a single space: ' '.
b. We will treat hyphenated words as two separate words, hence do the same for hyphens '-'.
c. Remove any character that is not in allowedChar. Hint: use regexprep with an empty string '' as an argument for replace.
d. Any sequence of two or more blank spaces should be replaced by a single blank space.
Use the provided permsRep function, to create a matrix of all possible N-symbol combinations of the symbols in allowedChar.
Using the strfind function, count all the N-symbol combinations in the parsed text into an array counts. Do not loop through each character in your parsed text as you would in a C program.
Close the opened file using fclose.
HERE IS MY QUESTION: so as you can see i have made this list of what the function is, what it should do, and using which commands (fclose etc.). the trouble is that I'm aware that closing the file involves use of 'fclose' but other than that I'm not sure how to execute #8. Same goes for the whole function creation. I have a vague idea of how to create a function using what commands but I'm unable to produce the actual code.. how should I begin? Any guidance/hints would seriously be appreciated because I'm having programmers' block and am unable to start!
I think that you are new to matlab, so the documentation may be complicated. The root of the problem is the basic understanding of file I/O (input/output) I guess. So the thing is that when you open the file using fopen, matlab returns a pointer to that file, which is generally called a file ID. When you call fclose you want matlab to understand that you want to close that file. So what you have to do is to use fclose with the correct file ID.
fid = open('test.txt');
fprintf(fid,'This is a test.\n');
fclose(fid);
fid = 0; % Optional, this will make it clear that the file is not open,
% but it is not necessary since matlab will send a not open message anyway
Regarding the function creation the syntax is something like this:
function out = myFcn(x,y)
z = x*y;
fprintf('z=%.0f\n',z); % Print value of z in the command window
out = z>0;
This is a function that checks if two numbers are positive and returns true they are. If not it returns false. This may not be the best way to do this test, but it works as example I guess.
Please comment if this is not what you want to know.

How can I find the length of a _bstr_t object using windbg on a user-mode memory dump file?

I have a dump file that I am trying to extract a very long string from. I find the thread, then find the variable and dump part of it using the following steps:
~1s
dv /v, which returns:
00000000`07a4f6e8 basicString = class _bstr_t
dt -n basicString
Command 3 truncates the string in the debugging console to just a fraction of its actual contents.
What I would like to do is find the actual length of the _bstr_t variable so that I can dump its contents out to a file with a command like the following:
.writemem c:\debugging\output\string.txt 07a4f6e8 L<StringByteLength>
So my question is how can I determine what I should put in for StringByteLength?
Your .writemem line is pretty close to what you need already.
First, you'll need the correct address of the string in memory. 07a4f6e8 is the address of the _bstr_t, so writing memory at that address won't do any good.
_bstr_t is a pretty complicated type, but ultimately it holds a BSTR member called m_wstr.
We can store its address in a register like so:
r? #$t0 = ##c++(basicString.m_Data->m_wstr)
As Igor Tandetnik's comment says, the length of a BSTR can be found in the 4 bytes preceding it.
Let's put that into a register as well:
r? #$t1 = *(DWORD*)(((BYTE*)#$t0)-4)
And now, you can writemem using those registers.
.writemem c:\debugging\output\string.txt #$t0 L?#$t1

How to convert between bytes and strings in Python 3?

This is a Python 101 type question, but it had me baffled for a while when I tried to use a package that seemed to convert my string input into bytes.
As you will see below I found the answer for myself, but I felt it was worth recording here because of the time it took me to unearth what was going on. It seems to be generic to Python 3, so I have not referred to the original package I was playing with; it does not seem to be an error (just that the particular package had a .tostring() method that was clearly not producing what I understood as a string...)
My test program goes like this:
import mangler # spoof package
stringThing = """
<Doc>
<Greeting>Hello World</Greeting>
<Greeting>你好</Greeting>
</Doc>
"""
# print out the input
print('This is the string input:')
print(stringThing)
# now make the string into bytes
bytesThing = mangler.tostring(stringThing) # pseudo-code again
# now print it out
print('\nThis is the bytes output:')
print(bytesThing)
The output from this code gives this:
This is the string input:
<Doc>
<Greeting>Hello World</Greeting>
<Greeting>你好</Greeting>
</Doc>
This is the bytes output:
b'\n<Doc>\n <Greeting>Hello World</Greeting>\n <Greeting>\xe4\xbd\xa0\xe5\xa5\xbd</Greeting>\n</Doc>\n'
So, there is a need to be able to convert between bytes and strings, to avoid ending up with non-ascii characters being turned into gobbledegook.
The 'mangler' in the above code sample was doing the equivalent of this:
bytesThing = stringThing.encode(encoding='UTF-8')
There are other ways to write this (notably using bytes(stringThing, encoding='UTF-8'), but the above syntax makes it obvious what is going on, and also what to do to recover the string:
newStringThing = bytesThing.decode(encoding='UTF-8')
When we do this, the original string is recovered.
Note, using str(bytesThing) just transcribes all the gobbledegook without converting it back into Unicode, unless you specifically request UTF-8, viz., str(bytesThing, encoding='UTF-8'). No error is reported if the encoding is not specified.
In python3, there is a bytes() method that is in the same format as encode().
str1 = b'hello world'
str2 = bytes("hello world", encoding="UTF-8")
print(str1 == str2) # Returns True
I didn't read anything about this in the docs, but perhaps I wasn't looking in the right place. This way you can explicitly turn strings into byte streams and have it more readable than using encode and decode, and without having to prefex b in front of quotes.
This is a Python 101 type question,
It's a simple question but one where the answer is not so simple.
In python3, a "bytes" object represents a sequence of bytes, a "string" object represents a sequence of unicode code points.
To convert between from "bytes" to "string" and from "string" back to "bytes" you use the bytes.decode and string.encode functions. These functions take two parameters, an encoding and an error handling policy.
Sadly there are an awful lot of cases where sequences of bytes are used to represent text, but it is not necessarily well-defined what encoding is being used. Take for example filenames on unix-like systems, as far as the kernel is concerned they are a sequence of bytes with a handful of special values, on most modern distros most filenames will be UTF-8 but there is no gaurantee that all filenames will be.
If you want to write robust software then you need to think carefully about those parameters. You need to think carefully about what encoding the bytes are supposed to be in and how you will handle the case where they turn out not to be a valid sequence of bytes for the encoding you thought they should be in. Python defaults to UTF-8 and erroring out on any byte sequence that is not valid UTF-8.
print(bytesThing)
Python uses "repr" as a fallback conversion to string. repr attempts to produce python code that will recreate the object. In the case of a bytes object this means among other things escaping bytes outside the printable ascii range.
TRY THIS:
StringVariable=ByteVariable.decode('UTF-8','ignore')
TO TEST TYPE:
print(type(StringVariable))
Here 'StringVariable' represented as a string. 'ByteVariable' represent as Byte. Its not relevent to question Variables..

How to store binary data in a Lua string

I needed to create a custom file format with embedded meta information. Instead of whipping up my own format I decide to just use Lua.
texture
{
format=GL_LUMINANCE_ALPHA;
type=GL_UNSIGNED_BYTE;
width=256;
height=128;
pixels=[[
<binary-data-here>]];
}
texture is a function that takes a table as its sole argument. It then looks up the various parameters by name in the table and forwards the call on to a C++ routine. Nothing out of the ordinary I hope.
Occasionally the files fail to parse with the following error:
my_file.lua:8: unexpected symbol near ']'
What's going on here?
Is there a better way to store binary data in Lua?
Update
It turns out that storing binary data is a Lua string is non-trivial. But it is possible when taking care with 3 sequences.
Long-format-string-literals cannot have an embedded closing-long-bracket (]], ]=], etc).
This one is pretty obvious.
Long-format-string-literals cannot end with something like ]== which would match the chosen closing-long-bracket.
This one is more subtle. Luckily the script will fail to compile if done wrong.
The data cannot embed \n or \r.
Lua's built in line-end processing messes these up. This problem is much more subtle. The script will compile fine but it will yield the wrong data. 0x13 => 0x10, 0x1013 => 0x10, etc.
To get around these limitations I split the binary data up on \r, \n, then pick a long-bracket that works, finally emit Lua that concats the various parts back together. I used a script that does this for me.
input: XXXX\nXX]]XX\r\nXX]]XX]=
texture
{
--other fields omitted
pixels= '' ..
[[XXXX]] ..
'\n' ..
[=[XX]]XX]=] ..
'\r\n' ..
[==[XX]]XX]=]==];
}
Lua is able to encode most characters in long bracket format including nulls. However, Lua opens the script file in text mode and this causes some problems. On my Windows system the following characters have problems:
Char code(s) Problem
-------------- -------------------------------
13 (CR) Is translated to 10 (LF)
13 10 (CR LF) Is translated to 10 (LF)
26 (EOF) Causes "unfinished long string near '<eof>'"
If you are not using windows than these may not cause problems, but there may be different text-mode based problems.
I was only able to produce the error you received by encoding multiple close brackets:
a=[[
]]] --> a.lua:2: unexpected symbol near ']'
But, this was easily fixed with the following:
a=[==[
]]==]
The binary data needs to be encoded into printable characters. The simplest method for decoding purposes would be to use C-like escape sequences for all bytes. For example, hex bytes 13 41 42 1E would be encoded as '\19\65\66\30'. Of course, then the encoded data is three to four times larger than the source binary.
Alternatively, you could use something like Base64, but that would have to be decoded at runtime instead of relying on the Lua interpreter. Personally, I'd probably go the Base64 route. There are Lua examples of Base64 encoding and decoding.
Another alternative would be have two files. Use a well defined image format file (e.g. TGA) that is pointed to by a separate Lua script with the additional metadata. If you don't want two files to move around then they could be combined in an archive.

Resources