How can I access the bit representation of a file using Scheme?

How can I access the bit representation of a file using Scheme? - io

If I had a file called raw_text.txt, is there a way I could iterate through each bit?
I see the following but am confused on how to use it:
http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/File-Manipulation.html
— procedure: file-attributes/mode-string attributes
The mode string of the file, a newly allocated string showing the file's mode bits. Under unix, this string is in unix format. Under Windows, this string shows the standard “DOS” attributes in their usual format.
EDIT: I am using mit-scheme

It's implementation-specific. On the Racket side of things, there are a few libraries:
http://planet.racket-lang.org/display.ss?package=bitsyntax.plt&owner=tonyg
http://planet.racket-lang.org/display.ss?package=bit-io.plt&owner=soegaard
You can probably use something like the binary-parse library as well: http://okmij.org/ftp/Scheme/binary-io.html, as long as your implementation of Scheme can support it.
Under MIT Scheme, you can use the bit-string functions.

I haven't actually tried to do anything with this, but I think you're looking for this section of the mit-scheme docs: Input/Output. Specifically the file ports and input procedures sections.
I didn't see anything specifically about reading the binary bits, but if it's character bytes you want, it looks like there are procedures for that. Maybe you want to do something like this?
(call-with-input-file "raw_text.txt" <procedure>)
or
(call-with-binary-file "raw_text.txt" <procedure>)
Where <procedure> will take the file port and use the input procedures to read things from that file.
Just out of curiosity, what are you trying to do?
EDIT: It appears that someone did a write up on this here.

Related

How to convert model.tflite to model.cc and model.h on Windows 10

I have created a TensorFlow Lite .tflite model which I plan to use on a microcontroller. However, this file must be converted to a C source file, i.e, a TensorFlow Lite for Microcontrollers model. TensorFlow documentation provides a simple way to convert to a C array with the unix command xxd. I am using Windows 10 and do not have access to the unix command and there are no alternative Windows methods documented. After searching superuser, I saw that xxd for Windows now exists. I downloaded the command and ran it on my .tflite model. The results were different than the hello world example.
First, the hello world example model.h file has a comment that say it was "Automatically created from a TensorFlow Lite flatbuffer using the command: xxd -i model.tflite > model.cc" When I ran the command, model.h was not "automatically created".
Second, comparing the model.cc file from the hello world example, with the model.cc file that I generated, they are quite different and I'm not sure how to interpret this (I'm not referring to the differences in the actual array). Again, in the example model.cc file, it states that it was "automatically created" using the xxd command. Line 28 in the example is alignas(8) const unsigned char g_model[] = { and line 237 is const int g_model_len = 2488;. In comparison, the equivalent lines in the file I generated are unsigned char _________g_model[] = { and unsigned int _________g_model_len = 4009981;
While I am not a C expert, I am not sure how to interpret the differences in the files and if I have generated the model.cc file incorrectly. I would greatly appreciate any insight or guidance here on how to properly generate both the model.h and model.cc files from the original model.tflite file.

After doing some experiments, I think this is why you are getting differences:
xxd replaces any non-letter/non-digit character of the path to the input file by an underscore ('_'). Apparently you called xxd with a path for the input file that has 9 such leading characters, perhaps something like "../../../g.model". The syntax of C allows only letters (a to z, A to Z), digits (0 to 9) and underscore as characters of objects' names, and the names need to start with a non-digit. This is the only "manipulation" xxd does to the name of an input file.
Since xxd knows nothing about TensorFlow, it could not had generated the copyright notice. Using this as indication, any other difference had been inserted by other means by the TensorFlow authors, despite the statement "Automatically created from a TensorFlow Lite flatbuffer ...". This could be done manually or by a script, unfortunately I did not find any hint in some quick research on their repository. Apparently the statement means just the data values.
So you need to edit your result:
Add any comment you see fit.
Add the compiler-specific alignas(8) to the array, if your compiler supports it.
Add the keywords const to the array and the length variable. This will tell the compiler to prohibit any write access. And probably this will place the data in read-only memory.
Rename array and length variables to g_model and g_model_len, respectively. Most probably TensorFlow expects these names.
Copy "model.cc" into "model.h", and then apply more editions, as the example demonstrated.
Don't be bothered by different values. Different contents of the model's file are the reason. It's especially simple to check the length variable, it has to have exactly the same value as the size of the input file.
EDIT:
On line 28 which is this text alignas(8) const unsigned char as shown in the example converted model. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 28 is unsigned char (any other text on that line is not in question). How is line 28 edited & explained?
Concerning the "how": I firmly believe that the authors of TensorFlow literally used an editor (an IDE or a stand-alone program like Notepad++ or Geany) and edited the line, or used some script to automate this.
The reason for alignas(8) is most probably that TensorFlow expects the data with an alignment of 8 bytes, for example because it casts the byte array to a structure that contains values of 8 bytes width.
The insertion of const will also commonly locate the model in read-only memory, which is preferable on most microcontrollers. If it were left out, the model's data were not only writable, but would be located in precious RAM.
On line 237, the text specifically is const int. When I attempt to convert a model (whether it's my custom model or the "hello_world.tflite" example model) the text that would be on line 237 is unsigned int (any other text on that line is not in question). Why are these two lines different in these specific places? It makes me believe that xxd on Windows is not functioning the same?
Again, I firmly believe this was edited manually or by a script. TensorFlow might expect this variable to be of data type int, but any xxd I tried (Windows and Linux) generates unsigned int. I don't think that your specific version of xxd functions differently on Windows.
For const the same thoughts apply as above.
Finally, when I attempt to convert the example model "hello_world.tflite" file using the xxd for windows utility, my resulting array doesn't match the example "hello_world.cc" file. I would expect the array values to be identical if the xxd worked. The last question is how to generate the "model.h" and "model.cc" files on Windows.
Did you note that the model you link is in another branch of the repository?
If I use the branch on GitHub as in your link to "hello_world.cc", I find in "../train/README.md" this archive hello_world_2020_12_28.zip. I unpacked it and ran xxd on the included "model.tflite". The result's data match the included "model.cc" in the archive. But it does not match the data of "hello_world.cc" in the same branch that you linked. The difference is already there.
My conclusion is, that the example result was not generated from the example model. This happens, since developers sometimes don't pay enough attention on what they commit. Yes, it's unfortunate, as it irritates and frustrates beginners like you.
But, as I wrote, don't let this make you headaches. Try the simple example, use the documentation as instructions on the process. Look at the differences in specific data as a quirk. You will encounter such things time after time when working with other's projects. It is quite normal.

i am playing with processes etc. but I dont know how to add "client.dll" to hex value

In cheat engine you can do "client.dll"+00D3AC5C and in reclass <client.dll>+00D3AC5C
how to do the same in python I am using ReadWriteMemory but I will soon change it for something more complex. Can you tell me please how to do it with RWM or with something other ?

According to the source code of that library, there's seemingly no way to get the base address of a process.
However you can get the base address by bypassing the library and doing it yourself via this method. Then, once you have the hex value of the base address, you can then simply add an offset to it, then use RWM's read() or get_pointer().

Converting an ASTNode into code

How does one convert an ASTNode (or at least a CompilationUnit) into a valid piece of source code?
The documentation says that one shouldn't use toString, but doesn't mention any alternatives:
Returns a string representation of this node suitable for debugging purposes only.
CompilationUnits have rewrite, but that one does not work for ASTs created by hand.
Formatting options would be nice to have, but I'd basically be satisfied with anything that turns arbitrary ASTNodes into semantically equivalent source code.

In JDT the normal way for AST manipulation is to start with a basic CompilationUnit and then use a rewriter to add content. Then ASTRewriteAnalyzer / ASTRewriteFormatter should take care of creating formatted source code. Creating a CU just containing a stub type declaration shouldn't be hard, so that's one option.
If that doesn't suite your needs, you may want to experiement with directly calling the internal org.eclipse.jdt.internal.core.dom.rewrite.ASTRewriteFlattener.asString(ASTNode, RewriteEventStore). If not editing existing files, you may probably ignore the events collected in the RewriteEventStore, just use the returned String.

Delphi: Upgrade from 6 to XE2 - TStringList

We have to upgrade to XE2 (from Delphi6).
I collected many informations about this, but one of them isn't clear for me.
We are using String - what is AnsiString in XE.
As I know we must replace all (P)Ansi[String/Char] in our libraries to avoid the side effects of Unicode converts, and to we can compile our projects.
It is ok, but we are also using TStringList, and I don't found any TAnsiStringList class to change it simply... ;-)
What do you know about this? Can this cause problems too? Or this class have an option to preserve the strings?
(Ok, it seems to be 3 questions, but it is one only)
The program / OS language is hungarian, the charset is WIN-1250, what have some strange characters, like Ő, and Ű...
Thanks for your every information, link, etc.

1) 1st of all - WHY should u use AnsiStringList, rather than converting all your project to unicode-aware TStringList ? That should have certain detailed reasons, to suggest viable alternatives.
Unicode is a superset of windows-1250, windows-1251 and such.
Normally all you locale-specific string would be just losslessly converted to Unicode. IT is the opposite, Unicode to AnsiString, convertion that may loose data.
Explicit or implicit (like AnsiChar reduction in "if char-var in char-set")
You may have type-unsafe API like in DLLs, where compiler cannot check if you pass PChar or PAnsiChar, but you anyway should not pass objects liek TStrings into DLLs, there are BPLs for that.
So you probably just do not need TAnsiStringList
2) you can take TJclAnsiStringList from Jedi Code Library
3) You can use XE2 stock TList<AnsiString> type

Groovy says my Unicode string is too long

As part of my probably wrong and cumbersome solution to print out a form I have taken a MS-Word document, saved as XML and I'm trying to store that XML as a groovy string so that I can ${fillOutTheFormProgrammatically}
However, with MS-Word documents being as large as they are, the String is 113100 unicode characters and Groovy says its limited to 65536. Is there some way to change this or am I stuck with splitting up the string?
Groovy - need to make a printable form
That's what I'm trying to do.
Update: to be clear its too long of a Groovy String.. I think a regular string might be all good. Going to change strategy and put some strings in the file I can easily find like %!%variable_name%!% and then do .replace(... uh i feel a new question coming on here...

Are you embedding this string directly in your groovy code? The jvm itself has a limit on the length of string constants, see the VM Spec if you are interested in details.
A ugly workaround might be to split the string in smaller parts and concatenate them at runtime. A better solution would be to save the text in an external file and read the contents from your code. You could also package this file along with your code and access it from the classpath using Class#getResourceAsStream.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can I access the bit representation of a file using Scheme? - io

Related

How to convert model.tflite to model.cc and model.h on Windows 10

i am playing with processes etc. but I dont know how to add "client.dll" to hex value

Converting an ASTNode into code

Delphi: Upgrade from 6 to XE2 - TStringList

Groovy says my Unicode string is too long

Categories

Resources