How to implement Uniscribe for Linux - linux

I'm trying to implement Uniscribe for Linux to display complex text, such as Arabic. It seems a huge task to do.
What I need to do is to implement the APIs below:
Check if the string is Complex.
Get string width.
Get next segment.
Get next break.
...
I want to use/refer to open source code to do it.
I have read HarfBuzz source code for weeks, but didn't get the APIs for them. Is it feasible to use ONLY HarfBuzz to implement them?
It seems I should use Pango, but I can't do it due to its license. Is there any substitution? MIT license is OK.
Is ICU helpful to me?

Check out https://github.com/HOST-Oman/libraqm as smallest thing that might meet your needs.

Related

Xcode String Searching Algorithm

How does Xcode's autocompletion algorithm work?
I am always amazed by how fast Xcode picks up the code blocks I want to write. I have looked at some different string matching algorithms but none seems to be working as the one Apple uses in Xcode. I would find it quite interesting what type of algorithm they are using.
Thanks in advance.
The image above shows Xcode "predicting" that UTV should be UITableView
It's not just a simple string searching algorithm. It uses nearest code to the scope, last code you picked with the same shortcut, precompiled codes, codes in frameworks, your own defined codes, and many other stuff sorted by some intelligent definition. Since it's private by apple, we may not know how exactly they achieve this. But this year, they open sourced their LSP repository to bring support for these kind of stuff to other editors, even the VIM in terminal! You can investigate on that if you are interested.
Also there are some projects out there like TabNine witch is the all-language autocompleter. It uses machine learning to provide responsive, reliable, and relevant suggestions trained with over 2million github repositories. You can check that out too if you are interested.
Who knows what exactly programming and tech lead companies are currently using while we are looking for algorithms? Maybe a lot of machine learnings is included and only machines knows the exact algorithms.

How to Decompile Bytenode "jsc" files?

I've just seen this library ByteNode it's the same as ByteCode of java but this is for NodeJS.
This library compiles your JavaScript code into V8 bytecode, which protect your source code, I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough. I'm wondering because I would like to protect my source code using this library?
TL;DR It'll raise the bar to someone copying the code and trying to pass it off as their own. It won't prevent a dedicated person from doing so. But the primary way to protect your work isn't technical, it's legal.
This library compiles your JavaScript code into V8 bytecode, which protect your source code...
Well, we don't know it's V8 bytecode, but it's "compiled" in some sense. All we know is that it creates a "code cache" via the built-in vm.Script.prototype.createCachedData API, which is officially just a cache used to speed up recompiling the code a second time, third time, etc. In theory, you're supposed to also provide the original source code as a string to the vm.Script constructor. But if you go digging into Node.js's vm.Script and V8 far enough it seems to be the actual code in some compiled form (whether actual V8 bytecode or not), and the code string you give it when running is ignored. (The ByteNode library provides a dummy string when running the code from the code cache, so clearly the actual code isn't [always?] needed.)
I'm wondering is there anyway to Decompile byteNode therefore it's not secure enough.
Naturally, otherwise it would be useless because Node.js wouldn't be able to run it. I didn't find a tool to do it that already exists, but since V8 is open source, it would presumably be possible to find the necessary information to write a decompiler for it that outputs valid JavaScript source code which someone could then try to understand.
Experimenting with it, local variable names appear to be lost, although function names don't. Comments appear to get lost (this may not be as obvious as it seems, given that Function.prototype.toString is required to either return the original source text or a synthetic version [details]).
So if you run the code through a minifier (particularly one that renames functions), then run it through ByteNode (or just do it with vm.Script yourself, ByteNode is a fairly thin wrapper), it will be feasible for someone to decompile it into something resembling source code, but that source code will be very hard to understand. This is very similar to shipping Java class files, which can be decompiled (there's even a standard tool to do it in the JDK, javap), except that the format Java class files are well-documented and don't change from one dot release to the next (though they can change from one major release to another; new releases always support the older format, though), whereas the format of this data is not documented (though it's an open source project) and is subject to change from one dot release to the next.
Certain changes, such as changing the copyright message, are probably fairly easy to make to said source code. More meaningful changes will be harder.
Note that the code cache appears to have a checksum or other similar integrity mechanism, since directly editing the .jsc file to swap one letter for another in a literal string makes the code cache fail to load. So someone tampering with it (for instance, to change a copyright notice) would either need to go the decompilation/recompilation route, or dive into the V8 source to find out how to correct the integrity check.
Fundamentally, the way to protect your work is to ensure that you've put all the relevant notices in the relevant places such that the fact copying it is a violation of copyright is clear, then pursue your legal recourse should you find out about someone passing it off as their own.
is there any way
You could get a hundred answers here saying "I don't know a way", but that still won't guarantee that there isn't one.
not secure enough
Secure enough for what? What's your deployment scenario? What kind of scenario/attack are you trying to defend against?
FWIW, I don't know of an existing tool that "decompiles" V8 bytecode (i.e. produces JavaScript source code with the same behavior). That said, considering that the bytecode is a fairly straightforward translation of the source code, I'm sure it wouldn't be very hard to write such a tool, if someone had a reason to spend some time on it. After all, V8's JS-to-bytecode compiler is open source, so one would only have to look at those sources and implement the reverse direction. So I would assume that shipping as bytecode provides about as much "protection" as shipping as uglified JavaScript, i.e. none that I would trust.
Before you make any decisions, please also keep in mind that bytecode is considered an internal implementation detail of V8; in particular it is not versioned and can change at any time, so it has to be created by exactly the same V8 version that consumes it. If you want to update your Node.js you'll have to recreate all the bytecode, and there is no checking or warning in place that will point out when you forgot to do that.
Node.js source already contains code for decompiling binary bytecode.
You can get a text string from your V8 bytecode and then you would need to analyze it.
But text string would be very long and miss some important information such as a constant pool. So you need to modify the Node.js source.
Please check https://github.com/3DGISKing/pkg10.17.0
I have attached exported xml file.
If you study V8, it would be possible to analyze it and get source code from it.
It keeping it short and sweet, You can try Ghidra node.js package which is based on Ghidra reverse engineering framework which was open-sourced by NSA in the year 2019. Ghidra is capable of disassembling and decompiling the v8 bytecode. The inner working of disassembling is quite complex, this answer is short but sufficient.

NSIS - Compile with opcode re arranged to prevent access to source code

I am trying to reduce and make as difficult as possible the ability to access my source code after being compiled by NSIS. I have read that the only way to reduce the chance of unzipping is to modify the order of the opcodes in the Source\fileform.h from the source code and then Compile the new version.
This is a bit over my head. I was wondering if anyone has done this before and willing to post one they have done. (Or create one for me?)
Main reason for this is I have info that I encrypt using blow-fish within NSIS and do not want the chance oFf someone finding out what the encryption keys are. (Used for licencing the software) I understand noting is fool proof, but just want it as difficult as possible.
I know its asking a lot, but could really this.
Thanks!
I don't believe there are any publicly available modified builds like that. And if there were and it got popular, the decompilers would just add support for it.
I have a complete step-by-step guide to building NSIS here.
If you know C/C++, Delphi or C# you could build your own private NSIS plug-in that handles the encryption details.
No matter what you do, somebody who knows how to use a debugger can easily set a breakpoint on the blow-fish plug-in and view your key. The only way around that is a custom plug-in or an external application that handles the cryptography internally...

What language is easiest to develop command line/simple GUI for Linux?

I need to develop a large set of tools to be run from the server command line (i.e. not client-server architecture). The systems does not have to be high-performance; I just want something that is easy to develop with.
Which technologies are out there I can use to build simple GUI to be run from the command line? I need only menus where I can select a line/check-box/enter free text in a dialog.
Edit: forgot to add, access to Mysql (i.e. drivers available) is essential.
Shell, with dialog, the old stand-by - http://www.linuxjournal.com/article/2807
EDIT- If it's MySQL-related, take a look at PERL-Tk and DBI.
python + ncurses would be a good combo here.
i like using perl's re.pl from the Devel::REPL library for quickie cli interfaces. read on a bit for my rationale before downvoting!
in this type of app it sounds like you will be doing query-type operations. these naturally lend themselves to a "repl" style interraction. re.pl gives you all of the goodies, namely command editing and history. all you need to write are the functions that users will call. the nice thing is that users who know perl will realize they can use any installed module to extend the functionality of your system on their own. i my case, i used re.pl to create a mysqlclient-like tool to access and display data that was being compressed in a way that the standard mysqlclient couldn't deal with.
i cite perl because it's DBI is the best database abstraction and it is what i have used....but the rationale can be extended to other tools. python's repl or any other would provide the same benefit.
You could use Mono for Linux and write your program in C# .NET, then make it work for Linux, since Mono allows so.
As far as graphic command line interfaces go, one of the best frameworks is ncurses. It abstracts away most of the ugliness associated with graphic command line applications.
I have to say, use Python, because I like it.
But text-based interfaces are pretty much not worth it, because they seem like a good idea until you look at the details:
There isn't really a standard keyboard navigation model for text-UIs; they all use their own scheme
How is unicode supported? (Hint: this is nontrivial)
What about different keyboard layouts? What key does someone press if their keyboard doesn't have, say, a "home", "end", or "Escape" ?
ncurses does not provide a widget set, only low-level operations. The answers to the above questions aren't easy.
It really shows that nobody has put much thought into keyboard-and-text-driven terminal-based UIs recently, or these would all have been solved.
Web interfaces have them solved, in fact, you can use a text-mode web browser if you like.
Modern devices like i(phone|pad)s and even cheap mobile phones have a web browser which is good enough.
It is easy to write a web application which uses a very simple style (few images, little Javascrfipt) and have it work without much effort on a variety of devices.
So I would say go with dmckee's comment "go with what you know".
By building your own terminal-based interface, you are going to box yourself into a corner in the long term.

Nested struct viewer for Linux Kernel

I am in the process of tackling the Linux Kernel learning curve and trying to get my head round the information stored in nested struct specifically to resolve a ALSA driver issue.
Hence, I am spending a lot of my time in the source code tracing through structures that have pointers to other structures that in turn have pointers to yet other structures...by which time my head has become so full that I start to loose track of the big picture!
Can anybody point me at either a tool or a website (along the lines of the highly usful Linux Cross Reference http://lxr.linux.no/) that will allow me to, ideally graphically, expand down through the nested struct of the source code?
At the moment we are developing for an Embedded PowerPC in Eclipse CDT version 4.0 but wouldn't be opposed to switching tool chains.
Regards
KermitG
This may sound old fashion but I've found that tracing through data structures with a pencil and paper helps you reverse engineer the code better than tools that automagically do this. So, my recommendation is that you draw them yourself so that you don't have to keep it all in your head. Once you've done this your learning curve becomes a lot less steep.
Just a copy/paste of my comment, so that this question has at least 1 answer.
Or alternatively you could use something like Doxygen to generate the diagrams for you. It's worth noting a lot of the DocBook books get their structures directly from annotated code.
I am currently using Kdevelop4 (svn version) to walk through the Linux kernel. The navigation capabilities are great, but it takes a big while to parse it (just give it the directories you need, omitting all drivers you are not interested in for example) and is still a little bit crashy.
Once the stability improves and the parser can cache previously parsed data, I think this will become the most convenient way to walk through the kernel.

Resources