How to perform XOR operation in J2ME - java-me

I want to know whether there is any method to perform XOR operation in J2ME?

!='s truth table is the same as XOR.

Related

Building XOR bruteforce python

I solved a simple bruteforce XOR challenge using cyberchef.
Of course, I wanted more and build a script that do it.
The script should perform XOR test against a specific string.
I don't understand how to manipulate binary in python and this is my issue so far.
ciph = "q{vpln'bH_varHuebcrqxetrHOXEj"
for decim in range(256):
print(decim, ": ", ''.join([chr(ord(char) ^ decim) for char in ciph]))
Of course 'decim' is not a bit so it's not working properly.
I have read stuff about bytearray but no sure how to handle it and if it's relevant here.
Any idea?
Well, my bad, it is working well.
The flag is there but not at the same position as Cyberchief indicated because it performed a XOR bruteforce with Hexadecimal key.
Here I'm doing a XOR bruteforce with Decimal key, so yeah of course the flag is not at the same position ...

Obfuscation of checksum guards

As part of my project, I have to insert small codes in a C program called checksum guards. What these guards do is they calculate the checksum value of a portion of code using a function(add, xor, etc.) which operates on the instruction opcodes. So, if somebody has tampered with the instructions(add, modify, delete) in that region of code, the checksum value will change and intrusion will be detected.
Here is the research paper which talks about this technique:
https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2001-49.pdf
Here is the guard template:
guard:
add ebp, -checksum
mov eax, client_addr
for:
cmp eax, client_end
jg end
mov ebx, dword[eax]
add ebp, ebx
add eax, 4
jmp for
end:
Now, I have two questions.
Would putting the guards in the assembly better than putting it in the source program?
Assuming I put them in the assembly(at an appropriate place) what kind of obfuscation should I use to prevent the guard template to be easily visible? (Since when I have more than 1 guard, the attacker should not easily find out all the guard templates and disable all the guards together as that would leave the code with no security)
Thank you in advance.
From attacker's (without sources) point of view the first question doesn't matter; he's tampering with the final binary machine code, whether it was produced from .c or .s will make zero difference. So I would worry mainly how to generate the correct binary with appropriate checksums. I'm not aware of any way how to get proper checksum inside the C source. But I can imagine to have some external tool running over assembler files created by C compiler, in some post-process way - before compiling the .s files into .o. But... Keep in mind, that some calls and addresses are just relative offsets, and the binary loaded into memory is patched by the OS loader according to linker's table, to make those point to the real memory addresses. Thus the data bytes will change (opcodes will stay fixed).
Yours guard template doesn't take that into account, and does checksum whole opcodes with data bytes as well (Some advanced guards have opcodes definitions, and checksum/encrypt/decipher only the opcodes themselves without operand bytes).
Otherwise it's neat, that the result is damaged ebp value, ruining any C code around (*) working with stack variables. But it's still artificial test, you can simply comment out both add ebp,-checksum and add ebp,ebx making the guard harmless.
(*) notice you have to put the guard in between some classic C code to get some real runtime problems from invalid ebp value. If you would put it at the end of subroutine, which ends with pop ebp, everything would work well.
So to the second question:
You definitely want more malicious ways to guard correct value, than only ebp damage. Usually the hardest (to remove) way is to make checksum value part of some calculation, eventually skewing results just slightly, so serious usage of the SW will be impossible, but it will take time to notice by the user.
You can also use some genuine code loop to add your checksumming to it, so simply skipping whole loop will skip also valid code (but I can imagine this one only added by hand into generated assembly from C, so you will have to redo it after every new compilation of particular C source).
Then the particular guard template can be obfuscated by any imaginable mutation (different registers used, modified order of instructions, instruction variants), try to search about viruses with mutation encoding to get some ideas.
And I didn't read that paper, but from the Figures I would say the main point is to make those guarding areas to overlap, so patching off one of them will affect another one, which sounds to me like that extra sugar to make it somewhat functional (although this still looks like normal challenge to 8bit game crackers ;), not even "hard" level). But that also means you would need either very cunning external tool to calculate that cyclic tree of dependencies, and insert the guard templates in correct order, or do it again manually completely.
Of course when doing manually, you have to do it after each new C compilation, so it's worth of the effort only on something very precious and expensive, or rock solid stable, where you will not produce another revision for next 10y or so... :D

Decoding 68k instructions

I'm writing an interpreted 68k emulator as a personal/educational project. Right now I'm trying to develop a simple, general decoding mechanism.
As I understand it, the first two bytes of each instruction are enough to uniquely identify the operation (with two rare exceptions) and the number of words left to be read, if any.
Here is what I would like to accomplish in my decoding phase:
1. read two bytes
2. determine which instruction it is
3. extract the operands
4. pass the opcode and the operands on to the execute phase
I can't just pass the first two bytes into a lookup table like I could with the first few bits in a RISC arch, because operands are "in the way". How can I accomplish part 2 in a general way?
Broadly, my question is: How do I remove the variability of operands from the decoding process?
More background:
Here is a partial table from section 8.2 of the Programmer's Reference Manual:
Table 8.2. Operation Code Map
Bits 15-12 Operation
0000 Bit Manipulation/MOVEP/Immediate
0001 Move Byte
...
1110 Shift/Rotate/Bit Field
1111 Coprocessor Interface...
This made great sense to me, but then I look at the bit patterns for each instruction and notice that there isn't a single instruction where bits 15-12 are 0001, 0010, or 0011. There must be some big piece of the picture that I'm missing.
This Decoding Z80 Opcodes site explains decoding explicitly, which is something I haven't found in the 68k programmer's reference manual or by googling.
I've decided to simply create a look-up table with every possible pattern for each instruction. It was my first idea, but I discarded it as "wasteful, inelegant". Now, I'm accepting it as "really fast".

How do I go about Power and Square root functions in Assembly(IA32)?

How do I go about Power and Square root functions in
Assembly Language (with/out) Stack on Linux.
Edit 1 : I'm programming for Intel x_86.
In x86 assembly, there is no instruction for a Power operation, but you can build your own routine for calculating Power() by expressing the Power in terms of logarithms.
The following two instructions calculate logarithms:
FYL2X ; Replace ST(1) with (ST(1) * log2 ST(0)) and pop the register stack.
FYL2XP1 ; Replace ST(1) with (ST(1) * log2(ST(0) + 1.0)) and pop the register stack.
There are several ways to compute the square root:
(1) You can use the FPU instruction
FSQRT ; Computes square root of ST(0) and stores the result in ST(0).
(2) alternatively, you can use the following SSE/SSE2 instructions:
SQRTPD xmm1, xmm2/m128 ;Compute Square Roots of Packed Double-Precision Floating-Point Values
SQRTPS xmm1, xmm2/m128 ;Compute Square Roots of Packed Single-Precision Floating-Point Values
SQRTSS xmm1, xmm2/m128 ;Compute Square Root of Scalar Single-Precision Floating-Point Value
SQRTSD xmm1, xmm2/m128 ;Compute Square Root of Scalar Double-Precision Floating-Point Value
Write a simple few line C program that performs the task you are interested in. Compile that to an object. Disassemble that object....Look at how the assembler prepares to call the math function and how it calls the math function, take the disassembled code segments as your starting point for assembler and go from there.
Now if you are talking some embedded system with no operating system, the problem is not the operating system but the C/math library. Those libraries, in these functions or other, may rely on operating system calls which wont be valid. Ideally though it is the same exact mechanism, prepare for the function call by setting up the right registers, make the call to the function, use the results. With embedded your problem comes when you try to link your code with the library and/or when you try to execute it.
If you are asking how to re-create this functionality without using a pre-made library using discrete instructions. That is a completely different topic, esp if you are using a processor without those instructions. You can learn a little by looking at the source code for the library for those functions, and/or the disassembly of the functions in question, but it is likely not obvious. Look for the book or a book similar to "Hacker's Delight", which is packed full of things like performing math functions that are not natively supported by the language or processor.

Reduce assembly number of instructions

I want to reduce (manually) the number of instructions from a Linux assembly file. This will be basically done by searching predefined reductions in an abstract syntax tree.
For example:
pushl <reg1>
popl <reg1>
Will be deleted because it makes no sense.
Or:
pushl <something1>
popl <something2>
Will become:
movl <something1>, <something2>
I'm looking for other optimizations that involve a fixed number of instructions. I don't want to search dynamic ranges of instructions.
Could you suggest other similar patterns that can be replaced with less instructions?
Later Edit: Found out, thanks to Richard Pennington, that what I want is peephole optimization.
So I rephrase the question as: suggestions for peephole optimization on Linux assembly code.
Compilers already do such optimizations. Besides, it's not that straightforward decision to make such optimizations, because:
push reg1
pop reg1
Still leaves value of reg1 at memory location [sp-nn] (where nn = size of reg1 in bytes). So although sp is past it, the code after can assume [sp-nn] contains the value of reg1.
The same applies to other optimization as well:
push some1
pop some2
And that is usually emitted only when there is no equivalent movl some1, some2 instruction.
If you're trying to optimize a high-level compiler generated code, compilers usually take most of those cases into account. If you're trying to optimize natively written assembly code, then an assembly programmer should write even better code.
I would suggest you to optimize the compiler, rather than optimizing the assembly code, it would provide you a better framework for dealing with intent of the code and register usage etc.
To get more information about what you are trying to do, you might want to look for "peephole optimization".
pushl <something1>
popl <something2>
replaced with
mov <something1>, <something2>
actually increased the size of my program. Weird!
Could you provide some other possible peephole optimizations?

Resources