SetEnvironmentVariable within 32bit Process on 64 bit Windows OS - 64-bit

I recently found an interesting issue. When using SetEnvironmentVariable, I can use Process Explorer to get the newly created environment variable. However when the process itself is 32 bit and the OS as 64 bit, Process Explorer (at least v10 ~ the latest v11.33) cannot find the new variables. If the program is native 64 bit then everything works fine, just as well as 32 bit process running on 32 bit OS.
The SetEnvironmentVariable API call should be successful, because the return value is TRUE and calling GetEnvironmentVariable returns correct value. Also if you create a child process, you can find the variable was correctly set in the new process by using Process Explorer.
I'm not if this is the limitation of SysWOW64 or a bug in Process Explorer. Anyone knows?
And, is there any way to get the 32 bit environment variables correctly? (for example, force Process Explorer to run in 32 bit mode, or some other tools)
Sample source to reproduce:
#include <stdio.h>
#include <windows.h>
int main(int argc, char *argv[])
{
printf("setting variable... %s\n",
SetEnvironmentVariable("a_new_var", "1.0") ? "OK" : "FAILED");
printf("press anykey to continue...\n");
getchar();
// system(argv[0]); // uncomment to inspect the child process
return 0;
}

I'm not sure how WOW64 works, but I'm pretty (99%) sure there are two PEBs (Process Environment Blocks) created - a 32-bit one and a 64-bit one. The process parameter structures (RTL_USER_PROCESS_PARAMETERS) are probably duplicated as well. So when you call SetEnvironmentVariable it is only modifying the 32-bit environment block. PE would be running as a native 64-bit program, which would mean it only knows about the 64-bit PEB and the 64-bit environment block (which hasn't changed).
Update (2010-07-10):
Just some new info on this old topic: You can find the 32-bit PEB by calling NtQueryInformationProcess with ProcessWow64Information. It gives you a PVOID with the address of the PEB.

Related

Stack smashing detected while applying stack & register on the remote identical process

Let us consider that I have an application that is to be executed on 1st node. This application however, cannot execute some function on this 1st node as the node lacks such capabilities. Hence, in order to make this application execution flawless, I am planning to steal the process's stack, heap & its registers using ptrace & send them over to other fully capable 2nd node. Here in this 2nd node, I would like to execute the same process(i.e same executable on the same architecture like x86) until the exact same point 1st process has exeuted, apply the previously stolen stack, heap & register's value onto this process and execute it here and transfer the results back to the 1st node and start executing the application from there.
I have also disabled the ASLR (Address space layout randomization) so that it will be one to one mapping between the process executed on remote node.
On applying such logic, the program ends up with "Stack smashing detected"
Is there anything that I am missing here, or is the idea itself not so feasible???
NOTE: I am also skipping the part of copying kernel stack, as the process on both sides are executed exactly until the same instruction. Please also note that this was a very simple program that I tried as I don't want the complexity of heaps to be involved.
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
void add_one(int *p){
*p += 2;
}
int main(int argc, char **argv)
{
int i = 0;
add_one(&i);
return 0;
}
Above picture holds that program that I experimented with, here I disassembled and found out the address of the function add_one, the point at which I would steal stack & process registers and send them over to apply onto the other identical process in node 2.
Any help on how to do such migrations and the things that I am missing would really help me in moving forward.
if you want to do this you need to at least disable stack canaries, because those will 100% mismatch when carrying over the execution to another machine even if you copied the entire address space.
-fno-stack-protector will do

USB device stuck polling on second call

I have a scientific camera, a USB3Vision device, that I wrote a program to captures images for. No problems when I use it on my desktop, it works fine every time. If I take the exact same application to another (2 other actually) computer it acquires one image but hangs the second time that you try to capture another.
The desktop I'm developing on is Fedora 25 and I am having this problem on clean installs of both Fedora 24 and 25. Also, on both computers the permissions of the USB device have been set to 666 using a udev rule so it's not some permissions error.
If I attach gdb to the process that's hung it shows that it's waiting on a blocking call to poll(), I've tried waiting a really long time but it never completes. This is where I'm at a bit of a loss now, I'm not really sure what to poke at to troubleshoot why that call never finishes on computers that aren't the mine. I'm also not really sure how to determine what's different between the USB buses on the different computers, that would probably be really useful too.
Example code to recreate this issue is irrelevant IMO, but here it is because it may be asked for if not included:
#include <arv.h>
#include <stdlib.h>
#include <stdio.h>
int
main (int argc, char **argv)
{
ArvCamera *camera;
ArvBuffer *buffer;
camera = arv_camera_new (argc > 1 ? argv[1] : NULL);
buffer = arv_camera_acquisition (camera, 0);
if (ARV_IS_BUFFER (buffer))
printf ("Image successfully acquired\n");
else
printf ("Failed to acquire a single image\n");
g_clear_object (&camera);
g_clear_object (&buffer);
return EXIT_SUCCESS;
}
This is just one of the test programs that comes with the Aravis library that works with GenICam type cameras.

Numerical regression in x64 with the VS2013 compiler with latest Haswell chips?

We recently started seeing unit tests fail on our build machine (certain numerical calculations fell out of tolerance). Upon investigation we found that some of our developers could not reproduce the test failure. To cut a long story short, we eventually tracked the problem down to what appeared to be a rounding error, but that error was only occurring with x64 builds on the latest Haswell chips (to which our build server was recently upgraded). We narrowed it down and pulled out a single calculation from one of our tests:
#include "stdafx.h"
#include <cmath>
int _tmain(int argc, _TCHAR* argv[])
{
double rate = 0.0021627412080263146;
double T = 4.0246575342465754;
double res = exp(-rate * T);
printf("%0.20e\n", res);
return 0;
}
When we compile this x64 in VS2013 (with the default compiler switches, including /fp:precise), it gives different results on the older Sandy Bridge chip and the newer Haswell chip. The difference is in the 15th significant digit, which I think is outside the machine epsilon for double on both machines).
If we compile the same code in VS2010 or VS2012 (or, incidentally, VS2013 x86) we get the exact same answer on both chips.
In the past several years, we've gone through many versions of Visual Studio and many different Intel chips for testing, and no-one can recall us ever having to adjust our regression test expectations based on different rounding errors between chips.
This obviously led to a game of whack-a-mole between developers with the older and newer hardware as to what should be the expectation for the tests...
Is there a compiler option in VS2013 that we need to be using to somehow mitigate the discrepancy?
Update:
Results on Sandy Bridge developer PC:
VS2010-compiled-x64: 9.91333479983898980000e-001
VS2012-compiled-x64: 9.91333479983898980000e-001
VS2013-compiled-x64: 9.91333479983898980000e-001
Results on Haswell build server:
VS2010-compiled-x64: 9.91333479983898980000e-001
VS2012-compiled-x64: 9.91333479983898980000e-001
VS2013-compiled-x64: 9.91333479983899090000e-001
Update:
I used procexp to capture the list of DLLs loaded into the test program.
Sandy Bridge developer PC:
apisetschema.dll
ConsoleApplication8.exe
kernel32.dll
KernelBase.dll
locale.nls
msvcr120.dll
ntdll.dll
Haswell build server:
ConsoleApplication8.exe
kernel32.dll
KernelBase.dll
locale.nls
msvcr120.dll
ntdll.dll
The results you documented are affected by the value of the MXCSR register, the two bits that select the rounding mode are important here. To get the "happy" number you like, you need to force the processor to round down. Like this:
#include "stdafx.h"
#include <cmath>
#include <float.h>
int _tmain(int argc, _TCHAR* argv[]) {
unsigned prev;
_controlfp_s(&prev, _RC_DOWN, _MCW_RC);
double rate = 0.0021627412080263146;
double T = 4.0246575342465754;
double res = exp(-rate * T);
printf("%0.20f\n", res);
return 0;
}
Output: 0.99133347998389898000
Change _RC_DOWN to _RC_NEAR to have MXCSR in normal rounding mode, the way the operating system programs it before it starts your program. Which produces 0.99133347998389909000. Or in other words, your Haswell machines are in fact producing the expected value.
Exactly how this happened can be very hard to diagnose, the control register is the worst possible global variable you can think of. The usual cause is an injected DLL that reprograms the FPU. A debugger can show the loaded DLLs, compare the lists between the two machines to find a candidate.
Due to a bug in the MS 2013 CRT x64 CRT code improperly detecting AVX and FMA3 support.
Fixed in an updated 2013 runtime, or using a newer MSVC version, or just disabling the feature support at runtime by calling" "_set_FMA3_enable(0);".
See:
https://support.microsoft.com/en-us/help/3174417/fix-programs-that-are-built-in-visual-c-2013-crash-with-illegal-instruction-exception

gdb/ddd Program received signal SIGILL

I wrote a very simple program in Linux using c++, which downloads images from some website over http (basically developed a http client request), using cURL libraries. http://curl.haxx.se/libcurl/c/allfuncs.html
#define CURL_STATICLIB
#include <stdio.h>
#include <stdlib.h>
#include </usr/include/curl/curl.h>
#include </usr/include/curl/stdcheaders.h>
#include </usr/include/curl/easy.h>
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
int main(void) {
CURL *curl;
FILE *fp;
CURLcode res;
char *url = "http://www.example.com/test_img.png";
char outfilename[FILENAME_MAX] = "/home/c++_proj/output/web_req_img.png";
curl = curl_easy_init();
if (curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
/* always cleanup */
curl_easy_cleanup(curl);
fclose(fp);
}
return 0;
}
I verified the code, and it works fine. I can see the image is downloaded and that I can view the image (with no errors or warnings). Since I plan on expanding my code, I tried to install ddd, and use the debugger, but the debugger doesn't work, and my program exits with some sort of Signal errors, when I try to run my program with ddd.
This is the error:
(Threadd debugging using libthread_db enabled)
Using host libthread_db library "/lib/arm-linux-gnueadihf/libthread_db.so.1"
Program received signal SIGILL, illegal instruction.
0xb6a5c4C0 in ?? () from /usr/lib/arm-linux-gnueadbihf/libcrypto.so.1.0.0
First I thought that I didn't properly install ddd, so I went back to gdb, but I get the exact same errors, when I run the program. (And I believe that I am using the latest version of gdb and ddd)
Then I tried to use ddd on another simple program, that doesn't involve cURL library, and it worked fine !!!
Does anyone know why this is the case, and what is the solution? Do I somehow need to point to cURL libraries while ddd is running? But, in the past, I don't recall doing this with different set of libraries! Maybe it is something abuot the cURL that ddd doesn't like? But the program runs fine itself without the debugger! I would appreciate some help.
I am guessing it may be part of some instruction set detection code. Just let the program continue and see if it handles the signal by itself (since it runs outside of gdb, it probably does). Alternatively, you can tell gdb to not bother you with SIGILL at all before you run the program: handle SIGILL pass nostop noprint.
It's only a problem if the program dies, which was not clear from your question.
Program received signal SIGILL, illegal instruction.
0xb6a5c4C0 in ?? () from /usr/lib/arm-linux-gnueadbihf/libcrypto.so.1.0.0
Does anyone know why this is the case, and what is the solution?
Jester gave you the solution. Here's the reason why it happens.
libcrypto.so is OpenSSL's crypto library. OpenSSL performs cpu feature probes by executing an instruction to see if its available. If a SIGILL is generated, then the feature is not available and an appropriate function is used instead.
The reason you see them on ARM and not IA-32 is, on Intel's IA-32 the cpuid instruction is non-privileged. Any program can execute cpuid to detect cpu features so there's no need for SIGILL-based feature program.
In contrast to IA-32, ARM's equivalent of cpuid is a privileged instruction. Your program needs Exception Level 1 (EL-1), but your program runs at EL-0. To side step the need for privileges on ARM programs setup a jmpbuf and install a SIGILL handler. They then try the instruction in question and the SIGILL handler indicates if the instruction or feature is available or not.
OpenSSL recently changed to SIGILL-free feature detection on some Apple platforms because Apple corrupts things. Also see PR 3108, SIGILL-free processor capabilities detection on MacOS X. Other libraries are doing similar. Also see How to determine ARMv8 features at runtime?
OpenSSL also documents the SIGILL behavior in their FAQ. See item 17 in the OpenSSL FAQ for more details: When debugging I observe SIGILL during OpenSSL initialization: why? Also see SSL_library_init cause SIGILL when running under gdb on Stack Overflow.
For Android developers you can disable SIGILL in Android Studio:
https://developer.oculus.com/documentation/native/android/mobile-studio-debug/#troubleshooting

Disable randomization of memory addresses

I'm trying to debug a binary that uses a lot of pointers. Sometimes for seeing output quickly to figure out errors, I print out the address of objects and their corresponding values, however, the object addresses are randomized and this defeats the purpose of this quick check up.
Is there a way to disable this temporarily/permanently so that I get the same values every time I run the program.
Oops. OS is Linux fsttcs1 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 23:42:43 UTC 2011 x86_64 GNU/Linux
On Ubuntu , it can be disabled with...
echo 0 > /proc/sys/kernel/randomize_va_space
On Windows, this post might be of some help...
http://blog.didierstevens.com/2007/11/20/quickpost-another-funny-vista-trick-with-aslr/
To temporarily disable ASLR for a particular program you can always issue the following (no need for sudo)
setarch `uname -m` -R ./yourProgram
You can also do this programmatically from C source before a UNIX exec.
If you take a look at the sources for setarch (here's one source):
http://code.metager.de/source/xref/linux/utils/util-linux/sys-utils/setarch.c
You can see if boils down to a system call (syscall) or a function call (depending on what your system defines). From setarch.c:
#ifndef HAVE_PERSONALITY
# include <syscall.h>
# define personality(pers) ((long)syscall(SYS_personality, pers))
#endif
On my CentOS 6 64-bit system, it looks like it uses a function (which probably calls the self-same syscall above). Take a look at this snippet from the include file in /usr/include/sys/personality.h (as referenced as <sys/personality.h> in the setarch source code):
/* Set different ABIs (personalities). */
extern int personality (unsigned long int __persona) __THROW;
What it boils down to, is that you can, from C code, call and set the personality to use ADDR_NO_RANDOMIZE and then exec (just like setarch does).
#include <sys/personality.com>
#ifndef HAVE_PERSONALITY
# include <syscall.h>
# define personality(pers) ((long)syscall(SYS_personality, pers))
#endif
...
void mycode()
{
// If requested, turn off the address rand feature right before execing
if (MyGlobalVar_Turn_Address_Randomization_Off) {
personality(ADDR_NO_RANDOMIZE);
}
execvp(argv[0], argv); // ... from set-arch.
}
It's pretty obvious you can't turn address randomization off in the process you are in (grin: unless maybe dynamic loading), so this only affects forks and execs later. I believe the Address Randomization flags are inherited by child sub-processes?
Anyway, that's how you can programmatically turn off the address randomization in C source code. This may be your only solution if you don't want the force a user to intervene manually and start-up with setarch or one of the other solutions listed earlier.
Before you complain about security issues in turning this off, some shared memory libraries/tools (such as PickingTools shared memory and some IBM databases) need to be able to turn off randomization of memory addresses.

Resources