How to generify data for "mov" instruction? - visual-c++

I need to understand just 1 single instruction and accordingly I need to generify the things.
I need to pass structures (Objects of User Defined Data Types) at runtime using following assembly code.
Where Following is User Defined Data Type namely WESContext :
typedef struct CWESContext
{
BSTR UserName;
BSTR MachineIP;
BSTR Certificate;
BSTR BrowserClienthandle;//Its the handle of the BrowserClient or Java Application Level Object
BSTR SessionID;
BSTR TaskID;// name of the original task
long LocaleID;//The location of the ultimate Caller
long FeatureID;//The feature ID mapping to some feature available in WESFW
long SessionTypeID;//Itmay be; Browser CLient Session, OPC Client Session, Authenticated OPC Clients session(as they have more rights), WESFWSystemClient.
SYSTEMTIME TimeStamp;//the time the original task was executed
DWORD Priority; //task priority of the original task
struct WESProductCategory
{
BSTR ProductCategoryName;
int serialNo;
struct WESDimensions
{
int weight;
struct WESVolume
{
int length;
int heigth;
int width;
} oVolume;
BSTR tempHeight;
BSTR otherUnknownDimensions;
} oDimensions;
} oWESProductCategory;
} CWESContext;
I have created the block enough of size WESContext and filled it with sample data.
int sizeOfWESContext = sizeof(CWESContext);
void *pWESContext = malloc(sizeOfWESContext);
void *pGenericPtr = pWESContext;
memset(pWESContext,0,sizeOfWESContext);
BSTR *bstrUserName = (BSTR*)pGenericPtr;
*bstrUserName = SysAllocString(CT2OLE(CA2T(results.at(0).c_str())));
bstrUserName++;
pGenericPtr = bstrUserName;
BSTR *bstrMachineIp = (BSTR*)pGenericPtr;
*bstrMachineIp = SysAllocString(CT2OLE(CA2T(results.at(1).c_str())));
bstrMachineIp++;
pGenericPtr = bstrMachineIp;
BSTR *bstrCertificate = (BSTR*)pGenericPtr;
*bstrCertificate = SysAllocString(CT2OLE(CA2T(results.at(2).c_str())));
bstrCertificate++;
pGenericPtr = bstrCertificate;
.....................
so on so forth...............
If I call it by passing this as object:
Calling Normaly :
MyCallableMethodUDT(((CWESContext)pWESContext));
Now following assembly i just pulled from Dissasembly view of Visual Studio while debugging.
mov esi,dword ptr [pWESContext]
sub esp,58h
mov ecx,16h
mov edi,esp
rep movs dword ptr es:[edi],dword ptr [esi]
I just need to understand 3rd line..
AS I increase members inside my User Defined Structure (i.e here WESContext) it increases but I am unable to conclude how it increases....? I need to generify this instruction so that whatever the Object is and whatever the size and whatever kind of data it contains....it should get pass by calling it with writing assembly instruction as written above.
Regards,
Usman

ecx is used as the count for the number of dwords to be copied by the rep movs instructions in line 5. It's copying data from the starting address pointed to by esi to the location starting at edi.
The value in ecx would be the size of the data that is being copied.

Related

VC++ SSE code generation - is this a compiler bug?

A very particular code sequence in VC++ generated the following instruction (for Win32):
unpcklpd xmm0,xmmword ptr [ebp-40h]
2 questions arise:
(1) As far as I understand the intel manual, unpcklpd accepts as 2nd argument a 128-aligned memory address. If the address is relative to a stack frame alignment cannot be forced. Is this really a compiler bug?
(2) Exceptions are thrown from at the execution of this instruction only when run from the debugger, and even then not always. Even attaching to the process and executing this code does not throw. How can this be??
The particular exception thrown is access violation at 0xFFFFFFFF, but AFAIK that's just a code for misalignment.
[Edit:]
Here's some source that demonstrates the bad code generation - but typically doesn't cause a crash. (that's mostly what I'm wondering about)
[Edit 2:]
The code sample now reproduces the actual crash. This one also crashes outside the debugger - I suspect the difference occurs because the debugger launches the program at different typical base addresses.
// mock.cpp
#include <stdio.h>
struct mockVect2d
{
double x, y;
mockVect2d() {}
mockVect2d(double a, double b) : x(a), y(b) {}
mockVect2d operator + (const mockVect2d& u) {
return mockVect2d(x + u.x, y + u.y);
}
};
struct MockPoly
{
MockPoly() {}
mockVect2d* m_Vrts;
double m_Area;
int m_Convex;
bool m_ParClear;
void ClearPar() { m_Area = -1.; m_Convex = 0; m_ParClear = true; }
MockPoly(int len) { m_Vrts = new mockVect2d[len]; }
mockVect2d& Vrt(int i) {
if (!m_ParClear) ClearPar();
return m_Vrts[i];
}
const mockVect2d& GetCenter() { return m_Vrts[0]; }
};
struct MockItem
{
MockItem() : Contour(1) {}
MockPoly Contour;
};
struct Mock
{
Mock() {}
MockItem m_item;
virtual int GetCount() { return 2; }
virtual mockVect2d GetCenter() { return mockVect2d(1.0, 2.0); }
virtual MockItem GetItem(int i) { return m_item; }
};
void testInner(int a)
{
int c = 8;
printf("%d", c);
Mock* pMock = new Mock;
int Flag = true;
int nlr = pMock->GetCount();
if (nlr == 0)
return;
int flr = 1;
if (flr == nlr)
return;
if (Flag)
{
if (flr < nlr && flr>0) {
int c = 8;
printf("%d", c);
MockPoly pol(2);
mockVect2d ctr = pMock->GetItem(0).Contour.GetCenter();
// The mess happens here:
// ; 74 : pol.Vrt(1) = ctr + mockVect2d(0., 1.0);
//
// call ? Vrt#MockPoly##QAEAAUmockVect2d##H#Z; MockPoly::Vrt
// movdqa xmm0, XMMWORD PTR $T4[ebp]
// unpcklpd xmm0, QWORD PTR tv190[ebp] **** crash!
// movdqu XMMWORD PTR[eax], xmm0
pol.Vrt(0) = ctr + mockVect2d(1.0, 0.);
pol.Vrt(1) = ctr + mockVect2d(0., 1.0);
}
}
}
void main()
{
testInner(2);
return;
}
If you prefer, download a ready vcxproj with all the switches set from here. This includes the complete ASM too.
Update: this is now a confirmed VC++ compiler bug, hopefully to be resolved in VS2015 RTM.
Edit: The connect report, like many others, is now garbage. However the compiler bug seems to be resolved in VS2017 - not in 2015 update 3.
Since no one else has stepped up, I'm going to take a shot.
1) If the address is relative to a stack frame alignment cannot be forced. Is this really a compiler bug?
I'm not sure it is true that you cannot force alignment for stack variables. Consider this code:
struct foo
{
char a;
int b;
unsigned long long c;
};
int wmain(int argc, wchar_t* argv[])
{
foo moo;
moo.a = 1;
moo.b = 2;
moo.c = 3;
}
Looking at the startup code for main, we see:
00E31AB0 push ebp
00E31AB1 mov ebp,esp
00E31AB3 sub esp,0DCh
00E31AB9 push ebx
00E31ABA push esi
00E31ABB push edi
00E31ABC lea edi,[ebp-0DCh]
00E31AC2 mov ecx,37h
00E31AC7 mov eax,0CCCCCCCCh
00E31ACC rep stos dword ptr es:[edi]
00E31ACE mov eax,dword ptr [___security_cookie (0E440CCh)]
00E31AD3 xor eax,ebp
00E31AD5 mov dword ptr [ebp-4],eax
Adding __declspec(align(16)) to moo gives
01291AB0 push ebx
01291AB1 mov ebx,esp
01291AB3 sub esp,8
01291AB6 and esp,0FFFFFFF0h <------------------------
01291AB9 add esp,4
01291ABC push ebp
01291ABD mov ebp,dword ptr [ebx+4]
01291AC0 mov dword ptr [esp+4],ebp
01291AC4 mov ebp,esp
01291AC6 sub esp,0E8h
01291ACC push esi
01291ACD push edi
01291ACE lea edi,[ebp-0E8h]
01291AD4 mov ecx,3Ah
01291AD9 mov eax,0CCCCCCCCh
01291ADE rep stos dword ptr es:[edi]
01291AE0 mov eax,dword ptr [___security_cookie (12A40CCh)]
01291AE5 xor eax,ebp
01291AE7 mov dword ptr [ebp-4],eax
Apparently the compiler (VS2010 compiled debug for Win32), recognizing that we will need specific alignments for the code, takes steps to ensure it can provide that.
2) Exceptions are thrown from at the execution of this instruction only when run from the debugger, and even then not always. Even attaching to the process and executing this code does not throw. How can this be??
So, a couple of thoughts:
"and even then not always" - Not standing over your shoulder when you run this, I can't say for certain. However it seems plausible that just by random chance, stacks could get created with the alignment you need. By default, x86 uses 4byte stack alignment. If you need 16 byte alignment, you've got a 1 in 4 shot.
As for the rest (from https://msdn.microsoft.com/en-us/library/aa290049%28v=vs.71%29.aspx#ia64alignment_topic4):
On the x86 architecture, the operating system does not make the alignment fault visible to the application. ...you will also suffer performance degradation on the alignment fault, but it will be significantly less severe than on the Itanium, because the hardware will make the multiple accesses of memory to retrieve the unaligned data.
TLDR: Using __declspec(align(16)) should give you the alignment you want, even for stack variables. For unaligned accesses, the OS will catch the exception and handle it for you (at a cost of performance).
Edit1: Responding to the first 2 comments below:
Based on MS's docs, you are correct about the alignment of stack parameters, but they propose a solution as well:
You cannot specify alignment for function parameters. When data that
has an alignment attribute is passed by value on the stack, its
alignment is controlled by the calling convention. If data alignment
is important in the called function, copy the parameter into correctly
aligned memory before use.
Neither your sample on Microsoft connect nor the code about produce the same code for me (I'm only on vs2010), so I can't test this. But given this code from your sample:
struct mockVect2d
{
double x, y;
mockVect2d(double a, double b) : x(a), y(b) {}
It would seem that aligning either mockVect2d or the 2 doubles might help.

How to restore stack frame in gcc?

I want to build my own checkpoint library. I'm able to save the stack frame to a file calling checkpoint_here(stack pointer) and that can be restored later via calling recover(stack pointer) function.
Here is my problem: I'm able to jump from function recover(sp) to main() but the stack frame gets changed(stack pointer,frame pointer). So I want to jump to main from recover(sp) just after is checkpoint_here(sp) called retaining the stack frame of main(). I've tried setjmp/longjmp but can't make them working.Thanks in anticipation.
//jmp_buf env;
void *get_pc () { return __builtin_return_address(1); }
void checkpoint_here(register int *sp){
//printf("%p\n",get_pc());
void *pc;
pc=get_pc();//getting the program counter of caller
//printf("pc inside chk:%p\n",pc);
size_t i;
long size;
//if(!setjmp(env)){
void *l=__builtin_frame_address(1);//frame pointer of caller
int fd=open("ckpt1.bin", O_WRONLY|O_CREAT,S_IWUSR|S_IRUSR|S_IRGRP);
int mfd=open("map.bin", O_WRONLY|O_CREAT,S_IWUSR|S_IRUSR|S_IRGRP);
size=(long)l-(long)sp;
//printf("s->%ld\n",size);
write(mfd,&size,sizeof(long)); //writing the size of the data to be written to file.
write(mfd,&pc,sizeof(long)); //writing program counter of the caller.
write(fd,(char *)sp,(long)l-(long)sp); //writing local variables on the stack frame of caller.
close(fd);
close(mfd);
//}
}
void recover(register int *sp){
//int dummy;
long size;
void *pc;
//printf("old %p\n",sp);
/*void *newsp=(void *)&dummy;
printf("new %p old %p\n",newsp,sp);
if(newsp>=(void *)sp)
recover(sp);*/
int fd=open("ckpt1.bin", O_RDONLY,0644);
int mfd=open("map.bin", O_RDONLY,0644);
read(mfd,&size,sizeof(long)); //reading size of data written
read(mfd,&pc,sizeof(long)); //reading program counter
read(fd,(char *)sp,size); //reading local variables
close(mfd);
close(fd);
//printf("got->%ld\n",size);
//longjmp(env,1);
void (*foo)(void) =pc;
foo(); //trying to jump to main just after checkpoint_here() is called.
//asm volatile("jmp %0" : : "r" (pc));
}
int main(int argc,char **argv)
{
register int *sp asm ("rsp");
if(argc==2){
if(strcmp(argv[1],"recover")==0){
recover(sp); //restoring local variables
exit(0);
}
}
int a, b, c;
float s, area;
char x='a';
printf("Enter the sides of triangle\n");
//printf("\na->%p b->%p c->%p s->%p area->%p\n",&a,&b,&c,&s,&area);
scanf("%d %d %d",&a,&b,&c);
s = (a+b+c)/2.0;
//printf("%p\n",get_pc());
checkpoint_here(sp); //saving stack
//printf("here\n");
//printf("nsp->%p\n",sp);
area = (s*(s-a)*(s-b)*(s-c));
printf("%d %d %d %f %f %d\n",a,b,c,s,area,x);
printf("Area of triangle = %f\n", area);
printf("%f\n",s);
return 0;
}
You cannot do that in general.
You might try non-portable extended asm instructions (to restore %rsp and %rbp on x86-64). You could use longjmp (see setjmp(3) and longjmp(3)) -since longjmp is restoring the stack pointer- assuming you understand the implementation details.
The stack has, thanks to ASLR, a "random", non reproducible, location. In other words, if you start twice the same program, the stack pointer of main would be different. And in C some stack frames contain a pointer into other stack frames. See also this answer.
Read more about application checkpointing (see this) and study the source code (or use) BLCR.
You could perhaps restrict the C code to be used (e.g. if you generate the C code) and you might perhaps extend GCC using MELT for your needs. This is a significant amount of work.
BTW, MELT is (internally also) generating C++ code, with restricted stack frames which could be easily checkpointable. You could take that as an inspiration source.
Read also about x86 calling conventions and garbage collection (since a precise GC has to scan local pointers, which is similar to your needs).

How to implement GetThreadContext in Linux/Unix?

GetThreadContext is a Windows API.
BOOL WINAPI GetThreadContext(
_In_ HANDLE hThread,
_Inout_ LPCONTEXT lpContext
);
I wonder that how to implement it in linux.
How to retrieves the register information of the specified thread in Linux?
Like this:
pthread_create(thread_id, ...);
...
func(thread_id, reg_info)
{
//get the reg_info by thread_id.
??
}
A Linux-specific way of getting thread info is to use get_thread_area(). From the get_thread_area() man page:
get_thread_area() returns an entry in the current thread's Thread Local Storage (TLS) array. The index of the entry corresponds to the value of u_info->entry_number, passed in by the user. If the value is in bounds, get_thread_area() copies the corresponding TLS entry into the area pointed to by u_info.
But, if you want to read the register value you need to take help of inline assembly. Fox example, to retrieve the value of esp you can use the following inline assembly:
unsigned sp;
__asm __volatile("movl %%esp, %0" : "=r" (sp));
return sp;
In this way, you can extract ebp, eip etc. Hope this will help!

Check validity of virtual memory address

I am iterating through the pages between VMALLOC_START and VMALLOC_END and I want to
check if the address that I get every time is valid.
How can I manage this?
I iterate through the pages like this:
unsigned long *p;
for(p = (unsigned long *) VMALLOC_START; p <= (unsigned long *) (VMALLOC_END - PAGE_SIZE); p += PAGE_SIZE)
{
//How to check if p is OK to access it?
}
Thanks!
The easiest way is to try to red it, and catch the exception.
Catching the exception is done by defining an entry in the __ex_table secion, using inline assembly.
The exception table entry contains a pointer to a memory access instruction, and a pointer to a recovery address. If an segfault happens on this instruction, EIP will be set to the recovery address.
Something like this (I didn't test this, I may be missing something):
void *ptr=whatever;
int ok=1;
asm(
"1: mov (%1),%1\n" // Try to access
"jmp 3f\n" // Success - skip error handling
"2: mov $0,%0\n" // Error - set ok=0
"3:\n" // Jump here on success
"\n.section __ex_table,\"a\""
".long 1b,2b\n" // Use .quad for 64bit.
".prev\n"
:"=r"(ok) : "r"(ptr)
);

Can I Read Windows Registry From Windows Service in VC++?

Can I Read Windows Registry From Windows Service in VC++ ?
I have written code to read from Registry in MFC Application.It works fine but the same code does not work in Windows Service project.
My code looks like this:
TCHAR szPasswordDecrypted[32] = _T("");
TCHAR* szEncryptPwd = NULL ;
HKEY hKey = NULL;
TCHAR achKey[MAX_KEY_LENGTH]; // buffer for subkey name
DWORD cbName; // size of name string
TCHAR achClass[MAX_PATH] = TEXT(""); // buffer for class name
DWORD cchClassName = MAX_PATH; // size of class string
DWORD cSubKeys=0; // number of subkeys
DWORD cbMaxSubKey; // longest subkey size
DWORD cchMaxClass; // longest class string
DWORD cValues; // number of values for key
DWORD cchMaxValue; // longest value name
DWORD cbMaxValueData; // longest value data
DWORD cbSecurityDescriptor; // size of security descriptor
FILETIME ftLastWriteTime; // last write time
DWORD i, retCode;
long lg = RegOpenKeyEx( HKEY_CURRENT_USER,
TEXT("SOFTWARE\\NetworkDriveSolution"),
0,
KEY_READ,
&hKey
);
// Get the class name and the value count.
retCode = RegQueryInfoKey(
hKey, // key handle
achClass, // buffer for class name
&cchClassName, // size of class string
NULL, // reserved
&cSubKeys, // number of subkeys
&cbMaxSubKey, // longest subkey size
&cchMaxClass, // longest class string
&cValues, // number of values for this key
&cchMaxValue, // longest value name
&cbMaxValueData, // longest value data
&cbSecurityDescriptor, // security descriptor
&ftLastWriteTime); // last write time
HKEY_CURRENT_USER refers to the user the process is running as. Your service will likely run as LocalSystem, not as you. Best bet is to store information under HKEY_LOCAL_MACHINE.
You must use HKEY_LOCAL_MACHINE. You must have time calculation in this service.
As well you must use Registry APIs. Mainly RegNotifyChangeKeyValue(..) to monitor change of event.

Resources