In ETW, how to enable ProcessRundown events for Microsoft-Windows-Kernel-Process? - etw

Provider's manifest indicates that it can send Microsoft-Windows-Kernel-Process::ProcessRundown::Info events, which I'd really like to have: they give a summary of processes that existed at the time the trace has started.
For reference, in the "usual" process provider enabled by EVENT_TRACE_FLAG_PROCESS, rundown is sent automatically via MSNT_SystemTrace::Process::DCStart events. However, data fields in that provider does not allow to find the process's image path: ImageFileName field is an ANSI filename without a path, and CommandLine field is also unreliable, because it could contain relative path (in worst case, no path at all). For this reason, I need Microsoft-Windows-Kernel-Process provider.

After quite a lot of trying, I found a very simple way: after the provider is enabled with EnableTraceEx2(EVENT_CONTROL_CODE_ENABLE_PROVIDER), an additional EnableTraceEx2(EVENT_CONTROL_CODE_CAPTURE_STATE) will send the events.
Eventually, I enable provider this way:
namespace Microsoft_Windows_Kernel_Process
{
struct __declspec(uuid("{22FB2CD6-0E7B-422B-A0C7-2FAD1FD0E716}")) GUID_STRUCT;
static const auto GUID = __uuidof(GUID_STRUCT);
enum class Keyword : u64
{
WINEVENT_KEYWORD_PROCESS = 0x10,
WINEVENT_KEYWORD_THREAD = 0x20,
WINEVENT_KEYWORD_IMAGE = 0x40,
WINEVENT_KEYWORD_CPU_PRIORITY = 0x80,
WINEVENT_KEYWORD_OTHER_PRIORITY = 0x100,
WINEVENT_KEYWORD_PROCESS_FREEZE = 0x200,
Microsoft_Windows_Kernel_Process_Analytic = 0x8000000000000000,
};
}
///////////////////////////////////
const u64 matchAnyKeyword =
(u64)Microsoft_Windows_Kernel_Process::Keyword::WINEVENT_KEYWORD_PROCESS;
const ULONG status = EnableTraceEx2(
m_SessionHandle,
&Microsoft_Windows_Kernel_Process::GUID,
EVENT_CONTROL_CODE_ENABLE_PROVIDER,
TRACE_LEVEL_VERBOSE,
matchAnyKeyword, // Filter events to specific keyword
0, // No 'MatchAllKeyword' mask
INFINITE, // Synchronous operation
nullptr // The trace parameters used to enable the provider
);
ENSURE_OR_CRASH(ERROR_SUCCESS == status);
And request rundown like this
const ULONG status = EnableTraceEx2(
m_SessionHandle,
&Microsoft_Windows_Kernel_Process::GUID,
EVENT_CONTROL_CODE_CAPTURE_STATE, // Request 'ProcessRundown' events
TRACE_LEVEL_NONE, // Probably ignored for 'EVENT_CONTROL_CODE_CAPTURE_STATE'
0, // Probably ignored for 'EVENT_CONTROL_CODE_CAPTURE_STATE'
0, // Probably ignored for 'EVENT_CONTROL_CODE_CAPTURE_STATE'
INFINITE, // Synchronous operation
nullptr // Probably ignored for 'EVENT_CONTROL_CODE_CAPTURE_STATE'
);
ENSURE_OR_CRASH(ERROR_SUCCESS == status);

Related

What s the Windows exact equivalent of WaitOnAddress() on Linux?

Using shared memory with the shmget() system call, the aim of my C++ program, is to fetch a bid price from the Internet through a server written in Rust so that each times the value changes, I m performing a financial transaction.
Server pseudocode
Shared_struct.price = new_price
Client pseudocode
Infinite_loop_label:
Wait until memory address pointed by Shared_struct.price changes.
Launch_transaction(Shared_struct.price*1.13)
Goto Infinite_loop
Since launching a transaction involve paying transaction fees, I want to create a transaction only once per buy price change.
Using a semaphore or a futex, I can do the reverse, I m meaning waiting for a variable to reachs a specific value, but how to wait until a variable is no longer equal to current value?
Whereas on Windows I can do something like this on the address of the shared segment:
ULONG g_TargetValue; // global, accessible to all process
ULONG CapturedValue;
ULONG UndesiredValue;
UndesiredValue = 0;
CapturedValue = g_TargetValue;
while (CapturedValue == UndesiredValue) {
WaitOnAddress(&g_TargetValue, &UndesiredValue, sizeof(ULONG), INFINITE);
CapturedValue = g_TargetValue;
}
Is there a way to do this on Linux? Or a straight equivalent?
You can use futex. (I assumed "var" is in shm mem)
/* Client */
int prv;
while (1) {
int prv = var;
int ret = futex(&var, FUTEX_WAIT, prv, NULL, NULL, 0);
/* Spurious wake-up */
if (!ret && var == prv) continue;
doTransaction();
}
/* Server */
int prv = NOT_CACHED;
while(1) {
var = updateVar();
if (var != prv || prv = NOT_CACHED)
futex(&var, FUTEX_WAKE, 1, NULL, NULL, 0);
prv = var;
}
It requires the server side to call futex as well to notify client(s).
Note that the same holds true for WaitOnAddress.
According to MSDN:
Any thread within the same process that changes the value at the address on which threads are waiting should call WakeByAddressSingle to wake a single waiting thread or WakeByAddressAll to wake all waiting threads.
(Added)
More high level synchronization method for this problem is to use condition variable.
It is also implemented based on futex.
See link

How can I check if I can write to a file without elevated token?

I know that this question was asked by many other people, but my case has one important difference. I can't just open file handle with write option because I want to check if I can write in this file without elevated token.
For example, user can run my installer as administrator so I can write in almost all of folders, but after installation my program won't work.
I thought that I can just get the token of my process, disable all privileges and apply it to new thread. I did it but it doesn't work.
I don't want to include code that I wrote because there are a lot of insignificant stuff. Instead, I'll just describe order of functions that I use.
GetCurrentProcess
OpenProcessToken
DuplicateTokenEx - somebody told me that it removes the elevation
AdjustTokenPrivileges - I call it with DisableAllPriveleges = true
CreateThread - Suspend = true
SetThreadToken
CreateFile
I can still write to all folders. What am I doing wrong?
for create not elevated token from your existing token we can use CreateRestrictedToken function with LUA_TOKEN flag. this by fact what is UAC doing when create restricted version of an existing access token on interactive logon. and then we can do access via this token. also note that we not need use new thread - we can temporary impersonate current thread with this lua token and then revert back.
so code can look like:
inline ULONG BOOL_TO_ERROR(BOOL f)
{
return f ? NOERROR : GetLastError();
}
ULONG CheckFileWriteAccess(PCWSTR FileName, ULONG& dwFileError)
{
HANDLE hToken, hLuaToken;
ULONG dwError = BOOL_TO_ERROR(OpenProcessToken(GetCurrentProcess(), TOKEN_DUPLICATE, &hToken));
if (dwError == NOERROR)
{
dwError = BOOL_TO_ERROR(CreateRestrictedToken(hToken, LUA_TOKEN, 0, 0, 0, 0, 0, 0, &hLuaToken));
CloseHandle(hToken);
if (dwError == NOERROR)
{
dwError = BOOL_TO_ERROR(DuplicateToken(hLuaToken, ::SecurityImpersonation, &hToken));
CloseHandle(hLuaToken);
if (dwError == NOERROR)
{
dwError = BOOL_TO_ERROR(SetThreadToken(0, hToken));
CloseHandle(hToken);
if (dwError == NOERROR)
{
HANDLE hFile = CreateFileW(FileName, FILE_GENERIC_WRITE, FILE_SHARE_VALID_FLAGS, 0,
OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, 0);
if (hFile != INVALID_HANDLE_VALUE)
{
CloseHandle(hFile);
dwFileError = NOERROR;
}
else
{
dwFileError = GetLastError();
NTSTATUS status = RtlGetLastNtStatus();
if (RtlNtStatusToDosError(status) == dwFileError)
{
dwFileError = HRESULT_FROM_NT(status);
}
}
SetThreadToken(0, 0);
}
}
}
}
return dwError;
}
note also snippet
dwFileError = GetLastError();
NTSTATUS status = RtlGetLastNtStatus();
if (RtlNtStatusToDosError(status) == dwFileError)
{
dwFileError = HRESULT_FROM_NT(status);
}
win32 error returned from CreateFileW api - frequently confused, because many different (by sense) NTSTATUS errors mapped to single win32 error. so always better check RtlGetLastNtStatus() instead GetLastError(). even more better of course use NtOpenFile (it documented, user mode, supported, will be not altered or removed) which direct return actual NTSTATUS.
note also that potential you can got error STATUS_SHARING_VIOLATION too. more releable open file with READ_CONTROL only access (this never give sharing violation) query it security descriptor and than use AccessCheck with LUA token, but this require more complex code

IOCP multithreaded server and reference counted class

I work on IOCP Server (Overlapped I/O , 4 threads, CreateIoCompletionPort, GetQueuedCompletionStatus, WSASend etc). And my goal is to send single reference counted buffer too all connected sockets.(I followed Len Holgate's suggestion from this post WSAsend to all connected socket in multithreaded iocp server) . After sending buffer to all connected clients it should be deleted.
this is class with buffer to be send
class refbuf
{
private:
int m_nLength;
int m_wsk;
char *m_pnData; // buffer to send
mutable int mRefCount;
public:
...
void grab() const
{
++mRefCount;
}
void release() const
{
if(mRefCount > 0);
--mRefCount;
if(mRefCount == 0) {delete (refbuf *)this;}
}
...
char* bufadr() { return m_pnData;}
};
sending buffer to all socket
refbuf *refb = new refbuf(4);
...
EnterCriticalSection(&g_CriticalSection);
pTmp1 = g_pCtxtList; // start of linked list with sockets
while( pTmp1 )
{
pTmp2 = pTmp1->pCtxtBack;
ovl=TakeOvl(); // ovl -struct containing WSAOVERLAPPED
ovl->wsabuf.buf=refb->bufadr();// adress m_pnData from refbuf
ovl->rcb=refb; //when GQCS get notification rcb is used to decrease mRefCount
ovl->wsabuf.len=4;
refb->grab(); // mRefCount ++
WSASend(pTmp1->Socket, &(ovl->wsabuf),1,&dwSendNumBytes,0,&(ovl->Overlapped), NULL);
pTmp1 = pTmp2;
}
LeaveCriticalSection(&g_CriticalSection);
and 1 of 4 threads
GetQueuedCompletionStatus(hIOCP, &dwIoSize,(PDWORD_PTR)&lpPerSocketContext, (LPOVERLAPPED *)&lpOverlapped, INFINITE);
...
lpIOContext = (PPER_IO_CONTEXT)lpOverlapped;
lpIOContext->rcb->release(); //mRefCount --,if mRefCount reach 0, delete object
i check this with 5 connected clients and it seems to work. When GQCS receives all notifaction, mRefCount reachs 0 and delete is executed.
And my questions: is that approach appropriate? What if there will be for example 100 or more clients? Is situation avoided when one thread can delete object before another still use it? How to implement atomic reference count in this scernario? Thanks in advance.
Obvious issues; in order of importance...
Your refbuf class doesn't use thread safe ref count manipulation. Use InterlockedIncrement() etc.
I assume that TakeOvl() obtains a new OVERLAPPED and WSABUF structure per operation.
Your naming could be better, why grab() rather than AddRef(), what does TakeOvl() take from? Those Tmp variables are something and the least important something is that they're 'temporary' so name them after a more important something. Go Read Code Complete.

RPC communication between Linux and Solaris

I have a RPC server running in Solaris. I have a RPC client which is running fine in Solaris.
When I compile and run the same code in Ubuntu, I am getting Error decoding arguments in the server.
Solaris use SunRPC (ONC RPC). Not sure how to find the version of rpc.
Is there any difference between the RPC available in Linux & Solaris?
Would there be any mismatch between the xdr generated in Solaris & Linux?
How should I find out the issue?
Note: Code cannot be posted
#twalberg, #cppcoder Have you resolve the problem? I have the same problem, but I can to post my code if it will be helpfull. The some part of code is:
/* now allocate a LoopListRequestStruct and fill it with request data */
llrs = malloc(sizeof(LoopListRequestStruct));
fill_llrs(llrs);
/* Now, make the client request to the bossServer */
client_call_status = clnt_call(request_client, ModifyDhctState,
(xdrproc_t)xdr_LoopListRequestStruct,
(caddr_t)llrs,
(xdrproc_t)xdr_void,
0,
dummy_timeval
);
void fill_llrs(LoopListRequestStruct* llrs)
{
Descriptor_Loop* dl = 0;
DhctState_d *dhct_state_ptr = 0;
PackageAuthorization_d *pkg_auth_ptr = 0;
llrs->TRANS_NUM = 999999; /* strictly arbitraty, use whatever you want */
/* the bossServer simply passes this back in */
/* in the response you use it to match */
/* request/response if you want or you can */
/* choose to ignore it if you want */
/* now set the response program number, this is the program number of */
/* transient program that was set up using the svc_reg_utils.[ch] */
/* it is that program that the response will be sent to */
llrs->responseProgramNum = response_program_number;
/* now allocate some memory for the data structures that will actually */
/* carry the request data */
llrs->ARG_PTR = malloc(sizeof(LoopListRequestArgs));
dl = llrs->ARG_PTR->loopList.Loop_List_val;
/* we are using a single descriptor loop at a time, this should always */
/* be the case */
llrs->ARG_PTR->loopList.Loop_List_len = 1;
llrs->ARG_PTR->loopList.Loop_List_val = malloc(sizeof(Descriptor_Loop));
/* now allocate memory and set the size for the ModifyDhctConfiguration */
/* this transaction always has 3 descriptors, the DhctMacAddr_d, the */
/* DhctState_d, and the PackageAuthorization_d */
dl = llrs->ARG_PTR->loopList.Loop_List_val;
dl->Descriptor_Loop_len = 2;
dl->Descriptor_Loop_val =
malloc((2 * sizeof(Resource_descriptor_union)));
/* now, populate each descriptor */
/* the order doesn't really matter I'm just doing it in the order I */
/* always have done */
/* first the mac address descriptor */
dl->Descriptor_Loop_val->type =
dhct_mac_addr_type;
strcpy(
dl->Descriptor_Loop_val[0].Resource_descriptor_union_u.dhctMacAddr.dhctMacAddr,
dhct_mac_addr
);
/* second the dhct state descriptor */
dl->Descriptor_Loop_val[1].type =
dhct_state_type;
dhct_state_ptr =
&(dl->Descriptor_Loop_val[1].Resource_descriptor_union_u.dhctState);
if(dis_enable)
dhct_state_ptr->disEnableFlag = DIS_Enabled;
else
dhct_state_ptr->disEnableFlag = DIS_Disabled;
if(dms_enable)
dhct_state_ptr->dmsEnableFlag = DMS_Enabled;
else
dhct_state_ptr->dmsEnableFlag = DMS_Disabled;
if(analog_enable)
dhct_state_ptr->analogEnableFlag = AEF_Enabled;
else
dhct_state_ptr->analogEnableFlag = AEF_Disabled;
if(ippv_enable)
dhct_state_ptr->ippvEnableFlag = IEF_Enabled;
else
dhct_state_ptr->ippvEnableFlag = IEF_Disabled;
dhct_state_ptr->creditLimit = credit_limit;
dhct_state_ptr->maxIppvEvents = max_ippv_events;
/* we don't currently use the powerkey pin, instead we use an */
/* application layer pin for purchases and blocking so always turn */
/* pinEnable off */
dhct_state_ptr->pinEnable = PE_Disabled;
dhct_state_ptr->pin = 0;
if(fast_refresh_enable)
dhct_state_ptr->fastRefreshFlag = FRF_Enabled;
else
dhct_state_ptr->fastRefreshFlag = FRF_Disabled;
dhct_state_ptr->locationX = location_x;
dhct_state_ptr->locationY = location_y;
}
I've met exactly this error during integration with the same software. Linux version really creates bad request. Reason of such behaviour is serialization of null c-string. Glibc edition of SUN rpc can't encode them, xdr_string returns zero. But the sample which you are dealing with sets 'pin' in 0. Just replace 'pin' with "", or create some wrapper over xdr_string(), and samples will work.
My patch to the PowerKey samples looks like this:
< if (!xdr_string(xdrs, objp, PIN_SZ))
< return (FALSE);
< return (TRUE);
---
> char *t = "";
> return xdr_string(xdrs, *objp? objp : &t , PIN_SZ);
but it can be made simpler, ofcourse. In general you should fix usage of the generated code, in my case it was 'pin' variable in the sample sources provided by software authors which must be initialized before xdr_string() call.
Note that XDR will handle endianness but if you use app specific opaque fields, decoding will break if you don’t handle endianness yourself. Make sure integers are sent as XDR integers

WebSocket Data Parsing with Node.js

Websocket on Client:
socket.send('helloworld');
Websocket on Node.js:
socket.ondata = function(d, start, end){
// I suppose that the start and end indicates the location of the
// actual data 'hello world' after the headers
var data = d.toString('utf8', start, end);
// Then I'm just printing it
console.log(data);
});
but I'm getting this: �����]���1���/�� on the terminal :O
I have tried to understand this doc: https://www.rfc-editor.org/rfc/rfc6455#section-5.2 but it's hard to understand because I don't know what should I work with, I mean I can't see the data even with toString?
I have tried to follow and test with this questions answer How can I send and receive WebSocket messages on the server side? but I can't get it to work, with this answer I was getting an array like this [false, true, false, false, true, true, false] etc... and I don't really know what to do with it.. :\
So I'm a bit confused, what the hell should I do after I get the data from the client side to get the real message?
I'm using the original client side and node.js API without any library.
Which node.js library are you using? Judging by the fact that you are hooking socket.ondata that looks like the HTTP server API. WebSockets is not HTTP. It has an HTTP compatible handshake so that the WebSocket and HTTP service can live on the same port, but that's where the similarity ends. After the handshake, WebSockets is a framed full-duplex, long-lived message transport more similar to regular TCP sockets than to HTTP.
If you want to implement your own WebSocket server in node.js you are going to want to use the socket library directly (or build on/borrow existing WebSocket server code).
Here is a Node.js based WebSocket server that bridges from WebSocket to TCP sockets: https://github.com/kanaka/websockify/blob/master/other/websockify.js Note that it is for the previous Hixie version of the protocol (I haven't had opportunity or motivation to update it yet). The modern HyBI version of the protocol is very different but you might be able to glean some useful information from that implementation.
You can in fact start with Node's HTTP API. That is exactly what I did when writing the WebSocket-Node module https://github.com/Worlize/WebSocket-Node
If you don't want to use an existing WebSocket Library (though you really should just use an existing library) then you need to be able to parse the binary data format defined by the RFC. It's very clear about the format and exactly how to interpret the data. From each frame you have to read in all the flags, interpret the frame size, possibly read the masking key, and unmask the contents as you read them from the wire.
That is one reason you're not seeing anything recognizable... in WebSockets, all client-to-server communications is obfuscated by applying a random mask to the contents using XOR as a security precaution against possibly poisoning the cache of older proxy servers that don't know about websockets.
There are two things -
Which node.js version are you using? I have never seen a data event with start and endpt. The emitted event is just data with buffer/string as an argument.
More importantly, if you are hooking on to the HTTP socket you should take care of the HTTP Packet. It contains hearders, body and a trailer. There might be garbage in there.
Here is a solution from this post:
https://medium.com/hackernoon/implementing-a-websocket-server-with-node-js-d9b78ec5ffa8
parseMessage(buffer) {
const firstByte = buffer.readUInt8(0);
//const isFinalFrame = Boolean((firstByte >>> 7) & 0x1);
//const [reserved1, reserved2, reserved3] = [ Boolean((firstByte >>> 6) & 0x1),
Boolean((firstByte >>> 5) & 0x1), Boolean((firstByte >>> 4) & 0x1) ];
const opCode = firstByte & 0xF;
// We can return null to signify that this is a connection termination frame
if (opCode === 0x8)
return null;
// We only care about text frames from this point onward
if (opCode !== 0x1)
return;
const secondByte = buffer.readUInt8(1);
const isMasked = Boolean((secondByte >>> 7) & 0x1);
// Keep track of our current position as we advance through the buffer
let currentOffset = 2; let payloadLength = secondByte & 0x7F;
if (payloadLength > 125) {
if (payloadLength === 126) {
payloadLength = buffer.readUInt16BE(currentOffset);
currentOffset += 2;
} else {
// 127
// If this has a value, the frame size is ridiculously huge!
//const leftPart = buffer.readUInt32BE(currentOffset);
//const rightPart = buffer.readUInt32BE(currentOffset += 4);
// Honestly, if the frame length requires 64 bits, you're probably doing it wrong.
// In Node.js you'll require the BigInt type, or a special library to handle this.
throw new Error('Large payloads not currently implemented');
}
}
const data = Buffer.alloc(payloadLength);
// Only unmask the data if the masking bit was set to 1
if (isMasked) {
let maskingKey = buffer.readUInt32BE(currentOffset);
currentOffset += 4;
// Loop through the source buffer one byte at a time, keeping track of which
// byte in the masking key to use in the next XOR calculation
for (let i = 0, j = 0; i < payloadLength; ++i, j = i % 4) {
// Extract the correct byte mask from the masking key
const shift = j == 3 ? 0 : (3 - j) << 3;
const mask = (shift == 0 ? maskingKey : (maskingKey >>> shift)) & 0xFF;
// Read a byte from the source buffer
const source = buffer.readUInt8(currentOffset++);
// XOR the source byte and write the result to the data
data.writeUInt8(mask ^ source, i);
}
} else {
// Not masked - we can just read the data as-is
buffer.copy(data, 0, currentOffset++);
}
return data
}

Resources