In an embedded python scenario we are using PyArg_ParseTupleAndKeywords to receive data from Python (version >=3 .x) and use it in a C++ application.
At the moment we have a similar setup:
PyObject* whatever(PyObject *self, PyObject *args, PyObject *keywds) {
....
static char* kwlist[] = { "foo", "bar", NULL };
...
if(!PyArg_ParseTupleAndKeywords(args, keywds, ..., kwlist, ...))
{
...bail out
however, if we pass more parameters than the two expected (e.g. issuing a python call like whatever(foo="a", bar="b", baz="c")) the whole thing crashes (not really, it returns an error, but that's beyond the scope here).
We would like to avoid such scenarios; it would be great if we could parse only the parameters in the kwlist and ignore everthing else. What's the best way to do it?
One solution we were thinking about was to convert kwlist into a dict, then manipulate it with PyDict_Merge or the like.
In the end we solved it like below:
(I'm answering my own question since nobody answered, and I think it might be valuable to somebody else in the future).
PyObject* whatever(PyObject *self, PyObject *args, PyObject *incoming_keywds)
{
static char* kwlist[] = { "foo", "bar", NULL };
PyObject* keywds = PyDict_New();
/**
* The following routine returns a subset of the incoming dictionary 'incoming_keywds'
* containing only the keys allowed in the list 'kwlist'
*/
for ( int i = 0 ; kwlist[i] != NULL ; i++ )
{
char* key = kwlist[i];
PyObject *single_key = Py_BuildValue("s", key);
if ( PyDict_Contains(incoming_keywds, single_key) )
{
// not checking for NULL as GetItem return value since
// we already checked above if the dict contains key 'single_key'
if ( PyDict_SetItem(keywds, single_key, PyDict_GetItem(incoming_keywds, single_key)) < 0 )
{
/* error */
}
}
Py_DECREF(single_key);
}
/** end */
Related
For example, consider the following C# code:
interface IBase { void f(int); }
interface IDerived : IBase { /* inherits f from IBase */ }
...
void SomeFunction()
{
IDerived o = ...;
o.f(5);
}
I know how to get a MethodDefinition object corresponding to SomeFunction.
I can then loop through MethodDefinition.Instructions:
var methodDef = GetMethodDefinitionOfSomeFunction();
foreach (var instruction in methodDef.Body.Instructions)
{
switch (instruction.Operand)
{
case MethodReference mr:
...
break;
}
yield return memberRef;
}
And this way I can find out that the method SomeFunction calls the function IBase.f
Now I would like to know the declared type of the object on which the function f is called, i.e. the declared type of o.
Inspecting mr.DeclaringType does not help, because it returns IBase.
This is what I have so far:
TypeReference typeRef = null;
if (instruction.OpCode == OpCodes.Callvirt)
{
// Identify the type of the object on which the call is being made.
var objInstruction = instruction;
if (instruction.Previous.OpCode == OpCodes.Tail)
{
objInstruction = instruction.Previous;
}
for (int i = mr.Parameters.Count; i >= 0; --i)
{
objInstruction = objInstruction.Previous;
}
if (objInstruction.OpCode == OpCodes.Ldloc_0 ||
objInstruction.OpCode == OpCodes.Ldloc_1 ||
objInstruction.OpCode == OpCodes.Ldloc_2 ||
objInstruction.OpCode == OpCodes.Ldloc_3)
{
var localIndex = objInstruction.OpCode.Op2 - OpCodes.Ldloc_0.Op2;
typeRef = locals[localIndex].VariableType;
}
else
{
switch (objInstruction.Operand)
{
case FieldDefinition fd:
typeRef = fd.DeclaringType;
break;
case VariableDefinition vd:
typeRef = vd.VariableType;
break;
}
}
}
where locals is methodDef.Body.Variables
But this is, of course, not enough, because the arguments to a function can be calls to other functions, like in f(g("hello")). It looks like the case above where I inspect previous instructions must repeat the actions of the virtual machine when it actually executes the code. I do not execute it, of course, but I need to recognize function calls and replace them and their arguments with their respective returns (even if placeholders). It looks like a major pain.
Is there a simpler way? Maybe there is something built-in already?
I am not aware of an easy way to achieve this.
The "easiest" way I can think of is to walk the stack and find where the reference used as the target of the call is pushed.
Basically, starting from the call instruction go back one instruction at a time taking into account how each one affects the stack; this way you can find the exact instruction that pushes the reference used as the target of the call (a long time ago I wrote something like that; you can use the code at https://github.com/lytico/db4o/blob/master/db4o.net/Db4oTool/Db4oTool/Core/StackAnalyzer.cs as inspiration).
You'll need also to consider scenarios in which the pushed reference is produced through a method/property; for example, SomeFunction().f(5). In this case you may need to evaluate that method to find out the actual type returned.
Keep in mind that you'll need to handle a lot of different cases; for example, imagine the code bellow:
class Utils
{
public static T Instantiate<T>() where T : new() => new T();
}
class SomeType
{
public void F(int i) {}
}
class Usage
{
static void Main()
{
var o = Utils.Instantiate<SomeType>();
o.F(1);
}
}
while walking the stack you'll find that o is the target of the method call; then you'll evaluate Instantiate<T>() method and will find that it returns new T() and knowing that T is SomeType in this case, that is the type you're looking for.
So the answer of Vagaus helped me come up with a working implementation.
I published it on github - https://github.com/MarkKharitonov/MonoCecilExtensions
Included many unit tests, but I am sure I missed some cases.
today while doing some rust wasm vs js speed benchmarking with wasm-bindgen, I ran into a problem.
I had made a simple struct as you can see here:
I used this struct in a simple function called gimmeDirections
as shown here:
After compiling this into browser javascript, I looked into the .d.ts file that was compiled into it and noticed that the gimmeDirections function returned a number.
even though in the js, it states in the JSDOC that it returned the class of XY which was defined earlier in the compiled code.
here is the class:
export class XY {
static __wrap(ptr) {
const obj = Object.create(XY.prototype);
obj.ptr = ptr;
return obj;
}
free() {
const ptr = this.ptr;
this.ptr = 0;
wasm.__wbg_xy_free(ptr);
}
/**
* #returns {number}
*/
get x() {
var ret = wasm.__wbg_get_xy_x(this.ptr);
return ret;
}
/**
* #param {number} arg0
*/
set x(arg0) {
wasm.__wbg_set_xy_x(this.ptr, arg0);
}
/**
* #returns {number}
*/
get y() {
var ret = wasm.__wbg_get_xy_y(this.ptr);
return ret;
}
/**
* #param {number} arg0
*/
set y(arg0) {
wasm.__wbg_set_xy_y(this.ptr, arg0);
}
}
after being very confused, due to the fact of how the typescript said it would return a number but the js said it would return a class, I decided to run it... and got a number back.
The object below is my javascript function running identical code for the benchmark, as you can see, I got an object, not a number.
Here is my JS code:
import * as funcs from './wasm/wildz.js';
// compiled wasm js file
function directionsJS(x, y) {
let xX = x;
let yY = y;
if (Math.abs(xX) === Math.abs(yY)) {
xX /= Math.SQRT2;
yY /= Math.SQRT2;
}
return {
x: x,
y: yY
};
}
(async() => {
const game = await funcs.default();
console.time('Rust Result'); console.log(game.gimmeDirections(10, 10));
console.timeEnd('Rust Result'); console.time('JS Result');
console.log(directionsJS(10, 10)); console.timeEnd('JS Result');
})();
I'm still very confused on why it's returning a number when clearly I'm returning a object. Help is much needed, and appreciated
Much of this and more is explained in Exporting a struct to JS in the wasm-bindgen guide, but I'll summarize.
Rust structs are "returned" by allocating space for them dynamically and returning a pointer to it. What you're seeing, in regards to the function returning number, is the "raw" ffi function that binds the JS runtime and wasm module. It just returns that pointer value.
The generated XY Javascript class is a wrapper around that pointer value and provides functions for interacting with it. The generated gimmeDirections function is a wrapper around that wasm module call that creates that class.
I'd like to match on all the ways a particular argument to a function can be null. Right now I'm using
hasArgument(
3,
anyOf(
cxxNullPtrLiteralExpr()
,integerLiteral() // Technically this would alert on a constant pointer; but that's madness
)
)
However this doesn't match the following code:
void* nullObj = nullptr;
function(nullptr, false, false, nullObj);
Is it possible/easy to track this and match it? Right now I have a very simpler matcher but I guess this type of analysis requires considerably more logic?
High-level answer
You can't just "match" an expression whose value is NULL. AST matching can only inspect the syntax of the argument, so if the argument is not a literal, you don't know if it might be NULL.
Instead, you need to use a flow-sensitive checker that queries the Clang SA constraint engine. The constraint engine tracks values as they flow through the program.
The core of such a checker looks like this:
bool NullArgChecker::evalCall(const CallExpr *CE, CheckerContext &C) const {
ProgramStateRef state = C.getState();
auto SVal = C.getSVal(CE->getArg(0)).getAs<DefinedOrUnknownSVal>();
if (SVal) {
ConditionTruthVal Nullness = state->isNull(*SVal);
if (Nullness.isConstrainedTrue()) {
Given a call expression CE, we get its first argument, then query the CheckerContext for the symbolic value SVal associated with the first argument. We then ask if that value is known to be NULL.
Complete example
Here is a complete example checker that reports a warning every time it sees a value known to be NULL being passed as the first argument of any function.
NullArgChecker.cpp:
// NullArgChecker.cpp
// https://stackoverflow.com/questions/57665383/how-can-i-match-a-pointer-to-a-null-object
#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
using namespace clang;
using namespace ento;
namespace {
class NullArgChecker : public Checker< eval::Call > {
mutable std::unique_ptr<BuiltinBug> BT_nullarg;
public:
NullArgChecker() {}
bool evalCall(const CallExpr *CE, CheckerContext &C) const;
};
} // end anonymous namespace
bool NullArgChecker::evalCall(const CallExpr *CE, CheckerContext &C) const {
ProgramStateRef state = C.getState();
auto SVal = C.getSVal(CE->getArg(0)).getAs<DefinedOrUnknownSVal>();
if (SVal) {
// This is the core of this example checker: we query the constraint
// engine to see if the symbolic value associated with the first
// argument is known to be NULL along the current path.
ConditionTruthVal Nullness = state->isNull(*SVal);
if (Nullness.isConstrainedTrue()) {
// Create a warning for this condition.
ExplodedNode *N = C.generateErrorNode();
if (N) {
if (!BT_nullarg) {
BT_nullarg.reset(new BuiltinBug(
this, "Null Argument", "The first argument is NULL."));
}
C.emitReport(llvm::make_unique<BugReport>(
*BT_nullarg, BT_nullarg->getDescription(), N));
}
}
}
return false;
}
void ento::registerNullArgChecker(CheckerManager &mgr) {
mgr.registerChecker<NullArgChecker>();
}
bool ento::shouldRegisterNullArgChecker(const LangOptions &LO) {
return true;
}
Changes to other files to hook this in:
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
## -148,6 +148,10 ## def NonnullGlobalConstantsChecker: Checker<"NonnilStringCon
stants">,
let ParentPackage = CoreAlpha in {
+def NullArgChecker : Checker<"NullArg">,
+ HelpText<"Check for passing a NULL argument">,
+ Documentation<NotDocumented>;
+
def BoolAssignmentChecker : Checker<"BoolAssignment">,
HelpText<"Warn about assigning non-{0,1} values to Boolean variables">,
Documentation<HasAlphaDocumentation>;
--- a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
## -62,6 +62,7 ## add_clang_library(clangStaticAnalyzerCheckers
NonNullParamChecker.cpp
NonnullGlobalConstantsChecker.cpp
NullabilityChecker.cpp
+ NullArgChecker.cpp
NumberObjectConversionChecker.cpp
ObjCAtSyncChecker.cpp
ObjCAutoreleaseWriteChecker.cpp
Example input to test it on:
// nullargpp.cpp
// Testing NullArg checker with C++.
#include <stddef.h> // NULL
void somefunc(int*);
void nullarg1()
{
somefunc(NULL); // reported
somefunc(0); // reported
somefunc(nullptr); // reported
}
void nullarg2()
{
int *p = 0;
somefunc(p); // reported
}
void nullarg3(int *p)
{
if (p) {
somefunc(p); // not reported
}
else {
somefunc(p); // reported
}
}
void not_nullarg(int *p)
{
somefunc(p); // not reported
}
Example run:
$ g++ -std=c++11 -E -o nullargpp.ii nullargpp.cpp
$ ~/bld/llvm-project/build/bin/clang -cc1 -analyze -analyzer-checker=alpha.core.NullArg nullargpp.ii
nullargpp.cpp:10:3: warning: The first argument is NULL
somefunc(
^~~~~~~~~
nullargpp.cpp:11:3: warning: The first argument is NULL
somefunc(0);
^~~~~~~~~~~
nullargpp.cpp:12:3: warning: The first argument is NULL
somefunc(nullptr);
^~~~~~~~~~~~~~~~~
nullargpp.cpp:18:3: warning: The first argument is NULL
somefunc(p);
^~~~~~~~~~~
nullargpp.cpp:27:5: warning: The first argument is NULL
somefunc(p);
^~~~~~~~~~~
5 warnings generated.
For maximum specificity, the above changes were made to llvm-project commit 05efe0fdc4 (March 2019), running on Linux, but should work with any Clang v9.
I'm trying to throw exception which is defined inside class in extension module created with python C api.
Here is what I want in python C api, but written in python:
class Graph:
class TooManyVerticesError( Exception ): pass
def addVertex( self ):
if self.__order == 16:
raise Graph.TooManyVerticesError( "too many vertices" )
for v in range( self.__order ):
self.__matrix[v] += [False]
self.__order += 1
self.__matrix += [ [False] * self.__order ]
Here is what I have now written with python C api:
#define ERROR return NULL;
PyObject *TooManyVerticesError;
typedef struct {
PyObject_HEAD
size_t __order; /* the maximum number of elements in q_elements */
std::vector<std::vector<int>> AdjacencyList;
} ListaSasiedztwa;
static
PyObject* addVertex(ListaSasiedztwa* self, PyObject* args)
{
int u, v;
// Process arguments
PyArg_ParseTuple(args, "ii",
&u,
&v);
if (self->__order == 16)
{
PyErr_SetString(TooManyVerticesError, "too many vertices");
ERROR
}
std::vector<int> vertexList;
self->AdjacencyList.push_back(vertexList);
return NULL;
}
PyMODINIT_FUNC PyInit_simple_graphs(void)
{
if (PyType_Ready(&ListaSasiedztwaType) < 0) return NULL;
PyObject* m = PyModule_Create(&cSimpleGraphsModule);
if (m == NULL) return NULL;
TooManyVerticesError = PyErr_NewException("simple_graphs.TooManyVerticesError", NULL, NULL);
PyDict_SetItemString(ListaSasiedztwaType.tp_dict, "TooManyVerticesError", TooManyVerticesError);
Py_INCREF(&ListaSasiedztwaType);
PyModule_AddObject(m, "ListaSasiedztwa",
(PyObject *)&ListaSasiedztwaType);
return m;
}
Everything builds correctly, I can create object of type ListaSasiedztwa. But when exception should be thrown I have Exception of type SystemError with message:
class 'simple_graphs.ListaSasiedztwa' returned a result with an error set.
Should I add this Exception as structure member somehow?
I don't know if ListaSasiedztwaType is relevant in this problem so I didn't add it yet.
I think you're actually hitting the return NULL; at the end of the function which you reach in the "non-error" case. This NULL indicates to Python that their should be an exception set, but since there isn't one you get the general SystemError from Python..
I suspect you're trying to return the Python None object - use the macro Py_RETURN_NONE instead.
It occured that addVertex was called from another method which didn't pass return value which had to be NULL.
How to type check the String object/Number object argument types in duktape c function and parse the value from String object/Number object. There is generic api like duk_is_object() but I need the correct object type to parse the value .
ex:
ecmascript code
var str1 = new String("duktape");
var version = new Number(2.2);
dukFunPrintArgs(str1,str2);
duktape c function :
dukFunPrintArgs(ctx)
{
// code to know whether the args is of type String Object / Number Object
}
Where did you find the information how to register a C function in duktape? That place certainly also has details on how to access the parameters passed to it. Already on the homepage of duktape.org you can find a Getting Started example:
3 Add C function bindings
To call a C function from Ecmascript code, first declare your C functions:
/* Being an embeddable engine, Duktape doesn't provide I/O
* bindings by default. Here's a simple one argument print()
* function.
*/
static duk_ret_t native_print(duk_context *ctx) {
printf("%s\n", duk_to_string(ctx, 0));
return 0; /* no return value (= undefined) */
}
/* Adder: add argument values. */
static duk_ret_t native_adder(duk_context *ctx) {
int i;
int n = duk_get_top(ctx); /* #args */
double res = 0.0;
for (i = 0; i < n; i++) {
res += duk_to_number(ctx, i);
}
duk_push_number(ctx, res);
return 1; /* one return value */
}
Register your functions e.g. into the global object:
duk_push_c_function(ctx, native_print, 1 /*nargs*/);
duk_put_global_string(ctx, "print");
duk_push_c_function(ctx, native_adder, DUK_VARARGS);
duk_put_global_string(ctx, "adder");
You can then call your function from Ecmascript code:
duk_eval_string_noresult(ctx, "print('2+3=' + adder(2, 3));");
One of the core concepts in duktape are stacks. The value stack is where parameters are stored. Read more on the Getting Started page.