I'm using MSVC to compile some C code which uses standard-library functions, such as getenv(), sprintf and others, with /W3 set for warnings. I'm told by MSVC that:
'getenv': This function or variable may be unsafe. Consider using _dupenv_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS
Questions:
Why would this be unsafe, theoretically - as opposed to its use on other platforms?
Is it unsafe on Windows in practice?
Assuming I'm not writing security-oriented code - should I disable this warning or actually start aliasing a bunch of standard library functions?
getenv() is potentially unsafe in that subsequent calls to that same function may invalidate earlier returned pointers. As a result, usage such as
char *a = getenv("A");
char *b = getenv("B");
/* do stuff with both a and b */
may break, because there's no guarantee a is still usable at that point.
getenv_s() - available in the C standard library since C11 - avoids this by immediately copying the value into a caller-supplied buffer, where the caller has full control over the buffer's lifetime. dupenv_s() avoids this by making the caller responsible for managing the lifetime of the allocated buffer.
However, the signature for getenv_s is somewhat controvertial, and the function may even be removed from the C standard at some point... see this report.
getenv suffers like much of the classic C Standard Library by not bounding the string buffer length. This is where security bugs like buffer overrun often originate from.
If you look at getenv_s you'll see it provides an explicit bound on the length of the returned string. It's recommended for all coding by the Security Development Lifecycle best practice, which is why Visual C++ emits deprecation warnings for the less secure versions.
See MSDN and this blog post
There was an effort by Microsoft to get the C/C++ ISO Standard Library to include the Secure CRT here, some of which was approved for C11 Annex K as noted here. That also means that getenv_s should be part of the C++17 Standard Library by reference. That said, Annex K is officially considered optional for conformance. The _s bounds-checking versions of these functions are also still a subject of some debate in the C/C++ community.
Related
Rust's memmap crate has unsafe methods.
I can understand how the returned address space is unsafe to pass to constructors which might validate its contents and then continue to use it.
I'm writing a binary-diff tool which only treats the returned address space as containing bytes (of any value) and does no validation on the contents of the address space.
Can I avoid propagating the unsafe in this case?
It's a shame that memmap does not document its safety contract. Indeed there is an open issue from 2017 on this very point.
I followed this discussion in the RustSec/advisory-db repo re the crate's apparent abandonment to this safety documentation in the forked mapr crate, which states:
All file-backed memory map constructors are marked unsafe because of the potential for Undefined Behavior (UB) using the map if the underlying file is subsequently modified, in or out of process. Applications must consider the risk and take appropriate precautions when using file-backed maps. Solutions such as file permissions, locks or process-private (e.g. unlinked) files exist but are platform specific and limited.
This safety concern should obviously be relevant in memmap too.
One of the main principles of using unsafe correctly is that you take responsibility for ensuring soundness in all affected code - including the safe code that surrounds it.
If you were to expose this function in a library crate, you should mark it as unsafe; users of your library need to understand where UB arises and make their own decisions about safety.
In an application crate, you are in control of all of the code and it's up to you to never dereference the bytes as anything else. If you do that, it is completely acceptable to limit the unsafe code to where you directly interact with the memory map, wrapped in safe functions. In fact, wrapping everything in unsafe is counterproductive because it will be unclear where the danger lies.
Usually one should be wary of transmuting (or casting) pointers to a higher alignment. Yet the interface to the above functions require *const _m128i and *mut _m128i pointers, respectively. Both are SIMD-aligned, which means I'd need to keep my arrays SIMD-aligned, too. On the other hand, the intrinsics are explicitly designed to load/store unaligned data.
Is this safe? Shouldn't we change the interface? Or at least document this fact?
I think this is a cross-language duplicate of Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?.
As I explained over there, Intel defined the C/C++ intrinsics API such that loadu / storeu can safely dereference an under-aligned pointer, and that it's safe to create such pointers, even though it's UB in ISO C++ even to create under-aligned pointers. (Thus implementations that provide the intrinsics API must define the behaviour).
The Rust version should work identically. Implementations that provide it must make it safe to create under-aligned __m128i* pointers, as long as you don't dereference them "manually".
The other API-design option would be to have another version of the type that doesn't imply 16-byte alignment, like a __m128i_u or something. GNU C does this with their native vector syntax, but that's way off topic for Rust.
How can I retrieve the type of architecture (linux versus Windows) in my fortran code? Is there some sort of intrinsic function or subroutine that gives this information? Then I would like to use a switch like this every time I have a system call:
if (trim(adjustl(Arch))=='Linux') then
resul = system('ls > output.txt')
elseif (trim(adjustl(Arch))=='Windows')
resul = system('dir > output.txt')
else
write(*,*) 'architecture not supported'
stop
endif
thanks
A.
The Fortran 2003 standard introduced the GET_ENVIRONMENT_VARIABLE intrinsic subroutine. A simple form of call would be
call GET_ENVIRONMENT_VARIABLE (NAME, VALUE)
which will return the value of the variable called NAME in VALUE. The routine has other optional arguments, your favourite reference documentation will explain all. This rather assumes that you can find an environment variable to tell you what the executing platform is.
If your compiler doesn't yet implement this standard approach it is extremely likely to have a non-standard approach; a routine called getenv used to be available on more than one of the Fortran compilers I've used in the recent past.
The 2008 standard introduced a standard function COMPILER_OPTIONS which will return a string containing the compilation options used for the program, if, that is, the compiler supports this sort of thing. This seems to be less widely implemented yet than GET_ENVIRONMENT_VARIABLE, as ever consult your compiler documentation set for details and availability. If it is available it may also be useful to you.
You may also be interested in the 2008-introduced subroutine EXECUTE_COMMAND_LINE which is the standard replacement for the widely-implemented but non-standard system routine that you use in your snippet. This is already available in a number of current Fortran compilers.
There is no intrinsic function in Fortran for this. A common workaround is to use conditional compilation (through makefile or compiler supported macros) such as here. If you really insist on this kind of solution, you might consider making an external function, e.g., in C. However, since your code is built for a fixed platform (Windows/Linux, not both), the first solution is preferable.
There is a well-known fact that C++ templates are turing-complete, CSS is turing-complete (!) and that the C# overload resolution is NP-hard (even without generics).
But is C# 4.0 (with co/contravariance, generics etc) compile-time turing complete?
Unlike templates in C++, generics in C# (and other .net lang) are a runtime generated feature. The compiler does do some checking as to verify the types use but, actual substitution happens at runtime. Same goes for Co and contravariance if I'm not mistaken as well as even the preprocessor directives. Lots of CLR magic.
(At the implementation level, the primary difference is that C#
generic type substitutions are performed at runtime and generic type
information is thereby preserved for instantiated objects)
See MSDN
http://msdn.microsoft.com/en-us/library/c6cyy67b(v=vs.110).aspx
Update:
The CLR does preform type checking via information stored in the metadata associated with the compiled assemblies(Vis-à-vis Jit Compliation), It does this as one of its many services,(ShuggyCoUk answer on this question explains it in detail) (others include memory management and exception handling). So with that I would infer that the compiler has a understanding of state as progression and state as in machine internal state (TC,in part, mean being able to review data (symbols) with so reference to previous data(symbols) , conditionally and evaluate) (I hesitated to state the exact def of TC as I, myself am not sure I have it fully grasped, so feel free to fill in the blanks and correct me when applicable ) So with that I would say with a bit of trepidation, yes, yes it can be.
I recently took in a small MCF C++ application, which is obviously in a working state. To get started I'm running PC-Lint over the code, and lint is complaining that CStringT's are being passed to Format. Opinion on the internet seems to be divided. Some say that CSting is designed to handle this use case without error, but others (and an MSDN article) say that it should always be cast when passed to a variable argument function. Can Stackoverflow come to any consensus on the issue?
CString has been carefully designed to be passed as part of a variable argument list, so it is safe to use it that way. And you can be fairly sure that Microsoft will take care not to break this particular behavior. So I'd say you are safe to continue using it that way, if you want to.
That said, personally I'd prefer the cast. It is not common behavior that string classes behave that way (e.g. std::string does not) and for mental consistency it may be better to just do it the "safe" way.
P.S.: See this thread for implementation details and further notes on how to cast.