How to check the version of OpenMP on windows - cygwin

I wonder how to check the version of OpenMP on a windows by using Cygwin64. Thanks and regards!

OpenMP version is binded with the compiler.
You need to check GCC's version in your Cygwin first.
https://www.openmp.org/resources/openmp-compilers-tools/

The OpenMP specification says:
In implementations that support a preprocessor, the _OPENMP macro name is defined to have the decimal value yyyymm where yyyy and mm are the year and month designations of the version of the OpenMP API that the implementation supports.
For Fortran implementations that do not support C-style preprocessing, the integer parameter openmp_version (provided by both use omp_lib and include 'omp_lib.h' interfaces) is set to the same yyyymm value.
The following table lists the correspondence between the number and the version (dates were looked up here and then cross-referenced with existing header files).
_OPENMP | OpenMP version
---------+----------------
200011 | 2.0 (Fortran)
200203 | 2.0 (C/C++)
200505 | 2.5
---------+----------------
200805 | 3.0
201107 | 3.5
---------+----------------
201307 | 4.0
201511 | 4.5
---------+----------------

Related

Are there compatibility issues with clang-cl and arch:avx2?

I'm using Windows 10, Visual Studio 2019, Platform: x64 and have the following test script in a single-file Visual Studio Solution:
#include <iostream>
#include <intrin.h>
using namespace std;
int main() {
unsigned __int64 mask = 0x0fffffffffffffff; //1152921504606846975;
unsigned long index;
_BitScanReverse64(&index, mask);
if (index != 59) {
cout << "Fails!" << endl;
return EXIT_FAILURE;
}
else {
cout << "Success!" << endl;
return EXIT_SUCCESS;
}
}
In my property solution I've set the 'Enable Enhanced Instruction Set' to 'Advanced Vector Extenstions 2 (/arch:AVX2)'.
When compiling with msvc (setting 'Platform Toolset' to 'Visual Studio 2019 (v142)') the code returns EXIT_SUCCESS, but when compiling with clang-cl (setting 'Platform Toolset' to 'LLVM (clang-cl)') I get EXIT_FAILURE. When debugging the clang-cl run, the value of index is 4, when it should be 59. This suggests to me that clang-cl is reading the bits in the opposite direction of MSVC.
This isn't the case when I set 'Enable Enhanced Instruction Set' to 'Not Set'. In this scenario, both MSVC and clang-cl return EXIT_SUCCESS.
All of the dlls are loaded and shown in the Debug Output window come from C:\Windows\System32###.dll in all cases.
Does anyone understand this behavior? I would appreciate any insight here.
EDIT: I failed to mention earlier: I compiled this with IntelCore i7-3930K CPU #3.20GHz.
Getting 4 instead of 59 sounds like clang implemented _BitScanReverse64 as 63 - lzcnt. Actual bsr is slow on AMD, so yes there are reasons why a compiler would want to compiler a BSR intrinsic to a different instruction.
But then you ran the executable on a computer that doesn't actually support BMI so lzcnt decoded as rep bsr = bsr, giving the leading-zero count instead of the bit-index of the highest set bit.
AFAIK, all CPUs that have AVX2 also have BMI. If your CPU doesn't have that, you shouldn't expect your executables build with /arch:AVX2 to run correctly on your CPU. And in this case the failure mode wasn't an illegal instruction, it was lzcnt running as bsr.
MSVC doesn't generally optimize intrinsics, apparently including this case, so it just uses bsr directly.
Update: i7-3930K is SandyBridge-E. It doesn't have AVX2, so that explains your results.
clang-cl doesn't error when you tell it to build an AVX2 executable on a non-AVX2 computer. The use-case for that would be compiling on one machine to create an executable to run on different machines.
It also doesn't add CPUID-checking code to your executable for you. If you want that, write it yourself. This is C++, it doesn't hold your hand.
target CPU options
MSVC-style /arch options are much more limited than normal GCC/clang style. There aren't any for different levels of SSE like SSE4.1; it jumps straight to AVX.
Also, /arch:AVX2 apparently implies BMI1/2, even though those are different instruction-sets with different CPUID feature bits. In kernel code for example you might want integer BMI instructions but not SIMD instructions that touch XMM/YMM registers.
clang -O3 -mavx2 would not also enable -mbmi. You normally would want that, but if you failed to also enable BMI then clang would have been stuck using bsr. (Which is actually better for Intel CPUs than 63-lzcnt). I think MSVC's /arch:AVX2 is something like -march=haswell, if it also enables FMA instructions.
And nothing in MSVC has any support for making binaries optimized to run on the computer you build them on. That makes sense, it's designed for a closed-source binary-distribution model of software development.
But GCC and clang have -march=native to enable all the instruction sets your computer supports. And also importantly, set tuning options appropriate for your computer. e.g. don't worry about making code that would be slow on an AMD CPU, or on older Intel, just make asm that's good for your CPU.
TL:DR: CPU selection options in clang-cl are very coarse, lumping non-SIMD extensions in with some level of AVX. That's why /arch:AVX2 enabled integer BMI extension, while clang -mavx2 would not have.

How to tell which version of OpenGL my graphics card supports on Linux

I am trying to figure out which version of OpenGL my graphics card and driver currently support.
This answer suggests running glxinfo | grep OpenGL which if fine, but here is (some) of that output:
OpenGL vendor string: NVIDIA Corporation
OpenGL core profile version string: 4.5.0 NVIDIA 387.22
OpenGL version string: 4.6.0 NVIDIA 387.22
So it is hard to tell, is it 4.5 or 4.6?
Also the official documentation from nVidia does not mention the answer either!
OpenGL version string: 4.6.0 NVIDIA 387.22
That is the highest legacy version the implementation will support. There are several possibilities here:
This number will be <= 3.1 for OpenGL implementations not supporting modern OpenGL profiles (introduced in 3.2).
It might be the highest supported compatibility profile version if the implementation does support the compatibility profile and actually returns a compatiblity profile context when asked for a legacy context
It might also be <= 3.1 even on implementations that do support GL >= 3.2 in a compatibility profile, but chose to not expose it when asked for a legacy context.
The nvidia proprietary driver falls in category 2.
For the core profile, there is simply no way to ask the implementation what it can support, as described in
this answer:
OpenGL core profile version string: 4.5.0 NVIDIA 387.22
That glxinfo output does not mean that your driver can't do 4.6 core. (It actually can). It just means that the glxinfo aren't aware of the presence of GL 4.6 right now, only only check for up to 4.5.
The source code for glxinfo will reveal the following logic:
if (coreProfile) {
/* Try to create a core profile, starting with the newest version of
* GL that we're aware of. If we don't specify the version
*/
int i;
for (i = 0; gl_versions[i].major > 0; i++) {
/* don't bother below GL 3.0 */
if (gl_versions[i].major == 3 &&
gl_versions[i].minor == 0)
return 0;
ctx = create_context_flags(dpy, config,
gl_versions[i].major,
gl_versions[i].minor,
0x0,
GLX_CONTEXT_CORE_PROFILE_BIT_ARB,
direct);
if (ctx)
return ctx;
}
/* couldn't get core profile context */
return 0;
}
so it just iterates through an array gl_versions and checks if a context with that version can be created.
And OpenGL 4.6 was added to that array in this commit on October 11, 2017:
diff --git a/src/xdemos/glinfo_common.h b/src/xdemos/glinfo_common.h
index 0024f85..4d07f66 100644
--- a/src/xdemos/glinfo_common.h
+++ b/src/xdemos/glinfo_common.h
## -86,6 +86,7 ## struct options
/** list of known OpenGL versions */
static const struct { int major, minor; } gl_versions[] = {
+ {4, 6},
{4, 5},
{4, 4},
{4, 3},
So if you use a glxinfo which was compiled on a source code version before Oct 11 (which means basically every distro version right now), it simply will not show 4.6, even if your driver can do it.
So it is hard to tell, is it 4.5 or 4.6?
It is 4.6 for both compatibility and core profile. But I only know that because I know that driver.

How can I convert a Bluetooth 16 bit service UUID into a 128 bit UUID?

All assigned services only state the 16 bit UUID. How can I determine the 128 bit counterpart if I have to specify the service in that format?
From Service Discovery Protocol Overview I know that 128 bit UUIDs are based on a so called "BASE UUID" which is also stated there:
00000000-0000-1000-8000-00805F9B34FB
But how do I create a 128 bit UUID from the 16 bit counterpart? Probably some of the 0 digits have to be replaced, but which and how?
This can be found in the Bluetooth 4.0 Core spec Vol. 3 - Core System. See the list of adopted specs.
In Part B, covering the Service Discovery Protocol (SDP) under Chapter 2.5.1 "Searching for Services / UUID" will explain how to calculate the UUID.
The full 128-bit value of a 16-bit or 32-bit UUID may be computed by a simple arithmetic operation.
128_bit_value = 16_bit_value * 2^96 + Bluetooth_Base_UUID
128_bit_value = 32_bit_value * 2^96 + Bluetooth_Base_UUID
A 16-bit UUID may be converted to 32-bit UUID format by zero-extending the 16-bit value to 32-bits. An equivalent method is to add the 16-bit UUID value to a zero-valued 32-bit UUID.
Note that, in another section, there's a handy mnemonic:
Or, to put it more simply, the 16-bit Attribute UUID replaces the x’s in the follow-
ing:
0000xxxx-0000-1000-8000-00805F9B34FB
In addition, the 32-bit Attribute UUID replaces the x's in the following:
xxxxxxxx-0000-1000-8000-00805F9B34FB
The same equations go for attribute UUIDs. See Part F, covering the Attribute Protocol (ATT) under Chapter 3.2.1 "Protocol Requirements / Basic Concepts". 32 bit attribute UUIDs are first specified in the Bluetooth Core 4.1 spec.

Can't manage to get the Star-Schema DBMS benchmark data generator to run properly

One of the commonly (?) used DBMS benchmarks is called SSB, the Star-Schema Benchmark. To run it, you need to generate your schema, i.e. your tables with the data in them. Well, there's a generator program you can find in all sorts of places (on github):
https://github.com/rxin/ssb-dbgen
https://code.google.com/p/gpudb/source/checkout (then under tests/ssb/dbgen or something)
https://github.com/electrum/ssb-dbgen/
and possibly elsewhere. I'm not sure those all have exactly the same code, but I seem to be experiencing the same problem with them. I'm using a Linux 64-bit system (Kubuntu 14.04 if that helps); and am trying to build and run the `dbgen' program from that package.
When building, I get type/size-related warnings:
me#myhost:~/src/ssb-dbgen$ make
... etc. etc. ...
gcc -O -DDBNAME=\"dss\" -DLINUX -DDB2 -DSSBM -c -o varsub.o varsub.c
rnd.c: In function גrow_stopג:
rnd.c:60:6: warning: format ג%dג expects argument of type גintג, but argument 4 has type גlong intג [-Wformat=]
i, Seed[i].usage);
^
driver.c: In function גpartialג:
driver.c:606:4: warning: format ג%dג expects argument of type גintג, but argument 4 has type גlong intג [-Wformat=]
... etc. etc. ...
Then, I make sure all the right files are in place, try to generate my tables, and only get two of them! I try to explicitly generate the LINEORDER table, and get a strange failure:
eyal#vivaldi:~/src/ssb-dbgen$ ls
bcd2.c build.c driver.c HISTORY makefile_win print.c rnd.c speed_seed.o varsub.c
bcd2.h build.o driver.o history.html mkf.macos print.o rnd.h ssb-dbgen-master varsub.o
bcd2.o CHANGES dss.ddl load_stub.c permute.c qgen rnd.o text.c
bm_utils.c config.h dss.h load_stub.o permute.h qgen.c rxin-ssb-dbgen-master.zip text.o
bm_utils.o dbgen dss.ri Makefile permute.o qgen.o shared.h tpcd.h
BUGS dists.dss dsstypes.h makefile.suite PORTING.NOTES README speed_seed.c TPCH_README
me#myhost:~/src/ssb-dbgen$ ./dbgen -vfF -s 1
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
Generating data for suppliers table [pid: 32303]done.
Generating data for customers table [pid: 32303]done.
Generating data for (null) [pid: 32303]done.
Generating data for (null) [pid: 32303]done.
Generating data for (null) [pid: 32303]done.
Generating data for (null) [pid: 32303]done.
me#myhost:~/src/ssb-dbgen$ ls *.tbl
customer.tbl supplier.tbl
me#myhost:~/src/ssb-dbgen$ ./dbgen -vfF -s 1 -T l
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
Generating data for lineorder table [pid: 32305]*** buffer overflow detected ***: ./dbgen terminated
======= Backtrace: =========
... etc. etc. ...
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fcea1b79ec5]
./dbgen[0x401219]
======= Memory map: ========
... etc. etc. ...
Now, if I switch to a 32-bit Linux system, I don't get any of these warnings (although there two warnings about pointer-to-non-pointer conversion); but running the generation again produces only two tables. Now, other individual tables can be produced - but they don't correspond to one another at all, I would think...
Has anyone encountered a similar problem? Am I doing something wrong? Am I using the wrong sources somehow?
(This is almost a dupe of
SSB dbgen Linux - Segmentation Fault
... but I can't "take over" somebody else's question when they may have encountered other problems than mine. Also, that one has no answers...)
If anyone still encourage this problem, I found a solution here: https://github.com/electrum/ssb-dbgen/pull/1
specifically, you have to modify the two files shared.h and config.h
Regards.
Edit: change:
#ifdef SSBM
#define MAXAGG_LEN 10 /* max component length for a agg str */
to:
#ifdef SSBM
#define MAXAGG_LEN 20 /* max component length for a agg str */
I found workaround but You need Windows system.
Download and unzip this package:
https://github.com/LucidDB/thirdparty/blob/master/ssb.tar.bz2
In bin directory is dbgen.exe. Run it from windows console like f.g.:
...\bin\dbgen.exe -s 1 -T a
After that just copy created files to your Linux system. Not the best way, but effective :)
So, eventually, I ended up surveying all versions of ssb-dbgen on GitHub, and creating a unified repository:
https://github.com/eyalroz/ssb-dbgen/
this repository:
incorporates fixes for all bugs fixed in any of those versions, and a few others. In particular, the format mismatch due to different int sizes on Linux and Windows for 64-bit machines is resolved.
Switches the build to using CMake, rather than needing to manually edit Makefiles. Specifically, building on Windows and MacOS is supported. Building on more exotic systems is theoretically supported.
has CI build testing of commits to make sure that at least the building doesn't break.

How does linux capability.h use 32-bit mask for 34 elements?

The file in /usr/include/linux/capability.h #defines 34 possible capabilities.
It goes like:
#define CAP_CHOWN 0
#define CAP_DAC_OVERRIDE 1
.....
#define CAP_MAC_ADMIN 33
#define CAP_LAST_CAP CAP_MAC_ADMIN
each process has capabilities defined thusly
typedef struct __user_cap_data_struct {
__u32 effective;
__u32 permitted;
__u32 inheritable;
} * cap_user_data_t;
I'm confused - a process can have 32-bits of effective capabilities, yet the total amount of capabilities defined in capability.h is 34. How is it possible to encode 34 positions in a 32-bit mask?
Because you haven't read all of the manual.
The capget manual starts by convincing you to not use it :
These two functions are the raw kernel interface for getting and set‐
ting thread capabilities. Not only are these system calls specific to
Linux, but the kernel API is likely to change and use of these func‐
tions (in particular the format of the cap_user_*_t types) is subject
to extension with each kernel revision, but old programs will keep
working.
The portable interfaces are cap_set_proc(3) and cap_get_proc(3); if
possible you should use those interfaces in applications. If you wish
to use the Linux extensions in applications, you should use the easier-
to-use interfaces capsetp(3) and capgetp(3).
Current details
Now that you have been warned, some current kernel details. The struc‐
tures are defined as follows.
#define _LINUX_CAPABILITY_VERSION_1 0x19980330
#define _LINUX_CAPABILITY_U32S_1 1
#define _LINUX_CAPABILITY_VERSION_2 0x20071026
#define _LINUX_CAPABILITY_U32S_2 2
[...]
effective, permitted, inheritable are bitmasks of the capabilities
defined in capability(7). Note the CAP_* values are bit indexes and
need to be bit-shifted before ORing into the bit fields.
[...]
Kernels prior to 2.6.25 prefer 32-bit capabilities with version
_LINUX_CAPABILITY_VERSION_1, and kernels 2.6.25+ prefer 64-bit capabil‐
ities with version _LINUX_CAPABILITY_VERSION_2. Note, 64-bit capabili‐
ties use datap[0] and datap[1], whereas 32-bit capabilities only use
datap[0].
where datap is defined earlier as a pointer to a __user_cap_data_struct. So you just represent a 64bit values with two __u32 in an array of two __user_cap_data_struct.
This, alone, tells me to not ever use this API, so i didn't read the rest of the manual.
They aren't bit-masks, they're just constants. E.G. CAP_MAC_ADMIN sets more than one bit. In binary, 33 is what, 10001?

Resources