While writing kernel modules/drivers, most of the time some structures are initialized to point to some specific functions. As a beginner in this could someone explain the importance of this.
I saw the struct file_operations while writing the character device driver
Also I found that eventhough the functions are declared they are not implemented always. Could anyone help on that too. For example, in the kernel source: kernel/dma.c, eventhough
static const struct file_operations proc_dma_operations = {
.open = proc_dma_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
are defined, only proc_dma_open is implemented.
The functions seq_read, seq_lseek and single_release are declared in the kernel source file linux-3.1.6/include/linux/seq_file.h and defined in the kernel source file linux-3.1.6/fs/seq_file.c. They are probably common to many file operations.
If you ever played with object-oriented languages like C++, think of file_operations as a base class, and your functions as being implementations of its virtual methods.
Pointers to functions are a very powerful tool in the C language that allows for real-time redirect of function calls. Most if not all operating systems have a similar mechanism, like for example the infamous INT 21 functions 25/35 in the old MS-DOS that allowed TSR programs to exist.
In C, you can assign the pointer to a function to a variable and then call that function through that variable. The function can be changed either at init time based on some parameters or at runtime based on some behavior.
Here is an example:
int fn(int a)
{
...
return a;
}
...
int (*dynamic_fn)(int);
...
dynanic_fn = &fn;
...
int i = dynamic_fn(0);
When the pointer "lives" in a structure that can be passed to system calls, this is a very powerful feature that allows hooks into system functions.
In object oriented languages, the same kind of behavior can be achieved by using reflection to instantiate classes dynamically.
Related
I would like to make a basic library with some very basic features of OpenMP. For example, just to be able to write a code like below. I wonder how I can use LLVM and pthreads to accomplish this.
I guess there are two steps:
Preprocessing to figure out the parallel task (parallel vs parallel-for)
Convert appropriate code blocks to a void* function needed for pthreads
Automation to create, run and join threads
Example code:
#Our_Own parallel
{
printf("hello");
}
#Our_Own parallel for
for (int i =0; i < 1000; i++)
{
printf(i);
}
There is no need to use pragmas to implement parallel loops (or tasks) in C++. You can see this in the implementation of Threading Building Blocks (TBB), or Kokkos, both of which provide parallelism without using pragmas. As a consequence, there is no need to make any compiler modifications to do this!
The key observation here is that C++ provides lambdas which allow you to abstract a chunk of code into an anonymous function and to bind the appropriate variables from the context so that it can later be invoked in some other thread.
Even if you do want to map to pragmas, for instance to provide your own, "improved" version of OpenMP, you can do that without using anything more than C macros, by using the _Pragma directive, which can be placed inside a macro, something like this:-
#include <stdio.h>
#include <omp.h>
#define STRINGIFY1(...) #__VA_ARGS__
#define STRINGIFY(...) STRINGIFY1(__VA_ARGS__)
#define MY_PARALLEL _Pragma("omp parallel")
#define my_threadID() omp_get_thread_num()
int main (int, char **)
{
MY_PARALLEL
{
printf ("Hello from thread %d\n", my_threadID());
}
return 0;
}
However, we're rather in the dark about what you are really trying to achieve, and in what context:
Since OpenMP implementations almost all sit on top of pthreads, why do you need something different?
Which language is this for? (C, C++, other?)
What is the reason to avoid using existing implementations (such as TBB, RAJA, Kokkos, C++ Parallel Algorithms)?
Remember, "The best code is the code I do not have to write".
(P.s. if you want to see the type of thing you are taking on, look at the Little OpenMP runtime which implements some (not all) of the CPU OpenMP requirements, and the associated book.)
DISCLAIMER: This is a very hard task, and there are many hidden tasks involved (how to access variables and set up their sharing attributes, implicit barriers, synchronization, etc.) I do not claim that you can solve all of them using the idea described below.
If you use a (python) script to preprocess your code, here is a very minimal example how to start in C language (C++ may be a bit easier because of lambda functions).
Use macro definitions (preferably put them in a header file) :
// Macro definitions (in own.h)
#define OWN_PARALLEL_BLOCK_START(f) \
void *f(void * data) \
{ \
#define OWN_PARALLEL_BLOCK_END \
return NULL; \
}
#define OWN_PARALLEL_MAIN(f) \
{ \
int THREAD_NUM=4; \
pthread_t thread_id[THREAD_NUM]; \
for(int i=0;i<THREAD_NUM;i++) { \
pthread_create(&thread_id[i], NULL, f, NULL); \
} \
for(int i=0;i<THREAD_NUM;i++) { \
pthread_join(thread_id[i], NULL); \
} \
}
Your (python) script should convert this:
int main(){
#pragma own parallel
{
printf("Hello word\n");
}
}
to the following:
OWN_PARALLEL_BLOCK_START(ThreadBlock_1)
{
printf("Hello word\n");
}
OWN_PARALLEL_BLOCK_END
int main(){
OWN_PARALLEL_MAIN(ThreadBlock_1)
}
Check it on Compiler Explorer
I would like to make a basic library with some very basic features of OpenMP.
No, you really wouldn't. At minimum, because nothing of the sort you are considering can accurately be described as "basic" or as (only) a "library".
There are multiple good alternatives already available, not least OpenMP itself. Use one of those. Or if you insist on rolling your own then do so with the understanding that you are taking on a major project.
For example, just to be able to write a code like below. I wonder how I can use LLVM and pthreads to accomplish this. I guess there are two steps:
Preprocessing to figure out the parallel task (parallel vs parallel-for)
Yes, but not with the C preprocessor. It is not nearly powerful enough. You need a source-to-source translator that has a sufficient semantic understanding of the target language (C? C++?) to recognize your annotations and the source constructs to which they apply, and to apply the needed transformations and tooling.
I think the LLVM project contains pieces that would help with this (so you probably don't need to write a C parser from scratch, for example).
Convert appropriate code blocks to a void* function needed for pthreads
Yes, that would be among the transformations required.
Automation to create, run and join threads
Yes, that would also be among the transformations required.
But those are not all. For the parallel for case, for example, you also need to account for splitting the loop iterations among multiple threads. You probably also need a solution for recognizing and managing access to shared variables.
Often times in embedded setting we need to declare static structs (drivers etc) so that
their memory is known and assigned at compile time.
Is there any way to achieve something similar in rust?
For example, I want to have a uart driver struct
struct DriverUart{
...
}
and an associated impl block.
Now, I want to avoid having a function named new(), and instead, I want to somewhere allocate this memory a-priori (or to have a new function that I can call statically outside
any code block).
In C I would simply put an instantiation of this struct in some header file and it will be statically allocated and globally available.
I haven't found any thing similar in rust.
If it is not possible then why? and what is the best why we can achieve something similar?
Thanks!
Now, I want to avoid having a function named new(), and instead, I want to somewhere allocate this memory a-priori (or to have a new function that I can call statically outside any code block). In C I would simply put an instantiation of this struct in some header file and it will be statically allocated and globally available. I haven't found any thing similar in rust. If it is not possible then why? and what is the best why we can achieve something similar?
https://doc.rust-lang.org/std/keyword.static.html
You can do the same in Rust, without the header, as long as all the elements are const:
struct DriverUart {
whatever: u32
}
static thing: DriverUart = DriverUart { whatever: 5 };
If you need to evaluate non-const expressions, then that obviously will not work and you'll need to use something like lazy_static or once_cell to instantiate simili-statics.
And of course, what with Rust being a safe languages and statics being shared state, mutable statics are wildly unsafe if not mitigated via thread-safe interior-mutability containers (e.g. an atomic, or a Mutex though those are currently non-const, and it's unclear if they can ever be otherwise), a static is considered to always be shared between threads.
I know there is a system call pthread_equal in Linux for comparing 2 thread ids. But why cannot one directly compare 2 thread ids using '==' operator?
From the pthread_equal man page on Linux:
The pthread_equal() function is necessary because thread IDs should
be considered opaque: there is no portable way for applications to
directly compare two pthread_t values.
It might be a struct. It might be a pointer. It might be a pointer to a struct held somewhere. == might, or might not, return true for all cases it should return true and vice versa.
So you are provided with an accessor that is guaranteed to return the correct result, no matter the implementation.
I have checked the definition of pthread_t with the gcc compiler which I am using, there it is defined as "typedef unsigned long int pthread_t;". But the definition of pthread_t is implementation dependent. While porting the same code to different platforms(hardware), if we compare directly the pthread id's it will be a problem because different platforms may implement pthread_t in different ways(implementaion dependent, like structure, or type other that unsigned long int). So to make our code portable across different platforms we are using pthread_t as an opaque type.
The new machine model of C++11 allows for multi-processor systems to work reliably, wrt. to reorganization of instructions.
As Meyers and Alexandrescu pointed out the "simple" Double-Checked Locking Pattern implementation is not safe in C++03
Singleton* Singleton::instance() {
if (pInstance == 0) { // 1st test
Lock lock;
if (pInstance == 0) { // 2nd test
pInstance = new Singleton;
}
}
return pInstance;
}
They showed in their article that no matter what you do as a programmer, in C++03 the compiler has too much freedom: It is allowed to reorder the instructions in a way that you can not be sure that you end up with only one instance of Singleton.
My question is now:
Do the restrictions/definitions of the new C++11 machine model now constrain the sequence of instructions, that the above code would always work with a C++11 compiler?
How does a safe C++11-Implementation of this Singleton pattern now looks like, when using the new library facilities (instead of the mock Lock here)?
If pInstance is a regular pointer, the code has a potential data race -- operations on pointers (or any builtin type, for that matter) are not guaranteed to be atomic (EDIT: or well-ordered)
If pInstance is an std::atomic<Singleton*> and Lock internally uses an std::mutex to achieve synchronization (for example, if Lock is actually std::lock_guard<std::mutex>), the code should be data race free.
Note that you need both explicit locking and an atomic pInstance to achieve proper synchronization.
Since static variable initialization is now guaranteed to be threadsafe, the Meyer's singleton should be threadsafe.
Singleton* Singleton::instance() {
static Singleton _instance;
return &_instance;
}
Now you need to address the main problem: there is a Singleton in your code.
EDIT: based on my comment below: This implementation has a major drawback when compared to the others. What happens if the compiler doesn't support this feature? The compiler will spit out thread unsafe code without even issuing a warning. The other solutions with locks will not even compile if the compiler doesn't support the new interfaces. This might be a good reason not to rely on this feature, even for things other than singletons.
C++11 doesn't change the meaning of that implementation of double-checked locking. If you want to make double-checked locking work you need to erect suitable memory barriers/fences.
Are there languages that support process common memory in one address space and thread specific memory in another address space using language features rather than through a mechanism like function calls?
process int x;
thread int y;
ThreadStatic attribute in C#
The Visual C++ compiler allows the latter through the nonstandard __declspec(thread) extension - however, it is severly limited, since it isn't supported in dynamically loaded DLLs.
The first is mostly supported through an extern declaration - unless dynamically linked libraries come into play (which is probably the scenario you are looking for).
I am not aware of any environment that makes this as simple as you describe.
C++0x adds the "thread_local" storage specifier, so in namespace (or global) scope your example would be
int x; // normal process-wide global variable
thread_local int y; // per-thread global variable
You can also use thread_local with static when declaring class members or local variables in a function:
class Foo {
static thread_local int x;
};
void f() {
static thread_local int x;
}
Unfortunately, this doesn't appear to be one of the C++0x features supported by Visual Studio 2010 or planned GCC releases.