difference between update_rq_clock and update_rq_clock_task - linux

I understand the notion of update_rq_clock as it updates the run queue clock on system tick periodically. But this function calls update_rq_clock_task(). What is the purpose behind this function?

Within update_rq_clock the difference between the CPU timestamp and the run queue clock is calculated (The rq->clock variable represents the last clock read from the CPU). That difference is added to the rq->clock and to the rq->clock_task (Which is the same as rq->clock - time for interrupts and stolen time) through update_rq_clock_task.
There are a couple of options within the function, which you can activate with kernel build options. But basically it breaks down to:
...
rq->clock_task += delta;
...
update_rq_clock_pelt(rq, delta);
...
So, both functions together update the clock of the run queue and the clock of the run queue without accounting for interrupts and stolen time (unless you activated that accounting through the kernel options), so the actual time that the tasks used.

Related

Measuring Semaphore wait times with Micrometer

We have a throttling implementation that essentially boils down to:
Semaphore s = new Semaphore(1);
...
void callMethod() {
s.acquire();
timer.recordCallable(() -> // call expensive method);
s.release();
}
I would like to gather metrics about the impact semaphore has on the overall response time of the method. For example, I would like to know the number of threads that were waiting for acquire, the time spend waiting etc., What, I guess, I am looking for is guage that also captures timing information?
How do I measure the Semphore stats?
There are multiple things you can do depending on your needs and situation.
LongTaskTimer is a timer that measures tasks that are currently in-progress. The in-progress part is key here, since after the task has finished, you will not see its effect on the timer. That's why it is for long running tasks, I'm not sure if it fits your use case.
The other thing that you can do is having a Timer and a Gauge where the timer measures the time it took to acquire the Semaphore while with the gauge, you can increment/decrement the number of threads that are currently waiting on it.

ostream::flush context

I am using both ostream::write and ostream::flush operation in a multithread application in the following sequence:
// <<-- start time measurement
{
ostream::write();
ostream::flush();
}
//<<-- end time measurement
The issue it that on measuring the time for the above sequence, I get a very short time (~10msec), yet the time between thread entrance becomes very large (~400msec), only because of adding the ostream::flush and ostream:::write commands.
Only oncee in a while I get that the time difference becomes larger, yet I am not sure if it is because of some context switch.
I test it in Linux machine, dual core cpu.
This make me confused, I have assumed that both of these functions are blocking, or is it that the writing is actually done only after fflush ?
EDIT:
only one thread does the writing to file.

Maximum number of tasks supported in AUTOSAR

What is the maximum number of tasks supported in AUTOSAR compliant systems?
In Linux, I can check the maximum process IDs supported to get the maximum number of tasks supported.
However, I couldn't find any source that states the maximum number of tasks supported by AUTOSAR.
Thank you very much for your help!
Well, we are still in an embedded automotive world and not on a PC.
There is usually a tradeoff between the number of tasks you have and what it takes to schedule them and what RAM/ROM and runtime resources your configuration uses.
As already said, if you just need a simple timed loop with some interrupts in between, one task may be ok.
It might be also enough, to have e.g. 3 tasks running at 5ms, 10ms and 20ms cycle. But you could also schedule this in simple cases like this with a single 5ms task:
TASK(TASK_5ms)
{
static uint8 cnt = 0;
cnt++;
// XXX and YYY Mainfunctions shall only be called every 10ms
// but do a load balancing, that does not run 3 functions every 10ms
// and 1 every 5ms, but only two every 5ms
if (cnt & 1)
{
XXX_Mainfunction_10ms();
}
else
{
YYY_Mainfunction_10ms();
}
ZZZ_Mainfunction_5ms();
}
So, if you need something to be run every 5, 10 or 20ms, you put these runnables into the corresponding tasks.
The old OSEK also had a notion of BASIC vs EXTENDED Tasks, where only extended tasks where able to react on OsEvents. This tasks might not run cyclically, but only on configured OsEvents. You would have an OS Waitpoint there, where the tasks is more or less stopped and only woken up by the OS on the arrival of an event. There are also OSALARM, which could either directly trigger the activation of a OsTask, or indirectly over an Event, so, you could e.g. wait on the same Waitpoint on both a cyclic event from an OsAlarm or an OsEvent set by something else e.g. by another task or from an ISR.
TASK(TASK_EXT)
{
EventMaskType evt;
for(;;)
{
WaitEvent(EVT_XXX_START | EVT_YYY_START | EVT_YYY_FINISHED);
GetEvent(TASK_EXT, &evt);
// Start XXX if triggered, but YYY has reported to be finished
if ((evt & (EVT_XXX_START | EVT_YYY_FINISHED) == (EVT_XXX_START | EVT_YYY_FINISHED))
{
ClearEvent(EVT_XXX_START);
XXX_Start();
}
// Start YYY if triggered, will report later to start XXX
if (evt & EVT_YYY_START)
{
ClearEvent(EVT_YYY_START);
YYY_Start();
}
}
}
This direct handling of scheduling is now mostly done/generated within the RTE based on the events you have configured for your SWCs and the Event to Task Mapping etc.
Tasks are scheduled mainly by their priority, that's why they can be interrupted anytime by a higher priority taks. Exception here is, if you configure your OS and tasks to be not preemptive but cooperative. Then it might be necessary to also use Schedule() points in your code, to give up the CPU.
On bigger systems and also on MultiCore systems with an MultiCore OS, there will be higher nunbers of Tasks, because Tasks are bound to a Core, though the Tasks on different Cores run independently, except maybe for the Inter-Core-Synchronization. This can also have a negative performance impact (Spinlocks can stop the whole system)
e.g. there could be some Cyclic Tasks for normal BaseSW components and one specific only for Communication components (CAN Stack and Comm-Services).
We usually separate the communication part, since they need a certain cycle time like 5..10ms, since this cycle is used by the Comm-Stack for message transmission scheduling and also reception timeout monitoring.
Then there might be a task to handle the memory stack (Ea/Fls, Eep/Fee, NvM).
There might be also some kind of Event based Tasks to trigger certain HW-control and processing chains of measured data, since they might be put on different cores, and can be scheduled by start or finished events of each other.
On the other side, for all your cyclic tasks, you should also make sure, that the functions run within such task do not run longer than your task cycle, otherwise you get an OS Shutdown due to multiple activation of the same task, since your task is started again, before it actually finished. And you might have some constraints, that require some tasks to finish in your applications expected measurement cycle.
In safety relevant systems (ASIL-A .. ASIL-D) you'll also have at least one task fpr each safety-level to get freedome-from-interference. In AUTOSAR, you already specify that on the OSApplication which the tasks are assigned to, which also allows you to configure the MemoryProtection (e.g. WrAccess to memory partitions by QM, ASIL-A, ASIL-B application and tasks). That is then another part, the OS has to do at runtime, to reconfigure the MPU according to the OsApplications MemoryAccess settings.
But again, the more tasks you create, the higher the usage of RAM, ROM and runtime.
RAM - runtime scheduling structures and different task stacks
ROM - the actual task and event configurations
Runtime - the context switches of the tasks and also the scheduling itself
It seems to vary. I found that ETAS RTA offers 1024 tasks*, whereas Vector's MICROSAR OS has 65535.
For task handling, OSEK/ASR provides the following functions:
StatusType ActivateTask (TaskType TaskID)
StatusType TerminateTask (void)
StatusType Schedule (void)
StatusType GetTaskID (TaskRefType TaskID)
StatusType GetTaskState (TaskType TaskID, TaskStateRefType State)
*Link might change in future, but it is easy to search ETAS page directly for manuals etc.: https://www.etas.com/en/products/download_center.php
Formally you can have an infinite number of OsTasks. According to the spec. the configuration of the Os can have 0..* OsTask.
Apart from that the (OS) software uses data type TaskType for Task-Index variables. Therefore, if TaskType is of uint16 you could not have more than 65535 tasks.
Besides that, if you have a lot of tasks, you might re-think your design.

Is there an equivalent to the windows GetSystemTimes() function in Linux?

In Windows there is a function called GetSystemTimes() that returns the system idle time, the amount of time spent executing kernel code, and the amount of time spent executing user mode code.
Is there an equivalent function(s) in linux?
The original answer gave a solution to getting the user and system time of the current running process. However, you want the information on the entire system. As far as I know, the only way to get this information is to parse the contents of /proc/stat. In particular, the first line, labeled cpu:
cpu 85806677 11713309 6660413 3490353007 6236822 300919 807875 0
This is followed by per cpu summaries if you are running an SMP system. The line itself has the following information (in order):
time in user mode
time in user mode with low priority
time in system mode
time idle
time waiting for I/O to complete
time servicing interrupts
time servicing software interrupts
time spent in virtualization
The times are reported in units of USER_HZ.
There may be other columns after this depending on the version of your kernel.
Original answer:
You want times(2):
times() stores the current process times in the struct tms that buf points to. The struct tms is as defined in <sys/times.h>:
struct tms {
clock_t tms_utime; /* user time */
clock_t tms_stime; /* system time */
clock_t tms_cutime; /* user time of children */
clock_t tms_cstime; /* system time of children */
};
Idle time can be inferred from tracking elapsed wall clock time, and subtracting away the non-idle times reported from the call.

delay generated from the for loop

//say delay_ms = 1
void Delay(const unsigned int delay_ms)
{
unsigned int x,y;
for(x=0;x<delay_ms;x++)
{
for(y=0;y<120;y++);
}
}
I am trying to use the C code above for my 8051 microcontroller. I wish to know what is the delay time generated above. I am using a 12MHz oscillator.
This is a truly lousy way to generate a time delay.
If you look at the assembler generated by the compiler then, from the data sheet for the processor variant that you are using, you can look up the clock cycles required for each instruction in the listing. Add these up and you will get the minimum delay time that this code will produce.
If you have interrupts enabled on your processor then the delay time will be extended by the execution time of any of the interrupt handlers that are triggered during the delay. These will add an essentially random amount of time to each delay function call depending upon the frequency and processing requirements of each interrupt.
The 8051 is built with hardware timer/counters that are designed to produce a signal after a user programmable delay. These are not affected by interrupt processing (it is true that the servicing of their trigger events may be delayed by another interrupt source) and so give a far more reliable duration for the delay .

Resources