Here is a part of my code.
for (i=0;i<29;i++)
{
*reg0= (unsigned int)(src[i][0]<<24 | (src[i][1]<<16) | (src[i][2]<<8) | src[i][3]);
printk("");
*reg4= (unsigned int)(src[i][4]<<24 | (src[i][5]<<16) | (src[i][6]<<8) | src[i][7]);
printk("");
*reg8= (unsigned int)(src[i][8]<<24 | (src[i][9]<<16) | (src[i][10]<<8) | src[i][11]);
printk("");
*reg12= (unsigned int)(src[i][12]<<24 | (src[i][13]<<16) | (src[i][14]<<8) | 0);
printk("");
}
it is a for loop used in a module built under petalinux. it works very well after doing insmod and execution the user application. But when I comment printk("") the module doesn't work any more and it seems like it gets lost in recursion. I have to suppress printk becase it takes much time execution.
Any help would be appreciated.
regards.
Related
It's well-documented how to use ftrace to find the function graph starting from a certain function, e.g.
# echo nop > current_tracer
# echo 100 > max_graph_depth
# echo ksys_dup3 > set_graph_function
# echo function_graph > current_tracer
# cat trace
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
7) | ksys_dup3() {
7) 0.533 us | expand_files();
7) | do_dup2() {
7) | filp_close() {
7) 0.405 us | dnotify_flush();
7) 0.459 us | locks_remove_posix();
7) | fput() {
7) | fput_many() {
7) | task_work_add() {
7) 0.533 us | kick_process();
7) 1.558 us | }
7) 2.475 us | }
7) 3.382 us | }
7) 6.122 us | }
7) 7.104 us | }
7) + 10.763 us | }
But this only returns the function graph starting at ksys_dup3. It omits the full function graph that leads to ksys_dup3:
7) | el0_svc_handler() {
7) | el0_svc_common() {
7) | __arm64_sys_dup3() {
7) | ksys_dup3() {
7) 0.416 us | expand_files();
7) | do_dup2() {
7) | filp_close() {
7) 0.405 us | dnotify_flush();
7) 0.406 us | locks_remove_posix();
7) | fput() {
7) 0.416 us | fput_many();
7) 1.269 us | }
7) 3.819 us | }
7) 4.746 us | }
7) 6.475 us | }
7) 7.381 us | }
7) 8.362 us | }
7) 9.205 us | }
Is there a way to use ftrace to filter a full function graph?
I'd say all of ftrace is well documented (here). There is no way to do what you want though, because the filtering ability of ftrace is implemented through triggers that can start/stop tracing exactly when an event happens. Setting set_graph_function will trigger a trace start when the function is entered and a trace stop when it's exited. Since when you enter el0_svc_handler you cannot know beforehand if ksys_dup3 is going to be called, there is no way to write a trigger to start the trace on such a condition (that would require being able to "predict the future").
You can however do fine grained filtering with the set_ftrace_pid parameter writing a program that only does what you need, running a full trace on that program or filtering on the parent el0_svc_handler.
In case you don't actually know which parent functions are being called before the one you want, you can do a single targeted run with the func_stack_trace option enabled to get an idea about what the entire call chain is, and then use that output to set the appropriate filter for a "normal" run.
For example, let's say I want to trace do_dup2, but as you say I want to start from one of the parent functions. I'll first write a dummy test program like the following one:
int main(void) {
printf("My PID is %d, press ENTER to go...\n", getpid());
getchar();
dup2(0, 1);
return 0;
}
I'll compile and start the above program in one shell:
$ ./test
My PID is 1234, press ENTER to go...
Then in another root shell, configure tracing as follows:
cd /sys/kernel/tracing
echo function > current_tracer
echo 1 > options/func_stack_trace
echo do_dup2 > set_ftrace_filter
echo PID_OF_THE_PROGRAM_HERE > set_ftrace_pid
Note: echo do_dup2 > set_ftrace_filter is very important otherwise you will trace and dump the stack for every single kernel function, which would be a huge performance hit and can make the system unresponsive. For the same reason doing echo PID_OF_THE_PROGRAM_HERE > set_ftrace_pid is also important if you don't want to trace every single dup2 syscall done by the system.
Now I can do one trace:
echo 1 > tracing_on
# ... press ENTER in the other shell ...
echo 0 > tracing_on
cat trace
And the result will be something like this (my machine is x86 so the syscall entry functions have different names):
# tracer: function
#
# entries-in-buffer/entries-written: 2/2 #P:20
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
a.out-6396 [000] .... 1455.370160: do_dup2 <-__x64_sys_dup2
a.out-6396 [000] .... 1455.370174: <stack trace>
=> 0xffffffffc322006a
=> do_dup2
=> __x64_sys_dup2
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
I can now see the entire call chain that led to do_dup2 (printed in reverse order above) and then do another normal trace from one of the parent functions (remember to disable options/func_stack_trace first).
I'm trying to run cargo fix on a project that uses slqx and am getting the following error:
error: proc macro panicked
--> src/twitter/domain/user.rs:54:5
|
54 | / sqlx::query!(
55 | | r#"
56 | | INSERT INTO users
57 | | (id, created_at,
... |
84 | | user["public_metrics"]["tweet_count"].as_i64(),
85 | | )
| |_____^
|
= help: message: Lazy instance has previously been poisoned
= note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
...for every instance of sqlx macro I have in my code. The weird thing is it used to work just fine, but for some reason doesn't anymore.
The only mention of the error on Google that I found is here, but I don't think it's relevant.
What might be wrong?
I have a simple SConstruct file to build the google test library with MinGW:
env = Environment(platform='posix') # necessary to use gcc and not MS
env.Append(CPPPATH=['googletest/'])
env.Append(CCFLAGS=[('-isystem', 'googletest/include/'), '-pthread'])
obj = env.Object(source='googletest/src/gtest-all.cc')
# linking skipped due to error search
# env.Append(LINKFLAGS=['-rv'])
# bin = env.StaticLibrary(target='libgtest', source=[obj])
The script resides in the main googletest\ folder. When running it - with or without linking - the output is this:
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
g++ -o googletest\src\gtest-all.o -c -isystem googletest/include/ -pthread -Igoogletest googletest\src\gtest-all.cc
scons: *** [googletest\src\gtest-all.o] The system cannot find the file specified
+-.
+-googletest
| +-googletest\src
| +-googletest\src\gtest-all.cc
| +-googletest\src\gtest-all.o
| | +-googletest\src\gtest-all.cc
| | +-googletest\src\gtest-death-test.cc
| | +-googletest\src\gtest-filepath.cc
| | +-googletest\src\gtest-port.cc
| | +-googletest\src\gtest-printers.cc
| | +-googletest\src\gtest-test-part.cc
| | +-googletest\src\gtest-typed-test.cc
| | +-googletest\src\gtest.cc
| | +-googletest\src\gtest-internal-inl.h
| +-googletest\src\gtest-death-test.cc
| +-googletest\src\gtest-filepath.cc
| +-googletest\src\gtest-internal-inl.h
| +-googletest\src\gtest-port.cc
| +-googletest\src\gtest-printers.cc
| +-googletest\src\gtest-test-part.cc
| +-googletest\src\gtest-typed-test.cc
| +-googletest\src\gtest.cc
| +-googletest\src\libgtest-all.a
| +-googletest\src\gtest-all.o
| +-googletest\src\gtest-all.cc
| +-googletest\src\gtest-death-test.cc
| +-googletest\src\gtest-filepath.cc
| +-googletest\src\gtest-port.cc
| +-googletest\src\gtest-printers.cc
| +-googletest\src\gtest-test-part.cc
| +-googletest\src\gtest-typed-test.cc
| +-googletest\src\gtest.cc
| +-googletest\src\gtest-internal-inl.h
+-SConstruct
scons: building terminated because of errors.
I also tried to build the library in one line: env.StaticLibrary(source='googletest/src/gtest-all.cc') - the result is the same.
Just executing the actuall g++ call gives me the object file I want.
What confuses me is that SCons should see the object file as an artifact it creates itself. I wondering why it tries to use it before it is finished. So what am I missing here? How can I make SCons wait until the compiling is done?
BTW: I just have some experience in using SCons and and did tweak a script once a while - but I do not really have profound knowledger about it.
Versions used: SCons 3.0.1, Python 3.6.3, MinGW 7.3.0
Does this work?
env = Environment(tools=['mingw','gnulink','ar']) # You should specify the tools
env.Append(CPPPATH=['googletest/'])
env.Append(CCFLAGS=[('-isystem', 'googletest/include/'), '-pthread'])
obj = env.Object(source='googletest/src/gtest-all.cc')
# linking skipped due to error search
# env.Append(LINKFLAGS=['-rv'])
# bin = env.StaticLibrary(target='libgtest', source=[obj])
I am currently trying to learn the syntax of Rust by solving little tasks. I compare the execution time as sanity-checks if I am using the language the right way.
One task is:
Create an array of 10000000 random integers in the range 0 - 1000000000
Sort it and measure the time
Print the time for sorting it
I got the following results:
| # | Language | Speed | LOCs |
| --- | -------------------- | ------ | ---- |
| 1 | C++ (with -O3) | 1.36s | 1 |
| 2 | Python (with PyPy) | 3.14s | 1 |
| 3 | Ruby | 5.04s | 1 |
| 4 | Go | 6.17s | 1 |
| 5 | C++ | 7.95s | 1 |
| 6 | Python (with Cython) | 11.51s | 1 |
| 7 | PHP | 36.28s | 1 |
Now I wrote the following Rust code:
rust.rs
extern crate rand;
extern crate time;
use rand::Rng;
use time::PreciseTime;
fn main() {
let n = 10000000;
let mut array = Vec::new();
let mut rng = rand::thread_rng();
for _ in 0..n {
//array[i] = rng.gen::<i32>();
array.push(rng.gen::<i32>());
}
// Sort
let start = PreciseTime::now();
array.sort();
let end = PreciseTime::now();
println!("{} seconds for sorting {} integers.", start.to(end), n);
}
with the following Cargo.toml:
[package]
name = "hello_world" # the name of the package
version = "0.0.1" # the current version, obeying semver
authors = [ "you#example.com" ]
[[bin]]
name = "rust"
path = "rust.rs"
[dependencies]
rand = "*" # Or a specific version
time = "*"
I compiled it with cargo run rust.rs and ran the binary. It outputs
PT18.207168155S seconds for sorting 10000000 integers.
Note that this is much slower than Python. I guess I am doing something wrong. (The complete code of rust and of the other languages is here if you are interested.)
Why does it take so long to sort with Rust? How can I make it faster?
I Tried your code on my computer, running it with cargo run gives:
PT11.634640178S seconds for sorting 10000000 integers.
And with cargo run --release (turning on optimizations) gives:
PT1.004434739S seconds for sorting 10000000 integers.
DCPU-16 (the CPU in Notch's new game) doesn't seem to have any signed IF/MUL/DIV instructions.
Is there still a way to do signed arithmetic/control-flow that isn't super incredibly painful?
The new DCPU spec published by Notch the other day, does have signed arithmetic instructions:
Unsigned | Signed
==================
MUL | MLI
DIV | DVI
SHR | ASR
IFG | IFA
IFL | IFU