Debugging a linux kernel module using serial port

Debugging a linux kernel module using serial port - linux

I'm trying to setup simple kernel debugging, I have a 3.2.6 kernel on VM (ESXi), which I declared a serial port (I manage to connect to the serial )
Debugger:
Debuggee:
I followed this tutorial on how to setup gdb, and I seem to be getting replies (the serial port seems to be fine) when I start debugging, however gdb is getting junk and doesn't manage to debug anything,
I encountered a thread on lkml from 2k10 which stated a similar problem on 2.6.35 kernel, however a patch has been committed (I guess went into mainstream) and I'm trying to debug 3.2.6
This is what I'm getting from gdb :
and here's a the output from the vm:
Can anyone hint me on how to solve it ?
(If any mod could edit my message and transform all images to real images and links to real links that would be awesome too, the system didn't allow that and I have to edit my post 4-5 times before posting :/ )

Related

How do I turn off the console on an embedded system built with Yocto?

I am running Linux kernel 4.14.149 built by Yocto Zeus, and I am running 2019.07 U-boot. At the recommendation of our security team, I am trying to get rid of the Linux console. I am not worried about debugging (once I get this to work anyways); we have other ways of getting the system logs out of the machine, and this will not be done on software development boards. That mechanism is already in place and is tested working. We have an i.MX6 as our core (this is an embedded system), and we have dedicated UART5 to our console on dev boards.
I have tried a few different methods to do this. The first was to disable the framebuffer console kernel config (CONFIG_FRAMEBUFFER_CONSOLE). The primary issue with this approach is that it disabled the splash screen. We have a splash screen that is put up in U-boot (and it is displayed again by Linux), but Linux appears to reset the framebuffer or something when it is booting, resulting in the display flickering and being blank for a bit before our applications start, which was unacceptable (and is the reason we put the splash screen up in both U-boot and Linux).
I also tried just setting "console=" on our command line. This is close to what we want to achieve in that the console doesn't come out the UART anymore, but we see it start to appear on the display on top of the splash screen. I haven't found any way to fix that (I can upload a screenshot if desired).
Just eliminating the console parameter entirely didn't appear to work, it still came out the UART. This is to be expected based on the serial console documentation which says it just uses the first available device.
I have tried commenting out the console initialization in main.c in the Linux source, which exploded rather quickly.
I tried setting to be a netconsole (see Where do you send the kernel console on an embedded system?) but the splash screen still got overwritten, same as the setting it nothing case.
The last thing I have tried was just setting it to a bogus device ("console=ttymxc9" on the Linux command line). While this appears to work (there is no data on the display or the UART) it appears to stall (crash?) partway through bootup and without being able to get the logs (it stalls before our application service runs). I say stall because we have Linux configured for a heartbeat and we do still get proper LED heartbeat behavior. None of the systemd services I added to our build however appear to run (I added one to save the journalctl log file after boot to a file on an external SD card for debugging purposes until I get this working)
At this point, I have run out of ideas on how to get rid of the console while keeping the splash screen intact. What is the proper way to disable the Linux console?

For kernel versions 5.11 and newer:
In the submenu "Character devices" under "Device Drivers" from make menuconfig, there is an option called "Null TTY driver" (CONFIG_NULL_TTY) that you can enable and add console=ttynull to the kernel boot cmdline so that all console output will be simply discarded.
You can also disable CONFIG_VT and CONFIG_UNIX98_PTYS, since you don't need to interact with your program via console at all.
For older kernels (like my 4.14):
You can add this support with the diffs at: https://lore.kernel.org/lkml/20190403131213.GA4246#kroah.com/T/ and then follow the instructions above.

More recent versions of yocto use systemd and a service called getty.target to load the serial port console. Disable by running the following command (once):
systemctl mask getty.target

This answer may not fully fit your question, however, it could serve as a research source for other users, just like me. I use the commands below to temporarily turn the console (ttyS0) on and off.
systemctl stop serial-getty#ttyS0.service
and
systemctl start serial-getty#ttyS0.service

How to not stop RedHawk processing even if there is no request from RedHawk-IDE

I use Red Hawk v2.1.0 to realize the AM demodulation part with three components.
Platform --> Xilinx Zynq 7035 (ARM Coretex A9*2)
Oparating System(OS)--> embedded Linux.
When connecting the RedHawk-IDE on the external PC with the Ether and displaying the waveform between the components, an abnormal sound is occured.
At this time, when I disconnect the LAN cable, the AM demodulation processing of Red Hawk inside the ARM will cease.
RedHawk inside the ARM appears to be waiting for requests from RedHawk-IDE on the external PC.
From this, it seems that abnormal noise will occur when requests from RedHawk-IDE on the external PC are delayed.
How can I keep RedHawk's AM demodulation processing inside the ARM running without stopping while connecting the RedHawk-IDE of the external PC and monitoring the waveform?
Environment is below.
CPU：Xilinx Zynq ARM CoretexA9 2cores 600MHz
OS：Embedded Linux Kernel 3.14 RealTimePatch
FrameLength：5.333ms（48kHz sampling, 256 data)

I have seen similar, if not identical issues, when running on an ARM board. Tracking down the exact issue may be difficult and in my experience hasn't been redhawk specific and has really been an issue with omniORB or its configuration. I believe one of the fixes for me was recompiling omniORB rather than using the omniORB package provided by my OS. (Which didn't make any sense to me at the time as I used the same flags & build process as the package maintainer)
First I would confirm this issue is specific to ARM. If it's easy enough to setup the same components, waveforms etc. on a 2nd x86_64 host and validate the problem does not occur.
Second I would try a "quick fix" of setting the omniORB timeouts on the arm host using the /etc/omniORB.cfg file and setting:
clientCallTimeOutPeriod = 2000
clientConnectTimeOutPeriod = 2000
This will set a 2 second timeout on CORBA interactions for both the connect portion and the call completion portion. In the past this has served as a quick fix for me but does not address the underlying issue. If this "fixes" it for you then you've at least narrowed part of the issue down and you could enable omniORB debugging using the traceLevel configuration option to find what call is timing out. See this sample configuration file for all options
If you want to dive into the underlying issues you'd need to see what the IDE and framework are doing when things lock up. With the IDE this is easy; simply find the PID of the java process and run kill -3 <pid> and a full stack trace will be printed in the terminal that is running the IDE. This can give you hints as to what calls are locked up. For the framework you'll need to use GDB and connect to the process in question and tell GDB to print the stack trace. You'd have to do some investigation ahead of time to determine which process is locking up.
If it ends up being an issue with the Java CORBA implementation on x86_64 talking with the C++ CORBA implementation on ARM you could also try launching / configuring / interacting with the ARM board via the REDHAWK python API from your x86_64 host. This may have better compatibility since they both use the same omniORB CORBA implementation.

Debugging (possibly) OpenCV related crash on Jetson TK1

What I am looking for: I need help debugging consistently happening system crashes on my Jetson TK1.
System: I am using a Jetson TK1 board from NVIDIA. Updated to 21.3.4 Grinch Kernel. All drivers installed, libopencv4tegra installed alongside ROS (using hacked deb packages to not overwrite openCV). Everything used to work perfectly in this exact setup.
When the crashes happen: I am running a VSLAM program, which uses a camera connected on the USB port. The program is making heavy use of OpenCV. The program used to run for over 1 month without problems in the current setup. Now, I am getting consistent system crashes which result in a total system freeze. When I am connected over ssh, I loose connection. When I connect a monitor to see what happens on the system while it crashes, I can see everything freeze. The USB port also seems to turn off, since not even USB mouse and keyboard work anymore post-crash. The Jetson stays on though.
Crash Logs: I have tried looking into the /var/log/ logs, but none of them show any messages for when the crash happens.
I have run memtester before. It didn't return any bad memory. While running and crashing, the memory onboard is used at about 60-75% (as shown by "top"). CPU usage is around 60%.
The weird thing is that this exact setup has been running just like this for over a month now.
I need to know: are there any other logs I could find information about the crash in? How could I find out if this is related to a hardware failure or whether there's a software issue?
Thanks
-Marc

How can I save or longly see linux kernel BUG message from console?

I'm trying to develop device driver for the linux kernel, but I have a problem with debugging bug message.
I'm working on the desktop (x64) with linux kernel (ubuntu 14.04, and I tried ubuntu server 14.04 too).
And, I'm using tty console (ctrl+alt+f1) to take the test, because tty console always print printk message when I enable log level to 7.
My problem is, first I have a bug on my developing device driver code, and second, I can't get the actual cause of bug because I cannot view the all bug message except last-left few bug message.
I tried ssh for debugging, (test PC is ssh server, and ssh client pulls dmesg or ftrace printk message from ssh) but ssh server died earlier than kernel bug message, so that I only could see the bug message from my monitor with tty console.
I also tried to use smaller size console font, but that was just temporal solution.
So, my question is, there are any debugging techniques that fit with this problem?
like, stop printing kernel bug message after first bug message, or redirect tty console to other PC using hardware or something..
Please help me with solution
Thanks,

You can redirect various system log streams (including those appearing in dmesg) to any terminal or file by modifying the rsyslog.conf.
Check in you already have a line similar to
kern.* /some/file
/some/file should contain the messages send to dmesg. If no such line exists, create one. If that doesn't work for some reason, replace kern.* with *.* and try that way.
You could also push the contents of dmesg to a file with a command like so dmesg > /var/log/dmesg which could be done regularly by Cron.

Serial port comms only working in one direction

I am working with a SOM mounted on a carrier board running Ubuntu 14.04 with the generic 3.13 kernel.
While testing out the peripherals, I hit a problem with serial communication.
Basically, I can transmit data from the custom platform to an external Linux machine, but I can not properly receive data from the external Linux machine to the custom platform.
Through my research I have messed with all sorts of BIOS settings, baud rates, hardware flow control, parity, etc. Nothing has worked. Most info I have found online just says "Make sure your baud rates and other settings match", and they do. It is not my first time working with Linux serial ports. But it is my first time encountering a problem like this.
Does anyone have any suggestions, recommendations, or has anyone ever seen an issue like this before?
More info: We are running a quad-core Intel Atom micro with a custom serial breakout interface. The serial port is at /dev/ttyS0.
EDIT (clarification):
If I set up a session in Picocom or Minicom, I can send characters from our custom platform (running Ubuntu 14.04) to another Linux PC (also running Ubuntu 14.04). However, if I try to send characters from the Linux PC to our custom board, I sometimes get nothing, and other times get unrecognized characters (they show up as bubbles with question marks in them).
I can also simply echo a string to /dev/ttyS0 on the custom platform and receive it on the Linux PC. I just can't get it to work the other way around.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string