Linux kernel compilation error: undefined reference to `__udivdi3' & `__umoddi3' - linux

Here is the error I've got:
http://pastebin.com/VadUW6fy
drivers/built-in.o: In function `gem_rxmac_reset':
clkdev.c:(.text+0x212238): undefined reference to `__bad_udelay'
drivers/built-in.o: In function `divide.part.4':
clkdev.c:(.text.unlikely+0x7214): undefined reference to `__udivdi3'
clkdev.c:(.text.unlikely+0x7244): undefined reference to `__umoddi3'
I googled and found this patch: https://lkml.org/lkml/2008/4/7/82
--- a/include/linux/time.h
+++ b/include/linux/time.h
## -174,6 +174,10 ## static inline void timespec_add_ns(struct timespec *a, u64 ns)
{
ns += a->tv_nsec;
while(unlikely(ns >= NSEC_PER_SEC)) {
+ /* The following asm() prevents the compiler from
+ * optimising this loop into a modulo operation. */
+ asm("" : "+r"(ns));
+
ns -= NSEC_PER_SEC;
a->tv_sec++;
}
but failed to apply (may be due to new version of the file).
patching file linux/time.h
Hunk #1 FAILED at 174.
1 out of 1 hunk FAILED -- saving rejects to file linux/time.h.rej
surprisingly, the file time.h.rej is not present!

I should have read more closely. The patch is for timespec_add_ns(), and you have gem_rxmac_reset() and divide.part.4 functions failing. Probably unrelated to the patch you found -- instead, probably standard 64-bit div / mod functions don't have an implementation on your target platform.
Do you have a Sun GEM or Apple GMAC NIC? If not, you can probably just disable that driver and get rid of the first error message.
For the second, you might need to implement a similar asm trick in the clkdev.c file -- when I skimmed my copy for a repeated subtraction operation I didn't spot one -- but maybe you can simply steal a newer clkdev.c or clkdev.h to fix this problem? (It's a long shot, there's only one entry in git log drivers/clk/clkdev.c.)

Related

Configure kern.log to give more info about a segfault

Currently I can find in kern.log entries like this:
[6516247.445846] ex3.x[30901]: segfault at 0 ip 0000000000400564 sp 00007fff96ecb170 error 6 in ex3.x[400000+1000]
[6516254.095173] ex3.x[30907]: segfault at 0 ip 0000000000400564 sp 00007fff0001dcf0 error 6 in ex3.x[400000+1000]
[6516662.523395] ex3.x[31524]: segfault at 7fff80000000 ip 00007f2e11e4aa79 sp 00007fff807061a0 error 4 in libc-2.13.so[7f2e11dcf000+180000]
(You see, apps causing segfault are named ex3.x, means exercise 3 executable).
Is there a way to ask kern.log to log the complete path? Something like:
[6...] /home/user/cclass/ex3.x[3...]: segfault at 0 ip 0564 sp 07f70 error 6 in ex3.x[4...]
So I can easily figure out from who (user/student) this ex3.x is?
Thanks!
Beco
That log message comes from the kernel with a fixed format that only includes the first 16 letters of the executable excluding the path as per show_signal_msg, see other relevant lines for segmentation fault on non x86 architectures.
As mentioned by Makyen, without significant changes to the kernel and a recompile, the message given to klogd which is passed to syslog won't have the information you are requesting.
I am not aware of any log transformation or injection functionality in syslog or klogd which would allow you to take the name of the file and run either locate or file on the filesystem in order to find the full path.
The best way to get the information you are looking for is to use crash interception software like apport or abrt or corekeeper. These tools store the process metadata from the /proc filesystem including the process's commandline which would include the directory run from, assuming the binary was run with a full path, and wasn't already in path.
The other more generic way would be to enable core dumps, and then to set /proc/sys/kernel/core_pattern to include %E, in order to have the core file name including the path of the binary.
The short answer is: No, it is not possible without making code changes and recompiling the kernel. The normal solution to this problem is to instruct your students to name their executable <student user name>_ex3.x so that you can easily have this information.
However, it is possible to get the information you desire from other methods. Appleman1234 has provided some alternatives in his answer to this question.
How do we know the answer is "Not possible to the the full path in the kern.log segfault messages without recompiling the kernel":
We look in the kernel source code to find out how the message is produced and if there are any configuration options.
The files in question are part of the kernel source. You can download the entire kernel source as an rpm package (or other type of package) for whatever version of linux/debian you are running from a variety of places.
Specifically, the output that you are seeing is produced from whichever of the following files is for your architecture:
linux/arch/sparc/mm/fault_32.c
linux/arch/sparc/mm/fault_64.c
linux/arch/um/kernel/trap.c
linux/arch/x86/mm/fault.c
An example of the relevant function from one of the files(linux/arch/x86/mm/fault.c):
/*
* Print out info about fatal segfaults, if the show_unhandled_signals
* sysctl is set:
*/
static inline void
show_signal_msg(struct pt_regs *regs, unsigned long error_code,
unsigned long address, struct task_struct *tsk)
{
if (!unhandled_signal(tsk, SIGSEGV))
return;
if (!printk_ratelimit())
return;
printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->ip, (void *)regs->sp, error_code);
print_vma_addr(KERN_CONT " in ", regs->ip);
printk(KERN_CONT "\n");
}
From that we see that the variable passed to printout the process identifier is tsk->comm where struct task_struct *tsk and regs->ip where struct pt_regs *regs
Then from linux/include/linux/sched.h
struct task_struct {
...
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- initialized normally by setup_new_exec */
The comment makes it clear that the path for the executable is not stored in the structure.
For regs->ip where struct pt_regs *regs, it is defined in whichever of the following are appropriate for your architecture:
arch/arc/include/asm/ptrace.h
arch/arm/include/asm/ptrace.h
arch/arm64/include/asm/ptrace.h
arch/cris/include/arch-v10/arch/ptrace.h
arch/cris/include/arch-v32/arch/ptrace.h
arch/metag/include/asm/ptrace.h
arch/mips/include/asm/ptrace.h
arch/openrisc/include/asm/ptrace.h
arch/um/include/asm/ptrace-generic.h
arch/x86/include/asm/ptrace.h
arch/xtensa/include/asm/ptrace.h
From there we see that struct pt_regs is defining registers for the architecture. ip is just: unsigned long ip;
Thus, we have to look at what print_vma_addr() does. It is defined in mm/memory.c
/*
* Print the name of a VMA.
*/
void print_vma_addr(char *prefix, unsigned long ip)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
/*
* Do not print if we are in atomic
* contexts (in exception stacks, etc.):
*/
if (preempt_count())
return;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma && vma->vm_file) {
struct file *f = vma->vm_file;
char *buf = (char *)__get_free_page(GFP_KERNEL);
if (buf) {
char *p;
p = d_path(&f->f_path, buf, PAGE_SIZE);
if (IS_ERR(p))
p = "?";
printk("%s%s[%lx+%lx]", prefix, kbasename(p),
vma->vm_start,
vma->vm_end - vma->vm_start);
free_page((unsigned long)buf);
}
}
up_read(&mm->mmap_sem);
}
Which shows us that a path was available. We would need to check that it was the path, but looking a bit further in the code gives a hint that it might not matter. We need to see what kbasename() did with the path that is passed to it. kbasename() is defined in include/linux/string.h as:
/**
* kbasename - return the last part of a pathname.
*
* #path: path to extract the filename from.
*/
static inline const char *kbasename(const char *path)
{
const char *tail = strrchr(path, '/');
return tail ? tail + 1 : path;
}
Which, even if the full path is available prior to it, chops off everything except for the last part of a pathname, leaving the filename.
Thus, no amount of runtime configuration options will permit printing out the full pathname of the file in the segment fault messages you are seeing.
NOTE: I've changed all of the links to kernel source to be to archives, rather than the original locations. Those links will get close to the code as it was at the time I wrote this, 2104-09. As should be no surprise, the code does evolve over time, so the code which is current when you're reading this may or may not be similar or perform in the way which is described here.

Inline C Varnish (VCL_deliver)

I am using Varnish 4.0.
My backend is adding to some responses an http header "x-count"
I would like to log the value of "x-count" into a file with a line break.
I assumed i should do it in VCL deliver.
Here is what i have so far :
sub vcl_deliver {
if (resp.http.x-count-this:) {
set resp.http.X-infodbg = "xx";
C{
FILE *fp;
fp = fopen("/tmp/test.txt", "w+");
fputs(VRT_GetHdr(sp, HDR_OBJ, "\013x-count-this:"), fp);
fputs("\n", fp);
fclose(fp);
}C
}
}
Of course it doesnt work and there is a couple of errors ..
./vcl.gK2lu7uM.c: In function ‘VGC_function_vcl_deliver’:
./vcl.gK2lu7uM.c:1049:22: error: ‘sp’ undeclared (first use in this
function) ./vcl.gK2lu7uM.c:1049:22: note: each undeclared identifier
is reported only once for each function it appears in
./vcl.gK2lu7uM.c:1049:5: error: passing argument 2 of ‘VRT_GetHdr’
makes pointer from integer without a cast [-Werror]
./vcl.gK2lu7uM.c:330:7: note: expected ‘const struct gethdr_s *’ but
argument is of type ‘int’ ./vcl.gK2lu7uM.c:1049:5: error: too many
arguments to function ‘VRT_GetHdr’ ./vcl.gK2lu7uM.c:330:7: note:
declared here
I have to say that i simply copy/pasted "sp" from some examples, but i have no idea where it comes from (i suppose the inline C was in a different context and therefore it was declared there but not in vcl_deliver)
So the probably undocumented differences between Varnish 4 and 3 in the above examples are :
VRT_GetHdr is now VRT_GetHdr(context, struct gethdr_s)
sp doesn't exist, but there is a "ctx" variable
Found this, there :
http://jan.bogutzki.de/Artikel/395/set-ttl-in-varnish-4.html
char *stuffid;
const struct gethdr_s hdr = { HDR_BERESP, "\015x-count-this:" };
stuffid = VRT_GetHdr(ctx, &hdr);
And now a different story: Varnish is crashing as soon as the backend sends back "count-this", but that is a different problem :p (my crappy C code probably)
I don't have Varnish 4.0 handy to test this out, but I was able to get your example working with Varnish 3.0. When I tried the VCL as is, I wasn't getting the exact error you are though. The first change:
if (resp.http.x-count-this:) {
needs to be:
if (resp.http.x-count-this) {
The colon should be left off of the header name when referred to this way. Next:
fputs(VRT_GetHdr(sp, HDR_OBJ, "\013x-count-this:"), fp);
needs to be:
fputs(VRT_GetHdr(sp, HDR_OBJ, "\015x-count-this:"), fp);
The length value in that string needs to be in octal for some reason, and 13 in octal is 15. Making those changes got this to work for me. That being said, you many want to look into using open and fcntl instead of fopen since without file locking I'm not sure what the effect of multiple requests contending for that file would be.

Where are ioctl parameters (such as 0x1268 / BLKSSZGET) actually specified?

I am looking for a definitive specification describing the expected arguments and behavior of ioctl 0x1268 (BLKSSZGET).
This number is declared in many places (none of which contain a definitive reference source), such as linux/fs.h, but I can find no specification for it.
Surely, somebody at some point in the past decided that 0x1268 would get the physical sector size of a device and documented that somewhere. Where does this information come from and where can I find it?
Edit: I am not asking what BLKSSZGET does in general, nor am I asking what header it is defined in. I am looking for a definitive, standardized source that states what argument types it should take and what its behavior should be for any driver that implements it.
Specifically, I am asking because there appears to be a bug in blkdiscard in util-linux 2.23 (and 2.24) where the sector size is queried in to a uint64_t, but the high 32-bits are untouched since BLKSSZGET appears to expect a 32-bit integer, and this leads to an incorrect sector size, incorrect alignment calculations, and failures in blkdiscard when it should succeed. So before I submit a patch, I need to determine, with absolute certainty, if the problem is that blkdiscard should be using a 32-bit integer, or if the driver implementation in my kernel should be using a 64-bit integer.
Edit 2: Since we're on the topic, the proposed patch presuming blkdiscard is incorrect is:
--- sys-utils/blkdiscard.c-2.23 2013-11-01 18:28:19.270004947 -0400
+++ sys-utils/blkdiscard.c 2013-11-01 18:29:07.334002382 -0400
## -71,7 +71,8 ##
{
char *path;
int c, fd, verbose = 0, secure = 0;
- uint64_t end, blksize, secsize, range[2];
+ uint64_t end, blksize, range[2];
+ uint32_t secsize;
struct stat sb;
static const struct option longopts[] = {
## -146,8 +147,8 ##
err(EXIT_FAILURE, _("%s: BLKSSZGET ioctl failed"), path);
/* align range to the sector size */
- range[0] = (range[0] + secsize - 1) & ~(secsize - 1);
- range[1] &= ~(secsize - 1);
+ range[0] = (range[0] + (uint64_t)secsize - 1) & ~((uint64_t)secsize - 1);
+ range[1] &= ~((uint64_t)secsize - 1);
/* is the range end behind the end of the device ?*/
end = range[0] + range[1];
Applied to e.g. https://www.kernel.org/pub/linux/utils/util-linux/v2.23/.
The answer to "where is this specified?" does seem to be the kernel source.
I asked the question on the kernel mailing list here: https://lkml.org/lkml/2013/11/1/620
In response, Theodore Ts'o wrote (note: he mistakenly identified sys-utils/blkdiscard.c in his list but it's inconsequential):
BLKSSZGET returns an int. If you look at the sources of util-linux
v2.23, you'll see it passes an int to BLKSSZGET in
sys-utils/blkdiscard.c
lib/blkdev.c
E2fsprogs also expects BLKSSZGET to return an int, and if you look at
the kernel sources, it very clearly returns an int.
The one place it doesn't is in sys-utils/blkdiscard.c, where as you
have noted, it is passing in a uint64 to BLKSSZGET. This looks like
it's a bug in sys-util/blkdiscard.c.
He then went on to submit a patch¹ to blkdiscard at util-linux:
--- a/sys-utils/blkdiscard.c
+++ b/sys-utils/blkdiscard.c
## -70,8 +70,8 ## static void __attribute__((__noreturn__)) usage(FILE *out)
int main(int argc, char **argv)
{
char *path;
- int c, fd, verbose = 0, secure = 0;
- uint64_t end, blksize, secsize, range[2];
+ int c, fd, verbose = 0, secure = 0, secsize;
+ uint64_t end, blksize, range[2];
struct stat sb;
static const struct option longopts[] = {
I had been hesitant to mention the blkdiscard tool in both my mailing list post and the original version of this SO question specifically for this reason: I know what's in my kernel's source, it's already easy enough to modify blkdiscard to agree with the source, and this ended up distracting from the real question of "where is this documented?".
So, as for the specifics, somebody more official than me has also stated that the BLKSSZGET ioctl takes an int, but the general question regarding documentation remained. I then followed up with https://lkml.org/lkml/2013/11/3/125 and received another reply from Theodore Ts'o (wiki for credibility) answering the question. He wrote:
> There was a bigger question hidden behind the context there that I'm
> still wondering about: Are these ioctl interfaces specified and
> documented somewhere? From what I've seen, and from your response, the
> implication is that the kernel source *is* the specification, and not
> document exists that the kernel is expected to comply with; is this
> the case?
The kernel source is the specification. Some of these ioctl are
documented as part of the linux man pages, for which the project home
page is here:
https://www.kernel.org/doc/man-pages/
However, these document existing practice; if there is a discrepancy
between what is in the kernel has implemented and the Linux man pages,
it is the Linux man pages which are buggy and which will be changed.
That is man pages are descriptive, not perscriptive.
I also asked about the use of "int" in general for public kernel APIs, his response is there although that is off-topic here.
Answer: So, there you have it, the final answer is: The ioctl interfaces are specified by the kernel source itself; there is no document that the kernel adheres to. There is documentation to describe the kernel's implementations of various ioctls, but if there is a mismatch, it is an error in the documentation, not in the kernel.
¹ With all the above in mind, I want to point out that an important difference in the patch Theodore Ts'o submitted, compared to mine, is the use of "int" rather than "uint32_t" -- BLKSSZGET, as per kernel source, does indeed expect an argument that is whatever size "int" is on the platform, not a forced 32-bit value.

Fixing 'no symbolic type information' from dtrace in Linux?

Just documenting this: (self-answer to follow)
I'm aware that Sun's dtrace is not packaged for Ubuntu due to licensing issues; so I downloaded it and built it from source on Ubuntu - but I'm having an issue pretty much like the one in Simple dtraces not working · Issue #17 · dtrace4linux/linux · GitHub; namely loading of the driver seems fine:
dtrace-20130712$ sudo make load
tools/load.pl
23:20:31 Syncing...
23:20:31 Loading: build-2.6.38-16-generic/driver/dtracedrv.ko
23:20:34 Preparing symbols...
23:20:34 Probes available: 364377
23:20:44 Time: 13s
... however, if I try to run a simple script, it fails:
$ sudo ./build/dtrace -n 'BEGIN { printf("Hello, world"); exit(0); }'
dtrace: invalid probe specifier BEGIN { printf("Hello, world"); exit(0); }: "/path/to/src/dtrace-20130712/etc/sched.d", line 60: no symbolic type information is available for kernel`dtrace_cpu_id: Invalid argument
As per the issue link above:
(ctf requires a private and working libdwarf lib - most older releases have broken versions).
... I then built libdwarf from source, and then dtrace based on it (not trivial, requires manually finding the right placement of symlinks); and I still get the same failure.
Is it possible to fix this?
Well, after a trip to gdb, I figured that the problem occurs in dtrace's function dt_module_getctf (called via dtrace_symbol_type and, I think, dt_module_lookup_by_name). In it, I noticed that most calls propagate the attribute/variable dm_name = "linux"; but when the failure occurs, I'd get dm_name = "kernel"!
Note that original line 60 from sched.d is:
cpu_id = `dtrace_cpu_id; /* C->cpu_id; */
Then I found thr3ads.net - dtrace discuss - accessing symbols without type info [Nov 2006]; where this error message is mentioned:
dtrace: invalid probe specifier fbt::calcloadavg:entry {
printf("CMS_USER: %d, CMS_SYSTEM: %d, cpu_waitrq: %d\n",
`cpu0.cpu_acct[0], `cpu0.cpu_acct[1], `cpu0.cpu_waitrq);}: in action
list: no symbolic type information is available for unix`cpu0: No type
information available for symbol
So:
on that system, the request `cpu0.cpu_acct[0] got resolved to unix`cpu0;
and on my system, the request `dtrace_cpu_id got resolved to kernel`dtrace_cpu_id.
And since "The backtick operator is used to read the
value of kernel variables, which will be specific to the running kernel." (howto measure CPU load - DTrace General Discussion - ArchiveOrange), I thought maybe explicitly "casting" this "backtick variable" to linux would help.
And indeed it does - only a small section of sched.d needs to be changed to this:
translator cpuinfo_t < dtrace_cpu_t *C > {
cpu_id = linux`dtrace_cpu_id; /* C->cpu_id; */
cpu_pset = -1;
cpu_chip = linux`dtrace_cpu_id; /* C->cpu_id; */
cpu_lgrp = 0; /* XXX */
/* cpu_info = *((_processor_info_t *)`dtrace_zero); /* ` */ /* XXX */
};
inline cpuinfo_t *curcpu = xlate <cpuinfo_t *> (&linux`dtrace_curcpu);
... and suddenly - it starts working!:
dtrace-20130712$ sudo ./build/dtrace -n 'BEGIN { printf("Hello, world"); exit(0); }'
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME
1 1 :BEGIN Hello, world
PS:
Protip 1: NEVER do dtrace -n '::: { printf("Hello"); }' - this means "do a printf on each and every kernel event", and it will completely freeze the kernel; not even CTRL-Alt-Del will work!
Protip 2: If you want to use DTRACE_DEBUG as in Debugging DTrace, use sudo -E:
dtrace-20130712$ DTRACE_DEBUG=1 sudo -E ./build/dtrace -n 'BEGIN { printf("Hello, world"); exit(0); }'
libdtrace DEBUG: reading kernel .ctf: /path/to/src/dtrace-20130712/build-2.6.38-16-generic/linux-2.6.38-16-generic.ctf
libdtrace DEBUG: opened 32-bit /proc/kallsyms (syms=75761)
...

c2664 in Visual Studio 2012 when using make_pair

I dig up an old project and wanted to compile it, but received several errors, a few of those being a c2664:
error C2664: 'std::make_pair' : cannot convert parameter 1 from 'CUser *' to 'CUser *&&'
error C2664: 'std::make_pair' : cannot convert parameter 1 from 'unsigned long' to ' unsigned long &&'
The relevant code parts are:
//typedef for the userdata map
typedef std::map<unsigned long, std::pair<CUser*,userstatus*>> UserDataMapType;
//...
Inc::incret CUserManager::AddUser(unsigned long ID, CUser* pUser, userstatus* pUserStatus)
{
//...
std::pair<UserDataMapType::iterator, bool> ret = m_mapUserData.insert(std::make_pair<unsigned long, std::pair<CUser*, userstatus*>>(ID, std::make_pair<CUser*, userstatus*>(pUser, pUserStatus)));
//...
}
I tried to make the function parameters const, but that did not help.
It did compile just fine in VS2010.
Please help me find what causes this and how to solve it.
make_pair() has been changed in VS2012 to support a new C++11 feature called move semantics and I suspect that explicitly specifying the types for make_pair() is getting in the way.
Remember that make_pair() does not need any template parameters to be explicitly specified. It deduces them from the type of each argument.
Try removing the explicit template arguments from both calls to make_pair() like so...
std::pair<UserDataMapType::iterator, bool> ret = m_mapUserData.insert(std::make_pair(ID, std::make_pair(pUser, pUserStatus)));
Explicitly providing them like this would have worked fine pre-VS2012 because of a new C++11 feature added called move semantics. You'll want to read up on that subject later since you have a shiny new compiler that supports it.

Resources