syslogd
Denys Vlasenko
vda.linux at googlemail.com
Mon Jan 7 07:56:02 PST 2008
On Sunday 06 January 2008 12:42, Alex Landau wrote:
> As a side effect of my problems with syslogd described below, I found a bug (return value from shmat()
> checked for error incorrectly), which is fixed by the attached patch.
>
> I'm running on NOMMU and using busybox 1.7 (and checked the relevant changes in syslogd.c till trunk head - none seems to be related).
> The problem itself: Once in a while (about once a week), syslogd crashes. I traced the point of the crash,
> and it is in log_to_shmem() in the "memcpy(G.shbuf->data + old_tail, msg, k);" line (line no. 268 in trunk head).
So, it is either a corrupted/wrong G.shbuf->data or wrong old_tail.
old_tail is taken from G.shbuf->tail:
old_tail = G.shbuf->tail;
and G.shbuf->tail is modified only in two places:
G.shbuf->tail = 0; -- obviously correct one
and
G.shbuf->tail = new_tail;
Just for paranoid reasons, add debug printout directly
after above line:
if (new_tail < 0 || new_tail >= G.shbuf->size)
bb_error_and_die("ERROR! new_tail:%d ...more debug data here...", new_tail, ...);
G.shbuf->data case is even more simple: it should never change.
So you may save it in another static variable and check
whether it changed at the beginning of log_to_shmem().
These two checks should catch the bug in action.
> The pointer (G.shbuf->data + old_tail) points to a bizarre place (very far from the shm area shown in /proc/pid/maps), and naturally trying to write there crashes syslog. I thought the problem happens due to the circular buffer overflowing, but running "logger" several times in order to fill the whole buffer didn't crash syslog, so this does not seem to be related.
>
> I didn't succeed in finding the cause of the crash (memory corruption? something overwriting G.shbuf?), and would appreciate any help.
>
> Oh, and how did I find the "shmat() return value" bug? If after syslogd crashes, I (or actually runsv) runs it again, it crashes at the memset in ipcsyslog_init() trying to write to 0xFFFFFFFF. Maybe shmat() failed due to the crash of the previous instance of syslogd, for which the kernel probably haven't released all resources (NOMMU, remember), maybe not... But shmget() succeeded and that's strange.
Fix is applied to svn, thanks!
--
vda
More information about the busybox
mailing list