[RFC] tail.c size reduction

Tito farmatito at tiscali.it
Mon Sep 3 13:49:00 PDT 2007


On Monday 03 September 2007 21:13:16 Denys Vlasenko wrote:
> On Sunday 02 September 2007 20:20, Tito wrote:
> > Hi to all,
> > this is a drop in replacement for tail.c.
> > It is a result of a few sleepless nights of hacking
> > that started by looking at  and trying 
> > to understand mjn3's code.
> > At the end I rewrote tail following a different
> > italian creative and instinctive way. 
> > bloat-o-meter is not that bad:
> > 
> >  * scripts/bloat-o-meter busybox_old busybox_unstripped
> >  * function                                             old     new   delta
> >  * xmalloc_fgetc                                          -      40     +40
> >  * .rodata                                           122102  122134     +32
> >  * header_fmt                                            13       -     -13
> >  * eat_num                                               37       -     -37
> >  * tail_read                                            126       -    -126
> >  * tail_main                                           1109     784    -325
> >  * ------------------------------------------------------------------------------
> >  * (add/remove: 1/3 grow/shrink: 1/1 up/down: 72/-501)          Total: -429 bytes 
> > 
> > The functionality should be the same as the original, at least
> > that is what my little testing says, but as the size reduction
> > is that big I fear that i've overlooked something very stupid somewhere.
> > That's why i'm posting it to the list for auditing and inspection by
> > better coders than me that will see my obvious errors.
> 
>         do {
>                 /* With no FILE or When FILE is -, read standard input. */
>                 if (argc == 0 || LONE_DASH(*argv)) {
>                         opt |= ~0x1;  /* unset FOLLOW */
>                         llist_add_to_end(&flist, stdin);
>                         llist_add_to_end(&nlist, (char *)bb_msg_standard_input);
>                         continue;
>                 }
>                 file = fopen_or_warn(*argv, "r");
>                 if (!file) {
>                         status = EXIT_FAILURE;
>                         continue;
>                 }
>                 /* Create lists of FILE* pointers and file names to reuse for -f FOLLOW */
>                 llist_add_to_end(&flist, file);
>                 llist_add_to_end(&nlist, *argv);
>         } while (*++argv);
> 
> If you run tail with no arguments, you go into first if and then "contunue".
> Then you do ++argv ... and happily continue into environment vector which
> is typically immediately follows argv[] in memory.

Fixed.

./busybox tail file1 file2 -
iteration
iteration
iteration
==> file1 <==
1

==> file2 <==
23456789a

==> standard input <==

root at localhost:~/Desktop/busybox# ./busybox tail -
iteration
==> standard input <==

root at localhost:~/Desktop/busybox# ./busybox tail
iteration
==> standard input <==

This for me is smaller than the cat solution.

	do {
		/* With no FILE or When FILE is -, read standard input. */
		if (!*argv || LONE_DASH(*argv)) {
			opt |= ~0x1;  /* unset FOLLOW */
			llist_add_to_end(&flist, stdin);
			llist_add_to_end(&nlist, (char *)bb_msg_standard_input);
			if (!*argv) break;
			continue;
		}
		file = fopen_or_warn(*argv, "r");
		if (!file) {
			status = EXIT_FAILURE;
			continue;
		}
		/* Create lists of FILE* pointers and file names to reuse for -f FOLLOW */
		llist_add_to_end(&flist, file);
		llist_add_to_end(&nlist, *argv);

	} while (*++argv);

> cat. deals with it like this:
> 
>         static const char *const argv_dash[] = { "-", NULL };
> 
>         if (!*argv)
>                 argv = (char**) &argv_dash;
>         do {
>                 if (LONE_DASH(*argv)) ...
> 
> 
> 
> Size: I think this
> 
>         do {
>                 /* With no FILE or When FILE is -, read standard input. */
>                 if (argc == 0 || LONE_DASH(*argv)) {
> 
> Will be smaller in this form:
> 
>         do {
>                 char *arg = *argv;
>                 /* With no FILE or When FILE is -, read standard input. */
>                 if (!arg || LONE_DASH(arg)) {
> 
> arg is better than *argv since gcc can have doubts that *argv doesn't
> change across function calls. Help it.
> 
> Why "!arg" instead of "argc == 0":
> You need arg for LONE_DASH(), thus gcc pulls it into register, thus
> !arg is as easy to evaluate as argc == 0, but does not require gcc
> to allocate another register for argc.
> 
> But ultimate test in bloatcheck, of course.
> 
> 
>         while (flist) {
>                 count = 0;
>                 /* if we are reading stdin print the header before reading as read can block */
>                 if ((FILE*)flist->data == stdin)
>                         tail_xprint_header(fmt + first_file, nlist->data);
> 
> Like arg above, temporary FILE *curfile = (FILE*)flist->data;
> may be smaller, and will be more readable for sure.
> 
>                 while ((line = (((COUNT_BYTES) ? xmalloc_fgetc : xmalloc_fgets)((FILE*)flist->data)))) {
> 
> malloc for each byte? :(

To have something you can free as a line of xmalloc_fgets.

> 
>                         /* Show if there was an error while reading */
>                         if (ferror((FILE*)flist->data)) {
>                                 bb_perror_msg("%s", nlist->data);
>                                 status = EXIT_FAILURE;
>                         }
>                         /* Create a list of the lines or chars */
>                         llist_add_to_end(&slist, line);
>                         /* If we exceed our -n or -c value remove one element from the head of the list */
>                         if (++count > n)
>                                 free(llist_pop(&slist));
> 
> what will happen if count wraps around?

Don't understand what the meaning of "wraps around is" ? 

> 
> "malloc for each byte" and count problems can be solved
> together: always store lines, not chars, and deal with
> COUNT_BYTES case by printing only a part of first line of output,
> so that total output size is correct.

How can you know that bytes < bytes_of_the_line, so to print only a part of the line ???
Don't you need two counters? one for the lines and one for the chars you're printing.

> 
>                 }
>                 /* Print the header if needed */
>                 if ((FILE*)flist->data != stdin
>                         && ((VERBOSE && argc > 0) || (VERBOSE && slist &&  argc < 0))
>                 )
>                         tail_xprint_header(fmt + first_file, nlist->data);
> 
> ((VERBOSE && argc > 0) || (VERBOSE && slist &&  argc < 0)) == VERBOSE && ((argc > 0) || (slist && argc < 0))
> 
>                 /* Zero first file flag */
>                 first_file = 0;
>                 /* Print what we have read */
>                 while ((line = llist_pop(&slist))) {
>                         tail_xprint_header("%s", line);
>                         free(line);
>                 }
> 
>                 /* Was the file truncated ? */
>                 if (!slist && lseek(fileno((FILE*)flist->data), 0, SEEK_END)
>                         < lseek(fileno((FILE*)flist->data), 0, SEEK_CUR)
>                 )
>                         bb_error_msg("file truncated");
> 
> Hard to read, and racy (which llseek is done first?).

Have to fix this any way....i discovered 
right now that in some circumstances  it doesn't work as expected.

> 
>                 if (!FOLLOW) {
>                         /* Close the current file */
>                         fclose_if_not_stdin(llist_pop(&flist));
>                         /* Move the head of the list to the next file name */
>                         llist_pop(&nlist);
>                 } else {
>                         /* FOLLOW */
>                         /* Sleep for approximately S seconds (default 1) between iterations */
>                         USE_FEATURE_FANCY_TAIL(sleep(xatou(str_s));)
>                         /* Move FILE* pointer from the top to the end fo the list */
>                         llist_add_to_end(&flist, llist_pop(&flist));
>                         /* Move filename from the top to the end fo the list */
>                         llist_add_to_end(&nlist, llist_pop(&nlist));
>                         /* Decrement argc as this is our indicator of the first complete iteration of the list */
>                         argc--;
>                 }
>         }
> 
> 
> --
> vda
> 

Thanks for your review, i learned something.... :-)
I'll try to fix it and improve it, and if the size savings
are still big enough i'll repost it.

Ciao,
Tito



More information about the busybox mailing list