[RFC] tail.c size reduction
Tito
farmatito at tiscali.it
Mon Sep 3 13:49:00 PDT 2007
On Monday 03 September 2007 21:13:16 Denys Vlasenko wrote:
> On Sunday 02 September 2007 20:20, Tito wrote:
> > Hi to all,
> > this is a drop in replacement for tail.c.
> > It is a result of a few sleepless nights of hacking
> > that started by looking at and trying
> > to understand mjn3's code.
> > At the end I rewrote tail following a different
> > italian creative and instinctive way.
> > bloat-o-meter is not that bad:
> >
> > * scripts/bloat-o-meter busybox_old busybox_unstripped
> > * function old new delta
> > * xmalloc_fgetc - 40 +40
> > * .rodata 122102 122134 +32
> > * header_fmt 13 - -13
> > * eat_num 37 - -37
> > * tail_read 126 - -126
> > * tail_main 1109 784 -325
> > * ------------------------------------------------------------------------------
> > * (add/remove: 1/3 grow/shrink: 1/1 up/down: 72/-501) Total: -429 bytes
> >
> > The functionality should be the same as the original, at least
> > that is what my little testing says, but as the size reduction
> > is that big I fear that i've overlooked something very stupid somewhere.
> > That's why i'm posting it to the list for auditing and inspection by
> > better coders than me that will see my obvious errors.
>
> do {
> /* With no FILE or When FILE is -, read standard input. */
> if (argc == 0 || LONE_DASH(*argv)) {
> opt |= ~0x1; /* unset FOLLOW */
> llist_add_to_end(&flist, stdin);
> llist_add_to_end(&nlist, (char *)bb_msg_standard_input);
> continue;
> }
> file = fopen_or_warn(*argv, "r");
> if (!file) {
> status = EXIT_FAILURE;
> continue;
> }
> /* Create lists of FILE* pointers and file names to reuse for -f FOLLOW */
> llist_add_to_end(&flist, file);
> llist_add_to_end(&nlist, *argv);
> } while (*++argv);
>
> If you run tail with no arguments, you go into first if and then "contunue".
> Then you do ++argv ... and happily continue into environment vector which
> is typically immediately follows argv[] in memory.
Fixed.
./busybox tail file1 file2 -
iteration
iteration
iteration
==> file1 <==
1
==> file2 <==
23456789a
==> standard input <==
root at localhost:~/Desktop/busybox# ./busybox tail -
iteration
==> standard input <==
root at localhost:~/Desktop/busybox# ./busybox tail
iteration
==> standard input <==
This for me is smaller than the cat solution.
do {
/* With no FILE or When FILE is -, read standard input. */
if (!*argv || LONE_DASH(*argv)) {
opt |= ~0x1; /* unset FOLLOW */
llist_add_to_end(&flist, stdin);
llist_add_to_end(&nlist, (char *)bb_msg_standard_input);
if (!*argv) break;
continue;
}
file = fopen_or_warn(*argv, "r");
if (!file) {
status = EXIT_FAILURE;
continue;
}
/* Create lists of FILE* pointers and file names to reuse for -f FOLLOW */
llist_add_to_end(&flist, file);
llist_add_to_end(&nlist, *argv);
} while (*++argv);
> cat. deals with it like this:
>
> static const char *const argv_dash[] = { "-", NULL };
>
> if (!*argv)
> argv = (char**) &argv_dash;
> do {
> if (LONE_DASH(*argv)) ...
>
>
>
> Size: I think this
>
> do {
> /* With no FILE or When FILE is -, read standard input. */
> if (argc == 0 || LONE_DASH(*argv)) {
>
> Will be smaller in this form:
>
> do {
> char *arg = *argv;
> /* With no FILE or When FILE is -, read standard input. */
> if (!arg || LONE_DASH(arg)) {
>
> arg is better than *argv since gcc can have doubts that *argv doesn't
> change across function calls. Help it.
>
> Why "!arg" instead of "argc == 0":
> You need arg for LONE_DASH(), thus gcc pulls it into register, thus
> !arg is as easy to evaluate as argc == 0, but does not require gcc
> to allocate another register for argc.
>
> But ultimate test in bloatcheck, of course.
>
>
> while (flist) {
> count = 0;
> /* if we are reading stdin print the header before reading as read can block */
> if ((FILE*)flist->data == stdin)
> tail_xprint_header(fmt + first_file, nlist->data);
>
> Like arg above, temporary FILE *curfile = (FILE*)flist->data;
> may be smaller, and will be more readable for sure.
>
> while ((line = (((COUNT_BYTES) ? xmalloc_fgetc : xmalloc_fgets)((FILE*)flist->data)))) {
>
> malloc for each byte? :(
To have something you can free as a line of xmalloc_fgets.
>
> /* Show if there was an error while reading */
> if (ferror((FILE*)flist->data)) {
> bb_perror_msg("%s", nlist->data);
> status = EXIT_FAILURE;
> }
> /* Create a list of the lines or chars */
> llist_add_to_end(&slist, line);
> /* If we exceed our -n or -c value remove one element from the head of the list */
> if (++count > n)
> free(llist_pop(&slist));
>
> what will happen if count wraps around?
Don't understand what the meaning of "wraps around is" ?
>
> "malloc for each byte" and count problems can be solved
> together: always store lines, not chars, and deal with
> COUNT_BYTES case by printing only a part of first line of output,
> so that total output size is correct.
How can you know that bytes < bytes_of_the_line, so to print only a part of the line ???
Don't you need two counters? one for the lines and one for the chars you're printing.
>
> }
> /* Print the header if needed */
> if ((FILE*)flist->data != stdin
> && ((VERBOSE && argc > 0) || (VERBOSE && slist && argc < 0))
> )
> tail_xprint_header(fmt + first_file, nlist->data);
>
> ((VERBOSE && argc > 0) || (VERBOSE && slist && argc < 0)) == VERBOSE && ((argc > 0) || (slist && argc < 0))
>
> /* Zero first file flag */
> first_file = 0;
> /* Print what we have read */
> while ((line = llist_pop(&slist))) {
> tail_xprint_header("%s", line);
> free(line);
> }
>
> /* Was the file truncated ? */
> if (!slist && lseek(fileno((FILE*)flist->data), 0, SEEK_END)
> < lseek(fileno((FILE*)flist->data), 0, SEEK_CUR)
> )
> bb_error_msg("file truncated");
>
> Hard to read, and racy (which llseek is done first?).
Have to fix this any way....i discovered
right now that in some circumstances it doesn't work as expected.
>
> if (!FOLLOW) {
> /* Close the current file */
> fclose_if_not_stdin(llist_pop(&flist));
> /* Move the head of the list to the next file name */
> llist_pop(&nlist);
> } else {
> /* FOLLOW */
> /* Sleep for approximately S seconds (default 1) between iterations */
> USE_FEATURE_FANCY_TAIL(sleep(xatou(str_s));)
> /* Move FILE* pointer from the top to the end fo the list */
> llist_add_to_end(&flist, llist_pop(&flist));
> /* Move filename from the top to the end fo the list */
> llist_add_to_end(&nlist, llist_pop(&nlist));
> /* Decrement argc as this is our indicator of the first complete iteration of the list */
> argc--;
> }
> }
>
>
> --
> vda
>
Thanks for your review, i learned something.... :-)
I'll try to fix it and improve it, and if the size savings
are still big enough i'll repost it.
Ciao,
Tito
More information about the busybox
mailing list