GC reads uninitialized values in SIOD?


I have found readings to uninitialized variables when using SIOD, and I believe it is due to having uninitialized variables in functions. I woud like to know if this is a bug, a feature or an unexpected behaviour.

Any comment will be very much welcome.

If there is a function that does not initialize its LISP variables (to NIL for instance) and then garbage is collected, the type of the uninitialized variable will be read returning an undefined (random) value.
int bar() {
 LISP foo1; // Not initialized
 // stuff not related to foo1
   // garbage collection is called
 // stuff 
From garbage collection routines:

In order to mark which cells are to be garbage collected, we sweep the stack (which may have uninitialized LISP objects).
void mark_locations_array(LISP *x,long n)
{int j;
 LISP p;
   {p = x[j];
    if (looks_pointerp(p))
The looks_pointerp decides if the current lisp object "p" has to be garbaged collected or not:
Notice that, as "p" may have not been initialized, checking NTYPEP(p, tc_free_cell) will be undefined:
long looks_pointerp(LISP p)
{long j;
 LISP h;
   if ((h = heaps[j]) &&
       (p >= h) &&
       (p < (h + heap_size)) &&
       (((((char *)p) - ((char *)h)) % sizeof(struct obj)) == 0) &&
This means that an uninitailized cell will randomly be marked for garbage collection. So gc_mark(p) with p uninitialized will be called. In this function, comments are mine
void gc_mark(LISP ptr) // We assume ptr uninitialized
{struct user_type_hooks *p;
 if NULLP(ptr) return; // Random return
 if ((*ptr).gc_mark) return; // Random return
 (*ptr).gc_mark = 1; // Ok
 switch ((*ptr).type) // Random case: Here any case could happen. 
                            //  The default case might call any function? Is this an issue?
   {case tc_flonum:
    case tc_cons:
      ptr = CDR(ptr);
      goto gc_mark_loop;
    case tc_symbol:
      ptr = VCELL(ptr);
      goto gc_mark_loop;
    case tc_closure:
      ptr = (*ptr).storage_as.closure.env;
      goto gc_mark_loop;
    case tc_subr_0:
    case tc_subr_1:
    case tc_subr_2:
    case tc_subr_2n:
    case tc_subr_3:
    case tc_subr_4:
    case tc_subr_5:
    case tc_lsubr:
    case tc_fsubr:
    case tc_msubr:
      p = get_user_type_hooks(TYPE(ptr));
      if (p->gc_mark)
    ptr = (*p->gc_mark)(ptr);}}
As I see it there is an unexpected behaviour in SIOD when LISP objects are not initialized. A workaround is to initialize to NIL all LISP objects.


publius wrote Jan 23, 2014 at 8:05 PM

If you could give a scheme-level source code example that shows a problem with this then it would provide some good insight into what is going on. Is that possible?

But just leaving the discussion at the level of C source code observations, I'm going to first clarify some of your terminology to help other readers.

The gc_mark does not mark something for garbage collection, it marks something to be preserved so that it will not be garbage collected in the sweep phase of the mark and sweep garbage collector.

SIOD has two garbage collectors built into it, you can choose one at runtime, and that can cause some confusion. One is stop and copy, the default is mark and sweep.

The looks_pointerp function the key part of the conservative garbage collection algorithm. You can google that phrase to good effect.

Now back to the subject of uninitialized LISP pointers on the stack. Indeed this might be random data.
You would also have a problem with long and double and struct and other values on the stack.
There might also be saved stack pointers, function call frame pointers, saved registers and all manner of junk on the stack. In fact a double might just look like a pointer into one of the heaps and could return true for looks_pointerp. The negative impact of this would be that an object in the heap which was supposed to be garbage collected by the sweep phase ends up being preserved even though no C code has a legitimate pointer to it.

Therefore, even if you carefully set LISP x = NULL; inside your C code you could still be subject to the bug you observe.

The key to anything dangerous (as contrasted with simply being conservative about the mark phase)
would be the behavior of get_user_type_hooks. That function cannot return any random pointers to functions, it can only return something that was previously set by set_gc_hooks.
And for that matter, the object in the heap must have been previously a legit object of the
particular type, and not previously marked or swept. The heap itself, unlike the stack,
is carefully managed. It doesn't even have binary objects in it (except for inline double floats), the pointers to strings or FILE * or other data are points back into regular C runtime library managed space.

Note that looks_pointerp also checks for the alignment of the pointer, so that it rejects pointers
that are not aligned correctly in the heap to the start of an object.

If I was going to code this today I might even be more careful, and make sure that everything
was an array index, not a pointer. So a LISP object might be broken up into two integers, a heap number and an index into the heap. The additional work this would require the CPU to do is probably
nothing to worry about in the CPU architectures of today. One might even store the CAR in one array
and the CDR in another array. Shades of the original FORTRAN implementation of LISP.

zeehio wrote Feb 3, 2014 at 12:04 AM

First of all, thank you very much for your attention, your corrections to my comments and your clarifications. They are very much appreciated.

I don't know if it is possible from an scheme code to get some uninitialized variable in the stack. I am not that familiar with it, but if I find how to do it, I will post it here.

Thanks to your comments I understand the "conservative" aspect of the garbage collector, and I agree with you that some of the objects may be preserved although they could be garbage collected. In any case this is not a serious issue.

The part of the code that I was most worried about is:
  p = get_user_type_hooks(TYPE(ptr));
  if (p->gc_mark)
       ptr = (*p->gc_mark)(ptr);}}

If ptr is uninitialized in the gc_mark function we can fall to the default case. In this case, ptr may be wrongly matched of being of a custom type, and then p will have a (controlled) gc_mark associated function.

I believe that we agree that (*p->gc_mark) will have the gc_mark function for a custom user type, and that function will have been properly set. However, its argument (ptr) will be a pointer to random data in the stack.

Here I want to explain a case that may lead to arbitrary code execution. Let's say that I want to add a SIOD interface to my application. In order to do that I do the following things:
  1. I define a custom new scheme type (to store objects with custom information related to my application).
  2. My custom scheme type contains a pointer to a function, so different LISP objects of my custom type may point to different functions. This is an apparently safe requisite of my application.
  3. I set the gc_hooks for my custom scheme type. My gc_mark function calls the function pointer stored in my custom object to know if it can or cannot be marked for garbage_collection.
If I am unaware of the issue I was mentioning in my previous post, I might assume that my gc_mark function will always have as an argument LISP objects of my custom scheme type.

Calling my gc_mark function with an arbitrary pointer as an argument will make the my program use a garbage value as the address of a function. This may call an arbitrary function leading to arbitrary code execution.

Maybe this chance is too remote to be considered, I don't think I could even write a proof of concept (unfortunately I am not that good...), or at least it would take a while for me to do it... but still I would advice in the documentation that the gc_mark function should be prepared to receive a random data pointer.