17. Compiled LanguagesAlthough assembly language provides a more convenient interface to the instruction set architecture of our general-purpose computer than the binary instructions it executes, it is a tedious language for most programming tasks. Assembly language is deliberately low level: its elements correspond directly to actual machine instructions, guaranteeing the programmer access to the complete feature set of the machine. As an engineering abstraction, however, assembly language has notable deficiencies; these stem from the fact that assembly language is designed around the capabilities of the hardware, rather than the needs of the programmer. In this Chapter, we indulge in a brief glimpse of higher-level programming languages.
Programming Language Features
17.1. Programming Language FeaturesThere are many different higher-level languages available for modern computer systems, with differing features and characteristics. Generally these languages are platform independent, meaning that they can be made to run on most any computer with sufficient resources -- in contrast to the ISA-specific assembly language. Beyond this commonality, general-purpose programming languages may advertise a number of other attractions, such as:
- safety features that guarantee your program won't mistake floating-point numbers for executable instructions, overrun buffers, and other common sources of unpredictable behavior; often involves a strong type system.
- garbage-collected memory to relieve the programmer from tedious allocation/deallocation bookkeeping;
- first-class procedures that can be passed as arguments, returned as values, and stored in data;
- objects and inheritance, popular abstraction mechanisms in contemporary software;
- functional programming support including constructs like comprehensions, combinators, and monads to support programming disciplines that avoid imperatives and side-effects.
17.1.1. Dangling ReferencesAlthough the stack is an efficient and effective storage allocation mechanism for many programming constructs, there are a few patterns that run afoul of the requirements of stack discippline.
int *pdeclares a pointer to an int value: we expect the runtime value of
pto be the address of a word in main memory containing a 32-bit integer. Note that this declaration is outside of any procedure definition; consequently, it is a statically-allocated global variable. No initial value is specified; the value of
pis to be set by a call to the procedure
h, defined subsequently. The definition of
h(x)includes the declaration of a local variable,
yis local to the procedure
h, it will be allocated dynamically from the stack on entry to
h, and deallocated when
pto the address of
y-- the using the C syntax
p = &y;. The final two lines include a call to
h, which allocates
yand initializes it to 30, followed by an attempt to print the value pointed to by the pointer
p. What value do we expect to be passed to
pis set to the address of a variable
ywhose lifetime is temporary. The attempt to dereference
pin the call to
hhas returned and
yis deallocated: it contains the address in some portion of the unallocated portion of the stack region. Such problems stem from the fact that the program keeps a pointer
pto a value
ybeyond the lifetime of the latter: stack discipline dictates that
ybe deallocated when the call to
hreturns, rendering the address stored in
pinvalid. More ambitous languages solve this problem by detecting cases where storage must be allocated from a more general storage mechanism (a heap), which avoids the constraints of stack discipline.
Procedure Values and Closures
17.1.2. Procedure Values and ClosuresA closely related class of problems involves treating procedures as "first-class" values -- the ability to pass them around like integer or character data, as arguments to (and values returned from) procedures, stored in data structures, etc.
make_addxis intended to return a new procedure, given an integer
x, that simply adds
xto its argument. It does so by defining a local procedure
addx, and returning that procedure as the value of the call to
make_addx(x). The final two lines include a call to
make_addx(3), expected to produce a new procedure
add3that simply adds 3 to its supplied integer argument; the call to
add3(4), in a more perfect world, should return the value 7. Again, however, this expectation exceeds the limited model of procedure values supported by C. A problem is that the definition of the returned
add3function involves the argument
x, which no longer exists after
make_addx(3)returns; thus we have another dangling reference problem. Variants of this implementation challange involve passing procedures as arguments rather than returned values; while avoiding dangling references, access to nonlocal variables in these cases requires mechanism beyond the simple stack-based model assumed by C (and used in this course). As a consequence, C disallows nested procedure definitions like that in the above example. More ambitious languages solve this problem by representing procedures as heap-allocated data structures called closures, which bundle values of relevent non-local variables into the runtime representation of procedure value.