Compiled Languages
17. Compiled Languages
Although assembly language provides a more convenient interface to the instruction set architecture of our general-purpose computer than the binary instructions it executes, it is a tedious language for most programming tasks. Assembly language is deliberately low level: its elements correspond directly to actual machine instructions, guaranteeing the programmer access to the complete feature set of the machine. As an engineering abstraction, however, assembly language has notable deficiencies; these stem from the fact that assembly language is designed around the capabilities of the hardware, rather than the needs of the programmer. In this Chapter, we indulge in a brief glimpse of higher-level programming languages.Programming Language Features
17.1. Programming Language Features
There are many different higher-level languages available for modern computer systems, with differing features and characteristics. Generally these languages are platform independent, meaning that they can be made to run on most any computer with sufficient resources -- in contrast to the ISA-specific assembly language. Beyond this commonality, general-purpose programming languages may advertise a number of other attractions, such as:- safety features that guarantee your program won't mistake floating-point numbers for executable instructions, overrun buffers, and other common sources of unpredictable behavior; often involves a strong type system.
- garbage-collected memory to relieve the programmer from tedious allocation/deallocation bookkeeping;
- first-class procedures that can be passed as arguments, returned as values, and stored in data;
- objects and inheritance, popular abstraction mechanisms in contemporary software;
- functional programming support including constructs like comprehensions, combinators, and monads to support programming disciplines that avoid imperatives and side-effects.
Dangling References
17.1.1. Dangling References
Although the stack is an efficient and effective storage allocation mechanism for many programming constructs, there are a few patterns that run afoul of the requirements of stack discippline.int *p
declares a pointer
to an int value: we expect the runtime value of p
to be the address of a word in main memory containing
a 32-bit integer.
Note that this declaration is outside of any procedure definition;
consequently, it is a statically-allocated global variable.
No initial value is specified; the value of p
is to be set by a call to the procedure h
,
defined subsequently.
The definition of h(x)
includes the declaration of
a local variable, y
. Since y
is local to the procedure h
, it will be allocated
dynamically from the stack on entry to h
, and
deallocated when h
returns.
h
sets p
to the
address of y
-- the using the C
syntax p = &y;
.
The final two
lines include a call to h
, which allocates
y
and initializes it to 30, followed by an
attempt to print the value pointed to by the
pointer p
.
What value do we expect to be passed to print
?
p
is set to
the address of a variable y
whose lifetime is temporary.
The attempt to dereference p
in the call to print
is executed after h
has returned and y
is deallocated: it contains the address in some portion of the unallocated
portion of the stack region.
Such problems stem from the fact that the program keeps a pointer p
to a value y
beyond the lifetime of the latter: stack discipline
dictates that y
be deallocated when the call to h
returns, rendering the address stored in p
invalid. More ambitous
languages solve this problem by detecting cases where storage must be allocated
from a more general storage mechanism (a heap), which avoids the
constraints of stack discipline.
Procedure Values and Closures
17.1.2. Procedure Values and Closures
A closely related class of problems involves treating procedures as "first-class" values -- the ability to pass them around like integer or character data, as arguments to (and values returned from) procedures, stored in data structures, etc.make_addx
is intended to return a new procedure, given
an integer x
, that simply adds x
to its
argument. It does so by defining a local procedure addx
,
and returning that procedure as the value of the call to
make_addx(x)
.
The final two lines include a call to make_addx(3)
,
expected to produce a new procedure add3
that simply
adds 3 to its supplied integer argument; the call to add3(4)
,
in a more perfect world, should return the value 7.
Again, however, this expectation exceeds the limited model of procedure
values supported by C. A problem is that the definition of the returned
add3
function involves the argument x
, which
no longer exists after make_addx(3)
returns; thus we
have another dangling reference problem. Variants of this implementation
challange involve passing procedures as arguments rather than returned
values; while avoiding dangling references, access to nonlocal variables
in these cases requires mechanism beyond the simple stack-based model
assumed by C (and used in this course).
As a consequence, C disallows nested procedure definitions like that in
the above example.
More ambitious languages solve this problem by representing procedures
as heap-allocated data structures called closures, which bundle
values of relevent non-local variables into the runtime representation
of procedure value.