Stacks and Procedures
16. Stacks and Procedures
Although the prior two chapters have detailed the hardware/software interface provided by the Beta ISA and its low-level programming tools, these specifications fall far short of a complete description of the programming model presented by the Beta. Much of the latter is provided by a set of programming conventions for organizing software and data on the machine. While prior examples have provided glimpses of several basic conventions (e.g. for representation of arrays and structures), there are many others. Certain conventions are associated with a particular language or sofware discipline, such as object-oriented programming or transaction processing; others, like the procedure linkage conventions of this chapter, are shared by most languages and disciplines.Procedures as an Abstraction
16.1. Procedures as an Abstraction
Procedures -- parameterized code modules that can be defined once and instantiated many times with different parameter values -- are a common mechanism for structuring code. The ability to define a procedure is a powerful abstraction technique: a well-designed procedure reduces the space of possible computations to a tiny, parameterized subset devoted to some related set of problems identifyable by a few parameters. Lika other engineering abstractions, a procedure offers to perform an understandable service without requiring its user to understand the details involved: it isolates specification from implementation. Our simple factorial example from Section 15.3.3.3, for example, is a natural candidate for encapsulation as a procedure.Procedure Linkage Approaches
16.2. Procedure Linkage Approaches
A plausible implementation strategy is to store the code that computes $N!$ at some location, taggedfact:
in memory.
Code elsewhere that needs to compute $n!$ for some $n$ might load
$n$ into R1
and transfer to fact:
, leaving
a return address in some agreed-upon register. Arbitrarily selecting
R28
for the return address, the proposed convention involves:
- Associating the name of a procedure (e.g.
fact
) with an entry point address for that procedure; - Calling the procedure by a
BR(
entry,R28)
where entry is entry point of the procedure, andR28
is a register reserved (by convention) to hold the return address from procedure calls; - Passing the argument parameter to the procedure in
R1
; ; and
- Returning the result in
R0
.
R28
, to hold
the return address from the procedure. Suppose the called
procedure itself contains a procedure call?
fact
procedure.
If you run this code, you will find that it gets into a strangely
endless loop: the JMP
at the line tagged rtn:
transfers control to the MUL
on line 20, after which
control is passed back to rtn:
.
The problem arises on the call to fact(5)
, which
calls fact(4)
in the course of its computation.
As the fact(5)
call is depending on R28
to maintain its return address, the resetting of R28
to a different address by the recursive call at line 19 causes
loss of critical information: the return address of the original
call fact(5)
is overwritten by the return address
for the recursive call. The net result: the procedure can never
return to its original caller.
Nesting Depth and Storage Requirements
16.2.1. Nesting Depth and Storage Requirements
Our primitive convention fails because it reserves exactly one location to hold return addresses from procedure calls. This is inadequate if procedure calls are nested -- for example, procedureproc1
contains a call to
proc2
.
Nesting depth of procedure calls may be greater than one,
from which we conclude that we may need multiple locations to
store return addresses of nested calls. Early runtime
conventions supported this model via schemes which provided
a constant number of return locations, a number that can be
computed from the lexical analysis of simple programs.
Such schemes were short-lived, however, when recursive definitions
of functions became popular programming constructs.
Simple nesting of procedure calls requires a bounded
amount of storage to hold return addresses -- bounded by the
nesting depth of the code. We can compute the nesting
depth of a block of code as follows:
- If there are no calls within the code, its nesting depth is zero.
- If the code contains calls, its nesting depth is $n+1$ where $n$ is the maximum nesting depth of a procedure called within the code block.
fact
procedure, $O(n)$ storage is required to compute $fact(n)$.
Activation Records
16.2.2. Activation Records
Generally there is a certain amount of storage required for each activation of a procedure: data that must persist for the time interval between the call and the corresponding return. This storage includes the return address, argument values, and perhaps other activation-related information. This collection of data is often gathered as a data structure called an activation record. The failed convention of the previous section allocated a single activation record to be shared among all procedure activations -- an approach that fails in the case of nested procedures. A slightly less foolish approach, used before recursion became popular ca. 1960, involved static allocation of storage for a single activation record for each procedure. Such a scheme supports nested calls but not recursive procedures; it was used in some early FORTRAN implementations that did not support recursion. This approach has the advantage that the total amount of storage needed for activation records is bounded by the nesting depth of procedure definitions in the program, a bound that can easily be determined as the program is translated. Thus the storage can be allocated statically, prior to execution. The recognition that the execution of a sequential program may involve multiple simultaneous activations of a single procedure leads to the conclusion that a workable discipline for procedure calls -- one that supports recursive procedures -- must be capable of allocating an unbounded number of activation records; the number may be data dependent, and in the general case may only be determined by running the program. This implies the need for a dynamic system for allocating activation records: they must be allocated whenever a procedure call is executed, and deallocated (returned to a storage pool) when the procedure returns, ending that activation. Consider the execution of the recursivefact
procedure
defined above, given (say) an initial argument of 3. A recursive
call to fact(2)
occurs during the activation
of fact(3)
, at which point two activation records
are needed for the two simultaneous activations of fact
.
But the activation of fact(2)
invokes fact(1)
,
in turn invoking fact(0)
; at this point, there are four
simultaneous activations, each with their own activation record (including
argument value, return address, etc.).
The fact(0)
returns, releasing its activation record;
then each of the other activation records are deallocated as their
corresponding activations return to their caller. These deallocations
are in the reverse order from that of their allocation.
fact
at that instant. At any time, however,
only one of the $N$ activation records is a candidate for
deallocation: the record that was most recently allocated.
It is important to note this last-in-first-out constraint
on the allocation and deallocation of activation records. It stems
from the proper nesting of procedure calls and returns, and is
related to the proper nesting (for example) of parentheses strings,
or the order of stacking/unstacking books in a single pile.
Allocation/deallocation schemes consistent with this constraint
are said to follow stack discipline.
Stacks
16.2.3. Stacks
The fact that storage requirements for nested procedure calls and many other useful programming patterns conforms to the constraints of stack discipline stimulated a revolution in programming languages and runtime disciplines in the early 1960s. This revolution was fueled by the observation that stack discipline enables a simple, fast, effective strategy for implementing a pool of dynamic storage. The discovery that single pool of this sort, called a stack, could support all procedures in a complex program (as well as other structures like interrupts) caused stacks to be institutionalized in both ISAs and programming disciplines, as they remain today.Stack Implementation
16.2.3.1. Stack Implementation
The stack discipline constraint on allocation/deallocation order admits a very simple implementation strategy:- Allocate a contiguous region of memory as the total available stack space;
- Pack stacked data -- the allocated portion of the reserved stack space -- at one end of the region, leaving the rest available to be allocated;
- Use a single register (called the stack pointer or SP within the CPU to identify the boundary between allocated and available stack space;
- Allocate and deallocate space simply by incrementing and decrementing the stack pointer.
push(datum)
:
add the value datum
to the stack; and
datum = pop()
:
remove the most recently added datum from the stack, and
return its value.
pop()
operation.
Immediately following the operation push(x)
, x
becomes the top value on the stack. Thus the sequence
push(x); pop()
adds then removes x
, leaving the stacked
data unchanged.
In addition to the abstract interface provided by the push/pop
operations, it is common to directly access data near the top of the stack.
Such data is typically at an address that is a constant offset from the
value in the SP.
Processor Stack
16.2.3.2. Processor Stack
As an abstract data structure, stacks have many applications. Often applications will create stacks for dedicated purposes -- examples include the undo/redo stack of an editor, and the history stack of a browswer. However, the rules of stack discipline make a single stack eminently sharable among various different uses that conform to the same proper nesting constraint. As proper nesting is a property of most common program control flow patterns -- including entering/exiting blocks, procedure call/return, and interrupts -- modern processors and run-time software conventions commonly assume a single stack that is shared among these purposes. We refer to this distinguished stack as the processor stack, or simply the stack. The evolution of the processor stack assumption is complex and interesting; indeed, the design of modern block-structured languages and consequent programming disciplines reflect the availability of a stack to at least the extent that support for a stack is demanded by these languages and disciplines. It has become popular for those committed to alternative programming structures -- involving, say, parallel processing and other control flow less amenable to stack discipline -- to complain about the extent to which stacks and consequent nesting are institutionalized by contemporary processor design. But the efficiency with which a stack handles a wide range of contemporary programming needs assures its continued support in future ISA generations.The Beta Stack
16.2.4. The Beta Stack
The use of a stack is sufficiently prevalent that it is supported directly by the instruction set of most contemporary computers. Such machines typically dedicate a special CPU register as the stack pointer, and offer specific instructions (e.g., acall
instruction for procedure calls) that manipulate this register
to effect push and pop operations.
In the minimalist Beta ISA, however, we choose to implement stacks,
and our procedure call mechanism, strictly by software convention.
We implement stacks and stack operations as follows:
- As described in the previous section, we statically allocate a
contiguous region of the Beta's main memory for stack storage.
We associate a particular tag (say,
stack:
with the base (lowest address) of this memory block. - We dedicate a particular general register,
R29
, for use as the stack pointerSP
.start: CMOVE(stack, SP) ... stack: . = .+STACKSIZE SP
points to the lowest unused word address in the stack region. On startup, we initialializeSP
to hold the base addressstack
, indicating an initally empty stack. - The top word of the stack is always at an offset of -4
from the
SP
; thus it can be read byLD(SP, -4, Rx)
or written byST(Rx, -4, SP)
. Larger constant offsets fromSP
may be used to access data deeper in the stack.ADDC(SP, 4, SP) ST(Rx, -4, SP) - To push the word in register
Rx
onto the stack, we incrementSP
by four, making it point to the second unused word address, and then write the contents ofRx
into the location that is now the top of the stack. -
To pop the top word from the stack, leaving its value
in register
Rx
, we read the top of the stack intoRx
and then decrementSP
by four. - To simply allocate
K
32-bit words of uninitialized space from the stack, we simply incrementSP
by4*K
. - To deallocate
K
32-bit words of space from the stack, discarding their values, we decrementSP
by4*K
.
PUSH, POP, ALLOCATE,
and DEALLOCATE
is defined as a macro in beta.uasm.
SP
Allocation Semantics16.2.4.1. SP
Allocation Semantics
The careful reader may be surprised by the order of the
ADDC
and ST
instructions in our
definition of the PUSH
operation: it would
seem more straightforward to store the new datum in the
address in SP
and then increment SP
,
rather than doing the increment first and requiring address
arithmetic in the ST
.
There is a subtle reason for this peculiar order of operations.
In later chapters of this course, we will find that the exception
processing mechanism of Section 14.9 opens the
possibility of interrupting program execution between any
pair of consecutive instructions.
When this happens, code in an interrupt handler will
run, eventually returning control to the interrupted program.
This whole sequence of events is designed to be transparent
to the interrupted program, allowing it to run as no interrupt
had taken place.
It is convenient, however, to allow the interrupt handling
process to "borrow" use of the stack temporarily: so long
as it maintains stack discipline, popping exactly whatever
data it pushes, it can in theory do so transparently. To
make this work reliably, we must guarantee that no
stacked data (data that the interrupted program cares about)
is disturbed by the "borrower" of the stack. We do so by
a precise distinction between stacked data, which must be
preserved, and unallocated stack space. The precise rule
is this:
-
All data stored at addresses below the value in
SP
is stacked, and must be preserved;
all locations at the address SP
and above
are unallocated and may be written during interrupt
processing.
SP
must be incremented prior to the store operation in a PUSH
;
otherwise, the datum (stored above the SP
address) will
be unprotected in case of an interrupt between the two instructions.
Stack Usage Patterns
16.3. Stack Usage Patterns
Although our introduction of the stack as a storage allocation mechanism was motivated by our need to store activation records, its availability to be "borrowed" transparently lends itself to a number of other common uses. We explore several common patterns of stack usage before returning to their support for procedures.Register Spills
16.3.1. Register Spills
One commom use of the stack mitigates the problem of register allocation: all computing is done in registers, but there are too few registers to hold all variables of a complex program.Local Variables
16.3.2. Local Variables
As noted in Section 14.6, our programming convention generally allocates locations in main memory to hold values of program variables, and their values are loaded temporarily into registers to perform computations. This plan sidesteps the limited number of registers available: we are unwilling to limit the number of program variables to fit into registers.LD
and ST
operations as shown in the example to the
right.
Such variables are allocated and initialized at the startup of
a program and their extent -- the lifetime of their
allocation -- is the entire running time of the program.
a
and b
, that are local
to an inner block of code (delimited by the curly brackets).
The scope of these variables -- the range of code over
which they are accessable -- is limited to that block: they cannot
be accessed outside of the curly brackets. Likewise, their
extent is limited to the period of execution of that
block of code: new instances of a
and b
are allocated and initialized whenever the block is entered, and
deallocated at exit from the block.
ALLOCATE
macro
to advance SP
, effectively allocating two words of
uninitialized space near the top of the stack. These two variables,
now allocated from the stack, may be accessed using negative offsets
from SP
: -8
for a
, -4
for b
.
Within the block, the variables can be copied to and from registers
using LD
and ST
instructions specifying
these offsets and SP
; otherwise, they are used similarly
to statically allocated variables.
Procedure Linkage Issues
16.4. Procedure Linkage Issues
We now return to the problem of procedure linkage: the protocol and coding conventions that link code that invokes a procedure -- the caller -- with the code that implements the procedure (the callee). As with all engineering abstractions, the key issue is a clear specification of the respective responsibilities of the caller and the callee. Our strategy is to define a general format for activation records, and arrange that an appropriate activation record be allocated from the stack when a procedure is called, and deallocated at the corresponding return. Activation records allocated from the stack are commonly called stack frames, a term we will use for the remainder of this discussion. Candidates for inclusion in the stack frame are details particular to the call:- Argument values for that call;
- Return address for that call;
- Temporary storage needed by the procedure (e.g., saved values from registers at the time of the call that must be preserved for the caller); and
- Local variables declared within that procedure.
Basic Call/Return Protocol
16.5. Basic Call/Return Protocol
Our convention will use several registers, which (likeSP
and the XP
reserved for interrupts
as mentioned in ) are reserved for
this purpose by our software run-time conventions.
Our growing list of reserved regisers is now:
XP
=R30
:
eXception Pointer, used to hold return addresses
for during exception (interrupt) processing;
SP
=R29
:
Stack Pointer, used to indicate the first unused
address of the processor stack;
LP
=R28
:
Linkage Pointer, used to pass the return address
during a procedure call; and
BP
=R27
:
Base Pointer, used to hold a base address for
the current activation record.
R0
:
Used, by convention, to pass a return value back to the caller.
R26
and
below may be used for application-specific purposes (with the
understanding that the contents of R0
will be
discarded on a procedure call).
Our protocol for a procedure call begins with the caller,
which pushes values of arguments on the stack. The caller then
transfers control to the callee by an instruction like
BR(callee, LP)
where callee
designates
the entry point of the procedure -- the address of the
first instruction to be executed on a call.
Note the specification of LP
as the destination
register in the BR
instruction. Recall that this causes
the address of the instruction following the BR
-- the appropriate return address -- to be written
into LP
by the call.
LP
.
The first instructions of the callee push the contents
of BP
and LP
onto the stack, to be restored just prior
to the eventual return to the caller.
PUSH
operations,
SP
is copied into BP
(via MOVE(SP,BP)
),
allowing BP
to serve as a fixed base address within
the newly created stack frame.
Although the value in SP
may vary during execution of
the procedures code, the address in BP
remains constant
(and, consquently, a constant offset from arguments and other values
in the frame).
Note, in particular, that the most recently stacked argument appears
at offset -12 from BP
, and other arguments at
increasingly negative constant offsets.
SP
at each allocation.
Our convention allows a return value to be passed back to the caller
in R0
, but requires that all other registers be restored
to the values at the time of the call; thus, any registers used by
the callee must be carefully saved and restored prior to return.
The called procedure then performs its advertised computation,
accessing arguments via negative offsets
and locals by positive offsets from BP
.
During the computation the stack may be used for various purposes,
including nested and recursive procedure calls. Of course, it is
critical that all such uses follow stack discipline, returning the
stack to its earlier state after each use.
MOVE(BP, SP)
which leaves the
stack containing only the saved BP
, LP
, and
arguments.
The saved values are popped back into BP
and LP
,
and control is returned to the caller via a JMP(LP)
instruction.
BR(callee, LP)
) R0
contains
the return value, all other registers have their values at the time
of the call, and the arguments remain on the stack.
In our
convention, it is the responsibility of the callee to remove any
stacked arguments following the call; typically, it does so by a
single DEALLOCATE(N)
where N
is the number
of arguments passed.
Resulting Stack Structure
16.6. Resulting Stack Structure
Given our linkage conventions, at any instant the stack may be viewed as a concatenation of successive stack frames that represent currently active procedure calls. The current stack frame, whose base address is identified inBP
,
represents the procedure call currently occupying the CPU's attention;
stack frames at successively lower addresses represent unfinished
procedure invocations that are awaiting completion of nested calls
(including the current one).
h(...)
during the computation of g(...)
which was, in turn, invoked by h(...)
. Although most
detail is omitted from the diagram, the diagram depicts the chain
of pointers that characterizes the stack as a sequence of frames.
The address in BP
points to the base address of the
current frame. That frame contains, in its old BP slot,
the base address of the immediately preceding frame on the stack.
The latter frame contains the base address for the frame of its
caller, and similarly through the entire history of procedures that
have been called but have not yet returned.
These pointers, reflecting a dynamic history of BP
values, are often referred to as dynamic links since
they reflect the program's dynamics rather than its static
structure (such as textual nesting). The dynamic link in a
stack frame is simply the immediately previous value that was in
BP
; it always points to an adjacent stack frame at
a lower address.
Stack Frame Details
16.6.1. Stack Frame Details
BP
points to the base address of the frame; per
our convention, the base address identifies the location immediately
following the location containing the saved BP
value --
the dynamic link to the caller's stack frame.
The address in BP
is the first address of the set of local variables
stored in the frame. From this picture, we conclude that the first local
variable has offset 0 from BP
, the second has offset 4, and so
on.
The stacked arguments are similarly accessible using negative offsets from
BP
: the first argument has offset -12 (owing to the stored BP
and LP
values.
As a consequence, the code within the procedure can access locals and
arguments systematically: to access
The $k^{th}$ local variable:
use
use
LD(BP, k*4, Rx)
or ST(Rx, k*4, BP)
;
The $j^{th}$ argument:
use
These patterns apply throughout the code that represents the
body of the called procedure; once use
LD(BP, -4*(j+3), Rx)
or ST(Rx, -4*(j+3), BP)
.
BP
is set up,
its value remains constant.
Why BP?
16.6.1.1. Why BP?
One might question why it is necessary to allocate a register to the function performed byBP
. In fact, the use
of a base register is not necessary; its possible, with
additional bookkeeping, to access contents of the local stack
frame via constant offsets from SP
. The extra
complexity of this route comes from the fact that we allow
the value of SP
to vary from instruction to instruction,
due to various uses of the stack (to deal with register spill and
other such purposes). As a consquence, offsets from SP
(say, to the first argument) may vary from instruction to instruction;
keeping track of proper offsets is tedious for a human programmer.
Such arithmetic is straightforward for a compiler, however,
so the no-BP
option is realistic in practice (where
nearly all code is generated by a compiler). We prefer the
more humane BP
scheme for this course in large part
because it makes example code easier to understand.
Order of Stacked Arguments
16.6.1.2. Order of Stacked Arguments
It is important to note that the above pattern for accessing the $j{th}$ argument would not work if the arguments were stacked in the opposite order. By our convention, the first argument always appears in the stack frame at an offset of -12 from its base address. If the arguments were stacked in the inverse order, the offset between the base address and the $j{th}$ argument would depend on the number of arguments, complicating argument access considerably. A consequence of this ordering, however, is that the caller must push arguments onto the stack in inverse order from their left-to-right sequence in common notation.f(1, 2, 3)
to the 3-argument procedure f
must be translated as shown to the right.
Note that there is no real cost to this convention: it is as easy for
a compiler to generate the above code as it would be to push the
arguments in the more straightforward order.
This choice has other advantages as well. Since early arguments can
be accessed by the callee without knowing the total number of arguments,
it is possible to define procedures taking a variable number of arguments --
say, a procedure $sum(k, a_1, a_2, ..., a_k)$ that computes the sum of
$a_1 ... a_k$ where $k$ is specified as its first argument.
Caller/Callee Contract
16.7. Caller/Callee Contract
We now have the elements of a standard "contractual" relationship beteen the callers and callees of procedures.LP
has a proper return address on entry.
The callee is responsible for computing the promised result, leaving
it in R0
, and returning to the caller with its state
(stack and registers) preserved, save R0
which contains
the return value (if any).
There are many common variations in the details of this protocol.
Some versions require that only a subset of the registers be preserved,
e.g. designating R0-R3
as "temporary" registers whose
contents may be trashed by a procedure call. Other schemes allow
the first several arguments to be passed in registers rather than
on the stack. Each of these issues has advantages and disadvantages.
Code Sequences
16.7.1. Code Sequences
It is common to establish standard instruction sequences that fulfill various steps of a procedure call.PUSH
operations; in practice, there may
be interspersed instructions computing the value of each argument as it is
stacked.
BP
value. Again variations
are possible; for example, the ALLOCATE
may be eliminated if the
callee requires no local variables.
DEALLOCATE
to undo the
ALLOCATE
of the entry sequence. While a DEALLOCATE
might be harmlessly included in the return sequence, it is unnecessary since
the MOVE(BP,SP)
does a "brute force" reset of SP
to the proper value.
Example: GCD Procedure
16.7.2. Example: GCD Procedure
Consider a proceduregcd(a, b)
which computes the
greatest common divisor of two positive integers.
gcd
is shown, in C, to the left.
Note that the algorithm itself contains two recursive calls to gcd
.
The reader might verify that it computes gcd(6,15)
, for example via the steps
gcd(6,15) → gcd(6,9) → gcd(6,3) → gcd(3,3) → 3
.
gcd
to Beta assembly language is shown to the right.
Note that the first few instructions correspond to our boiler-plate entry sequence,
and the instructions following the tag rtn:
constitute a return sequence
that is shared among the three return
s in the C code.
Each of the two recursive calls exemplifies the calling sequence.
Clicking this diagram will present a runnable simulation of this code,
together with a tiny main program that sets up the stack and calls gcd
with sample argument values. It is worth experimenting with this setup to
familiarize yourself with the Beta simulator:
try running, single-stepping, changing argument values, and setting breakpoints.
Uncommenting the breakpoint at line 34, for example, causes the simulator to
pause with arguments in R0
and R1
as each gcd
call is entered.
If you feel ambitious, you might look for ways to improve the code by making it
shorter and/or faster.
We might use gcd
to determine whether a pair of positive integers is
relatively prime or coprime by defining the simple procedure
shown to the left.
gcd
on the supplied
arguments and compares the result to one; it returns a Boolean (true/false)
value, reflecting the fact that a
and b
are
relatively prime if and only if their greatest common divisor is one.
coprime
is shown to the right;
clicking it yields a runnable example, including gcd
and a test
case.
Note that the Boolean (1 or 0) value produced by the CMPEEQC
instruction, comparing the value returned by gcd
with 1, is
itself returned as the result of the coprime
call.
Again, we recommend experimenting with this example: verify that while
33 and 28 are relatively prime, 35 and 28 are not.
Exploring BSim
16.7.3. Exploring BSim
At this point, the reader is encouraged to run this sample code and gain familiarity with BSim.bsim.uasm
and required setup code.
After making any desired edits to the source code, click assemble
to translate and load the result into the simulated Beta memory. If
no errors are detected in the translation, you'll see a summary of CPU
state (registers) along with scrollable windows showing several regions
of memory. Buttons at the top left will single-step, reload, or run the
program to completion (or until it hits a breakpoint).
Stack Detective Skills
16.7.4. Stack Detective Skills
Skilled assembly language programmers develop masterful techniques for debugging that involve deducing details of a program's state and history from binary snapshots of its memory. One such technique, often termed stack crawling, deals with using the critical information in the stack to reconstruct the history of nested calls that led to its current state. For a simple but concrete example, supposecoprime(4,6)
is called with the above coprime
and gcd
implementations.
coprime(4,6)
to invoke gcd(4,6)
, invoking in turn the recursive
calls to gcd
diagramed to the left. The recursion
bottoms out with a call to gcd(2,2)
, which returns
2; this value is passed up through the stacked calls, and is
returned as the value of coprime(6,4)
.
Each of these nested calls generates a stack frame; at the
point of deepest nesting (during the call to gcd(2,2)
)
we expect the laout of frames depicted above.
Lets corroborate this understanding by
starting our coprime(4,6)
example
and, after running a while, stopping execution at the (commented) breakpoint shown at line 34
of the above gcd
listing, i.e. when it is about to
execute the CMPEQ
instruction.
Note that you can duplicate this experiment by clicking on the
coprime
code, editing the source file to specify
arguments and enable the .breakpoint
, assembling,
and clicking run several times.
BP=0x174
,
SP=0x17c
, and a stack containing the values
shown on the right. What can we determine about the history
and state of the computation?
In puzzling over this programmer's analog of an archaeological dig,
it is useful to annotate the stack dump with
some deductions. We start by marking the locations pointed
to by the values in SP
and BP
,
identifying the top of the stack and the base of the current
stack frame.
Our first deductions come from locations surrounding BP
:
since we know the program is stopped within gcd
, we know
the format of the local stack frame. We annotate the stack dump to
identify the values below BP=0x174
as saved LP
,
saved BP
and the arguments a
and b
.
The value pointed to by BP
is the saved R1
,
followed by the saved R2
. We might note that 0x174
is the base of the stack frame corresponding to a call to gcd(2,2)
.
Next, we can follow the saved BP
pointer in
0x16c
to the base of the caller's frame at
0x15c
.
gcd(4,2)
,
and continue up the stack until we reach the initial call coprime(4,6)
.
The result might be the annotated stack dump shown to the left.
Note that its structure is consistent
with the layout we expected; of course, the annotated dump
contains much greater detail.
We can deduce, for example, that coprime(4,6)
was called from location 0x28
since the
corresponding return address points to the immediately
following instruction.
Observe that all stacked PC values in our example
have the Kernel bit (discussed in Section 14.9.2) set,
reflecting that the Beta starts running in Kernel mode.
Chapter Summary