SS Unit-1
SS Unit-1
Machine Architecture
Definition of Software:
Software is a set of programs, which is assigned to perform a well defined function. A
program is a sequence of instructions written to solve a particular problem.
Types of Software:
1. System Software.
2. Application Software.
Eg: text editor, compiler, loader and linker, debugger, macro processors, operating
system etc...
Text Editor: Used to create and modify program.
Compiler: Translate the user program into machine language.
Loaders: Is a system programs that prepare machine language programs for execution.
Debugger: Helps to detect errors in the program.
Translator: Used to translate the assembly code into machine code. This translator is
called an assembler.
User
Application Program
OS
Memory Process Device Information
Management Management Management Management
Computer
Fig: System Software Concept
Eg: Complier, Assembler, Operating Eg: Payroll, Microsoft word, Excel etc...
system etc..
Memory:
Upward compatible
Memory consists of 8-bit bytes, 3 consecutive bytes form a word (24 bits)
There are a total of 215=32768 bytes in the computer memory.
Registers:
5 registers, 24 bits in length
Data Format:
Integers are stored as 24-bit binary number.
2’s complement representation for negative values.
Characters are stored using 8-bit ASCII codes.
No floating-point hardware on the standard version of SIC.
Instruction Format:
24-bit format.
The flag bit x is used to indicate indexed-addressing mode.
Addressing Modes:
Instruction Set:
Memory:
Memory consists of 8-bit bytes, 3 consecutive bytes form a word (24 bits).
Maximum memory available on a SIC/XE system is 1 Megabyte (220 bytes)
Registers:
Data Formats:
24-bit binary number for integer, 2’s complement for negative values.
48-bit floating-point data type.
The exponent is between 0 and 2047.
f*2(e-1024)
0: set all bits to 0
1 11 36
s exponent fraction
Instruction Format:
Format 2 (2 bytes) :
8 4 4
op r1 r2
Format 3 (3 bytes): 6 1 1 1 1 1 1 12
op n i x b p e disp
Here e=0
Format 4 (4 bytes): 6 1 1 1 1 1 1 20
Addressing Modes:
n i x b p e
Simple n=0, i=0 (SIC) or n=1, i=1
Immediate n=0, i=1 TA=Valus
Indirect n=1, i=0 TA=(Operand)
Direct (x, b, and p all set to 0): operand address goes as it is. n and i are both set to the
same value, either 0 or 1. While in general that value is 1, if set to 0 for format 3 we
can assume that the rest of the flags (x, b, p, and e) are used as a part of the address of
the operand, to make the format compatible to the SIC format.
Relative (either b or p equal to 1 and the other one to 0): the address of the operand
should be added to the current value stored at the B register (if b = 1) or to the value
stored at the PC register (if p = 1)
Immediate (i = 1, n = 0): The operand value is already enclosed on the instruction (ie.
lies on the last 12/20 bits of the instruction)
Indirect (i = 0, n = 1): The operand value points to an address that holds the address
for the operand value.
Indexed (x = 1): value to be added to the value stored at the register x to obtain real
address of the operand. This can be combined with any of the previous modes except
immediate.
The various flag bits used in the above formats have the following meanings e i.e e =
0 means format 3, e = 1 means format 4
Bits x,b,p: Used to calculate the target address using relative, direct, and indexed
addressing Modes
Bits i and n: Says, how to use the target address b and p - both set to 0, disp field from
format 3 instruction is taken to be the target address. For a format 4 bits b and p are
normally set to 0, 20 bit address is the target address.
x - x is set to 1, X register value is added for target address calculation
Instruction Set:
Format 1, 2, 3, or 4
Load and store registers (LDB, STB, etc.)
Floating-point arithmetic operations (ADDF, SUBF, MULF, DIVF)
Register-to-register arithmetic operations (ADDR, SUBR, MULR, DIVR)
A special supervisor call instruction (SVC) is provided
SIC VERSION
SIC/XE VERSION
SIC/XE VERSION
SIC VERSION
SIC/XE VERSION
SIC VERSION
Subroutine operation
SIC
SIC/XE
ASSEMBLER
Decide the proper instruction format Convert the data constants to internal
machine representations
Write the object program and the assembly listing
So for the design of the assembler we need to concentrate on the machine architecture
of the SIC/XE machine. We need to identify the algorithms and the various data
structures to be used. According to the above required steps for assembling the
assembler also has to handle assembler directives, these do not generate the object
code but directs the assembler to perform certain operation.
The following are the assembler directives:
START :
• Specify name and starting address for the program
END :
• Indicate the end of the source program, and (optionally) the first
executable instruction in the program.
BYTE :
• Generate character or hexadecimal constant, occupying as many
bytes as needed to represent the constant.
WORD :
• Generate one-word integer constant
RESB :
• Reserve the indicated number of bytes for a data area
RESW :
• Reserve the indicated number of words for a data area
The assembler design can be done:
Single pass assembler
Multi-pass assembler
Single-pass Assembler:
In this case the whole process of scanning, parsing, and object code conversion is
done in single pass. The only problem with this method is resolving forward
reference.
This is shown with an example below:
10 1000 FIRST STL RETADR 141033
--
--
--
--
95 1033 RETADR RESW 1
In the above example in line number 10 the instruction STL will store the linkage
register with the contents of RETADR. But during the processing of this instruction
the value of this symbol is not known as it is defined at the line number 95. Since I
single pass assembler the scanning, parsing and object code conversion happens
simultaneously. The instruction is fetched; it is scanned for tokens, parsed for syntax
and semantic validity. If it valid then it has to be converted to its equivalent object
code. For this the object code is generated for the opcode STL and the value for the
symbol RETADR need to be added, which is not available. Due to this reason usually
the design is done in two passes. So a multi-pass assembler resolves the forward
references and then converts into the object code. Hence the process of the multi-pass
assembler can be as follows:
Pass-1
Assign addresses to all the statements
Save the addresses assigned to all labels to be used in Pass-2
Perform some processing of assembler directives such as RESW, RESB to find
the length of data areas for assigning the address values.
Defines the symbols in the symbol table(generate the symbol table)
Pass-2
Assemble the instructions (translating operation codes and looking up
addresses).
Generate data values defined by BYTE, WORD etc.
Perform the processing of the assembler directives not done during pass-1.
Write the object program and assembler listing.
Assembler Design:
The most important things which need to be concentrated is the generation of
Symbol table and resolving forward references.
Symbol Table:
This is created during pass 1
All the labels of the instructions are symbols
Table has entry for symbol name, address value.
Forward reference:
Symbols that are defined in the later part of the program are called
forward referencing.
There will not be any address value for such symbols in the symbol
table in pass 1.
Example Program:
The example program considered here has a main module, two
subroutines
Purpose of example program
Reads records from input device (code F1)
Copies them to output device (code 05)
At the end of the file, writes EOF on the output device, then RSUB to
the operating system
Data transfer (RD, WD)
A buffer is used to store record
Buffering is necessary for different I/O rates
The end of each record is marked with a null character (00)16
The end of the file is indicated by a zero-length record
Subroutines (JSUB, RSUB)
RDREC, WRREC
Save link register first before nested jump
SIC PROGRAM
The first column shows the line number for that instruction, second column shows the
addresses allocated to each instruction. The third column indicates the labels given to
the statement, and is followed by the instruction consisting of opcode and operand. The
last column gives the equivalent object code.
All these steps except the second can be performed by sequential processing of the
source program, one line at a time.
Consider the instruction
10 1000 LDA ALPHA 00-----
This instruction contains the forward reference, i.e. the symbol ALPHA is used is not
yet defined. If the program is processed (scanning and parsing and object code
conversion) is done line-by-line, we will be unable to resolve the address of this
symbol. Due to this problem most of the assemblers are designed to process the
program in two passes.
In addition to the translation to object program, the assembler has to take care of
handling assembler directive. These directives do not have object conversion but gives
direction to the assembler to perform some function.
Examples of directives are the statements like BYTE and WORD, which directs the
assembler to reserve memory locations without generating data values.
The other directives are START which indicates the beginning of the program and
END indicating the end of the program.
The assembled program will be loaded into memory for execution. The simple object
program contains three types of records: Header record, Text record and end
record. The header record contains the starting address and length. Text record
contains the translated instructions and data of the program, together with an indication
of the addresses where these are to be loaded. The end record marks the end of the
object program and specifies the address where the execution is to begin.
Header record:
Col 1: H
Col 2-7: Program name
Col 8-13: Starting address of object program (hexadecimal)
Col 14-19: Length of object program in bytes (hexadecimal)
Text record:
Col. 1: T
Col 2-7: Starting address for object code in this record (hexadecimal)
Col 8-9: Length off object code in this record in bytes (hexadecimal)
Col 10-69: Object code, represented in hexadecimal (2 columns per byte of object
code)
End record:
Col. 1: E
Col 2-7: Address of first executable instruction in object program (hexadecimal)
The assembler can be designed either as a single pass assembler or as a two pass
assembler. The general description of both passes is as given below:
Pass 1 (define symbols)
Assign addresses to all statements in the program
Save the addresses assigned to all labels for use in Pass 2
Perform assembler directives, including those for address assignment, such as
BYTE and RESW
Pass 2 (assemble instructions and generate object program)
Assemble instructions (generate opcode and look up addresses)
Generate data values defined by BYTE, WORD
Perform processing of assembler directives not done during Pass 1
Write the object program and the assembly listing
OPTAB:
SYMTAB:
This table includes the name and value for each label in the source program,
together with flags to indicate the error conditions (e.g., if a symbol is defined
in two different places).
During Pass 1: labels are entered into the symbol table along with their
assigned address value as they are encountered. All the symbols address value
should get resolved at the pass 1.
During Pass 2: Symbols used as operands are looked up the symbol table to
obtain the address value to be inserted in the assembled instructions.
SYMTAB is usually organized as a hash table for efficiency of insertion and
retrieval. Since entries are rarely deleted, efficiency of deletion is the important
criteria for optimization.
Both pass 1 and pass 2 require reading the source program. Apart from this an
intermediate file is created by pass 1 that contains each source statement
together with its assigned address, error indicators, etc. This file is one of the
inputs to the pass 2.
A copy of the source program is also an input to the pass 2, which is used to
retain the operations that may be performed during pass 1 (such as scanning the
operation field for symbols and addressing flags), so that these need not be
performed during pass 2. Similarly, pointers into OPTAB and SYMTAB is
retained for each operation code and symbol used. This avoids need to repeat
many of the table-searching operations.
LOCCTR:
Apart from the SYMTAB and OPTAB, this is another important variable
which helps in the assignment of the addresses.
LOCCTR is initialized to the beginning address mentioned in the START
statement of the program.
After each statement is processed, the length of the assembled instruction is
added to the LOCCTR to make it point to the next instruction. Whenever a
label is encountered in an instruction the LOCCTR value gives the address to
be associated with that label.
Explanation:
The algorithm scans the first statement START and saves the operand field (the
address) as the starting address of the program. Initializes the LOCCTR value
to this address. This line is written to the intermediate line. If no operand is
mentioned the LOCCTR is initialized to zero. If a label is encountered, the
symbol has to be entered in the symbol table along with its associated address
value.
If the symbol already exists that indicates an entry of the same symbol already
exists. So an error flag is set indicating a duplication of the symbol.
It next checks for the mnemonic code, it searches for this code in the OPTAB.
If found then the length of the instruction is added to the LOCCTR to make it
point to the next instruction.
If the opcode is the directive WORD it adds a value 3 to the LOCCTR. If it is
RESW, it needs to add the number of data word to the LOCCTR. If it is BYTE
it adds a value one to the LOCCTR, if RESB it adds number of bytes.
If it is END directive then it is the end of the program it finds the length of the
program by evaluating current LOCCTR – the starting address mentioned in
the operand field of the END directive. Each processed line is written to the
intermediate file.
else
store 0 as operand address assemble the object code
instruction
else if OPCODE = ‘BYTE’ or ‘WORD” then
convert constant to object code
if object code doesn’t fit into current Text record then
begin
Write text record to object code
initialize new Text record
end
add object code to Text record
end {if not comment}
write listing line
read next input line
end
write listing line
read next input line
write last listing line
End {Pass 2}
Explanation:
Here the first input line is read from the intermediate file. If the opcode is
START, then this line is directly written to the list file. A header record is
written in the object program which gives the starting address and the length of
the program (which is calculated during pass 1). Then the first text record is
initialized. Comment lines are ignored. In the instruction, for the opcode the
OPTAB is searched to find the object code.
If a symbol is there in the operand field, the symbol table is searched to get the
address value for this which gets added to the object code of the opcode. If the
address not found then zero value is stored as operands address. An error flag is
set indicating it as undefined. If symbol itself is not found then store 0 as
operand address and the object code instruction is assembled.
If the opcode is BYTE or WORD, then the constant value is converted to its
equivalent object code( for example, for character EOF, its equivalent
hexadecimal value ‘454f46’ is stored). If the object code cannot fit into the
current text record, a new text record is created and the rest of the instructions
object code is listed. The text records are written to the object program. Once
the whole program is assemble and when the END directive is encountered, the
End record is written.
Accordingly it supports only one instruction format. It has only two registers:
register A and Index register.
Therefore the addressing modes supported by this architecture are direct,
indirect, and indexed.
Whereas the memory of a SIC/XE machine is 220 bytes (1 MB). This supports
four different types of instruction types, they are:
1 byte instruction
2 byte instruction
3 byte instruction
4 byte instruction
Instructions can be:
– Instructions involving register to register
– Instructions with one operand in memory, the other in Accumulator
(Single operand instruction)
– Extended instruction format
Addressing Modes:
PC-relative or Base-relative addressing: op m
Indirect addressing: op @m
Immediate addressing: op #c
Extended format: +op m
Index addressing: op m,x
register-to-register instructions
larger memory -> multi-programming (program allocation)
Hence the displacement of the operand is relative to the current program counter value.
The following example shows how the address is calculated:
Base-Relative Addressing Mode: in this mode the base register is used to mention the
displacement value. Therefore the target address is
TA = (base) + displacement value
This addressing mode is used when the range of displacement value is not sufficient.
Hence the operand is not relative to the instruction as in PC-relative addressing mode.
Whenever this mode is used it is indicated by using a directive BASE. The moment the
assembler encounters this directive the next instruction uses base-relative addressing
mode to calculate the target address of the operand.
When NOBASE directive is used then it indicates the base register is no more used to
calculate the target address of the operand. Assembler first chooses PC-relative, when
the displacement field is not enough it uses Base-relative.
LDB #LENGTH (instruction)
BASE LENGTH (directive)
:
NO BASE
In the above example the use of directive BASE indicates that Base-relative addressing
mode is to be used to calculate the target address. PC-relative is no longer used. The
value of the LENGTH is stored in the base register. If PC-relative is used then the
target address calculated is:
The LDB instruction loads the value of length in the base register which 0033. BASE
Directive explicitly tells the assembler that it has the value of LENGTH.
BUFFER is at location (0036)16
(B) = (0033)16
disp = 0036 – 0033 = (0003)16
The instruction jumps the control to the address location RETADR which in turn has
the address of the operand. If address of RETADR is 0030, the target address is then
0003 as calculated above.
Program Relocation:
The actual starting address of the program is not known until load time
An object program that contains the information necessary to perform this kind
of modification is called a relocatable program
No modification is needed: operand is using program-counter relative or base
relative addressing
The only parts of the program that require modification at load time are those
that specified direct (as opposed to relative) addresses
Absolute program, starting address 1000
e.g. 55 101B LDA THREE 00102D
Relocate the program to 2000
e.g. 55 101B LDA THREE 00202D
Each Absolute address should be modified
Except for absolute address, the rest of the instructions need not be modified
not a memory address (immediate addressing), PC-relative, Base-relative.
The only parts of the program that require modification at load time are those
that specify direct addresses.
The above diagram shows the concept of relocation. Initially the program is
loaded at location 0000. The instruction JSUB is loaded at location 0006. The
address field of this instruction contains 01036, which is the address of the
instruction labeled RDREC.
The second figure shows that if the program is to be loaded at new location
5000. The address of the instruction JSUB gets modified to new location 6036.
Likewise the third figure shows that if the program is relocated at location
7420, the JSUB instruction would need to be changed to 4B108456 that
correspond to the new address of RDREC.
The only part of the program that require modification at load time are those
that specify direct addresses. The rest of the instructions need not be modified.
The instructions which doesn’t require modification are the ones that is not a
memory address (immediate addressing) and PC-relative, Base-relative
instructions.
From the object
program, it is not possible to distinguish the address and constant The
assembler must keep some information to tell the loader. The object program
that contains the modification record is called a relocatable program.
For an address label, its address is assigned relative to the start of the program
(START 0). The assembler produces a Modification record to store the starting
location and the length of the address field to be modified. The command for
the loader must also be a part of the object program. The Modification has the
following format:
Modification record
Col. 1 M
Col. 2-7 Starting location of the address field to be modified, relative to the
beginning of the program (Hex)
Col. 8-9 Length of the address field to be modified, in half-bytes (Hex)
One modification record is created for each address to be modified The length is stored
in half-bytes (4 bits) The starting location is the location of the byte containing the
leftmost bits of the address field to be modified. If the field contains an odd number of
half-bytes, the starting location begins in the middle of the first byte.
In the above object code the red boxes indicate the addresses that need modifications.
The object code lines at the end are the descriptions of the modification records for
those instructions which need change if relocation occurs. M00000705 is the
modification suggested for the statement at location 0007 and requires modification 5-
half bytes. Similarly the remaining instructions indicate.