FE Exam Preparation Book VOL1 LimitedDisclosureVer
FE Exam Preparation Book VOL1 LimitedDisclosureVer
FE Exam Preparation Book VOL1 LimitedDisclosureVer
1
Version
FE Exam
Preparation Book
Preparation Book for Fundamental Information Technology Engineer Examination
Table of Contents
Chapter 1
Computer Science Fundamentals 2
1.1 Basic Theory of Information 3
1.1.1 Radix Conversion 3
1.1.2 Numerical Representations 7
1.1.3 Non-Numerical Representations 10
1.1.4 Operations and Accuracy 11
Quiz 14
1.2 Information and Logic 15
1.2.1 Logical Operations 15
1.2.2 BNF 18
1.2.3 Reverse Polish Notation 21
Quiz 24
1.3 Data Structures 25
1.3.1 Arrays 25
1.3.2 Lists 27
1.3.3 Stacks 29
1.3.4 Queues (Waiting lists) 30
1.3.5 Trees 32
1.3.6 Hash 34
Quiz 37
1.4 Algorithms 38
1.4.1 Search Algorithms 38
1.4.2 Sorting Algorithms 41
1.4.3 String Search Algorithms 45
1.4.4 Graph Algorithms 48
Quiz 50
Questions and Answers 51
i
Chapter 2
Computer Systems 62
2.1 Hardware 63
2.1.1 Information Elements (Memory) 63
2.1.2 Processor Architecture 65
2.1.3 Memory Architecture 68
2.1.4 Magnetic Tape Units 70
2.1.5 Hard Disks 73
2.1.6 Terms Related to Performance/ RAID 77
2.1.7 Auxiliary Storage / Input and Output Units 79
2.1.8 Input and Output Interfaces 81
Quiz 83
2.2 Operating Systems 85
2.2.1 Configuration and Objectives of OS 85
2.2.2 Job Management 87
2.2.3 Task Management 89
2.2.4 Data Management and File Organization 90
2.2.5 Memory Management 95
Quiz 99
2.3 System Configuration Technology 100
2.3.1 Client Server Systems 100
2.3.2 System Configurations 102
2.3.3 Centralized Processing and Distributed Processing 104
2.3.4 Classification by Processing Mode 106
Quiz 108
2.4 Performance and Reliability of Systems 109
2.4.1 Performance Indexes 109
2.4.2 Reliability 111
2.4.3 Availability 113
Quiz 116
2.5 System Applications 118
2.5.1 Network Applications 118
2.5.2 Database Applications 121
2.5.3 Multimedia Systems 123
Quiz 125
Questions and Answers 126
ii
Chapter 3
System Development 138
3.1 Methods of System Development 139
3.1.1 Programming Languages 139
3.1.2 Program Structures and Subroutines 141
3.1.3 Language Processors 143
3.1.4 Development Environments and Software Packages 144
3.1.5 Development Methods 147
3.1.6 Requirement Analysis Methods 149
3.1.7 Software Quality Management 151
Quiz 154
3.2 Tasks of System Development Processes 155
3.2.1 External Design 155
3.2.2 Internal Design 157
3.2.3 Software Design Methods 159
3.2.4 Module Partitioning Criteria 162
3.2.5 Programming 163
3.2.6 Types and Procedures of Tests 165
3.2.7 Test Techniques 167
Quiz 170
Questions and Answers 172
Chapter 4
Network Technology 181
4.1 Protocols and Transmission Control 182
4.1.1 Network Architectures 182
4.1.2 Transmission Control 184
Quiz 187
4.2 Transmission Technology 188
4.2.1 Error Control 188
4.2.2 Synchronization Control 190
4.2.3 Multiplexing and Communications 192
4.2.4 Switching 194
Quiz 195
4.3 Networks 196
4.3.1 LANs 196
4.3.2 The Internet 198
4.3.3 Various Communication Units 200
4.3.4 Telecommunications Services 202
Quiz 204
Questions and Answers 205
iii
Chapter 5
Database Technology 212
5.1 Data Models 213
5.1.1 3-layer Schemata 213
5.1.2 Logical Data Models 215
5.1.3 E-R Model and E-R Diagrams 217
5.1.4 Normalization and Reference Constraints 218
5.1.5 Data Manipulation in Relational Database 221
Quiz 223
5.2 Database Languages 224
5.2.1 DDL and DML 224
5.2.2 SQL 226
Quiz 231
5.3 Control of Databases 232
5.3.1 Database Control Functions 232
5.3.2 Distributed Databases 234
Quiz 236
Questions and Answers 237
Chapter 6
Security and Standardization 244
6.1 Security 245
6.1.1 Security Protection 245
6.1.2 Computer Viruses 247
6.1.3 Computer Crime 249
Quiz 251
6.2 Standardization 252
6.2.1 Standardization Organizations and Standardization of Development and
Environment 252
6.2.2 Standardization of Data 254
6.2.3 Standardization of Data Exchange and Software 256
Quiz 258
Questions and Answers 259
iv
Chapter 7
Computerization and Management 262
7.1 Information Strategies 263
7.1.1 Management Control 263
7.1.2 Computerization Strategies 265
Quiz 267
7.2 Corporate Accounting 268
7.2.1 Financial Accounting 268
7.2.2 Management Accounting 270
Quiz 274
7.3 Management Engineering 275
7.3.1 IE 275
7.3.2 Schedule Control (OR) 278
7.3.3 Linear Programming 282
7.3.4 Inventory Control (OR) 284
7.3.5 Probability and Statistics 286
Quiz 290
7.4 Use of Information Systems 291
7.4.1 Engineering Systems 291
7.4.2 Business Systems 293
Quiz 296
Questions and Answers 297
v
Part 1
PREPARATION FOR
MORNING EXAM
The Morning Exam questions are formulated from the following seven
fields: Computer Science Fundamentals, Computer Systems, System
Development, Network Technology, Database Technology, Security
and Standardization, and Computerization and Management.
Here, detailed explanations of each field are provided in the
beginning of each chapter, followed by the actual questions used in
the past exams, as well as answers and comments that are included
in the end of each chapter.
Chapter Objectives
In order to become an information technology engineer, it
is necessary to understand the structures of information
processed by computers and the meaning of information
processing. All information is stored as binary numbers in
computers; therefore, in Section 1, we will learn the form
in which decimal numbers and characters we use in daily
life are stored in computers. In Section 2, we will study
logical operations as a specific example of information
processing. In Section 3, we will learn data structures, of
which modification is necessary to increase the ease of
data processing. Lastly, in Section 4, we will study
specific data processing methods.
The term “Radix1 conversion” means, for instance, converting a decimal number to a binary
number. Here, “10” in decimal numbers and “2” in binary numbers are called the radices.
Inside a computer, all data is expressed as binary numbers since the two conditions of
electricity, ON and OFF, correspond to the binary numbers. Each digit of a binary number is
either a “0” or a “1,” so all numbers are expressed by two symbols—0 and 1.
However, binary numbers, expressed as combinations of 0s and 1s, tend to be long and hard to
understand, so the concept of hexadecimal notation was introduced. In hexadecimal notation,
4 bits2 (corresponding to numbers 0 through 15 in decimal notation) are represented by one
digit (0 through F).
The table below shows the correspondence among the decimal, binary, and hexadecimal
notations.
1
Radix: It is the number that forms a unit of weight for each digit in a numeration system such as binary, octal, decimal,
and hexadecimal notations. The radix in each of these notations is 2, 8, 10, and 16, respectively.
Binary system: uses 0 and 1.
Octal system: uses 0 through 7.
Decimal system: uses 0 through 9.
Hexadecimal system: uses 0 through F.
2
Bit: It means the smallest unit of information inside a computer, expressed by a “0” or a “1.” Data inside a computer is
expressed in binary, so a bit represents one digit in binary notation. For the purpose of convenience, the hexadecimal and
octal notations are represented by partitioning binary numbers as follows:
Quaternary: 2 bits (0 through 3)
Octal: 3 bits (0 through 7)
Hexadecimal: 4 bits (0 through F)
(1100100)2 = 1 × 26 + 1 × 25 + 0 × 24 + 0 × 23 + 1 × 22 + 0 × 21 + 0 × 20
= 64 + 32 + 4
= (100)10 …… (b)
For those digits to the right of the radix point, the weights are r-1, r-2, r-3, … in order. Thus, the
conversion is shown below. (In these examples, (c) is shown in hexadecimal, and (d) is in
binary.)
(59)10 = 32 + 16 + 8 + 2 + 1 = 25 + 24 + 23 + 21 + 20
= 1 × 25 + 1 × 24 + 1 × 23 + 0 × 22 + 1 × 21 + 1 × 20
(1 1 1 0 1 1)2
3
Weight: the value that indicates each scaling position in numerical expressions such as binary, octal, decimal, and
hexadecimal.
However, we can also divide the given number by 2 sequentially and repeat it until the quotient
becomes 0. This is a mechanical conversion method, so calculation errors can be reduced. 4
Remainder
2 59 … 1 Æ (1) 59 / 2 =29 remainder 1
2 29 … 1 Æ (2) 29 / 2 =14 remainder 1
2 14 … 0 Æ (3) 14 / 2 = 7 remainder 0
2 7 … 1 Æ (4) 7 / 2 = 3 remainder 1
2 3 … 1 Æ (5) 3 / 2 = 1 remainder 1
2 1 … 1 Æ (6) 1 / 2 = 0 remainder 1
0 ← “The process ends when the quotient is 0.” (7) List the remainders from the bottom. Æ (59)10 = (111011)2
(0.1 0 0 1 1)2
However, we can also multiply the fractional part (the part to the right of the decimal (or radix)
point) by 2 sequentially and repeat it until the fractional part becomes 0. This is a mechanical
conversion method, so calculation errors can be reduced.
(5) List the integer-part values from the top. Æ (0.59375)10 = (0.10011)2
0.59375 × 2= 1 .1875 Æ (1) Write down only the fractional part.
0.1875 × 2= 0 .375 Æ (2) Write down only the fractional part.
0.375 × 2= 0 .75 Æ (3) Write down only the fractional part.
0.75 × 2= 1 .5 Æ (4) Write down only the fractional part.
0.5 × 2= 1 .0 ← The process ends when the fractional part becomes 0.5
4
(Note) There is no guarantee that multiplying the fractional part by 2 always produces 0. We can verify this fact by
converting 0.110 into the binary number; it becomes a repeating binary fraction. It is always possible to convert a binary
fraction to a decimal fraction, but not vice versa. In such a case, we can stop the conversion at an appropriate place.
5
Repeating fraction: a number with a radix point where a sequence of digits is repeated indefinitely. For instance, 1 / 3 =
0.333…, and 1 / 7 = 0.142857142857…, wherein the patterns “3” and “142857” are repeated, respectively.
(2 D E 4) 16
(2 D E 4)16
0. 1011 1000
Decimal numbers used in our daily life need converting to a format which is convenient for
computer processing, so there are various formats available to represent numerical values.
Some of the formats that represent numerical values in a computer are shown below.
6
(FAQ) There are many questions mixing multiple radices (bases) such as “Which of the following is the correct result
(in decimal) of adding the hexadecimal and binary numbers?” If the final result is to be represented in decimal, it is better
that you convert the original numbers to decimal first and then calculate it. If the final result is to be represented in a radix
other than 10 (binary, octal, hexadecimal, etc.), it is better that you convert the original numbers to binary first and then
carry on the calculation.
1 2 3 + 4
+1234 0011 0001 0011 0010 0011 0011 1100 0100 Sign bits 1100: Positive or zero
1 2 3 - 4
-1234 0011 0001 0011 0010 0011 0011 1100 0100 1101: Negative
Zone bits8 Numeral bits Sign bits Numeral bits Zone bits: 0011
In the packed decimal format, each digit of the decimal number is represented with 4 bits, and
the last four bits indicate the sign. The leading space of the highest byte is padded with 0s. The
bit pattern of the sign bits is the same as that of the zoned decimal format. In the examples
shown below, 2 bytes and 4 bits are sufficient to represent the numbers, but in both cases 3
bytes are used by appending four leading 0s since computers reserve areas in byte9 units.
0 1 2 3 4 + 0 1 2 3 4 -
+1234 0000 0001 0010 0011 0100 1100 -1234 0000 0001 0010 0011 0100 1101
7
(Hints and Tips) If the sign (positive or negative) is not used in the zoned decimal format, the sign bits are identical to
the zone bits.
8
(Note) The bit patterns in the zone bits are different depending on the computer. The examples shown here have “0011,”
but some computers use “1111.” The numeral bits, however, are identical.
9
Byte: A byte is a unit of 8 bits. It is the unit for representing characters.
Let us represent the decimal number “-20” in two's complement. First, we can represent the
decimal number “+20” in binary as shown below.
(+20)10 = 0 0 0 1 0 1 0 0
Reverse each bit.
1 1 1 0 1 0 1 1 One's complement10
+) 1 Add 1
(-20)10 = 1 1 1 0 1 1 0 0 Two's complement
Hence, (-20)10 is represented as (11101100)2. The bit length varies from computer to computer.
In general, the numbers from -2n-1 through 2n-1 – 1, a total of 2n numbers, can be represented by
using n bits. Note that, considering only the absolute values, one more negative number can be
represented in comparison with positive numbers.
10
Complement: The complement of a number is the value obtained by subtracting the given number from a certain fixed
number, which is a power of the radix or a power of the radix minus 1. For instance, in decimal, there are ten's
complements and nine's complements. In binary, there are two's complements and one's complements. In general, in the
r-ary system, there are r's complements and (r-1)'s complements. If x is an n-digit number in the r-ary system, r's
complement of x is (rn-x), and (r-1)'s complement of x is ((rn-1)-x). For example, the three-digit number “123” in decimal
has the following complements: ten's complement is “1000 – 123 = 877,” and nine's complement is “999 – 123 = 876.”
The 4-bit number “0101” in binary has the following complements: two's complement is “10000 – 0101 = 1011,” and one's
complement is “1111 – 0101 = 1010.”
Note that one's complement in binary is just the reverse of each bit (0 becomes 1 and vice-versa). Two's complement is
one's complement plus 1.
11
Register: It is low-capacity, high-speed memory placed in the CPU for temporary storage of data.
12
(FAQ) There are many questions on converting a given binary number into the corresponding negative number and
converting a given negative number into the corresponding positive number.
Character Representations
Using n-bit binary numbers, there are 2n types of codes available, and one-to-one
correspondence to those codes allows us to represent 2n types of characters (alphabet characters,
numeric characters, special characters, and various symbols).
Each digit of decimal number can be represented by using 4 bits. The following shows such an
example.
22's bit
21's bit
20's bit
23's bit
22's bit
21's bit
20's bit
← Scaling positions
Computers are equipped with circuits to perform the four fundamental arithmetic operations
and shift operations. For operations such as computing 2n, the operation speed improves by
using shift operations (or moving digits). All computer operations are executed in the register.
This register14 has only the limited number of significant digits, so an operation result may
contain a margin of error.
13
(FAQ) There have been many exam questions that require some knowledge of organizations which have established
functions and standards regarding JPEG and MPEG. Several keywords, such as JPEG, ISO, and ITU-T for still images,
MPEG, ISO, and IEC for motion pictures, should be checked prior to the exam.
14
Register: It is the low-capacity, high-speed memory placed in the CPU for temporary storage of data; this includes
general-purpose registers used by the CPU to carry out operations.
Shift Operations
A shift operation is the operation of shifting (moving) a bit string to the right or to the left.
Shift methods can be classified as shown below.
Arithmetic shift
An arithmetic shift is used when data is handled as numeric data with a positive or negative
sign; it is an operation of shifting a bit string, except for the sign bit, representing a fixed-point
number. The arithmetic left shift inserts a “0” in the rightmost place that has been made empty
by the shift. In general, shifting left by n bits increases the number by 2n times. The arithmetic
right shift, on the other hand, inserts a value identical to the sign bit into the leftmost place that
has been made empty by the shift. In general, shifting right by n bits reduces the number by 2-n
times (1/2n). Examples of 1-bit arithmetic shifts are illustrated below. Shifting 1 bit to the left
doubles the value while shifting 1 bit to the right reduces the value to half.
Logical shift
Unlike an arithmetic shift, a logical shift does not handle the data as numeric data; rather, it
handles the data merely as bit strings. It shifts an entire bit string of data and inserts 0s in places
vacated by the shift. In logical shifts15, there is no such relation as a change by 2n or 2-n times
in arithmetic shifts. Examples of 1-bit logical shifts are illustrated below:
Overflow Overflow
01111010 10011001
11110100 01001100
(Logical left shift) (Logical right shift)
Inserting 0s in vacated digits
15
(Note) In a logical shift, the figure indicates that the sign bit of 0 may become 1 after the shift. If the data is numeric,
this means that a positive number changes to a negative number by the shift operation.
Errors
Since operations are executed by a computer register with a limited number of digits, numerical
values that cannot be contained in the register are be ignored, resulting in differences between
the operation results and true values. Such a difference is called an error.
Rounding errors
Since computers cannot handle an infinite (non-terminating) fraction, bits smaller than a certain
bit are rounded off, rounded down, or rounded up to the value with the limited number of
significant digits. The difference between the true value and the result of such rounding is
called the rounding16 error (or round-off error).
When one number is subtracted from another number almost identical to it, or when two
numbers, one positive and the other negative, with almost identical absolute values are added
together, the number of significant digits could drop drastically. This is called a cancellation of
significant digits (or cancellation error).
356.3622
- 356.3579
0.0043
Since the higher digits become 0, the number of significant digits decreases
drastically.
When a very large number and a very small number are added together, or when one is
subtracted from the other, some information (or a part thereof) in the lower digits, which cannot
be contained in the mantissa, can be lost due to the alignment of the numbers. This is called a
loss of trailing digits. In order to keep the error by loss of trailing digits small, it is necessary
to do addition and subtraction in an order starting with numbers with small absolute values.
356.3622
- 0.000015 Digits in extremely small place values get omitted.
356.3622
16
Rounding: It is a way to approximate a number by rounding off, rounding down, or rounding up so that it can be easily
handled by people. For instance, if 2.15 is rounded to the nearest integer, it is rounded to 2, with an error of 0.15.
Quiz
Q1 Express the decimal number 100 in the binary, octal, and hexadecimal notations.
Q2 Perform arithmetic right and logical right shifts by 3 bits on the 8-bit binary number 11001100.
A1 Binary: (1100100)2
Octal: (144)8
Hexadecimal: (64)16
For conversion to binary:
100 = 64 + 32 + 4
= 26 + 25 + 22
= 1 × 26 + 1 × 25 + 0 × 24 + 0 × 23 + 1 × 22 + 0 × 21 + 0 × 20
= (1100100)2
1 4 48 6 416
11111001 00011001
Loss of trailing digits: a phenomenon where some information (or a part thereof) in the
lower digits, which cannot be contained in the mantissa, can be lost due to the alignment
of the numbers when a very large number and a very small number are added together, or
when one is subtracted from the other
To make a computer perform a task, a program written according to rules is needed. Here, we
will learn about logical operations, BNF, and reverse Polish notation. Logic operations are
fundamental to the mechanism of operations. BNF is syntax rules for writing programs.
Reverse Polish notation is used to interpret mathematical formulas written in programs.
Basic logical operations include logical product (AND), logical sum (OR), logical negation
(NOT), and exclusive logical sum (EOR, XOR). Logical negation is sometimes referred to
simply as negation.
17
(Note) The inside of a computer is equipped with circuits corresponding to logical product, logical sum, and logical
negation. All operations are executed using combinations of these circuits.
Exclusive logical sum can be expanded, as shown below. Many questions can be easily
answered if you know the expanded form of exclusive logical sum, so be sure to know the
expanded formula.
A⊕ B = A⋅ B + A⋅ B
A B A B A⊕ B A⋅ B A⋅ B A⋅ B + A⋅ B
0 0 1 1 0 0 0 0
0 1 1 0 1 0 1 1
1 0 0 1 1 1 0 1
1 1 0 0 0 0 0 0
De Morgan's Theorem
A well-known set of formulas concerning logical operations is De Morgan's theorem. These
laws give the relations, as shown below. You can easily memorize them if you remember to
exchange logical products and logical sums when removing parentheses. Many questions can
be easily answered if you know De Morgan's theorem, so be sure to know these formulas.20
( A ⋅ B) = A + B
( A + B) = A ⋅ B
18
Negation of logical sum: ( A + B) = A ⋅ B . Negation of logical product: ( A ⋅ B) = A + B .
19
(Note) Some truth tables represent 1 with “T” (or true) and 0 with “F” (or false).
20
(FAQ) Many questions can be easily answered if you know De Morgan's theorem. There are also many questions that
can be easily answered if you know the expanded form of exclusive logical sum.
Adder
An adder is a circuit that performs addition of 1-bit binary numbers, consisting of AND, OR,
and NOT logic circuits.21 There are half adders, which do not take into account carry-overs
from lower bits, and full adders, which take into account carry-overs from lower bits.
When a signal of a “0” or a “1” is sent to the inputs A and B of a circuit, the addition result
appears as outputs C and S. Here, C indicates a carry-over, and S is the lower bit of the result of
addition. The binary addition result is shown below. As seen here, C is the logical product, and
S is the exclusive logical sum.22
A B C S
0 + 0 = 0 0
0 + 1 = 0 1
1 + 0 = 0 1
1 + 1 = 1 0
In the figure below, the circuit structure of a half adder is shown on the left. The figure on the
right is simplified notation for a half adder, which is generally used.
Simplified notation23
21
(FAQ) There are many questions on the use of adders. As a shortcut, most of these questions can be answered if you
know logical operations, but you can save time by knowing the operation results of adders.
22
(Hints and Tips) Be sure to understand the binary 1-bit operations correctly. Be very careful since it is easy to make
careless mistakes. The four additions of 1-bit binary numbers are shown below.
If A and B are both 1s, simple addition gives the sum of 2, but in binary, in which only 0s and 1s are used, a carry-over
takes place, resulting in the sum of “10.” If the adder circuit does not carry over, the output “0” is produced.
23
You must have the circuit symbols memorized well. Be careful not to mix the AND and OR circuits.
For a full adder, there are three input values, one of which is the carry-over from the lower bit.
Hence, a full adder adds three values X, Y, and Z. The addition results are as shown below.
Unlike half adders, there are no general relations such as logical product and exclusive logical
sum with a full adder.
X Y Z C S
0 + 0 0 = 0 0
0 + 0 1 = 0 1
0 + 1 0 = 0 1
0 1 1 = 1 0
1 0 0 = 0 1
1 0 1 = 1 0
1 1 0 = 1 0
1 + 1 1 = 1 1
In the figure below, the circuit structure of a full adder is shown on the left. As shown in the
figure, a full adder consists of two half adders combined. The figure on the right is simplified
notation for the full adder.
Simplified notation
1.2.2 BNF
¾ A means of strictly expressing the grammar of a programming
Points language
¾ The terminal symbols cannot be further decomposed.
To define the grammar of a programming language (syntax definition), expressions free from
any ambiguity are required. To express such a grammar, BNF (Backus-Naur Form) is often
used.24
BNF defines the rules of character orders by using characters; it also defines repetition and
selection using appropriate character symbols. Since only characters are used in the definitions,
the expressions are simple and close to the final descriptive style of the sentences. Furthermore,
not only does BNF give unambiguous definitions, it is also considered to be easy to
understand.
24
(Note) BNF was first used to define ALGOL60, a programming language for technical calculations. BNF is a language
to define syntax formally, not to stipulate any meanings. Hence, it cannot define every rule of a language, so today many
extensions of BNF are used.
Sequence
25
<x>::=<a><b>
This gives a definition which means “the syntax element x is a string of the character a and b.”
The symbol “::=” means “is defined to be.”
Repetition
<x>::=<a>…
This gives a definition which means “the syntax element x is a repetition of the character a,” It
also means that the character a repeats once or more times.
Selection
<x>::=<a | b>
This gives a definition which means “the syntax element x is either the character a or the
character b.” If one of the options is missing, the following expression is used:
<x>::=[<a>]
This gives a definition which means “the syntax element x is either the character a or the null
character (blank).” The symbols “[ ]” means that it can be omitted.
In the following definitions, the underlined “<x>” is a non-terminal symbol whereas a, b, and c
are terminal symbols.
<y>::=<a><x>
<x>::=<b><c>
25
(Note) < >: These angle brackets are used when characters are consecutively placed or when the bounds are unclear;
these do not have to be used.
26
(Note) Non-terminal symbols: These are used to make the syntax definition easy to understand.
Examples of BNF
For example, the syntax rules of “floating-point constant” are defined as follows:
Let us follow the syntax rules above to see what <floating-point constant> looks like
specifically.27
The definition of <floating-point constant> is separated by the line “|” which separates the
group (1)~(3) from the group (4)~(6), so it has two possible forms. Let us take the first group
(1)~(3) as our example to interpret what <floating-point constant> is. For clarity, we include in
our example the part surrounded by [ ], which can be omitted. Each of the elements (1)~(3) can
be further expanded as follows:
<sign><radix constant><exponent>
<E><sign><numeric string>
<numeral>::= <0|1|2|3|4|5|6|7|8|9>
<numeral>::= 0
27
(FAQ) An example of syntactical rules: often, questions follow the pattern of selecting the sentence that satisfies given
syntactical rules.
28
(Note) An expression with a character string combined with special symbols ($, *, etc.) is called a regular expression.
These designated characters are called meta-characters. Meta-characters have specific meanings. In UNIX, Windows, etc.,
if one searches for a file by entering “*.jpg,” then the system looks for all files with the extension “jpg.” Here, the symbol
“*” is a meta-character.
Further, considering the definitions of <numerical string> and <numeral>, “0” itself is also a
<numerical string>. Therefore, we can have the following:
The allows “012” to be also <numeral string>. Hence, <numeral string> is any consecutive
string of numerals. Hence, for example, <radix constant> can look as follows:
Thus, the interpretation of <floating-point constant> can give us the following example:
<sign><radix constant><exponent>
+ 123.456 E+123
Of course, the <sign> can be a negative sign “–” or can be omitted altogether, so the following
strings can also be floating-point constants.
-123.456E-123
-123.456E123
As you can see, if all specific forms are to be expressed, that would result in an enormous
amount of information. BNF is thus used to give general definitions to avoid such a situation.
Reverse Polish Notation is a method of expressing mathematical formulas we use every day
in a form more easily processed by computers. The basic concept of this notation is that the
operators are written toward the end as opposed to the middle of a formula.
For example, X = A + B * C means “Calculate the product of B and C, add A, and then move
the result to X.” This is expressed by extracting the underlined parts as follows:
XABC*+=
(1) e = a − b ÷ ( c + d )
“(c+d)” is converted to Reverse Polish Notation. Æ cd +
Let us call this string “P.”
(2) e = a − b ÷ P
“ b ÷ P ” is converted to Reverse Polish Notation. Æ bP ÷
Let us call this string “Q.”
(3) e = a − Q
“a – Q” is converted to Reverse Polish Notation. Æ aQ −
Let us call this string “R.”
(4) e = R
“e = R” is converted to Reverse Polish Notation. Æ eR =
(5) Re-write P, Q, and R in Reverse Polish Notation (underlines indicate where replacement
has occurred):
eR = Æ eaQ − = Æ eabP ÷− = Æ eabcd + ÷ − =
29
(FAQ) Conversion into Reverse Polish Notation or into a mathematical formula is a very frequent theme on exams. It is
best if you learn how to answer these questions intuitively.
30
(Note) Intuitively, Reverse Polish Notation follows the order of operations in the formula when converting.
(i) Scan the Reverse Polish Notation from the beginning, looking for an operator.31
(ii) Execute the operation indicated by the first operator, using the two variables immediately
preceding the operator.
(iii) Let the result of the operation of (ii) be a new variable, and repeat the first two steps (i)
and (ii).
For example, consider the formula in Reverse Polish Notation “ eabcd + ÷− = .” This is
converted as follows:
Here, the underlined parts indicate the parts that can be converted.
(1) Scan the Reverse Polish Notation “ eabcd + ÷ − = ” from the beginning, searching for an
operator. The first operator is “+,” so the focus is on that operator and the two variables
preceding it, i.e., “cd+.”
cd + Æ c + d Let this be P. Æ “ eabP ÷ − = ”
(2) Scan the expression “ eabP ÷− = ” from the beginning, searching for an operator. The first
operator is “ ÷ ,” so the focus is on that operator and the two variables preceding it, i.e.,
“ bP ÷ .”
bP ÷ Æ b÷P Let this be Q. Æ “ eaQ− = ”
(3) Scan the expression “ eaQ − = ” from the beginning, searching for an operator. The first
operator is “–,” so the focus is on that operator and the two variables preceding it, i.e.,
“ aQ − .”
aQ − Æ a − Q Let this be R. Æ “ eR = ”
(4) Rewrite P, Q, and R as mathematical formulas (the underlined parts have been replaced).
eR = Æ e = R Æ e = a − Q Æ e = a − b ÷ P Æ e = a − (b ÷ (c + d ))
Removing unnecessary parentheses, we get the following result:
e = a − b ÷ (c + d )
Polish Notation
In Polish Notation, “ a + b ” is expressed as “ + ab ”, for instance.32 Whereas the expression
for this in Reverse Polish Notation is “ ab + ,” Polish Notation places the operator in front of
the variables. The fundamental concept is the same as that of Reverse Polish Notation. If
“ e = a − b ÷ (c + d ) ” is converted to Polish Notation, we have the following:
e = a − b ÷ (c + d ) Æ = e − a ÷ b + cd
31
(Hints and Tips) In Reverse Polish Notation, once you find an operator, there will always be two variables that precede
it immediately.
32
In Polish Notation, every operator is always followed by two variables. If there are not two variables, search for the next
variable.
Quiz
Q1 Given the values of logical variables x and y below, complete the table below by calculating the
logical product, logical sum, and exclusive logical sum.
A1
A2
Adder: A circuit that adds 1-bit binary numbers, consisting of AND, OR, and NOT
circuits.
Half adder: An adder that does not take into account carry-overs from lower bits. There are
two input values and two output values.
Full adder: An adder that does take into account carry-overs from lower bits. There are
three input values and two output values.
A3
ab+cd– ×
The interpretation is that we “add a and b,” “subtract d from c,” and then “multiply” these
results together.
1.3.1 Arrays
¾ Arrays can be used in every data structure.
Points ¾ Arrays are referred to by index.
An array is a data structure consisting of multiple data of the same type. For example, imagine
children lined up in a single row. This situation, in which objects with identical properties
(here, the objects are “children”) are repeated, is similar to an array. Each child is identified as
the “first child,” “second child,” etc. These numbers, “first, second, …” are called index
numbers. An array is used when multiple data of the same type are handled not individually
but in relation to one another. The data is given an array name, and each data field (element) is
identified by an index.
1-Dimensional Arrays
33
A 1-dimensional array is conceptually shown below.
Index 1 2 3 4 … 25 26
Array T a b c d … y z
Each array is given a name. In the example shown above, the name is “T.” To identify each
element, an index is used. An index number represents the position of an element in the array.34
For example, the fourth element “d” is designated by “T(4),” where the index number is in
parentheses. In some languages, square brackets [ ] are used. In general, the n-th element of the
array is denoted by “T(n).” By changing the value of n, we can indicate any element of the
array.
33
(FAQ) There are hardly any questions directly on arrays themselves. However, any question on a data structure or an
algorithm always uses an array. Hence, you must understand arrays properly. More specifically, be sure that you understand
how to use the index.
34
(Hints and Tips) The index begins with 0 in some programming languages. Questions on algorithms on the exam may
have indexes starting at 0 or 1, so care must be taken.
2-Dimensional Arrays
35
A 2-dimensional array is conceptually shown below.
In general, the elements of a 2-dimensional array are identified using two sets of index
numbers m and n. The notation is “a(m,n)” or “amn,” where m is used for the row and n for the
column. The array shown above is a 2-dimensional array with 3 rows and 4 columns,
sometimes called a “3 by 4” array.
Column-directional Row-directional
a(1,1) storage a(1,1) a(1,2) a(1,3) a(1,4) storage a(1,1)
a(2,1) a(2,1) a(2,2) a(2,3) a(2,4) a(1,2)
a(3,1) a(3,1) a(3,2) a(3,3) a(3,4) a(1,3)
a(1,2) a(1,4)
a(2,2) a(2,1)
a(3,2) a(2,2)
a(1,3) a(2,3)
a(2,3) a(2,4)
a(3,3) a(3,1)
a(1,4) a(3,2)
a(2,4) a(3,3)
a(3,4) a(3,4)
In the figure above, take notice of the difference in the indexes. In column-directional storage,
the x in “a(x,y)” is changing first. In row-directional storage, y is changing first. When
referring to an array, it is more efficient to look up the elements consecutively than to access
skipping here and there. Hence, for efficient processing, the indexes are controlled as follows:
• Column-directional: x in “a(x,y)” (x = 1 to m; y = 1 to n) changes first.
• Row-directional: y in “a(x,y)” (x = 1 to m; y = 1 to n) changes first.
With this arrangement, referring to a 2-dimensional array is made more efficient when it is
converted to a 1-dimensional array.
35
(Hints and Tips) A 1-dimensional array is used when data is simply stored. A 2-dimensional array is used when storing
objects like mathematical matrices.
36
Among programming languages, Fortran uses column-directional storage whereas COBOL, PL/I, and C use
row-directional storage.
1.3.2 Lists
¾ Lists are characterized by being connected with pointers.
Points ¾ Operations for a list are controlled by changing the values of
pointers.
A list is a set of identical or similar data placed logically in one line (linear37); its structure is
similar to that of an array. The difference is that, whereas the elements of an array are placed
physically right next to one another, the elements of a list can be placed at independent
locations, and pointers establish connection between them. Because of this, sometimes arrays
and lists are distinguished from each other by another pair of terms: an “array” to refer to a
linear list, and a “list” to refer to a connected list because the elements are connected by
pointers.
In general, the term “list” refers to “connected list”. In the explanations below, we refer to
“connected list” simply as “list.”
Structures of List
A list is a data structure in which the elements are connected by pointers. A pointer is
information indicating the storage location (address) of the next element. Each element is
connected by a pointer, so the elements need not be placed in order.
A list can have a variety of structures. The figure below is called a one-directional
(unidirectional) list.38
Root
First Middle Middle Last
The pointer to the initial element is stored in the variable called the root. The last element (D)
of the list has no element following it, so its pointer includes the symbol (X) indicating that the
element is the last one in the list. In some programming languages, this symbol may be stored
automatically; in others, any symbol can be given. The important thing is to assign a value that
cannot exist as data.
37
(Hints and Tips) The term “linear” here refers to a set of data placed in consecutive locations. An array is linear since the
elements are placed in consecutive area. On the other hand, a (connected) list is a structure where the elements are linked
by pointers, so they may not be placed in consecutive locations.
38
Besides unidirectional lists, there are bidirectional lists and ring lists. A bidirectional list is one in which each element
has a pointer indicating the previous element as well as a pointer indicating the next element. A ring list is one in which the
last element has a pointer indicating the location of the first element.
39
Insert
To insert an element into a given list, all we have to do is to change some pointers
appropriately. First, in the pointer part of the element to be inserted to the list, enter the address
of the element that is to immediately follow the element. Next, change the pointer part of the
element immediately preceding the element to be inserted so that the pointer part can have the
address of the element that is to be inserted.
Element to be inserted
Delete
To delete an element from a given list, just as in insertion, all we have to do is to change
pointers. Change the pointer part of the element immediately preceding the element to be
deleted so that the pointer can indicate the data immediately following the element to be
deleted. The data to be deleted remains as garbage until the list is re-structured, so it is
necessary to perform, in a timely manner, garbage collection 40 to delete unnecessary
elements.41
Garbage
39
(FAQ) Many questions involve insertion into and deletion from a list. You must carefully consider which element it is
whose pointer should be stored.
40
Garbage collection: It is the procedure whereby small, fragmented unused memory and other areas not usable due to
memory leak are combined together in order to increase usable memory space. If garbage is not collected, usable memory
space continues to decrease and finally the system restart will be required.
41
Memory leak: It means the situation wherein the main memory secured dynamically by an application is not released
for some reason and remains in the main memory. To eliminate memory leak, garbage collection is necessary.
1.3.3 Stacks
¾ Stacks are data structures of LIFO (Last-In First-Out).
Points ¾ Stacks are used to manage the return addresses of subroutines.
A stack is a data structure in which data insertion and deletion both take place on the same end
of the list. Conceptually, it can be described as shown below.
The end where insertion (storage) and deletion (removal) of elements take place is called the
top, and the other end is called the bottom. Insertion is called push-down while deletion is
pop-up.
The pointer called stack pointer (SP) is used to keep track of where the top of the stack
currently is; we can store an element into or remove an element from the position indicated by
SP. The stack pointer sometimes points at the actual top element, and sometimes one place
beyond it, depending on implementation.
Use of Stack
When a main program calls a subprogram (subroutine) or a function, often the return address
of the program being executed is stored in a stack; when the subprogram is completed, the
return address of the main program is taken from the stack to return the control. Further, if a
subprogram calls other subprograms, the return addresses of the called programs are stored in
the stack each time in sequence.43
42
(FAQ) Many questions involve stacks. The pattern is that frequently there are questions asking what happens to the
contents of a given stack when push and pop are repeated.
43
Using a stack, a subprogram can be called from within another subprogram. Every time a subprogram is called
sequentially, the return address is stored into the stack. Since taking out follows the order opposite the order in which
storing took place, the subprograms are returned in the opposite order as well. A structure wherein a subprogram is called
from within another subprogram like this is referred to as nested structure.
Root
To add the element “35” to the list, we place it as the first element, i.e., before the element
“10.”
Root
We delete the first element “35” from the list, which is inserted in the above process. As a
result, by combining insertion and deletion, we can implement the stack.
Root
A queue is a data structure in which insertion takes place at one end while deletion
(taking-out) occurs at the other end. Conceptually, it is described as shown below.
Insertion Deletion
Element Element
Tail Head
The first data in a queue is called the head while the last data is called the tail. A queue is
sometimes referred to as a waiting list; this name came from the concept of processing
sequentially.
Operations of Queue
A queue is of the type referred to as FIFO (First-In First-Out), meaning that the element stored
first is taken out first. In the figure below, the data is stored in the order of “1 Æ 2 Æ 3” and
are taken out in the order of “1 Æ 2 Æ 3.”
Insertion Deletion
In a queue, new data is always stored (enqueued) after the last data, and the first (oldest) data is
always deleted (dequeued) first.44
Examples of Queues
In multiple programming, programs waiting to be executed are placed into the queue for
execution as long as their priorities are equal, and they wait for the CPU to be available. In
online transaction45 processing, messages (electronic texts) are entered into a queue and
processed in the order of entrance.
Suppose there is a list as shown below. Here, the pointer to the last element is referred to as the
“tail” for convenience.
Root
Tail
Since we assume here that the element is to be added at the end of the list, the figure above
indicates that the elements were added and stored in the order of “1 Æ 2 Æ 3 Æ 4.”
44
(Note) Examples of queues are seen all around us in daily life. For example, a line of people waiting to purchase train
tickets from a ticket vending machine is a queue, as those who joined the line first purchase tickets first. Because of this
metaphor of people waiting in lines, sometimes a queue is called a waiting list.
45
Online transaction processing: It is the processing mode in which a process request is immediately executed and the
result is returned, such as seat-reservation systems of trains and airlines. For example, when ticketing is requested for a
train ticket, the ticket is immediately printed. A request for processing is called a transaction.
46
(Hints and Tips) A time-sharing system (TSS) appears on the surface as an online transaction process, but the method of
processing is completely different. A queue processes tasks in the order in which they arrived; a TSS splits the processing
time among the tasks. So even if a program (or a terminal) does not finish its processing, after a certain amount of time has
elapsed, another program (or a terminal) begins its processing. A TSS is accomplished by multiprogramming.
The figure below shows how “5” was inserted. The pointer value that indicates the tail has
been switched to point to “5.” Also, the pointer of the element “4,” which used to be the last
element, has been changed so that it can point to the element just inserted.
Root
Tail
Since the first element is “1,” the root pointer is changed to point to the element “2.” Read the
element “1” and see what the pointer says; that should be pointing to the position of the
element “2.”
Root
Tail
1.3.5 Trees
¾ Trees clearly indicate a hierarchical structure.
Points ¾ Among the various types of trees, binary trees are to be thoroughly
understood.
A tree is a data structure that expresses the hierarchical structure between elements. It is used
for the organizational chart of a company, system configuration, etc. It has a root at the top,
and nodes are joined by edges (branches). A node directly above another node is called a
parent, and a node directly below another is called a child.48 Each node is placed at a level
showing the degree of depth; the root is at level 0. A node without any children is called a leaf.
A part of a tree is called a subtree. Given a node, the subtree to the left of it is called the left
subtree; the one on the right is the right subtree.
Edge
Node Level 1 Child Parent
Child Parent
Level 2
47
Multiprogramming: It is a method where multiple programs appear to be running at the same time. No computers can
actually execute multiple programs concurrently. Hence, the computer uses time-sharing to switch, at short time intervals,
the program being executed so that it can appear as though multiple programs are being executed concurrently.
48
(Hints and Tips) A pointer is used for a parent to indicate its child. Each parent thus has as many pointers as its children.
Value of the left child < Value of the parent element < Value of the right child
49
(FAQ) On the Common FE Exam, questions involve only binary trees. Keep straight in your mind the characteristics of
various binary trees, such as complete binary trees, binary search trees, and heaps.
50
(Hints and Tips) In a binary search tree, note that the element with the minimum value is the leftmost leaf while the
element with the maximum value is the rightmost leaf. This is a characteristic of a binary search tree.
Heaps
A binary tree is called a heap if the node values are assigned from the root level and from left
to right on the same level with the following conditions: 51
The heap which meets the former condition is called the max-heap, and the min-heap for the
latter condition.
As a result, elements with large (or small) values are close to the root whereas elements with
small (or large) values are toward the leaves. It is a data structure suitable for retrieving a
maximum (or minimum) value since the root is the element with the largest (or smallest) value.
Maximum
value The value is large.
1.3.6 Hash
¾ Hash is the concept of using the key values directly as the index.
Points ¾ Two methods to avoid collisions are the open-address method and
the chain method.
Hash is the concept of using key values directly as the storage locations of data. For example,
suppose there is an array H of size 100. If the key values are two digits from 01 through 99
without duplication, these key values can be directly used as index numbers. This is called the
direct search method.
However, it is rare that key values can be directly used as index numbers. Thus, to convert key
values to index numbers, a hash function52 is used to calculate hash values, which are then
used as index numbers. The array that stores elements using such a method is called a hash
table.
Consider now the hash function that divides a given key value by the number of elements in
the array and adds 1 to the remainder.
51
(Hints and Tips) Note that the heap shown here has the maximum value at the root. Take out the root, restructure the
heap, and repeat this process; this way, you can take out the elements in the order of their values, from the largest to the
smallest.
52
Hash function: A function that calculates data addresses (index numbers, etc.) from key values
~
Key value Key Data 50
~
Hash function Key Data n
50
If there are n elements, then the remainder will be 0 through (n – 1), so adding 1 will give hash
values of 1 through n. These can then be used as index numbers to be stored in the array.53
However, the keys involve a variety of values, so the same index number can be produced
from different key values by calculating the index number (hash value) 1 through n using the
hash function. When the same hash value is generated in this way, it is called a collision.54
The figure below shows an example in which three pieces of data are stored in the position
with index number 1 in the hash table. This hash table has a pointer indicating the first data.
The position of the next data is found by looking up the pointer part when the first data is read.
Pointer part
Hash table
1 • Data • Data • Data ×
2 ×
Data
…
×
3 •
4 • Data • Data ×
(Home55) (Synonym56)
53
(FAQ) Many questions dealing with hash will ask you to calculate the storage position, and a "mod" function is often
used as the hash function in such cases. “mod (a,b)” is the remainder of “a” divided by “b.”
54
Collision: When a hash function is used to calculate storage addresses, different key values could result in the same
hash value. This is called a collision.
55
Home: Data that had been stored first when a collision occurred
56
Synonym: Data that came in later when a collision occurred
For example, elements a, b, and c are stored in their respective positions (designated by the
index numbers) according to the hash values calculated. Next, element d has the hash value 1,
but position 1 is already taken by element a stored in that location. Here, for example, if the
re-hashing method is determined in advance as “the original hash value + 1,” then the next
position is the one designated by index number 2. But, that location is also taken, and the same
goes for index number 3. Then, looking up position 4, that location is empty. As a result,
element d is stored in the position with index number 4. If there happens to be no vacancy all
the way to the end of the hash table, the search goes back to the first position of the table and
looks for the first vacancy in a similar way.
Quiz
Q1 What do we call a data structure whose concept is shown in the following figure?
Q5 What do we call a binary tree with the following relation: “value of the left child < value of the
parent element < value of the right child.”
A1 List
A2 Stack
A4 Binary tree: A tree in which each node has no more than 2 children
Complete binary tree: A binary tree such that all the leaves are at the same depth or that the
difference of depth between any two leaves is 1 or less and the
leaves are laid out from the left
1.4 Algorithms
Introduction
A set of procedures to solve a problem is called an algorithm. A figure expressing the set of
procedures to obtain appropriate results is called a flowchart. Here, we have selected basic
algorithms to study.57
Search means finding an element in a table (1-dimensional array), and there are two types of
search methods: linear search (sequential search) and binary search. Linear search can be
performed regardless of how the elements are sorted, but binary search requires that the
elements be sorted in ascending or descending order.
Linear Search
This is the method of searching for the desired element in the table from the beginning of the
table in order. It can be done regardless of how the elements are sorted, but it takes longer than
binary search. If N is the number of elements, at least 1 (if the element to be sought is located
at the beginning of the table) and at most N (if the element to be sought is at the end of the
table or does not exist) comparisons are necessary.
In linear search, comparisons are made from index number 1 and continued as 1 is added to the
previous index number until the index number reaches N.
For example, suppose “25” in the table is searched for by linear search. It is compared with the
first value, the second, …, the fifth. These numbers indicating the positions of elements are the
index numbers.
25
15 30 45 40 25 35 10 5
Index 1 2 3 4 5
57
(FAQ) To express algorithms, questions on Morning Exams use flowcharts, while questions on Afternoon Exams use
pseudo-language. Rules on the pseudo-language are not released, so it is a good idea to look through them in advance.
Binary Search
This is an effective method when the elements in the table are sorted in ascending58 or
descending order.59 Comparisons are made in sequence with the middle value of the table.
After the first comparison, the right or left half of the table is discarded, and the middle value
of the remaining part of the table is used for the next comparison. Since the range to be
searched is reduced to half each time, this type of search is faster on average than linear search.
Let us explain a specific algorithm, using the following array as an example. Suppose that we
are looking for the value “11.”
Index 1 2 3 4 5 6 7 8 9 10
Array T 0 1 3 5 7 9 11 13 15 17
First comparison
The range is the entire array. Let L be the lower bound and U be the upper bound of the range.
Let M be the middle value (median).
Index 1 2 3 4 5 6 7 8 9 10
Array T 0 1 3 5 7 9 11 13 15 17
L Search range U
The median can be obtained by rounding the quotient up or down; either is acceptable. Here,
we round it down for our explanation.
T(M) = T(5) = 7
The value to be sought is “11,” so “11” cannot be found in the left half of the table, including
the median value because the elements are sorted in ascending order and the desired value is
larger than the median value.60
Second comparison
Since the first comparison made it clear that the desired value is not in the left half of the table
including the median, we change the search range. Here, the lower bound is changed to the
value immediately to the right of the median. The value of L is then changed as follows:
L=M+1=5+1=6
In the same way as the first comparison, we find the new median as follows:
M = (L + U) / 2 = (6 + 10) / 2 = 8 (median, shaded value)
T(M) = T(8) = 13
58
Ascending order: Order in which data is sorted from the smallest key value to the largest key value
59
Descending order: Order in which data is sorted from the largest key value to the smallest key value
60
(Hints and Tips) When deleting the left half of a table, the new lower bound is the median plus 1; when deleting the
right half, the new upper bound is the median minus 1.
Since we are comparing this to “11,” the desired value “11” cannot be in the right half of the
search range including this new median. Since the elements are sorted in ascending order and
the desired value is smaller than the value of the median.
Third comparison
Since the second comparison made it clear that the desired value is not in the right half of the
search range including the median, we change the search range. Here, the upper bound is
changed to the value immediately to the left of the median. The value of U is then changed as
follows:
U=M–1=8–1=7
In the same way as the second comparison, we find the (new) median as follows:
T(M) = T(6) = 9
Since we are comparing this to “11,” the desired value “11” cannot be in the left half of the
search range including the median.
Fourth comparison
Since the third comparison made it clear that the desired value is not in the left half of the
search range including the median, we change the lower bound in the same way as the second
comparison.
L=M+1=6+1=7
T(M) = T(7) = 11
Since we are comparing this to “11,” we can find the desired value.61
Suppose, for example, that we search for “10.” At the fourth comparison, the formula “T(M) =
11 > 10” holds true, so we have to change the upper bound of the search range. The new search
61
(Hints and Tips) The element “11,” which is T(7) in array T, was found after 4 comparisons in binary search. Linear
search requires 7 comparisons to find the value.
range is as follows:
L = 7 (remains unchanged)
U=M–1=7–1=6
Since L is the lower bound and U is the upper bound, we should have “ L ≤ M ,” but now we
have “ L > M .” When this inequality holds, we determine that the desired element is not
present.
However, consider searching for “0,” the index of which is “1.” Linear search can find it at the
first comparison while binary search takes 3 comparisons. Here, linear search is faster.
To address this issue, there is a concept called the mean number of comparisons. When the
number N of elements is very large, this value tells us how many comparisons are required on
average. We omit detailed explanations here, but this is obtained by the following formulas:62
Let me add a word on the square brackets [ ] used in [log2N]. In general, log2N is not an integer,
but the number of comparisons must be an integer. Hence, [ ] denotes deleting, or truncating,
the fractional part. For example, [10.513] is 10.
Sort means rearranging elements and/or records of an array in a certain order according to a
key. Arranging elements from the smallest key value to the largest is called sorting in
ascending order, and ordering them from the largest key value to the smallest is called
sorting in descending order.
Sorting the contents of an area in a program, such as an array, is called internal sorting
whereas sorting data stored in an external device such as records in a file is called external
sorting (file sorting).63 Typical methods for internal sorting include bubble sort, selection sort,
62
(FAQ) The average number of comparisons and the maximum number of comparisons in binary search are frequently
asked on exams, so it is a good idea to have these formulas memorized.
63
(Hints and Tips) Questions involving internal sorting on the Common FE Exams are almost always about array
manipulation. Be careful not to switch the index numbers when data is switched.
insertion sort, quick sort, merge sort, shell sort, and heap sort.64
Bubble Sort
In bubble sort, each adjacent pair of elements is sequentially compared and exchanged if
necessary. In case of sorting in ascending order, the maximum value is put as the last element
in the array. Next, going back to the beginning, the values are checked and exchanged when
necessary. On the second run, the element at the end of the array is outside the sorting range.
Continuing this process, the range gets smaller each time, and the sorting ends when the first
and second elements are compared.
Selection Sort
Selection sort finds the maximum value (or the minimum value) from the array and exchanges
it with the element at the end of the array. Next, it finds the maximum (or minimum) value
from the array except for the last element and exchanges it with the second-to-the-last element
of the array. Repeating this procedure, selection sort ends when it compares the first and
second elements of the array.65
First run
5 4 3 2 1: Since 5 is the maximum value, it is exchanged with the
last element “1.”
Second run
1 4 3 2| 5: Since 4 is the maximum value, it is exchanged to the
last element in the second run.
64
(FAQ) Questions of internal sorting appear on the Common FE Exams. Bubble sort and selection sort have appeared
very frequently, so be sure to understand their algorithms well.
65
(FAQ) Bubble sort and selection sort very frequently appear on the exams. The questions are given in a variety of ways,
such as on the contents of an array at an intermediate stage and filling in blanks of a flowchart. Be sure that you understand
the algorithms well.
Insertion Sort
Insertion sort starts with an already sorted array, compares the element to be inserted with the
elements in the array, starting from the back, and inserts the element in the appropriate
location.66 Below, the elements to the left of “|” are already sorted. Here, since there is only
one element on the first run, it is already considered to have been sorted.
First run
5| 4 3 2 1: Since 4 is the least value, it is inserted in the
appropriate location (before 5).
Second run
4 5| 3 2 1: Since 3 is the least value, it is inserted in the
appropriate location (before 4).
Third run 3 4 5| 2 1: Since 2 is the least value, it is inserted in the
appropriate location (before 3).
Fourth run
2 3 4 5| 1: Since 1 is the least value, it is inserted in the
appropriate location (before 2).
1 2 3 4 5: Sorting complete
Quick Sort
Quick sort selects a random value from the array and uses its key value as the pivot; the
elements are divided into two groups: the first group in which all elements are less than the
pivot and the second group in which all elements are greater than the pivot (equal values can
go either way). Then, the same procedure is repeated for each group. This is continued until
there is only one element in each group. As a result, the array is sorted.67
Below is an example of sorting in ascending order. The underlined values are the pivots. The
line “|” indicates a block boundary.
66
(Hints and Tips) When finding a location for insertion in insertion sort, the element to be inserted is compared from the
back of an already sorted array. For example, on the third run here, the comparison will be “2 and 5,” “2 and 4,” and “2 and
3,” in this order.
67
(Note) Quick sort and merge sort differ from each other in the number of elements involved in splitting processes, but
they use the same method. In such cases, a method known as “recursive call” is used.
Merge Sort
In merge sort, two or more arrays, each of which is already sorted, are merged together to form
one sorted array. In merge sort, splitting is repeated until each group has only one element.
When each group has only one element, the elements are merged together in sequence.68
Below is an example of sorting in ascending order.
First run Splitting
Second run
Splitting
Third run
Splitting
Fourth run
Merging
Fifth run
Merging
Sixth run
Merging
Seventh run
Shell Sort
This is an improved form of insertion sort; the sorting is made faster by increasing the moving
distances of the elements.
First, the elements are sorted roughly by using insertion sort with gaps of a certain size. Then,
insertion sort is used again to complete the sorting operation.
Below is an example of sorting in ascending order. Initially the gap is set to size 2, i.e., sorting
every other element only. Then, the gap is made 1, and insertion sort is used.
Unsorted 2 4 5 3 1:
First run 2 4 5 3 1: Every other element is sorted (the underlined elements are sorted).
1 4 2 3 5: First run complete
Second run
1 4 2 3 5: Every other remaining element is sorted (the underlined elements
are sorted).
1 3 2 4 5: Second run complete
Third run 1 2 3 4 5: Third run complete (sorting complete)
The reason that such a complicated method is used is the insert sort does not necessarily
require exchanging of elements. For example, consider the following situation.
Case A: 2 4 6 | 1…
Case B: 2 4 6 | 8…
68
Recursion: It is a process in which a function calls itself from within itself. In Pascal and C, “recursive call” is allowed,
but COBOL and Fortran do not allow this.
In Case A, in order to decide where to insert the “1,” comparisons are made 6 Æ 4 Æ 2. Then,
all the elements need to be moved over to make space to insert the “1,” In contrast, in Case B,
as soon as the value is compared to “6,” the insertion location is obtained, without any sliding.
Hence, the amount of processing in insertion sort depends on how the elements are originally
ordered. Shell sort reduces the work of sliding/moving elements by roughly sorting first.
Heap Sort
A heap is a binary tree in which every subtree has the property that a parent has a value larger
than its children. If the root element is picked, we can obtain the maximum value while the
remaining elements can be re-structured to form a heap. We can again pick the root, which
gives us the element with the second largest value. In other words, by repeating root extraction
and re-structuring the heap, sorting can be achieved. This sorting method using a heap is called
heap sort.69
String search means the process of looking for a designated sequence of characters in a text
(character string). In most cases, strings are in arrays where each cell stores one character and
is referenced by index. Two arrays are then given: the text and the designated string (pattern).
The algorithm then searches for the pattern string in the string of the text.
In the example below, we want to check that string S, which is “XYZ,” is present in cells 6~8
and cells 10~12 in string R. It is obvious by visual inspection, but it is actually rather difficult
to create an algorithm to check this.
String S X Y Z Pattern
String R P Q A C Z X Y Z R X Y Z Text
Position 1 2 3 4 5 6 7 8 9 10 11 12
69
(FAQ) For quick sort, merge sort, insertion sort, heap sort, and shell sort, questions generally deal with the concept of
each, so you should understand how each sort processes the data.
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
(1) The first character of the pattern is compared with the first character of the text.
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
(2) Because of the mismatch, the second character of the text is now compared with the first
character of the pattern.
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
(3) Repeat this process, and the first match occurs with the seventh character “X.”
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
(4) Having had a match, now the 8th character of the text is compared with the second
character of the pattern.
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
(5) Since the second pair also matched up, the third characters are compared.
Text P Q A B C Z X Y Z R X Y Z
Pattern X Y Z
Now we have determined that the string pattern S is present in the text string R.
70
(Hints and Tips) In string search, there needs to be an index for the string S and another index for the string R. When
answering a question, the crucial point is to grasp how to use the indexes.
Let us explain this specifically, using the same example as in brute-force search.
(1) If the rightmost character of the portion of the text currently being compared with the
string is “X,” the next possible place where the pattern can be matched is two characters
ahead, so the next two characters are skipped.
(2) If the rightmost character of the portion of the text currently being compared with the
string is “Y,” the next possible place where the pattern can be matched is one character
ahead, so the next character is skipped.
(3) If the rightmost character of the portion of the text currently being compared with the
string is “Z,” the next possible place where the pattern can be matched is three characters
ahead, so the next three characters are skipped.
(4) If the rightmost character of the text is not X, Y, or Z, then the situation is identical to that
of (3), so the next three characters are skipped.71
71
(Note) In the BM method, it is necessary in advance to calculate the number of characters to be skipped. The example
discussed here has a 3-character pattern, so the number is 2 for X, 1 for Y, and 3 for Z or any other characters. These need
to be calculated before the string search begins.
A graph algorithm is an algorithm where the search is performed on a tree, one of the
question-oriented data structures.72 Depending on the order of search, a graph algorithm can
be breadth-first or depth-first. The depth-first order frequently appears on the exams, so be sure
that you understand how to pick out the nodes using this method.
A graph consists of nodes and edges.73 A node is a vertex that forms the graph whereas an
edge is a segment connecting a point to a point. Below is an example of a graph.
Node
Edge
A tree can be considered a graph in which not all nodes are connected to all others.
Breadth-First Order
The search begins at the root and traverses from lower levels and from left to right. Below, the
number at each node indicates the order in which the nodes are traversed.
72
Question-oriented data structure: A question-oriented data structure is a data structure often used to create a program.
Since the algorithms using data is well-established, such a structure enables the programmer to write a program with few
errors. Examples of question-oriented data structures include trees, stacks, queues, and lists.
73
(Hints and Tips) When you hear the term graph, you may think of a pie chart, a bar graph, etc., but in the world of
mathematics, it refers to a set of points and edges.
Depth-First Order
In depth-first search,74 we start with the root and traverse from the left child and from leaves.
Depending on the timing when the nodes are traversed, it can be classified as shown in the
following table.
Below, the number at each node indicates the order in which the node is traversed.
It is probably not very clear yet what the rules are for each of the search types, so let me add
some explanation. In depth-first order, the search follows the order as shown below:
In pre-order, the node values are taken out whenever you traverse the left side of the nodes.
Hence, the order is “+ – a b / * c d e.” In in-order, the node values are taken out whenever you
traverse under the nodes. Hence, the order is “a – b + c * d / e.” In post-order, the node values
are accessed whenever you traverse the right side of the nodes. Hence, the order is “a b – c d *
e / +.” 75
74
(FAQ) Depth-first order frequently appears on the exams. Understand well how the nodes are taken out in pre-order,
in-order, and post-order.
75
(Note) Note the result of traversing the tree to obtain symbols and variables. In pre-order, the result is “+ – a b / * c d e,”
which is in Polish Notation. In in-order, the result is “a – b + c * d / e,” which is in standard mathematical notation. In
post-order, the result is “a b – c d * e / +,” which is in Reverse Polish Notation.
Quiz
Q1 In binary search, when the number of sorted data values is quadrupled, how much does the
maximum number of comparisons increase by?
Q2 Explain the characteristics of each of the sorting methods: “shell sort,” “bubble sort,” “quick
sort,” and “heap sort.”
A1
2 times
Since the number of data values becomes 4 times as much, substitute the “n” in the formula “log2
n + 1” (maximum number of comparisons) with “4n.”
log2 4n + 1 = (log2 4 + log2 n) + 1
= log2 22 + log2 n + 1
= 2 + log2 n + 1
= 2 + (log2 n + 1)
A2
Shell sort: Elements of the array are picked at certain intervals first and are sorted; then, more
elements are picked by reducing the intervals and are sorted.
Bubble sort: Adjacent elements are compared and exchanged if the order is not correct; this
process is repeated.
Quick sort: An intermediate (median) reference value is chosen, and the array is divided into
the elements larger than the reference value and those smaller than the value; within
each portion, the same process is repeated.
Heap sort: The unsorted portion is expressed as a subtree, from which the largest (or the
smallest) value is taken out and moved to the already sorted portion. This process is
repeated to reduce the unsorted portion.
Q1. There is a register which stores values in binary. After entering a positive integer x into this
register, the operation “to shift the register value 2 bits to the left and to add x to the value” will
be performed. How many times as large as x is the resulting register value? Here, assume
that overflow due to shifting will not occur.
a) 3 b) 4 c) 5 d) 6
Answer 1
Correct Answer: c
In general, if there is no overflow, shifting n bits to the left multiplies the value by 2n while
shifting n bits to the right multiples the value by 1/2n. Shifting 2 bits to the left is to multiply by
22, so if we let y be the calculation result, y is related to x by the following equation:
y = x × 22 + x
= x × (2 2 + 1)
= 5 × x (y is 5 times x)
a) To make it 3 times as large, we would shift the register value 1 bit to the left and add x to it.
Shifting 1 bit to the left multiplies the value by 21, so the result would be as follows:
y = x × 21 + x = 2 x + x = 3 x
b) To make it 4 times as large, we would shift the register value 2 bits to the left, which
multiplies the value by 22, and the result would be as follows:
y = x × 22 = 4x
d) To make it 6 times as large, we would shift the register value 2 bits to the left and add this
result to the result obtained by shifting the original register value 1 bit to the left. Shifting 2
bits to the left multiplies the value by 22, and shifting 1 bit to the left multiplies the value by
21, so the following would result:
y = x × 2 2 + x × 21 = 4 x + 2 x = 6 x
Q2. Which of the following is an appropriate description concerning the cancellation of significant
digits?
a) It means that the number of the significant digits is extremely reduced when a floating
point number is subtracted by another whose value is almost equal.
b) It refers to an error which occurs because the calculation result exceeds the maximum
numeric value that can be processed.
c) It refers to an error which occurs when rounding off (up or down) the numbers smaller than
the lowest digit when the total number of digits in numerical representation is limited.
d) It refers to the omission of the low-order digit of an operand when adding floating point
numbers.
Answer 2
Correct Answer: a
123.4567
– 123.4556
0.0011
Here, the higher-order digits become 0, reducing the number of significant digits drastically.
Q3. The truth table below shows the results of logical operation “x @ y.” Which of the following
expressions is equivalent to this operation?
x y x@y
True True False
True False False
False True True
False False False
Answer 3
Correct Answer: b
In logic operations, we assign “1” for “true” and “0” for “false.” It is easier to use familiar
notation, so we shall use the following symbols:
Then, the logical expressions in the answer group can be rewritten as follows:
a) x OR (NOT y) x+ y
b) (NOT x) AND y x⋅ y
c) (NOT x) AND (NOT y) x⋅ y
d) (NOT x) OR (NOT y) x+ y
Then we check to see which of the expressions in the answer group matches (has the identical
results with) the given logic operation:
a) b) c) d)
x y x y x+ y x⋅ y x⋅ y x+ y x@y
1 1 0 0 1 0 0 0 0
1 0 0 1 1 0 0 1 0
0 1 1 0 0 1 0 1 1
0 0 1 1 1 0 1 1 0
Q4. When the syntax for numerical values is defined as shown below, which of the following
expressions is treated as <numerical value>?
Answer 4
Correct Answer: b
This answer conforms to the third form (<numerical string> E <sign> <numerical string>) of
the definition of <numerical value>.
This type of definition is called BNF notation (Backus-Naur Form). BNF notation is used as a
way to formally denote the syntax of a programming language.
Overview of BNF notation is as follows:
α::=β → The left-hand side α is defined as the right-hand side β. In other words, α = β.
< α > → This denotes the variable α. < > can be omitted.
| → This means “or.” “α::=β | γ” means “α::=β” or “γ.”
a) By the definition of <numerical value>, “–” (<sign>) must follow “E.” The underlined part
does not satisfy the definition. –12
c) By the definition of <numerical value>, “+” (<sign>) must follow “E.” The underlined part
does not satisfy the definition. +12E – 10
d) By the definition of <numerical value>, “+” (<sign>) must follow “E.” The underlined part
does not satisfy the definition. +12E10
Q5. A key is composed of 3 alphabetic characters. When the hash value h is decided with the
following expression, which of the following collides with the key “SEP”? Here, “a mod b”
represents the remainder when a is divided by b.
Alphabetic Alphabetic
Position Position
character character
A 1 N 14
B 2 O 15
C 3 P 16
D 4 Q 17
E 5 R 18
F 6 S 19
G 7 T 20
H 8 U 21
I 9 V 22
J 10 W 23
K 11 X 24
L 12 Y 25
M 13 Z 26
Answer 5
Correct Answer: b
A hash value is the result of converting the key by a hash function, which is used for hashing.
The term “hashing” refers to a process of performing some sort of calculation on the key to
convert it to an address value in order to obtain the storage address of the record in a direct
organization file, for example. Here, the function used to obtain the address is called a hash
function. If hashing generates the same hash value for two or more different keys, it is called a
collision. Records that came in later when a collision occurred are called synonyms.
Calculating the hash value for “SEP” by means of the given hash function, we can obtain the
following:
h = (sum of positions for each alphabetic character used in the key) mod 27
= (19 + 5 + 16) mod 27
= (40) mod 27
= 13 (40 ÷ 27 = 1 remainder 13)
Q 6. In the heap shown below, the value of a parent node is less than the values of child nodes.
When inserting a node into this heap, an element is added at the very end. If that element is
less than the parent node, the parent and child are exchanged with each other. If element 7 is
added to the heap at the position marked by the asterisk (*), what element will end up at
position A?
9
11 14
A
24 25 19 28
29 34 *
a) 7 b) 11 c) 24 d) 25
Answer 6
Correct Answer: b
Add the element to the given position and then repeat the procedure to exchange the child and
the parent when the child element has a value smaller than the parent value. “7” is the added
element here.
Exchange
Exchange
Exchange
Now, the heap is complete. Hence, the element that ends up at position A ( ) is “11.”
Answer 7
Correct Answer: b
A stack is a data structure of the type known as Last-In First-Out, where data stored last will be
the first data to be taken out. The operation of inserting data into a stack is called a “push,” and
the operation of taking data out of a stack is called a “pop.”
a) FIFO (First-In First-out) is the data structure of a queue, where data stored first will be
the first data to be taken out.
c) LILO (LInux LOader) is a boot loader (program to load the OS into memory) that allows
PCs to read Linux.
Translator’s note: It seems natural that LILO means “Last-In Last-Out” in this question.
d) LRU (Least Recently Used) means “least accessed in recent history” and is used as the
page-replacing algorithm in a virtual memory system. This is the method of paging-out
which discards the least recently accessed page.
Q8. The decision table below shows the conditions for creating reports from employee files. Which
of the following can be concluded from this decision table?
Under age 30 Y Y N N
Male Y N Y N
Married N Y Y N
Output Report 1 – X – –
Output Report 2 – – – X
Output Report 3 X – – –
Output Report 4 – – X –
a) Report 1 contains the contents of Report 4 except for data on men age 30 and over.
b) Report 2 contains all unmarried men.
c) Men in Report 3 are also included in Report 2.
d) Persons included in Report 4 are not included in any of the other reports.
Answer 8
Correct Answer: d
Let the negation of “married” be “unmarried” and the negation of “male” be “female.” Now,
read the answer group descriptions carefully. In the following explanation, the underlined parts
indicate negation (N).
a) The output conditions for Report 1 are “under 30, not male, married.”
→This is “under 30, female, married.”
The output conditions for Report 4 are “not under 30, male, married.”
→This is “at least 30, male, married.”
So, Report 1 contains females only. Report 4 contains males only, and removing those
“males, at least 30” from Report 4 causes it to be the empty set. Hence, this description is
wrong.
b) The output conditions for Report 2 are “not under 30, not male, not married.”
→This is “at least 30, female, unmarried.”
So, report 2 contains females only, so it is not true that “all unmarried men” are included.
Hence, this description is wrong.
c) The output conditions for Report 3 are “under 30, male, not married.”
→This is “under 30, male, unmarried.”
The output conditions for Report 2 are “not under 30, not male, not married.”
→This is “at least 30, female, unmarried.”
So, Report 3 contains males only while Report 2 contains females only. There is no
intersection. Hence, this description is wrong.
d) By elimination this must be the correct answer, but let us check it. Organizing all of the
output criteria for all of the reports from a), b), and c) in the answer group, we get the
following:
Report 1: “under 30, female, married” (from “a”)
Report 2: “at least 30, female, unmarried” (from “b”)
Report 3: “under 30, male, unmarried” (from “c”)
Report 4: “at least 30, male, married” (from “a”)
The condition “at least 30” for Report 4 is also for Report 2, but the other conditions are
in negation of each other, so no person is included in both. Further, the condition “male”
is also for Report 3, but the other conditions are in negation of each other also. Similarly,
the condition “married” is also for Report 1, but again the other conditions are in
negation of each other. Therefore, no other reports contain those persons contained in
Report 4. This is the correct description.
Q9. The flowchart below illustrates the Euclidean algorithm for obtaining the greatest common
divisor of values “A” and “B,” by repeated subtraction. When “A” is 876 and “B” is 204, how
many comparisons are required for completion of this process?
Start
Output A, B, L
End
a) 4 b) 9 c) 10 d) 11
Answer 9
Correct Answer: d
The Euclidean algorithm is an algorithm to obtain the greatest common divisor of two integers
A and B. However, you need not know this algorithm; all you have to do is to track how the
data changes. First, by “AÆL” and “BÆS,” the values for which the greatest common divisor
is to be obtained are rewritten as L and S. The algorithm then determines which is greater and
subtracts the smaller from the larger. Then, in case of “L=S,” the algorithm stops.
Since initially A=876 and B=204, we subtract B (=S) from A (=L) as many times as possible.
Note that the values must be compared first before the subtraction takes place.
(1) Under the condition of L=876 and S=204, repeat subtraction until L<S. Since
876 ÷ 204 = 4 with remainder 60, the subtraction and replacement “L – S Æ L” can be
executed 4 times before “L < S” is satisfied. Hence, the comparison (L:S) occurs 4 times.
(2) Under the condition of L=60 and S=204, repeat subtraction until L>S. Since
204 ÷ 60 = 3 with remainder 24, the subtraction and replacement “S – L Æ S” can be
executed 3 times before “L > S” is satisfied. Hence, the comparison (L:S) occurs 3 times
here.
(3) Under the condition of L=60 and S=24, repeat subtraction until L<S. Since
60 ÷ 24 = 2 with remainder 12, the subtraction and replacement “L – S Æ L” can be
executed 2 times before “L < S” is satisfied. Hence, the comparison (L:S) occurs 2 times
here.
(4) Under the condition of L=12 and S=24, repeat subtraction until L=S. Since
24 ÷ 12 = 2 with remainder 0, the subtraction and replacement “S – L Æ S” can be
executed 2 times before “L = S” is satisfied. Hence, the comparison (L:S) occurs 2 times
here.
(5) The number of times of comparison for “L:S” is calculated as follows:
We can now add the numbers from (1) through (4).
Total number of times of comparison = 4 + 3 + 2 + 2 = 11 (times)
Q10. When the algorithms described by the two flowcharts below are performed on a positive integer
M, which of the following conditions needs to be inserted in the box below so that the same
value x can be obtained?
Start Start
Operation
End End
Answer 10
Correct Answer: a
The notation “n: M, -1, 1” at the loop limit means, as explained in the question, the following:
let the initial value of n be M, add “– 1” (subtract 1) each time the loop is executed, and stop
when the final value “1” is reached. This means that the value of n changes from M, M – 1, M –
2, …, 2, and 1.
Let us follow the flowchart on the left first. Starting with n = M and decreasing the value by 1
each time the loop is executed until the value gets to 1 (n = M, M – 1, M – 2, …, 2, 1), the
following operation is going on since “ x × n → x ” is executed within the loop. As “1 Æ x”
suggests, the initial value for x is 1. Let us track how the value of x changes as n changes:
and so on…
n=1: x × n = 1× 1 = 1 → x (x = 1)
n=2: x × n = 1 × 2 → x (x = 1× 2 )
n=3: x × n = 1 × 2 × 3 → x (x = 1 × 2 × 3 )
n=4: x × n = 1 × 2 × 3 × 4 → x (x = 1 × 2 × 3 × 4 )
Let us consider how large n should be in order to make the result identical to the result of the
flowchart on the left. The flowchart on the left repeats “ x × n → x ” to execute the following
calculation:
The multiplication begins with M here; on the right, the multiplication begins with 1. Hence, as
shown below, if the multiplication continues until M, the results of the two flowcharts will be
identical:
Therefore, we are to repeat “ x × n → x ” until n = M. Following the flowchart, after the command
“ x × n → x ,” the program executes “ n + 1 → n ,” so after n=M is multiplied, we will have n =
(M+1). This means that the program should flow to the “end” branch when “n=M+1” is satisfied.
Among the options in the answer group, this condition is “n>M.”
Chapter Objectives
A computer system is composed of hardware and
software. There are many types of computer, but the
principles of their operation are fundamentally the same.
We will learn the mechanism of computers (hardware) in
Section 1 and software (operating system) for efficient
computer use in Section 2. We will further learn some
configurations of computer systems for achieving
improved reliability in Section 3 and ways to evaluate the
performance of computers in Section 4. Finally, in Section
5, we will learn various systems that use computers.
2.1 Hardware
2.2 Operating System
2.3 System Configuration Technology
2.4 Performance and Reliability of Systems
2.5 Systems Application
2.1 Hardware
Introduction
1
Volatility and non-volatility: It is the property that the contents of memory are lost when the power is turned off is
called volatility. RAM is a type of volatile memory. On the other hand, the property that the contents of memory are not
lost when the power is turned off is non-volatility. ROM is a type of non-volatile memory.
2
(Hints & Tips) Flash memory is classified as EEPROM.
There are two typical types of RAM: SRAM and DRAM. The characteristics of SRAM and
DRAM can be summarized as shown in the following table.
SRAM is composed of a flip-flop,5 so it does not require any refresh operations and is able to
speed up information reading and writing. However, the cost is higher for the same capacity
than DRAM, because the SRAM structure is more complicated than that of DRAM. For this
reason, it is used mainly in areas where the speed, not the cost, is important, such as in cache
memory. It is also used in battery-operated devices.
DRAM consists of condensers and transistors, representing whether or not there is electrical
charge in the condensers, by using 1 or 0. As time elapses, the electrical charge in the
condensers gets discharged, resulting in memory loss; therefore, it needs to be re-written
(refreshed) at certain time intervals (every few milliseconds). Since the structure is rather
simple, the manufacturing cost is low, and it is mainly used in the main memory of PCs.6
Types of DRAM equipped with high-speed data transfer functions include SDRAM, DDR
SDRAM, etc.
3
Graphics memory: It is memory used when images and characters are displayed on the display screen using a computer.
It is also referred to as video memory (VRAM).
4
Cache memory: It is high-speed memory placed between main memory and the CPU to speed up data reading from
main memory to the CPU.
5
Flip-flop (also know as bistable circuit): It is an electrical circuit with two stable states, which maintains its state until an
input that changes one of the states is entered.
6
(FAQ) Frequently there are questions that compare SRAM and DRAM. You should have good knowledge of the level of
integration, usage, structure, etc.
The term architecture refers to “structures or organizations.” The processor architecture refers
to the configuration and operating principles of the computer.
Configuration of Computer
Below is a figure showing the basic configuration of a computer. This configuration is called
the “big five units” or “big five functions,”7 because there are five major components.
Processing unit
Control unit
Control flow
Data flow
Operation unit
The control unit and the operation unit are together called the processing unit or the central
processing unit (CPU).
Address modification is a function that obtains the value of the address actually accessed
based on the address specified by the instruction. The method for address modification is called
an addressing method. The address actually accessed as the result of the address modification
is called the effective address. Addressing methods is described below in detail.
7
“The big five units”:
• Control unit: It is the unit that controls the entire computer. It extracts and reads instructions of the program stored in
the main memory and sends to various units the directions necessary to execute the instruction.
• Operation unit: It is the unit that performs the arithmetic operations, logic operations, and other operations. It
consists of adders, registers, complementers (units that convert values to their complements), etc.
• Memory: It is a generic term of the unit that stores data, programs, etc. It can be classified into main memory and
auxiliary memory.
• Input unit: It is a generic term of the unit that enters programs and data into the computer.
• Output unit: It is a generic term of the unit that outputs results of computer processing in characters and numbers that
we can recognize.
In this method, the content stored in the address part of the instruction becomes the data subject
to operation.
∼
x
In this method, the data stored in the address designated by the address part of the instruction
are not the data subject to operation; rather, the data stored at the address designated by that
content are the data subject to operation.
In this method, the effective address is the sum of the value of the address of the instruction and
the value of the index register.8 For example, when processing an array, we can look up the
content of another address simply by changing the content of an index register.9
y x+y
8
Register: It is low-capacity, high-speed memory where data is temporarily stored. It is located in the CPU. There are
various registers, including the following: general registers, for storing intermediate and final results of operations; status
registers for indicating the CPU state after an instruction is executed; index registers for address calculations; and base
registers.
9
(FAQ) There are questions on the concept of addressing methods. An example is “Which of the following is an
appropriate description of the direct addressing method?” Be sure to have these methods organized in your mind: the direct
addressing method, index addressing method, immediate value addressing method, etc.
In this method, the effective address is the sum of the address designated by the address part of
the instruction and the content of the base address register.10
B b b+y
This is the method where the address of the instruction stores the data subject to processing, not
an address.11
These computers have only a set of simple, frequently used instructions integrated onto a single
VLSI (very large scale integration) chip in order to achieve high performance through
improved machine cycles (operation speed) and a reduction in instruction processing time. The
emphasis is placed on keeping the length of each instruction to a fixed length and limiting the
time required to execute each instruction to a fixed amount. By doing so, the technology of
pipeline control has been easily implemented. However, the number of instructions to be
executed becomes large unless efficient object programs are created, so it is essential that the
compiler have an optimization function.12 Most computers called workstations are of this type.
These computers have complex instructions integrated onto a single VLSI ship in order to
achieve high overall performance. Most general-purpose computers are CISCs.
10
(Note) The base addressing method can be used regardless of where in the main memory the program is stored, simply
by changing the value of the base register. Such a structure is called a re-locatable structure.
11
(Hints & Tips) Note that in the immediate value addressing method, the address of the main memory is not designated.
12
Optimization: It is a function of a compiler to eliminate redundancy of a program in order to reduce the execution time
of the object program and the size of the program. This is done in a variety of ways, such as calculating constants in
advance, simplifying formulas, and eliminating double loops.
Pipeline Control
We have mentioned RISC and CISC as technologies to improve computer processing speed. To
further improve the speed, the RISC system uses pipeline control. Pipeline control is a
technology to reduce the instruction execution time of the CPU. This is an attempt to do the
following: when execution steps of an instruction are divided into 5 or 6 steps, and if each step
is completed within a certain fixed amount of time and the instruction steps stay independent of
one another, then we can improve the overall processing speed by delaying the execution of
each instruction 1 step behind the previous instruction. In reality, however, due to branching
instructions, there are times when the next instruction address is not completely determined,
and some steps are not completed within the fixed processing time. Everything is not always
functioning in an ideal way, but pipeline control does process instructions concurrently,
providing one way of speeding up the computer.13
Time
Instruction 1 I1 I2 I3 I4 I5 I1 : Fetch an instruction
Execution order
I2 : Decode
Instruction 2 I1 I2 I3 I4 I5 I3 : Calculate an address
I4 : Fetch data
Instruction 3 I1 I2 I3 I4 I5 I5 : Execute
Instruction 4 I1 I2 I3 I4 I5
There are many requirements for memory, but requirements for high speed and large capacity14
are of particular importance. In general, however, high-speed memory is expensive and has
small capacity whereas low-speed memory is inexpensive and has large capacity. So, efforts are
being made to combine high-speed but small-capacity memory and low-speed but
large-capacity memory to develop high-speed and large-capacity memory.
13
(FAQ) Questions regarding pipeline control often appear on exams. Most of them are in the form of choosing an
appropriate description of pipeline control, so you only have to know that pipeline control executes instructions
concurrently.
14
(Note) Besides these, requirements for memory include reliability, ability of random access, non-volatility, re-writable
function, portability, low cost, etc.
Memory Hierarchy
Memory hierarchy is a hierarchical representation of the relationship between the access speed
and capacity of various types of memory.15
Expensive Fast
Register
Cache memory16
speed
price
Main memory
Disk cache
Auxiliary memory (hard disk, etc.)
Inexpensive Slow Large-capacity memory (optical disk, etc.)
Memory capacity
The cache memory placed between a hard disk and the main memory is called the disk cache.
The figure below shows the relationship between the cache memory and the disk cache.
Cache memory
Primary cache
Secondary cache17
Main memory
Disk cache
15
(Note) If t is the average memory access time, tm is the access time of the main memory, tc is the access time of the
cache memory, and h is the hit ratio, then the following equation holds: t = tch + tm(1 – h).
16
Hit ratio: It is the probability that the portion of a program necessary to execute that program is in the cache memory
17
Secondary cache: The primary cache is the cache memory which is built in the CPU; the secondary cache is the cache
memory placed between the primary cache and the main memory.
…
Central Processing Unit (CPU)18
Data and programs are stored over a sequence of addresses (horizontally), but the memory is
accessed in bank units (vertically). This allows concurrent access to a sequence of addresses.
Magnetic tape is a medium that records data onto a tape that has been magnetically coated.
The unit price of this memory medium is cheap and has a large capacity, so it is used in cases
such as backing up entire hard disks.
Capacity Calculation
The record format of a magnetic tape is shown below. As we can see in this figure, in order to
record one block, we needs to include an IBG (Inter-block gap) which is an area to identify the
block and contains a special code. A magnetic tape reader reads data in block units, so this area
is crucial even in identifying the end of each block.
Record
length
Block length
18
(FAQ) Interleaving is a way to speed up memory. Questions on the concept of interleaving have appeared often, so be
sure you understand this.
Assuming that the specifications of a magnetic tape are given below, let us calculate precisely
the number of records that can be stored on this single magnetic tape.19
The blocking factor is 100. Æ 100 records can be stored in one block.
Record length Æ 80 bytes
So the number of bytes L1 for each block, excluding the IBG, is as follows:
80 bytes20 15mm
L3=140mm
Let us now calculate precisely how many records can be stored on one magnetic tape.
(1) Calculating the number of blocks that can be stored on one tape
Since the length of the tape is 730 m (730 * 103 mm), the number B1 of blocks that can be
recorded on one tape is as follows:
B1 = (Length of the tape) / (Length of a block)
= 730 * 103 / 140
19
(FAQ) On each exam, there is at least one question dealing with the calculation of the capacity or performance of a
magnetic tape or a hard disk. If you keep these ideas organized in your mind, you can answer these questions because the
difference is only in the numerical values.
20
(Hints & Tips) Besides “bytes/mm,” the record density can be represented in “columns per mm” or “bpi.” A column is
the same as a byte. “bpi” stands for “bytes per inch,” which is the number of columns per inch. Converting from inches to
mm is necessary here, but you need not remember the formula since the relationship between inches and mm will be given
in the question.
= 5,214.285…≒5,214 (blocks).21
The fractional part 0.285 is less than one block, so we truncate it.
(2) Calculating the number of records that can be stored on one tape
Since one tape can record 5,214 blocks, and each block has 100 records, the number B2 of
records that can be stored on one tape is as follows:
B2 = (number of blocks that can be stored on one tape) * (blocking factor)
= 5,214 * 100
= 521,400 (records)
Performance Calculation
The running speed of the tape is constant when reading or writing data. In theory, the tape
begins to accelerate in the middle of IBG and starts to read and write when a constant speed is
achieved. When another IBG is found, the tape decelerates and stops in the middle of IBG.
IBG block IBG block IBG
Running speed
of the tape
Time
Start
Stop
Running at a
constant speed
Constant Deceleration
speed
Assuming that the specifications of a magnetic tape are given below, let us calculate the time it
takes the magnetic tape to read one block.22
Transfer time for one block can be obtained by dividing the length of a block by the data
transfer speed:
Data transfer time for one block = (block length) / (data transfer speed)
Data transfer time for one block = 8,000 (bytes) / 320 (Kbytes/sec)
21
(Hints & Tips) The fractional part of the number of blocks is discarded here, but it actually becomes a short block,
which is a block with fewer records than the other blocks.
22
(Hints & Tips) In performance calculations, data is transferred in blocks, so we do not need to consider the length of
IBG.
We add the start-up time to the data transfer time for one block.
Note that the start-up time is added but not the stop time. When being read, no data is
transferred until the beginning of a block is reached. Hence, the time until this is achieved is
part of the waiting time. After that, data is transferred, but when the data transfer is completed,
the stop operation and the program processing are performed concurrently. Thus there is no
need to add the stop time.23
A hard disk is a medium that achieves random and high-speed reading and writing of data,
consisting of 1 to 10 round disks coated with a magnetic substance on the front and the back
sides and rotated at a high speed. If there is only one disk, it is called a floppy disk.24
High-speed rotation
Some disks do not record data on the very top
and very bottom sides (protective sides).
Tracks Read/write head
(Circumference portion) (portion that reads and writes data)
Cylinders (as many as the recording sides of the disk)
(Cylindrical portion)
Access arm
23
(Hints & Tips) That stop time is not included is a standard assumption in exam questions.
24
(Note) Another medium that, like a floppy disk, can read and write and can easily be carried around is MO (magneto
optical disk), which is very popular because its capacity is about 600 times that of a floppy disk.
Capacity Calculation
Just as on a magnetic tape, data is recorded in blocks on a hard disk. However, if a block cannot
fit into a track, it cannot be recorded.
First, data is recorded on a track, and when that track is filled, the recording proceeds to the
next track which is a track on the corresponding circumference of the next surface. In other
words, data is recorded in cylinder units.
Assuming that the specifications of a hard disk are given below, let us calculate actually how
many cylinders are necessary to write 100,000 records.
Rt = N * (blocking factor)
= 6 * 8 = 48 (records)
S = (number of records) / Rs
= 100,000 / 912 = 109.649… ≒110 (cylinders) (rounded up).
In general, files are secured in cylinder units, so if there are unused tracks on a cylinder, the
Performance Calculation
The access time of a hard disk is calculated as follows:
Translator’s note: The waiting time (seek time + latency time) is often called the access time.
The seek time is the time during which the read/write head moves to the track where the data is
recorded. The latency time is the time until the desired data come under the read/write head.
The seek time and the latency time are determined by where the head is located, so we use the
average values. In actual exam questions, the seek time is always given, and the latency delay
can be calculated by the duration of one rotation, which is obtained by the inverse of the
number of rotations per time unit. Since the minimum latency time is 0 and the maximum is 1
rotation time, the average latency time is the duration of a half rotation.
Assuming that the specifications of a hard disk are given below, let us calculate actually the
access time for reading data contained in one block (5,000 bytes).
The fact that the number of rotations of this hard disk is 2,500 rotations per minute means that
the disk makes 2,500 rotations every minute. Hence, the rotation time is as follows:
Note carefully these units. The rotation speed is given in rotations per minute, but the rotation
time is in milliseconds. Hence, we need to convert minutes to milliseconds (1 minute = 60
seconds = 60,000 milliseconds).
Since one rotation allows the transfer of data contained on one track, 20,000 bytes are
transferred in 24 milliseconds. Hence, 20,000 / 24 (bytes/msec) is the data transfer speed. We
can calculate this quotient. But, since it is indivisible, we shall leave it as it is here.
25
(FAQ) On each exam, there is at least one question dealing with the calculation of the capacity or performance of a
magnetic tape or a hard disk. If you keep these ideas organized in your mind, you can answer these questions because the
difference is only in the numerical values.
26
Seek time/Latency time: The seek time is sometimes called the positioning time. The latency time is sometimes called
the search time.
The calculations above are based on the rotation speed of the hard disk. However, auxiliary
memory, such as a hard disk and a magnetic tape, exchanges data with the computer through
input/output channels. 27 It is therefore necessary to install input/output channels with
appropriate transfer speeds.28
Sector (arc)
For example, suppose that each track consists of 12 sectors, each of which consists of 1,200
bytes on a hard disk. To store files whose record length is 900 bytes, there is no sector
remainder, as shown below, if the flocking factor is 4.
3,600
1,200 1,200 1,200
27
Input/output channel: Data-transfer path for exchanging data between auxiliary memory and the computer
28
(Hints & Tips) The data transfer speed of a hard disk is determined by the rotational speed of the disk, so it is
meaningless to have a high-speed channel. Channels are to be selected according to the transfer speed of the disk.
There are some terms related to the performance of storage media, including access time,
waiting time, transfer time, seek time, and latency time. RAID, sometimes called a disk array,
is a way to control multiple disks placed in parallel as if they were one unit.
Access time
Seek time
Latency time
S T
I/O request Delivery complete
Since the head moves to the desired track while the disk is rotating, the rotation time and
moving time partially overlap.29 However, on IT exams, this overlap is almost always ignored.
Hence, there is no problem defining waiting time as follows:
29
(Hints & Tips) A hard disk is rotating as the head approaches the track, so the seek time and the latency time overlap
partially.
Main memory
←Block
←Block
←Block
←Block
Disk 1 Disk 2 Disk 3 Disk 4
RAID0
This is the method of writing blocks of a fixed size on multiple disks. Access is not centralized
on one single unit, so the input/output time can be reduced.30
RAID1
By recording the same data on two disks, this configuration enhances the safety of the data.31
These are configurations where, in addition to data recorded on the hard disk, there is a disk
designated as the error-checking disk to prevent failures. RAID 2 can correct errors. RAID3
and RAID4 can detect errors while they cannot correct errors. In RAID3, data is partitioned in
bits or bytes whereas RAID4 partitions data in block units.32
RAID5
This is where each data block is assigned a parity value. Data and parity are written on separate
disks, and a failure on a single disk can be recoverable.
Here, the data is divided up into blocks of a certain length, and three blocks are considered to
form a unit. For example, Blocks 1 through 3 are a unit, and for each bit, the exclusive logical
sum of blocks 1 through 3 is written on a separate disk as parity value 1 through 3. Similarly,
the exclusive logical sum of blocks 4 through 6 is written on a separate disk as parity value 4
through 6.
There is also RAID6, in which the parity values are separated as in RAID5 and the data is
recoverable even when two disks fail.34
30
(Hints & Tips) The only feature about RAID0 is that I/O is dispersed, so this is not a measure to improve reliability.
31
(Note) RAID1 is called mirroring since the same data is recorded on separate hard disks.
32
(Note) There is also RAID0+1, a combination of RAID0 and RAID1, already in use.
33
(Note) The parity of RAID5 uses the exclusive logical sum of multiple blocks. Hence, even if one of the disks should
fail, the damaged data can be recovered by taking the exclusive logical sum of the other blocks.
34
(FAQ) Often on the exam, there are questions of the form: “Which of the following statements is appropriate concerning
RAID…?” Remember the difference between RAID0 and RAID1.
Any storage excluding the main memory is called auxiliary storage. Auxiliary storage can be
used to compensate for the insufficient capacity of the main memory. In general, auxiliary
storage has larger capacities in comparison with the main memory.
Input and output units include both input units, where data is entered into the computer, and
output units, where data is taken out of the computer. A unit equipped with both the input and
output functions is called an input/output unit.
Auxiliary Storage
Typical auxiliary storage includes the following media. In the past there was a time when
magnetic tapes and floppy disks were the mainstream media; however, recently the main types
have been hard disks, magnetic optical disks, CDs, and DVDs.35
35
(Note) DVDs are optical disks just as CDs are, but with reduced laser-light wavelength, the DVDs have larger capacities.
The record density on DVD is also larger.
36
(Hints & Tips) A magneto optical disk uses light and magnetism for writing data but uses only light for reading data.
37
(Note) DAT is a unit that records audio onto a magnetic tape using digital signals. It was originally designed for music,
but it is now used as a backup system because of its low cost.
Input units
The most common input devices are keyboards and pointing devices. A keyboard is used to
enter numerals and characters while pointing devices38 are used to enter coordinate values.
Other input units are shown in the following table.
Output units
38
(Note) Pointing device: A pointing device is any unit designed to enter coordinate positions such as a mouse, a tablet,
etc. Other examples include trackballs, digitizers, touch screens, etc.
39
(Note) OLED: It is a display using the organic light emitting diode technology. It uses organic materials that emit light
when voltage is applied. Compared with LCDs, the viewing angle is larger, the contrast is better, and the response speed is
higher, in addition to being thinner and lighter.
40
(Note) Liquid crystal display (LCD): A liquid crystal display can be of various types such as TFT and STN (currently
the mainstream is DSTN). STN has a simple structure with low manufacturing costs, but its resolution and contrast are also
low. DSTN is an improved version of STN, where contrast is enhanced. TFT has contrast and resolution equivalent to
those of CRT but is expensive.
Input and output interfaces are interfaces for connecting peripheral devices such as printers and
hard disk units to the PC and for transferring data. Depending on their types, the transfer may
be either serial data transfer or parallel data transfer.41
0 1 0 1 0 1 0 1
Serial transfer
Peripheral devices
Computer
Parallel transfer
41
(FAQ) Frequently the exams have questions concerning combinations of I/O interfaces and data transfer methods. It is
good to know common I/O interfaces and data transfer methods.
Daisy chain
This is the connection method used in SCSI and GPIB, where peripheral units are connected
along a line. The last unit in the line requires a termination resistor called a terminator.
USB has two modes: the full speed mode of 12Mbps and the low speed mode of 1.5Mbps. In
the full speed mode, relatively high speed units such as printers and scanners are connected. In
the low speed mode, relatively slow units such as keyboards and mice are connected. Currently,
USB 2.0 has increased its high speed mode up to 480Mbps, so most peripheral units can be
connected.
Other interfaces
In addition to the above, there are other interfaces such as IDE (connecting a hard disk), ATA
(IDE standardized by ANSI), and ATAPI (connecting ATA with units other than a hard disk,
such as CD-ROM drive and tape streamer). However, currently CD-R and CD-RW units are
commonly connected via SCSI and USB.
42
Plug-and-play (Plug and play): This refers to the function of automatically installing and setting the device driver
when the peripheral unit or extension card is connected to the computer. The OS checks all the units connected to the
computer when it is started up, installing the required device drivers. If the OS does not have the device driver of the unit
in its own library, it requests installation of the device driver and, if necessary, even re-starts the computer automatically.
43
Hot plug: This is the function that enables the plug-and-play function while the computer and peripheral unit power is
on.
44
IEEE 1394: The standard where the transfer speed is 100Mbps, 200Mbps, or 400Mbps. This is equipped with a hot
swap function (peripheral units can be connected or disconnected without having to turn the power off).
Quiz
Q1 Compare DRAM and SRAM:
Q2 Among the components that compose a computer, which ones compose the central
processing unit (CPU)?
Control unit
Control flow
Data flow
Operation unit
Memory Auxiliary
memory
A1
A3 It is high-speed small-capacity memory placed between the CPU or the register and
the main memory. The main memory is slower than the CPU or the register, so the
CPU processes can be made more efficient by storing frequently accessed data and
programs of the main memory in the cache memory.
A4 This is auxiliary memory in which multiple hard disks are placed in parallel and are
controlled as if they were one disk unit so that the input/output speed can be improved
and/or reliability can be enhanced. Sometimes the term RAID refers to such a method
instead of the storage. It is an attempt to speed up the process by spreading records
over multiple disks and reading the distributed records simultaneously.
Computers do not operate only with hardware. They function only with the use of software
called an operating system (OS).
The definition of an operating system (OS) is not clear. The basic software is called an OS in a
broad sense while the control program is called an OS in a narrow sense.
Configuration of OS
An operating system is the basic software that comprehensively controls and manages the entire
operation of hardware and software of a computer system. A program referred to as the basic
software and its role are shown below. 45
Service programs and language processors are sometimes called processing programs, which run
on the control program. For this reason, the control program is called an OS in a narrow sense.
45
(Note) Operating systems for personal computers include Windows XP, Mac OS X, and OS/2. Those for workstations
include Windows Server 2003, UNIX, and Mac OS X Server. For general-purpose machines, there is also MVS developed
by IBM. In addition, there is a free OS program called Linux, which is compatible with UNIX.
Objectives of OS
An OS attempts to improve the productivity of the entire system by eliminating unnecessary
operations and waste of various resources surrounding the computer and by operating the
computer system efficiently. The objectives of an OS are organized in the following figure.
Response to various processing modes: Batch processing, online real-time processing, etc.
Support of computer control and operation: Continuous processing, recording the operation data, etc.
Hardware resources include the central processing unit (CPU), memory, I/O units (including
channels). etc. The OS controls these resources so that they can be used efficiently.
One computer can handle various processing modes such as batch processing, remote batch
processing, online processing, real-time processing, and interactive mode processing. In particular,
since online processing has become widespread, the scope of computer applications has been
dramatically enlarged.
Indexes for reliability and safety include RASIS. This is a term coined by taking the initial letters
of the words Reliability, Availability, Serviceability, Integrity, and Security.
Application software refers to a program which runs under the control of the OS. The OS provides
an environment in which application programs can be efficiently executed.
46
Multiprogramming: It is a mechanism in which programs are processed alternatingly on one CPU so that it can appear
as though multiple programs were operating at the same time.
47
Spooling: It is accomplished by using high-speed hard disks as a virtual I/O unit. For example, directly printing on a
low-speed printer slows down the processing speed. Instead, the output results can be recorded on a high-speed hard disk
first, and then a service program, dedicated only to output, can do the printing when the CPU is not busy.
48
Virtual memory: It is a technique to enlarge the apparent capacity of the main memory so that large-scale programs can
be loaded in the memory at a time. Often, auxiliary storage such as a hard disk is used as virtual memory.
49
Library management: It is the function that systematically accumulates primitive programs, object programs, load
modules, and other programs developed. This enables integrated management of software assets that are managed
individually (by individuals).
One of the functions of the control program, the “OS" in a narrow sense, is "job management.” In
job management, the priorities of jobs are determined, and the jobs are synchronized. In batch
processing, the OS analyzes the contents of JCL (job control language) to assign resources51 and
schedule jobs. In interactive mode processing, the OS analyzes instructions entered at the terminal,
assigns resources, and performs scheduling. In addition, job management has other functions such
as spooling and cataloged procedures.
50
(FAQ) Concerning operating systems, many exam questions involve knowledge of terms. Be sure to know terms such as
multiprogramming, virtual memory, and spooling function.
51
Resource: A resource is a device/unit of various kinds necessary for the computer to operate. It refers to any device
related to memory, input, output, control, and other functions; specifically, these include the CPU, main memory, and files.
Scheduler
In job management, jobs are continuously executed under a master scheduler and job scheduler.
The master scheduler plays the role of an interface with the operator via the console panel.52 The
job scheduler manages the reception, selection, start, and finish of the jobs.
Reader
This reads the contents of JCL, analyzes them, schedules jobs, and places them in a queue.
Initiator
This selects the programs with high execution priorities among those in the queue and assigns the
resources that those programs need.
Terminator
This releases resources that were used by programs just completed. If there is another program
following, the terminator starts up the initiator.
Spooling (Spool)
Spooling is the function of the I/O of jobs independent of the programs. Any output results to
low-speed units such as a printer are first stored in a spool file. Then, after the program is finished,
the output results are printed on the printer from the spool file by the service program of the OS.53
The reason this is done is that, when the I/O unit is slow, directly performing the I/O process
would reduce the processing speed of the computer.54
Cataloged Procedures
In job execution directions, typical processing (routine work) such as translation of languages is
done in the following way. A set of JCLs is registered together at a separate location, and this
registered set of JCLs is called for executing programs. By doing this, the computer prevents JCL
errors. This set of registered JCLs is called cataloged procedures.
52
Console panel: It is a unit where the operator interacts with the computer system via key control and monitors its
operation and where the system communicates failures, etc. It consists of a keyboard and a display.
53
(Note) For spool files usually a hard disk is used. The service program that performs the sending of spool files to a
printer is often called a writer (output writer).
54
(FAQ) Most exam questions on job management are about spooling. They are always in the form of selecting the correct
term, so be sure to have accurate understanding of spooling.
One of the functions of the control program, which is the “OS” in a narrow sense, is “task
management.” Task management is the function of controlling the execution of programs and
consists of various procedures such as synchronization control of programs, dynamic assignment
of resources for program execution, and management of execution priorities of the programs. It
also conducts various types of interruption control.
6. Terminate Task
Running elimination
1. A task has been generated, so it is entered into the queue. Æ Move to the ready state.
2. For execution, move to the running state via the task dispatcher. Æ Move to the running state.
3. Time has expired; withdrawn to make way for a task with high priority. Æ Move to the ready state.
4. Withdraw for an I/O instruction. Æ Move to the waiting state.
5. Again waiting to be executed after completion of I/O. Æ Move to the ready state.
6. All processes are now completed. Æ The task is terminated.
Dispatching refers to the step of selecting a task with high priority from among those tasks in the
ready state and advancing it to the running state.56 Supervisor call refers to the step of invoking a
function of the OS; in state transfer of tasks, this refers to an I/O instruction. I/O interruption is a
55
Process: This term refers to a program being executed and is used interchangeably with the term “task.” “Process” is a
word used by some operating systems such as UNIX. In recent years the expression “process” is used frequently.
56
Dispatcher: It is the program of the OS that carries out dispatching; also known as the dispatching routine.
Interruption
Interruption refers to temporary suspension of a program currently being executed for any reason
and transferring control to the OS to execute some necessary processing program. There are
external interruptions caused by certain specific states of the hardware and internal
interruptions caused intentionally when the control program is called from within a program.
The hardware detects interruptions. When the CPU detects an interruption, the OS receives it,
changes the program being executed to the necessary state prior to the interruption, examines the
cause of the interruption, and transfers control to the corresponding processing routine (program).
The area in which this state is stored is called the PSW (program status word). There is also a
possibility that while an interruption process is taking place, another interruption occurs. Priorities
are given to these interruptions depending on their types so that multiple interruptions can also be
controlled. The table below describes main types and examples of causes for interruptions.
Another function of the control program, which is the “OS” in a narrow sense, is “data
management.” Data management is the control program that manages data input and output. It
provides various file organization methods such as sequential organization, direct organization,
and indexed organization. It works as a bridge between logical files processed within a program
and physical files whose structures are different.
Data management allows programmers not to worry about the physical structure of the files.
57
(FAQ) State transition of tasks (processes) is almost certain to appear on every exam. Commit the entire figure of state
transition to your memory.
58
(Hints & Tips) Issuing of an I/O instruction is notified by supervisor call; I/O completion is notified by I/O interruption.
59
Internal interruptions are intentionally caused by programs, so they are sometimes referred to as traps.
Access Methods
Access methods include sequential access, direct access, and dynamic access, which is a
combination of the first two.
Sequential access
This is the method of handling records in a file in sequential order from the beginning. This can
be performed with almost all recording media. This is suitable for collective processing in
which all records in a file are subject to processing.
Direct access
This is the method where a necessary record is directly (randomly) accessed regardless of the
order in which the records are stored. This method is used when the file medium is a directly
accessible storage medium such as a hard disk. It is used in online real-time systems where
only a part of a large number of records stored in a file needs to be accessed for quick update.
Dynamic access
This is the method where direct access is used to find a specific record and then sequential
access follows. Similar to direct access, this is used when the file medium is a directly
accessible storage medium.
60
(Hints & Tips) Sequential access can be used with most media; however, direct access and dynamic access are limited to
directly accessible media such as a hard disk.
File Organization
File organization methods include sequential organization, direct organization, indexed
organization, and partitioned organization,61 etc.
The records in the files are stored in consecutive positions following a certain order. Files of
this type can be created on almost all media such as magnetic tape, hard disk, and floppy disk.
In general, only sequential access is possible with these files.
A storage address on the medium is calculated based on the key value found in each record, and
the record is stored in that position.62 To access a record, we first calculate the storage address
using the same formula, and the record is read from that location. There is a method where the
key value of each record is directly used as the storage address for the record, but this is not
very practical, creating a lot of wasted memory if the key values are not consecutive. A more
general way is to use a certain type of conversion formula to calculate a storage address from
the key value of each record. This is called address conversion (randomization).
Key
Store
Record Record … Record
Address calculation Record Record … Record
Record Record … Record
Address conversion sometimes produces the same address for different records. In such a case,
the record stored first is called the home record while the record assigned to the same address
later is called a synonym record.63
61
(FAQ) Questions concerning the characteristics of these organization methods are frequently asked. Know the
characteristics of each type of file organization.
62
(Hints & Tips) A special case of direct organization is relative organization, in which the key values are consecutive like
1, 2, 3, … and the key value is itself the storage address of the record.
63
Synonym/Home: Address conversion can take different key values but produce the same storage value. This is called a
“synonym” (word meaning the same thing). If the result of key conversion stores a record, this record is called the home
record, and another record that could not be stored there is called a synonym record. A synonym record needs to be stored
elsewhere by some other method (e.g. by list).
These are files with an index, and they are organized such that the user can access the records
by looking up their addresses by index. Sequential access, direct access, and dynamic access
are all possible. In each case, the actual records are accessed only after their addresses on the
medium are looked up using the index, so not only does the medium contain a basic data area
(prime domain) where the data is stored but also an index area. Further, in order to prevent a
situation where records cannot be added to the basic data area, an overflow area64 is also
reserved.
In these files, sequential organization files are grouped into units called members, each of
which is given a name. Then a directory containing these names and their leading addresses is
created. Access is allowed to these members. Think of a member as a set of multiple files
organized sequentially. Direct access can find the beginning of a member, and sequential access
can be used to find a record in that member. Partitioned organization files are used as storage
locations for program files and libraries but not very much as data files.
File B
64
(Note) There are two types of overflow areas: cylinder overflow area and independent overflow area. Overflow records
from various tracks are stored in a cylinder overflow area; if a cylinder overflow area becomes full, additional records are
stored in an independent overflow area shared by all of the files.
A file system has a hierarchical structure consisting of files and directories (directories are file
registers). At the top of the hierarchical structure is the root directory, and directories under it are
called subdirectories.66
File 2
: Directory
: File
File manipulations
When searching for a file, we designate the path showing in which directory the file is located.
There are two methods for doing this. For instance, if the hierarchical structure is as shown in the
figure above, we can designate the path in the following ways:67
• Absolute path
• Relative path
65
(Hints & Tips) What UNIX and MS-DOS call “directory” is called “folder” in Windows and MacOS.
66
(Hints & Tips) Whereas directories and files can be made under a directory, files and directories cannot be made under a
file.
67
(FAQ) Concerning hierarchical file systems, there are exam questions like “Choose an appropriate designation as an
absolute path or as a relative path.” In those questions, the symbol for separating directories and files, as well as its use,
will be explained in the question text.
68
(Hints & Tips) Here we are using the symbol “\” to separate directories and files, but some operating systems use the
symbol “/” instead.
69
Current directory: It is a directory in which the user is working at the moment.
Another role of the control program, which is the “OS” in a narrow sense, is “memory
management.” Memory management makes the most effective use of the memory as well as
compensating for any lack of the main memory capacity. To this end, it effectively uses auxiliary
storage as part of the memory.70
Partitioned method
When a program is placed in the main memory, the main memory is partitioned into several
partitions, into which the program is loaded.72 Without memory management, fragmentation
occurs, causing a situation which prevents programs from being stored even though empty space
exists. Hence, to combine all empty areas together, compaction73 is necessary.
Swapping refers to execution as the program keeps switching back and forth between the main
memory and auxiliary storage. If a program is entered with higher priority than the priority level
of the currently executed program, the new program is immediately loaded into the main memory
and is executed. However, if there is no space in the main memory, any program in the main
memory can be moved to the auxiliary storage. Hence, this system compensates for a lack of main
memory capacity by utilizing auxiliary storage. However, if swapping occurs frequently, it means
that programs are switched back and forth many times, thus reducing the processing efficiency of
the computer system.
Main memory
Program X Swap out (roll out) Program X Auxiliary storage
Program B
Program A Swap in (roll in) Program A
70
Memory leak: Sometimes, for some reason, memory in the main memory, secured dynamically by an application, may
not get released but remains in the main memory. This is called memory leak. To eliminate memory leak, compaction must
be performed.
71
Real memory system or “Real Storage (RS)”: This refers to the actually existing memory; it is the main memory.
72
(Note) In partitioned method, multiple programs can be stored simultaneously, so multitasking is possible.
73
Compaction: It means collecting empty memory areas to form a continuous area; also known as garbage collection.
Relocation (Relocatable)
Relocation refers to the function wherein a program already assigned to a certain area is re-stored
in another location. A program whose structure allows it to be relocated is called a relocatable
program.74
Overlay method
The physical limitations of the main memory can be eliminated; that is, programs are divided into
segment units, and only the necessary segments75 are loaded into the main memory to be executed.
The entire program is stored in auxiliary storage, and the main memory contains only frequently
used segments. Exclusive segments, which are never used simultaneously, are loaded from
auxiliary storage to the main memory on an as-required basis.
For example, suppose that Segment A is a main routine used with high frequency while Segments
B and C are subroutines called exclusively by Segment A. While Segment B is being executed,
Segment C is in auxiliary storage. When Segment C is called, it is loaded in the area of Segment B.
Consequently, the entire memory capacity of the program is “A + B + C,” but the capacity of the
main memory is sufficient if it is at least the greater of “A + B” or “A + C.”
Program
Virtual Memory
Virtual memory provides a large capacity of storage space regardless of the size of the main
memory.76 Programs are stored in virtual memory (normally in auxiliary storage), and only the
parts necessary for execution are loaded into the main memory.
Since the program is loaded into virtual memory, the instructions and data is given virtual
addresses, which need to be converted to actual addresses (main memory addresses) for the
execution of the program. This conversion is implemented by hardware called DAT (Dynamic
Address Translator).
We discuss the virtual memory strategies, which include three methods: page, segment, and
segment-page.
74
(Note) “Relocatable” means that compaction is possible.
75
Segment: It is a logical processing unit of a program. Here, we can regard a segment as a subroutine.
76
Virtual memory system or “Virtual Storage (VS)”: It is a conceptual storage that does not actually exist. A program
to be executed appears to be loaded into virtual memory, which is a large memory space, while only the portions (pages or
segments) of the program with high frequency of use, data, and other parts necessary for the execution get loaded into the
main memory.
In this method, the program is partitioned into units of a fixed size, called pages. A page then
becomes the unit for loading into the real memory. Pages are managed by a page table, which has
one entry for each page of virtual memory. If the corresponding page is in the real memory, the
page fault bit becomes 0. This page fault bit then indicates whether or not the corresponding page
is in the real memory.
⋮⋮
Page 3 3 1 n Page n
⋮
⋮
Page n n n 0 b Page 1
⋮
⋮
⋮
⋮
Segment method
In this method, programs and logical sets of data is considered segments. Virtual addresses consist
of segment numbers and addresses within the segments. The paging method is only for memory
management and as such, the programs do not need to be written with pages in mind. In contrast,
in the segment method, in which segments have different capacities, the programs must be written
in consideration of the segment sizes.
Segments are logical processing units, so they can be treated as subroutines. However, the flexible
lengths are sometimes inconvenient to manage, and the usage efficiency of the main memory may
be reduced.
Segment-page method
This is an improved version of the segment method, in which segments are further partitioned into
pages. Real addresses are accessed in the order of “segment Æ page Æ relative displacement
within the page.”77
77
Relative displacement within the page: It is an address assigned such that the beginning of the page has displacement
0.
Paging Algorithms
If a page necessary for processing is not found in the real memory, an interruption called a page
fault occurs, and the page is read into the real memory from virtual memory. This is called
page-in. On the other hand, page-out is to move an unnecessary page out to virtual memory.
Page-in and page-out are together called paging.78
If paging occurs frequently, the time for executing the control program increases, reducing the
performance. This is called slashing. To minimize the occurrence of slashing as much as possible,
various algorithms are proposed to select pages that are subjects of page-outs.
79
Common page-out methods and their properties are shown below.
78
(Hints & Tips) Swapping and paging are similar, but note that swapping takes place in program and segment units
whereas paging takes place in units called pages, which are parts of a program.
79
(FAQ) There are exam questions that require specific tracking of page-ins and page-outs. For example, if the order in
which pages are used is “1, 3, 2, 3, 5, 2,” and if the main memory has page 3, which page will be the first to be paged out?
For these questions, have clear understanding of ideas like LRU and FIFO.
Quiz
Q1 Explain the roles of task management.
Q2 What type of file organization is this? Sequential organization files are grouped into
units called members, each of which is given a name. A directory is created, including
these names and leading addresses, and access is allowed to these members.
Q3 Explain swapping.
Q4 What is the unit of loading into the main memory in the page method?
A2 Partitioned organization
A3 Swapping refers to execution as the program keeps switching back and forth between the
main memory and auxiliary storage. If a program is entered with higher priority than the
priority level of the currently executed program, the new program is immediately loaded into
the main memory and is executed. However, if there is no space in the main memory, any
program in the main memory can be moved to the auxiliary storage. Hence, this system
compensates for a lack of main-memory capacity by utilizing auxiliary storage.
A4 Page
Various system configurations are being used to reduce the cost and increase the efficiency of
computer systems. These include client/server systems to distribute the load, dual systems to
improve reliability, and duplex systems.
A client sends service requests to a server, receives the results of data processed by the server, and
displays the results.
The clients and the servers distribute their processes in an attempt to spread the load of computer
processing. In addition, by sharing resources, the user can reduce waste. For instance, by
connecting a high-speed printer to a server, the clients can share the high-speed printer. Having
one high-speed printer may be less expensive than preparing a low-speed printer for each of the
clients, although this depends on the number of clients.
Computers used as a server are generally more high-performance than the client computers. To
clarify the functions of various servers, they can be named by their functions, such as file servers,
database servers, print servers, and communication servers.81
80
Client/server system: In multiprogramming, if there is one operational computer, the user can run both the client and
the server within the one computer. In other words, a client/server process does not mean that everything is distributed. It is
one method to achieve distributed processing.
81
(Hints & Tips) If a service request made by a client cannot be provided by the server, that server can become a client
and request another server to perform the requested process.
Server
Process result
Types of Server
The table below shows the types of server, depending on the provided functions.82
• The system could be confusing unless the server administrator is clearly identified.
• The performance deteriorates if the use gets concentrated on specific servers.
• The performance of the entire system depends on the network performance.84
82
(Hints & Tips) Clients and servers do not necessarily have to have the same OS. Where there are multiple servers, they
do not have to have the same OS either. In addition, if there are multiple clients, they need not have the same OS either.
83
(Note) If client/server type programs are logically divided into three layers (presentation layer, function (application)
layer, and data layer), such a system is called a 3-layer client server system. By distinguishing 3 layers by function, such a
system strives to enhance system performance and efficiency for development and maintenance.
84
(FAQ) There have been many exam questions regarding the knowledge of client/server systems. Most of them are about
the role of a client or that of a server, so be sure to know these things well.
Simplex System
This system consists of one CPU only. Reliability and the processing capabilities are inferior in
comparison with other configurations, but it is economical. This configuration is commonly
used.85
Dual System
This is a system configuration in which two CPUs perform the same processing and compare
the processing results to each other. This configuration is applied when the process is not
allowed to stop, even for a moment. If one CPU fails, the system cuts off the failed CPU and
continues processing on the other CPU. Reliability is extremely high, but this system is
expensive.86 87
Comparing
processing
results1
85
DCE (Data Circuit-terminating Equipment): This unit converts signals received from communication line, sends
them to data terminals, and also executes exactly the opposite operation. Normally, this unit is connected at the end of a
communication line and functions as an interface with a computer. A modem (modulator-demodulator) is used on an
analog line, and DSU (Digital Service Unit) is used on a digital line.
86
CCU (Communication control unit): This unit controls the reception and transmission of data, performs error
control, and assembles and decomposes characters.
87
DISC: Auxiliary storage
(Secondary system)
92
88
(FAQ) There are many exam questions on characteristics of dual, duplex, and multiprocessor systems. The key term for
each system is as follows: “comparing process results” for dual, “switching the units” for duplex, and “sharing the main
memory” for multiprocessor systems.
89
(Note) If a failure occurs in the primary system, it takes time to switch to the secondary system. This is because the
batch processing or whatever else is being executed in the secondary system must be suspended, and the OS must be
booted for the online system. A hot standby system configuration can solve this by standing by, ready to switch at any time.
In this case, the OS for the online system stays on, so switching can occur immediately.
90
LCMP (Loosely Coupled Multiprocessor): Each CPU has its own main memory and independent OS. CPUs are
joined by a high-speed network or shared path. This is a configuration where independent computer systems are connected
via a network.
91
TCMP (Tightly Coupled Multiprocessor): One main memory and one OS are shared in this configuration. Each CPU
can perform identical processes, so even if one CPU fails, the processing can continue, albeit with lower performance. This
configuration is highly reliable and thus is used in systems where a high level of processing capability is required.
92
MM: Main memory
Depending on how the computers are placed physically, there are two types of processing:
centralized processing and distributed processing.
Centralized Processing
Centralized processing is a system configuration in which one computer is connected with many
terminals, and the one computer alone does all of the processing. It is easy to maintain the
consistency of data, and it is easy to manage the resources. These merits contributed to the
popularity of this configuration in which a general-purpose computer is used as the host in
centralized processing. Below is a summary of relative comparison with distributed processing.
Distributed Processing
Distributed processing is a system configuration in which multiple computers connected via a
network perform the processing. Since the processing is done through a network, the processing
time is longer than that of centralized processing. But, the merit is that a failure of one computer
does not affect the entire system. Below is a summary of relative comparison with centralized
processing.95
93
Grosch's Law: It states that “performance is directly proportional to the square of the price.” If the price of a computer
doubles, the performance quadruples. However, technological advancement has reduced the prices of devices significantly,
so this law is no longer applicable.
94
Backlog: It means systems, software, programs, etc. that are necessary to develop but the development of which has not
even begun. The term often refers to those that are held back in the IT department within a company.
95
(FAQ) There are exam questions where you are required to identify the characteristics of centralized processing and
distributed processing. For example, questions may be of the form “Which of the following is an appropriate characteristic
of a centralized processing system?” Know the advantages and disadvantages of each processing type.
As shown in the following table, distributed processing can be classified according to the
distribution status of functions and loads. It is said that vertical load distribution does not exist in
reality.
Configuration
Function distribution Load distribution
Function
Horizontal distribution Horizontal function distribution Horizontal load distribution
Vertical distribution Vertical function distribution
A horizontal function distribution system is a system in which computers are classified according
to type of application and type of data; examples include processing function distribution and
database distribution. For instance, in financial institutions, host computers may be classified into
those in an information system and those in an accounting system; this classification is based on
the type of processing, so it is an example of processing function distribution. Database
distribution means that computers are located in appropriate locations based on the contents of
data.
This is a system in which multiple computers perform processes jointly when an application is
executed. When a process is requested, an idle computer responds. In this mode, if one computer
fails, the process switches to another computer and is continued. Hence, this system is quite
effective in time of failure. A tightly coupled multiprocessor system is an example of this type.
This is a system where the processing function is shared among workstations belonging to
individual users as well as computers shared by multiple users. Here, there is a vertical
relationship in regard to the processing function. A client/server system is a typical example of a
vertical function distribution system.96
96
(Hints & Tips) A client/server system appears as if it were horizontal distribution, but it is properly classified under
vertical function distribution. Since one server performs processes of multiple clients, there is a vertical relationship in
functions.
From the standpoint of processing modes, system configurations can be classified into two
categories: batch processing and real-time processing. They can also be classified by whether or
not they are connected to a network.
Processing mode Operation mode Connection method
Batch processing97 Center batch processing Offline
Remote batch processing
Real-time processing Interactive mode processing Online
Online transaction processing
Real-time control
• The computer can be used efficiently because the processing is done all at once.
• It is suitable for routine and repetitive processing (standard tasks).
• The results are not immediately obtained because the processing is collectively done.
In center batch processing systems, the processing takes place at a central computer center; in
remote batch processing,98 the batch processing is performed from a terminal at hand via a
communication line.
Online transaction processing requires specialized (dedicated) terminals with a high level of
usability aimed at enhancing the processing efficiency, such as bank ATMs and terminals for
issuing reserved-seat tickets at reservation windows. In addition, since data is shared by many
terminals, there is a risk that simultaneous access to the same data could cause problems such as
deadlock100 or data destruction. Hence, attempts are made to enhance the data maintainability.
Real-Time Control
In general, real-time control refers to the method by which data is processed in real time once a
processing request is made and the result is immediately reported to the requester. This concept
also includes online transaction processing. However, in a narrow sense, this refers more
specifically to the processing mode at places like manufacturing plants, where the system is
interlinked with a sensor system tracking the physical motions of objects to be controlled,
processing corresponding to the external signals are immediately executed, and the results are
immediately sent back to units on production lines (e.g. robots) as control signals. Production
control systems at steel plants and automobile factories are examples of this type. Another
example is a 24-hour monitoring process of electrical system managed by computers; if there is
something wrong with the system, the system reports it to the maintenance company in real time,
or the unit that has detected it gets cut off in real time.101
99
TSS (Time Sharing System): The system authorizes multiple programs to be executed in a specific order for an
extremely short duration at a time (several milliseconds at a time). This procedure is repeated so that the execution of each
program can get completed within a certain period of time. From the user's viewpoint, there is no mutual interference, so
each user feels as if he or she were the only one exclusively using the computer system.
100
Deadlock: It is a situation where the system gets stuck because multiple tasks (programs) try to access the same
resource (file, database, etc.) and go into the waiting mode.
101
(FAQ) Many exam questions involve characteristics of remote batch processing and interactive processing. Both are
online, but note that remote batch is a type of batch processing while interactive processing is a type of real-time
processing.
Quiz
Q1 Explain the roles of clients and servers in a client/server system.
(Secondary system)
Configuration
Function distribution Load distribution
Function
Horizontal distribution Horizontal function distribution Horizontal load distribution
Vertical distribution Vertical function distribution
A2 Duplex system: Distinguishable points are the existence of two CPUs and the
switching units.
To evaluate computer systems, various methods are available. While good performance (high
processing speed) is important, fault-tolerance (high reliability) is also significant.
To evaluate the comprehensive performance of computer systems, including their software and
hardware, we can use various criteria such as response time, throughput, and turn-around time.
Indexes to evaluate performance, especially the hardware, include instruction mix and
benchmark.
Response time
This is the amount of time between the completion of input at an input unit and the beginning of
output at an output unit. For example, when a processing request is made at the keyboard of the
computer, this time refers to the amount of time it takes until the result is shown on the display
unit or until the printing begins. This is mainly used to evaluate the performance of an online
system.
This refers to the amount/number of jobs that can be processed by the computer system within a
certain unit of time, or the amount of time required to process a certain job. This processing time
includes the exclusive CPU time and process-waiting time such as preparation for I/O operation
and clean-up time.
102
(Hints & Tips) There are instruction mixes and benchmarks for evaluating the performance of computers, and
instruction mixes are for hardware evaluation. However, even if hardware is very fast, the entire system performance
becomes poor if the performance of the OS is poor. Hence, the performance of hardware is often used only for reference.
Technically, this refers to the amount of time it takes information to make the rounds of the system.
In batch processing, this is the duration between submission of a program at the window and the
time when the results are obtained. In business operations, this is the duration from the time when
a client places an order to the time when the ordered product is shipped and reaches the client.
Instruction Mix103
An instruction mix is used to compare the performance of hardware in computer systems. Even if
the hardware is fast, if the performance of the OS is poor, the performance of the entire system
becomes inferior. An instruction mix is to use an average program and calculate the average
instruction execution time per instruction and MIPS value,104 based on the execution frequency of
each instruction.
Under these conditions, let us do some specific calculations of the MIPS value.
First, let us calculate the average instruction execution time. The execution speed of each
instruction is expressed in microseconds (10-6). The average instruction execution time is the
sum (over all instructions) of the products of the execution time of instructions and their respective
frequencies.
Average instruction execution time = 0.1 * 10-6 * 0.4 + 0.2 * 10-6 * 0.3+0.5 * 10-6 * 0.3
= (0.04 + 0.06 + 0.15) * 10-6
= 0.25 * 10-6 (seconds/instruction)
The average number of instructions executed per second is the inverse of the average
instruction execution time, so it is obtained as follows:
103
(Note) An instruction mix for scientific calculations is called “Gibson mix,” and one for business calculations is called
“commercial mix.”
104
MIPS (Million Instructions Per Second): This is the performance index expressing the number of machine
instructions, in millions (106), that can be executed per second. This is just for the performance of hardware, so again it is
used only for reference.
105
(FAQ) Exams do have questions where you are asked to calculate MIPS values given an instruction mix or to calculate
the average clock count per instruction. You would want to be familiar with these calculation questions.
106
Clock: This refers to the frequency of a clock signal generated by a circuit called a clock generator. Since instructions
inside the CPU are synchronized to this clock signal as they are executed, the higher the clock frequency is, the more
instructions can be executed in a given period of time. For example, if the clock frequency is 200MHz, there are 200 * 106
clock signals per second. In general, one instruction takes several clocks.
107
FLOPS: It stands for floating-point operations per second. This is an index expressing the number of floating-point
operation instructions executed per second. If it is expressed in millions (106), it is called MFLOPS.
Benchmark
A benchmark is used to compare and evaluate the comprehensive performance of computers,
including the hardware and the OS, by measuring the standard program execution time.108
2.4.2 Reliability
The level of reliability required for information systems varies depending on the purpose for
which the systems are used. Sometimes the economical factor must be sacrificed to achieve a
high level of reliability. In some other situations, not only the subject of reliability is focused on
the operation of the system but also the information handled by the system needs to be reliable
as well.
Reliability Indexes
Reliability is the degree to which system operation is stable. The ideal case is that the system does
not fail, but there is no system that does not ever fail.
RAS/RASIS
Both of the terms RAS and RASIS are acronyms of elements that help computer systems to
operate in a stable manner. RAS stands for the first three elements of RASIS:
R A S I S
108
(Note) An example of a technical calculation benchmark is SPECmark, and an example of a transaction benchmark is
TPC. TPC-C is the frequently used benchmark under TPC which directly responds to actual business applications.
109
Availability: Availability refers to the probability that the system is maintaining its functions (operating) at any given
time or the percentage of the duration when the functions are maintained during a certain period of time.
Bathtub curve
The bathtub curve is used to illustrate the concept of hardware lifecycle. Hardware may fail during
the initial period of its operation due to defective parts, etc., but the probability at which these
failures occur decreases gradually as repairs and replacements are made. After that, because of
wear and tear of various parts, the probability of failures increases, and eventually its life is
determined to be over. This curve is shown below.110
Failure rate →
Time →
Fail-soft
This refers to the function in which, when a failure occurs, the failed part gets cut off and the
system continues to operate, perhaps with a lower performance level (fall back114). In a duplex
system, normally the two systems are independently processing data, but if one system should fail,
the configuration would switch the processing to the other system and would carry on the
processing. In addition, when a failure occurs in multiprocessors, the system continues its services
by cutting off the failed processor. This too is a system configuration with fail-soft in mind.
Fail-safe
This refers to the function in which, when a failure occurs, the system locks its functions in a safe
mode established in advance to control the extent of the impact of the failure.115 This is just like
the measure where all railroad lights turn red when an accident has occurred. In system
configurations where two systems compare the processing results of each other, such as in a dual
system, when the compared results are different, the system in which a failure is determined to
have occurred is cut off while the operation continues on.
110
(Note) The bathtub curve is so named because the graph showing the relationship between the failure probability and
time resembles the shape of a bathtub.
111
Early failure period: It is a period of failures at the beginning of unit use. These failures become less frequent as time
passes.
112
Stable failure period: The unit is stable during this period, with less frequent failures.
113
Wear-out failure period: A certain period of time has passed, and failures become more frequent during this period.
114
Fall-back: In a fail-soft computer system, processing continues at a lower level of functionality; this is called a
fall-back or a fall-back operation.
115
(Hints & Tips) Fail-soft and fail-safe are similar words, so do not confuse them.
Fool-proof
This term refers to a measure that prevents an unintentional use of a program from causing a
failure, especially when indefinitely many users use the same program. If one individual is using a
particular program, the way the program is written does not create a major problem, but when
there are indefinitely many users, how the program gets used is hard to predict.116
2.4.3 Availability
¾ MTBF is the time when the system is operating properly, and
MTTR is the time when it is being repaired.
Points ¾ Availability is the ratio of the time when the system is operating
properly.
One of the indexes in RASIS is “A” for availability, which means the operation rate. The
availability is calculated using MTBF and MTTR as follows:
Time →
Being Being Being Being
Operating Operating Operating … Operating
repaired repaired repaired repaired
T1 D1 T2 D2 T3 D3 Tn Dn
This is the average length of time that the system continues to operate without a failure. The
larger MTBF is, the more reliable the system is. Therefore, this is used as an index of reliability
(“R” in RASIS).117
(Here, “n” is the number of intervals the system was operating without failure.)
116
(FAQ) There are exam questions concerning what each of the letters RASIS means as an index for computer system
reliability. At least know what RAS stands for.
117
(Note) Functions that improve MTBF include error detection, automatic 1-bit error correction, instruction re-try, etc.
These are functions that prevent the computer system from coming to a stop. Functions that improve MTTR include log
output. By looking up logs, the cause of failure can sometimes be identified. Remote maintenance also helps detect a
failure promptly, enhancing MTTR.
This is the average length of time required for repair when a failure occurs. The shorter the
repair time is, the better the system is. Therefore, it is used as an index of serviceability (“S” in
RASIS).
(Here, “n” is the number of intervals the system was operating without failure.)
Calculation of Availability
To calculate the availability, the serial connection and parallel connection sections must be
calculated differently. The basic ideas are described below.118
The availability of an entire serial connection system as shown here is the product of the
availabilities of each unit. Here, P1, P2, and P3 are the availabilities of the respective units
shown in the figure.
Suppose that we have, as shown below, a system in parallel connection where the system
operates as long as at least one of Units 1, 2, and 3 is operating. Here, the availability is
calculated using the fact that the probability of the entire sample space is 1.
Unit 1
P1
Unit 2
P2
Unit 3
P3
118
(FAQ) There is always a question involving a calculation of availability. Make sure you understand correctly how to
calculate it.
Suppose that there is a system which operates if Unit 1 is operating AND at least one of Units 2
and 3 is operating. In this case, we consider that Unit 1 and the parallel section (Units 2 and 3)
are serially connected.
Unit
P2
Unit
P1
Unit
P3
Availability
= (Availability of Unit 1) * {1 – (Prob. that Units 2 and 3 fail simultaneously)}
= (Availability of Unit 1) * {1 – (Prob. that Unit 2 fails) * (Prob. that Unit 3 fails)}
= P1 * {1 – (1 – P2) * (1 – P3)}
Let us consider more complicated configurations.119 Even though the two systems below may
appear similar, the availabilities are different. Here, the letter α in the figure indicates the
availability.
[Configuration 1] The two parallel sections (inside the dotted lines) are serially connected.
α α
α α
[Configuration 2] The two serially connected units (inside the dotted lines) are connected in
parallel.
α α
α α
119
(Hints & Tips) Note that similar configurations have different availabilities.
Quiz
Q1 Explain the meanings of the following terms: “MIPS,” “response time,” “throughput,”
and “turn-around time.”
Q5 Calculate the availability for the entire system configuration shown below. A and B are
units, each of which has an availability of 0.97. The entire system is assumed to be in
operation if at least one of the units is operating.
A1
MIPS: It is an acronym standing for “million instructions per second.” This
indicates the number of instructions that can be executed in one second,
expressed in millions (106). It is one of the performance indexes of
hardware.
Response time: It is the length of the time interval from data transmission to the return of
processing results. It is one of the performance indexes of an online
system.
Turn-around time: It is the length of the time interval from a job request to the complete
output. Mainly a concept used in batch processing, this is an index of
system evaluation including its operation.
A2 RASIS is an acronym of elements that help computer systems operate in a stable manner.
R A S I S
A3 MTBF: (Mean Time Between Failures) This is the average length of time that the
system continues to operate without a failure.
MTTR: (Mean Time To Repair) This is the average length of time required for repair
when a failure occurs.
Various systems have been developed using networks and databases. Close to our daily life are
the Internet and database services (generally called commercial databases). Examples of
applications of multimedia systems include 3D graphics.
Today, our information society has networks spanning all over the world like a gigantic web.
Systems using networks themselves and application systems with add-on values are available.
This is a rather new area, so there are not many exam questions on this topic, and they are
relatively easy. Most of the questions simply require knowledge of the terms, so be sure to
memorize them to improve your exam scores.
Uses of Infrastructure
“Infrastructure” means “foundation” or “basis.” In computer systems, this word refers to the
foundation of software and hardware to form the systems. For instance, in network construction,
various components such as communication lines, communication units, and the charge system of
the communication lines are parts of what is known as the communications infrastructure.
This is the information search system in the hypertext format, developed by researchers at
CERN.120 Since information distributed all over the world is mutually linked by this network
using hypertext,121 a name meaning “global spider web” was given to it.
WWW is a mechanism on the server that records information in the form of an Internet homepage.
Software that accesses WWW and displays it on a screen is called a Web browser or simply a
“browser.”
The Internet
120
CERN (Conseil Europeen pour la Recherche Nucleaire) (European Council for Nuclear Research): It is a quantum
physics research institute jointly funded and operated by 12 European counties, but generally it is known as the institute
which developed WWW on the Internet. Its name has now been changed to Laboratoire Europeen pour la Physique des
Particules, but the abbreviation remains the same.
121
Hypertext: It is a structure in which pointers are placed within texts so that links can be made to jump from those
pointers to other texts and pictures. To create a document in the hypertext format, one uses HTML (HyperText Markup
Language). To identify a WWW server address, we can use URL (Uniform Resource Locator).
It is a collection of networks all over the world connected together by TCP/IP. There is no
government organization or designated organization managing it in an integrated manner. Instead,
the technical support and resource management are done by volunteer organizations.
ARPANET,122 created by the United States Defense Department, set the foundations for the
Internet.
Intranets
An intranet is a company-wide network applying the technology of the Internet. Normally, a
firewall123 is set up between the Internet and a company-wide network in order to prevent leakage
of confidential information of the company.
Extranets
An extranet is an intranet extended over numerous companies. In general, an extranet is built by
connecting intranets to the Internet.
Intranet at Intranet at
Company A The Internet Company B
Mobile communication
Mobile communication124 is an environment in which the network can be accessed from any
location. Today, communication is the mainstream, so we can send and receive e-mails on the
Internet and obtain a variety of information all with one telephone.
Satellite communication
Satellite communication is a wireless communications system using a communications satellite.
A broadcasting station transmits (uplink) a huge amount of information to a stationary
communications satellite located 36,000 km above the equator (in a stationary orbit), and the
information is distributed all at once (downlink) to various receiving stations on the earth. A large
amount of information can be transmitted to many points.
CATV
New types of services using CATV (Community Antenna TeleVision, or Cable TV) are being
considered and discussed commercially, such as Internet connection, telephone services,
experiments involving PHS (Personal Handy-phone System), and VOD (Video On Demand).
CATV is expected to be a major part of the infrastructure in the multimedia era.125
122
ARPANET (Advanced Research Project Agency Network): This is a nationwide computer network developed under
the sponsorship of the Advanced Research Project Agency (ARPA) of the United States Department of Defense. It is the
predecessor of the Internet.
123
Firewall: It is the mechanism which is located between the Internet and a company-wide intranet to manage data
communication and to protect the internal network from external attacks and invalid access. The word could also refer to
this functional role.
124
Mobile/mobile computing: “Mobile" refers to any information device that can be carried around, including cell
phones, PHSs (personal handy-phone system), and notebook PCs. “"Mobile computing” refers to the mode of using any of
these information devices to have access to the company network from the outside.
125
(Note) CATV began as a reparation facility in remote areas and a community facility in rural regions. Today, urban
CATV, which can provide broadcast services on many channels, is getting attention as a new-generation component of the
Application Systems
A network application system is a social system using a network. Specifically it includes the
following:
Internet shopping
This is a system in which the user can shop at a virtual store set up on web pages. To make a
payment, the shopper can use his or her credit card or go to a nearby convenience store to pay.
Groupware
This is software for communication within an organization or for information sharing. It has
functions such as electronic mail, schedule sharing, document sharing, and workflow.
Debit cards
This is a service whereby a cash card issued by a bank can be used to make payments. The money
for the payment is directly withdrawn from the bank account in real time.126
infrastructure. Coaxial cables are used for distribution so that high-quality images can be received.
126
Debit cards have been traditionally called bank POS; cash cards are used instead of cash.
One type of database application system is a data warehouse. Application systems in which
databases are applied in business include corporate accounting systems, 127 inventory
management systems,128 document management systems,129 and sales support systems.130
Data Warehouse
A data warehouse is a company-wide database to support decision-making. The idea is to have
a large amount of data stored, organized, and used to help make business decisions. Sometimes
it is called an informational database.
Data Mining
This refers to a technology or method of drawing out tendencies, trends, correlations, and
patterns necessary for management and marketing, through dialogues with a large amount of
raw data.
Whereas a data warehouse normally analyzes various data based on some hypothesis, data
mining discovers trends and patterns in order to establish the hypothesis.
Data Mart
A data mart is a database which stores data obtained from a data warehouse. The data stored in
a data mart, is selected and summarized according to the purposes of a specific user group.
Whereas a data warehouse contains information for the entire company, a data mart has a
relatively small amount of data tailored for the target users.
127
Corporate accounting system: It is a system in which the accounting procedures of a corporation are computerized in
an attempt to make the accounting tasks more efficient and quicker and to obtain timely understanding of the business and
managerial records.
128
Inventory management system: It is a system to keep the production (purchase) and demand in balance, managing the
inventory such as products and raw materials kept by the company at an optimum amount. In a retail store such as a
supermarket, the sales information entered at POS terminals is collected and analyzed so that the demands can be predicted
and more products are automatically ordered, taking into account safe inventory volumes and optimum amounts to
purchase.
129
Document management system: It is a system in which a corporation manages various types of documents and
sources; document search is possible from a variety of fields such as the storage location or contents of the document. It is
an attempt to make document management and document preparation more efficient, e.g. to avoid duplicate preparation of
the same document.
130
Sales support system: It is a system that supports making sales plans and business plans, based on accumulated sales
information.
OLAP
OLAP (OnLine Analytical Processing) is the concept of analytical application in which the end
user discovers problems and solutions by directly searching and organizing a database; the goal
is to achieve quick data access and to provide a function for easy analysis.131
OLTP
OLTP (OnLine Transaction Process) is the processing mode in which messages are sent to the
host computer from multiple terminals connected online to the host computer, which, according
to the message received, in turn performs the process including access to a series of databases
and returns the process results immediately to the terminals.
Databases used by OLTP are called business databases or, sometimes, operational databases.
These are terms in contrast with informational databases.
Application Systems
Various systems that use databases are developed. Today, it is not an overstatement to say that
most of the systems in operation use databases. Some of the great advantages for using
databases are as follows:
• Data can be easily accumulated.
• Data can be managed in an integrated manner.
• Data can be easily processed.
• Data can be easily searched.
131
(FAQ) There have been exam questions concerning data warehouses. Know accurately the meanings of data
warehouses, OLAP, and OLTP.
132
Standard/non-standard tasks: Standard tasks are those for which processing procedures are fixed, such as daily
business procedures and daily input of sales data. Non-standard tasks are those for which processing procedures vary case
by case. Creation of analysis documents, for instance, requires different processes depending on the purpose of use, so it is
considered non-standard.
Multimedia refers to handling not just characters and text but also mixtures of still images,
moving pictures, audio, and other communication media. The term also refers to devices and
software used in multimedia communication.
Type Explanations
Artificial reality (AR) It is the technology of creating a virtual world inside the computer, with
a sense of reality.
Special equipment is not necessary to experience the artificial reality.
Virtual reality (VR) It is the technology of creating a fictitious world and having people
experience and feel that world as though it were real.
3D vision using dedicated display units and special input equipment are
used.
Examples: pre-experience of virtual surgery, flight simulator, etc.
133
Expert system: It is a system created with the knowledge base of various specialists (experts) in a variety of fields;
given certain conditions, the system applies the knowledge based on certain rules so that problems can be solved as if they
were solved by the experts.
134
Computer graphics (CG): It is the technology of creating images via computers, or images made by such technology.
There are methods where the computer processes already existing images, and there are other methods where the computer
creates images themselves. The latter method is called CGI (computer-generated images).
135
VRML (virtual reality modeling language): language specifications to describe 3DCG used on the Internet.
Internet broadcasting
This is broadcasting using multimedia on the Internet. With the use of streaming distribution
technology, 136 programs are broadcasted in real-time on the Internet. Compared with
conventional broadcasting business, equipment costs much less, and global information
transmission is possible. In addition, on-demand service137 can also be provided, so we can
tune in whenever we wish to watch a particular program.
Internet broadcasting comes in various formats. On-demand broadcasting stores the contents on
a server and distributes them per request from a user. Live broadcasting (Internet live)
distributes live programs, such as concerts, simultaneously to multiple users.
Non-linear editing
This is a method of video editing where images are digitized and video is produced in free
order using a computer. It is easy to correct images, switch the order in which they appear, and
create a different version. Incidentally, the conventional method in which video images are
dubbed in the order of their completion is called linear editing.
Video on demand
This refers to the service of instantly sending a video program requested by the viewer via, for
example, bidirectional CATV. The service provider stores many video programs on its video
server and distributes the one requested by the viewer.
A video server can respond to simultaneous access by many viewers, and programs are
requested to be sent from the beginning. Hence, it is necessary to construct an image database
and connect it to individual mobile terminals and household receivers via broadband
communication lines such as cable, wireless, etc.
136
Streaming: It is the technology of reading data and playing the data back immediately. It enables Internet broadcasting
and playback of contents without waiting time. For streaming distribution, the line speed must exceed the amount of data;
however, Internet lines are generally slow, so normally the data is compressed to enable real-time transmission.
Conventionally, playback used to be time-consuming since the data had to be downloaded first and then played back. With
streaming, however, playback is done while data is being received.
137
On-demand: It is a function to provide what is requested whenever requested.
Quiz
Q1 What is an intranet?
Question 1
Q1. There is a system which manages the file area in units of blocks. Each block contains
eight sectors, and one sector is 500 bytes. How many sectors in total would be
assigned to store two files, one consisting of 2,000 bytes and the other of 9,000 bytes?
Here, the sectors occupied by management information, such as directories, can be
ignored.
a) 22 b) 26 c) 28 d) 32
Answer 1
Correct Answer: d
Files are saved in units of 8 sectors. Eight sectors, as shown below, are 4,000 bytes.
8 * 500 = 4,000 (bytes)
Hence, if one block is less than 4,000 bytes, all 4,000 bytes are used.
Next, we find the number of sectors necessary for each of the 2,000-byte file and 9,000-byte
files.
Capacity required for the 2,000-byte file = 2,000 / 4,000=0.5 (block)
Æ 1 block (= 8 sectors) required
Capacity required for the 9,000-byte file = 9, 000 / 4,000=2.25 (blocks)
Æ 3 blocks (= 24 sectors) required
Hence, to save files of 2,000 bytes and 9,000 bytes, the total number of sectors allocated to the
two files is 32 as shown below.
Question 2
Q2. Which of the following is the appropriate term for the process of breaking down data
and storing it on multiple hard disks, as shown in the figure below? Here, b0 to b15
represent the sequence in which data is stored on the data disk in units of bits, and
p0 to p3 represent the parity used to identify disk failure.
Control unit
b0 b1 b2 b3 p0(b0 to b3)
b4 b5 b6 b7 p1(b4 to b7)
b8 b9 b10 b11 p2(b8 to b11)
b12 b13 b14 b15 p3(b12 to b15)
Data Data Data Data Parity disk
disk 1 disk 2 disk 3 disk 4
Answer 2
Correct Answer: a
Striping is to distribute one block of data onto two or more disks and write simultaneously. By
striping, each block can be read and written in parallel, so the input/output speed increases.
Striping is defined as a technology for RAID. As shown in the figure above, this configuration
contains a disk dedicated to the parity; such a configuration is called RAID2, RAID3, or
RAID4.
b) Disk cache is placed between a hard disk and the main memory; it is a buffer (buffer
memory) to improve the apparent speed of the hard disk.
c) Blocking is to handle each logical set of multiple records as one physical record (block).
d) Mirroring is to prepare multiple disks and write the same data onto separate disks
simultaneously, i.e., a multi-disk configuration. If one of the disks fails, the operation
continues with the remaining disks only. This configuration is called RAID1.
Question 3
Q3. Which of the following is arranged in the order of the effective memory access speeds
from fastest to slowest?
a) A, B, C, D b) A, D, B, C
c) C, D, A, B d) D, C, A, B
Answer 3
Correct Answer: b
Cache memory (buffer memory) is memory which is placed between the CPU and the main
memory to adjust speed differences between the two. The effective access speed can be
increased by adding high-speed buffer memory, and by reading and writing on this cache
memory as much as possible.
Let tc be the access speed of the cache memory, tm be the access speed of the main memory, and
h be the hit ratio. The effective memory access speed is then calculated as follows:
The hit ratio is the probability that the data to be read is in the cache memory. The higher the
hit ratio is, the faster the effective memory access speed becomes.
For A through D, we need to calculate the effective memory access time. For A and B, there is
no cache memory, so the access time of the main memory is the effective memory access time.
Here, we can consider h = 0. The numbers below indicate the order of each, from fastest to
slowest (from the shortest effective memory access time to the longest).
A: 15 (ns) (1)
B: 30 (ns) (3)
C: 0.6 * 20 + (1 – 0.6) * 70 = 40 (ns) (4)
D: 0.9 * 10 + (1 – 0.9) * 80 = 17 (ns) (2)
Hence, if we arrange Memory A through D from fastest to slowest in terms of the effective
memory access time, the order is “A, D, B, C.”
Question 4
Q4. When a certain file was copied from one directory to another on a hard disk in a PC,
file fragmentation occurred. Which of the following is an appropriate description
concerning this situation?
Answer 4
Correct Answer: d
Fragmentation means that this file saved on the hard disk could not secure one continuous area
and thus is saved across multiple blocks. When fragmentation occurs, various parts of the hard
disk must be accessed, reducing the processing efficiency. However, the file size does not
change, as the file is simply saved in a divided manner.
a) If the disk is physically copied, the situation does not change. It must be copied logically.
b) One file was copied on the hard disk on which files had already been saved, so no other file
except the one that was copied was affected. Physically nothing was changed.
c) The fragmentation will be solved if the copy destination has a continuous empty area
whose size is larger than the file size.
Question 5
Q5. The table shown below gives processing times for a CPU and I/O devices to execute 5
stand-alone tasks. Which task can be executed simultaneously with the “High” priority
task so that the CPU idle time from task execution start to end can be zero? Here, each
task uses a different I/O device and is performed concurrently. The overhead of the OS
can be ignored.
Unit: ms
Priority Stand-alone task processing time
High CPU (3) →I/O (3) → CPU (3) → I/O (3) → CPU (2)
a) Low CPU (2) →I/O (5) → CPU (2) → I/O (2) → CPU (3)
b) Low CPU (3) →I/O (2) → CPU (2) → I/O (3) → CPU (2)
c) Low CPU (3) →I/O (2) → CPU (3) → I/O (1) → CPU (4)
d) Low CPU (3) →I/O (4) → CPU (2) → I/O (5) → CPU (2)
Answer 5
Correct Answer: c
We check the operation of each “low-priority” task to see which one uses the CPU while the
“high-priority” task is not using the CPU. The input/output units are different, so there is no
waiting for the input/output (I/O) units.
The “high-priority” task uses I/O units twice, each for 3 ms. If the use of the CPU by the
“low-priority” task takes exactly 3 ms, the CPU has no idle time. With this said, let us now
consider each task listed in the answer group.
For instance, consider Task (a). During the 3-ms period when the “high-priority” task uses an
I/O unit for the first time, the “low-priority” task uses the CPU for 2 ms. Hence, 1 ms of CPU
idle time will result.
Similarly, any “low-priority” tasks like (b) and (d), with 2 ms of CPU use, will cause CPU idle
time. In contrast, Task (c) has 4 ms of CPU use, but this comes in the end, so no idle time will
occur if this is after the completion of the “high-priority” task.
Hence, the “low-priority” task that can completely eliminate the CPU idle time until the
execution of both tasks is completed is “CPU (3), I/O (2), CPU (3), I/O (1), CPU (4).”
Question 6
Q6. The state transition diagram below shows a task (process) state transition on a
multitasking computer. When does the task state change from the running state to the
ready state?
Running state
a) A task with higher priority compared to its own has moved to the ready state.
b) A task has been generated by the job scheduler.
c) An I/O operation has completed.
d) An I/O operation has been requested.
Answer 6
Correct Answer: a
When a task (process) is generated, it proceeds to the ready state and is entered into the queue.
After that, it goes into the running state depending on the task priority and goes into the waiting
state when an I/O operation occurs. Such changes in the task state are called the state
transitions of the task.
A B C
E
Ready D Waiting
state state
(2) (4)
We now explain the state transitions of tasks. The numbers in parentheses ( ) are for state
explanations, and A through D are for transition explanations.
Transition Explanation
A A task is generated; moved to the ready state
B By priority, moved to the running state
C Moved to the waiting state (waiting for I/O, etc.) due to an event
D Moved to the ready state after completion of an event
E Moved to the ready state due to another task with high priority
F The task in the running state is completed.
As you can see from these tables, transition from the running state to the ready state is
transition E. This is when a task whose priority is higher than the task being executed goes
into the ready state.
b) This describes transition A in the state transition figure for the tasks.
c) An I/O operation is an operation of input or output. When this I/O operation is completed, an
I/O interruption occurs and the task moves into the ready state. This describes transition D in
the state transition figure for the tasks.
d) An I/O operation is requested when the task issues an instruction for input or output. If this
happens, a supervisor call occurs, and the task moves from the running state to the waiting
state to wait for the completion of the I/O. This is transition C in the state transition figure
for the tasks.
Question 7
Q7. Which of the following is an appropriate statement concerning a client/server system?
a) The client and the server must use the same kind of OS.
b) The server sends data processing requests and the client processes those requests.
c) A server can support a client function that enables it to request processing of
another server if necessary.
d) The server functions must be allocated to different computers, such as a file server
and print server.
Answer 7
Correct Answer: c
A client server system is a system made up of processing units called clients and servers. A
client performs data input/output and other processes through a server while a server controls
all input and output that depend on the hardware according to its type. Normally, a client unit is
a unit equipped with data processing functions such as a personal computer or a workstation, so
it can perform applications on its own. In fact, it also performs processes only a client can
perform, such as displaying text and drawing figures. A server, on the other hand, accesses
databases and performs printing processes in response to requests by clients. Further, if a server
is not able to perform a process, it can request another server to do that. Here, the first server
becomes a client because it is requesting another to perform a process.
a) Different operating systems do not cause any problems as long as the protocol is established.
We can have a combination of servers with UNIX and clients with Windows.
b) It is a client that sends requests for processing. It is a server that performs the requested
processes.
d) In a small-scale system, a server and a client can even be the same. X Windows of UNIX is
an example of this type. If the clients and servers are built on the same platform (OS) and
can be connected via a network, there is no inhibitory effect.
Question 8
Q8. When comparing a distributed processing system, which consists of multiple computer
systems located in a wide area, with centralized processing systems that operate in a
single center, which of the following is the most appropriate feature of centralized
processing systems?
Answer 8
Correct Answer: c
a) Since the host computer performs all processing, the system gets shut down until the host
computer is recovered. If this is a serious failure, it is possible that the downtime of the
system becomes long.
b) In a centralized processing system, all requests must be answered by the host computer alone.
If the contents of the requested items vary significantly in level, the host computer cannot
meet all those requests, causing backlog accumulation. Incidentally, backlog can also refer to
a system waiting to be developed.
d) A centralized processing system processes all tasks at the host computer, so it is cumbersome
to respond to each type of business task separately. Even if we want to extend the system,
often there are tasks that cannot be suspended. With the introduction of a new technology,
some tasks may not be able to be processed any longer.
Question 9
Q9. For one job, which of the following formulas appropriately expresses the relationship
between turnaround time, CPU time, I/O time, and process waiting time? Here, other
types of overhead time are ignored.
Answer 9
Correct Answer: c
Process waiting time is the time until the start of a CPU process or an I/O process of the job
entered into the computer system. Turn-around time (TAT) is the time interval from submitting
a job to receipt of the results. This concept is mainly used in batch processing. TAT includes
both the CPU processing time and I/O time. Hence, process waiting time is obtained by
subtracting CPU processing time and I/O processing time from TAT.
In the figure below, the shaded areas represent process waiting time.
Turn-around time
Hence, for one job, the formula expressing the relationship among turn-around time, CPU time,
I/O time, and process waiting time is as follows:
Question 10
Q10. LAN facilities are installed as shown in the figure below. Using the server connected
to LAN3, the client on LAN1 is performing a business application. Data transmission
is normally performed via Router 1. If a failure occurs in Router 1, Routers 2 and 3
are used for transmission between LAN 1 and LAN 3. What is the availability of the
LAN equipment connecting LAN1 and LAN3? Here, the failure rate of each router is
0.1, no switch-over time is required in case of a failure, and failures in LAN facilities
other than the routers are not taken into account.
Server
LAN 3
Router 3
LAN 2
Router 1
Router 2
LAN 1
Client
Answer 10
Correct Answer: b
Note that “Router 1” and “the serial connection of Routers 2 and 3” are in parallel connection.
Since the failure rate of each router is 0.1, the availability is 0.9. Hence, the configuration is as
shown below. The values in the boxes are availability of each unit.
Client Server
Router 1
0.9
Question 11
Q11. Which of the following is an appropriate statement concerning VR (Virtual Reality)?
a) Using technology such as CG, VR expresses the world created inside a computer as
if it were the real world.
b) For the purpose of improving GUI, it does not display an image incrementally from
the top, but first displays a rough mosaic-like image and gradually sharpens it.
c) VR tests whether or not hypothetical results can be obtained from computer
simulations such as those of wind tunnel tests used for automobile or aircraft
design.
d) VR makes abilities such as human recognition and inference possible on a
computer.
Answer 11
Correct Answer: a
Chapter Objectives
System development means the creation of software to
operate computers. In general, this is performed in the
order of requirement analysis, external design, internal
design, programming, and testing, but various
methodologies have been proposed, depending on the
situation of system development. In Section 1, we will
learn the methodologies of system development as well
as programming languages, groups of tools, and
evaluation of software quality, all of which support system
development. In Section 2, we will learn specific
procedures of system development and methods of
testing.
1
Non-procedural: Sometimes programming languages that are not procedural are called non-procedural programming
languages. They are characterized by the property that the order in which instructions are written in the program does not
match the order of execution. Generally, parameters are given, and the processes are executed according to the contents of
the parameter definitions.
Programming Languages
The characteristics of commonly used programming languages are organized below.
Procedural/functional/logic/object-oriented
Language Characteristics
COBOL A business-processing language
The language specifications were established by CODASYL.
C Developed by AT&T2 to write OS for UNIX3
Allows easy portability
Fortran Developed by IBM as a computing language for science and technology
Pascal A structured programming language developed for the purpose of teaching students
Lisp A list-processing language developed at MIT4
Used for research in artificial intelligence, etc.
Prolog A language with an inferential mechanism
Developed at the University of Marseille in France
C++ An object-oriented language and an extension of C
Completely upward-compatible with C
Java Developed by Sun Microsystems, based on C++
Runs on any OS
Smalltalk Developed by Xerox at its Palo Alto laboratory
Dialogue-type and programmable
2
AT&T: American Telephone and Telegraph, a telecommunications company, oldest in the world and largest in the
United States.
3
(Note) C is a language developed to write an operating system for UNIX, but since it is so easy to use, today a wide
range of programs are written in it, including business applications and operating systems.
4
MIT: Massachusetts Institute of Technology.
5
Page description language: It is a language used to define printing image for the printer when printing a document
using a page printer. Identical images can be printed even if printers have different resolutions.
6
CGI (Common Gateway Interface): It is a mechanism that takes requests from a WWW browser, calls an external
program requested, and returns the execution results to the WWW browser.
Script Languages
A script language is a language that uses text (characters) to describe procedures to be executed
by the computer. The processing procedures described by a script language are called scripts.
Mainly these are in database software and spreadsheet software used as macros. In the sense
that these languages describe procedures, they are similar to procedural programming
languages; however, scripts are characterized as being event-driven. 7 Also, often a
development environment using GUI is provided so that the end user can easily write
programs.8
Processes frequently used in a program or processes shared by multiple programs are set aside
as separate programs and are shared among many programs. Such a shared program is called a
subroutine (subprogram), and a variety of structures are used according to the conditions of
use.
Program Structures
According to the structure, programs can be classified as shown below.
Reentrant Multiple tasks9 can use the program at the same time
Recursive
A procedure is said to be recursive if the definition of that procedure refers to the procedure
itself. A program in which the definition of a subroutine or a function uses the subroutine or the
function itself is called a recursive program. Such a reference within itself is known as a
recursive call. It can be used in most programming languages, but COBOL and Fortran are
exceptions.
7
Event-driven: It is a program that is triggered by an event and starts up to respond to and process the event. An event is
any conditional change, such as a press on the keyboard. Programs that start up when the user clicks on an icon are
event-driven.
8
(FAQ) There are exam questions on combinations of common languages and their classification. For instance, know that
COBOL is procedural, Lisp is functional, and Java is object-oriented.
9
Task: It is a processing unit obtained when program processes are minutely divided
Reusable
This term refers to the program structure that allows multiple programs (tasks) to share the use
of the program without reloading the program into main memory each time. If the program can
be used simultaneously by multiple tasks, it is called reentrant;10 otherwise, it is called
serially reusable. A program with a structure to allow reentry is called a reentrant program.
Subroutines (Subprograms)
A subroutine refers to a part of a program which is repeatedly used within the program to
execute common procedures. If multiple programs execute the same procedures, those
procedures can be combined as one program so that the multiple programs can share their use.
Such a program is also called a subroutine.
Open subroutine
An open subroutine11 is a subroutine that is embedded wherever a program needs it as many
times as the program needs it.
Closed subroutine
A closed subroutine is created independently of programs that need it as a subroutine. If a
program needs the subroutine, it executes a subroutine-call instruction (usually a CALL
statement) to deliver the control to the subroutine.
The figure below illustrates the concept of a closed subroutine. The processes are executed in
the order (1), (2), (3),… By the CALL statement, the program jumps to the entrance of the
subroutine, and by the RETURN statement, it returns to the instruction following the CALL
statement (return point).12
(1)
(Entrance)
(3) (7)
(4) (8) Subroutine
RETURN statement
(return instruction)
10
(Note) In a reentrant program, the unchangeable parts (mainly procedural parts) and changeable parts (mainly data) are
separated so that multiple programs can use it at the same time by sharing the use of the unchangeable parts while securing
only the changeable parts according to the programs that call the reentrant program. In general, most online-processing
programs have the reentrant structure.
11
Open subroutine: It can be implemented as a macro in assembler language, a copy library in COBOL, and “%include”
in C.
12
(FAQ) With regard to program structure, many recent exam questions have involved recursion and reentry. Recursion is
calling itself, and reentry is being simultaneously called up by multiple programs.
A programming language uses expressions similar to daily language so that programs can be
easily written. However, computers cannot understand the instructions of any programming
language as is. Hence, it is necessary to convert those programs written in programming
languages into a format that computers can understand. This conversion is performed by what
is called language processors.13
Language Processors
A language processor is a program that translates (converts) source programs to machine
language. Language processors are as follows:
Language processors
In addition, there are also preprocessors,15 which convert source programs to a compiler
language, not to machine language.
Procedures of Compiler
A compiler language is translated to machine language in the order below. A program translated
into machine language is called an object program (also an object module).
Decomposing the source program to variables and tokens (smallest linguistic units)
Analyzing the program according to the language syntax
Lexical analysis
13
(FAQ) There have been many exam questions where you are to select characteristics of interpreters and compilers. Be
sure to completely understand the characteristics of each. Questions concerning the procedures of a compiler have also
been frequently asked on the exams.
14
Compiler language: It is a programming language that generates object programs from source programs using a
compiler. It is also called a higher-level language and includes COBOL, Fortran, Pascal, PL/I, and C. A compiler language
uses expressions similar to what humans use in daily living, so they are easy to understand and easy to learn.
15
Preprocessor: It is a program that takes source programs before they are translated by a compiler to machine language
and makes them execute various processes. For example, a preprocessor for the C language supports functions such as
defining numerical values found in the source programs as character strings and obtaining library files referenced by the
source programs. They are designated by “include.”
Load modules (executable programs) are the programs that can actually be executed. Object
programs, which are simply translated by a language processor, cannot be executed. Through a
linkage-editing program (linkage editor), what is required for execution needs to be added to
the object program.
A linkage editor, in linking two or more object programs, calls function programs and
subroutines used by the object programs from the software library and links them to the object
programs. This is also called a linker.
Interpreters do not have object programs. Rather, they execute each instruction as they translate
the instructions one by one. Generators directly create load modules by giving parameters.
Execution of Program
To execute a program, it is necessary to store the program to be executed in the main memory
or in a virtual memory. This function is performed by a loader.
A loader stores a load module in the main memory, and then the computer takes out one
instruction at a time from the load module, interprets it, and executes it.
A development environment includes hardware necessary to build a system and software such
as system development support tools.
16
(Note) Linking a subroutine during the creation of a load module is called a static link. In contrast, it is also possible to
link a necessary subroutine when the program is executed. This method is called a dynamic link.
Requirement
Upstream
definition
CASE
Supports design processes
External
design
Supports all processes
Internal
Integrated CASE17
Common CASE
design
Program
design
Supports documentation, project management, etc.
Programming
Downstream
Testing CASE
Supports development processes
Operation,
maintenance18
Maintenance
Supports maintenance processes
CASE
CASE for providing development platform Defines interfaces among existing CASE
Repository19
17
Integrated CASE: These are tools that support the entire system development process. Initially, the idea of integrated
CASE was to have one CASE that covers all the processes; however, the reality was that partial CASE was in use, and the
idea that it is better to use these existing tools became more popular. Therefore, integrated CASE is now developed as a
means to provide interfaces between various tools so that design information can be communicated smoothly.
18
(Hints & Tips) Some common CASE tools have functions to support the entire development process. However, these
are to be distinguished from integrated CASE. Common CASE manages areas besides design information, such as
documentation (tables, graphs, figures) support, project management, and systems configuration management.
19
Repository: It is a database in CASE tools storing a variety of information, also known as a software engineering
database or storage. By consolidated management of the design information using a repository, it is possible to check for
consistency and completeness as well as to automate development processes.
Software Packages
A software package is “general-purpose software that general users can commonly use.” Amid
the various kinds of software packages, business packages have received greater attention
recently, as they support efficient business processes.
ERP
CRM
CRM (Customer Relationship Management) is the concept that all departments which have
contact opportunities with customers should share and manage customer information and
contact history so that any questions from the customers can be answered appropriately.
Companies attempt to promote the expansion of their customers by integrating all
communication channels including telephone, fax, the Web, and e-mail, reinforcing the
relationships with their customers and providing services that meet individual customer needs.
SFA
SFA (Sales Force Automation) is the basic concept of information systems that facilitate work
restructuring of the entire sales activities that support corporate profits by using information
technology. For example, more sales can be generated by managing previous contact records
for each customer on computers. Moreover, transfer of work to new staff can be made more
smoothly.
CTI
CTI (Computer Telephony Integration) is technology that provides a high level of telephone
services by combining the information processing functions of computers and the
communication functions of telephone switches.
A process model is a model of system development method seen from the perspective of the
processes involved; a cost model is a model from the perspective of the costs.
Process Models
A process model is a model abstracting the process of system development. By establishing a
process model, the procedures of system development are given a direction or a guideline. The
table below shows various models and their characteristics.
Name Characteristics
Waterfall Each phase of the development process flows from upstream to downstream, without going
model back.
• Each phase is reviewed at the time of its completion for quality management.
• It is difficult to clarify all of the requirements in the initial stages of the development.
• There are always activities that require iteration.
Prototyping A prototype of the user interface is developed to clarify the requirements.
model • The requirements are clarified in early stages.
• Final stages will have few corrections and reviews.
Spiral Subsystems are developed independently.
model23 • Scales of simultaneous development can be controlled.
• Development staff can be secured in a stable manner.
23
Spiral model: It is a process model in which the methods of both the waterfall model and the prototyping model are
incorporated. If a large-volume application can be divided into mutually highly independent components, for each
component, either the waterfall model or the prototyping model is applied.
The figure below shows examples of development phases using the waterfall model and the
prototyping model. The task contents during the development process in the prototyping model
are identical to those in the waterfall model. As for the user interface, requirement analysis and
testing are repeated until the specifications are finalized. For other parts, the waterfall model is
used. 24
Requirement
analysis Defining requirements of the system
Defining requirements of the hardware and software
Review
Planning the development structure and schedule
External
design Designing the system without taking the computer into account
Review Screen design, form design, etc.
Internal
design Designing the system, taking the computer into account
Review File organization, file design, etc.
Programming
Structured design of the program
Designing interfaces between programs
Review
Prototyping Coding, unit testing, etc.
Cost Models
Software cost is the cost incurred in each process of the lifecycle of software development
(Software Development Life Cycle: SDLC). A cost model is a model to quantify the cost (i.e.,
productivity) such as the productivity and quality measures of the software. Cost models and
characteristics are shown in the following table.
Name Characteristics
COCOMO model The programmer's work load is calculated in terms of cost based on a mathematical
formula, using a statistical model, consisting of basic, intermediate, and advanced
(detailed) levels.
FP (Function Point) The numbers of the five elements—input, output, inquiry, logical files, and
method interfaces—are obtained and added up with weights. Based on the assumption that
this weighted sum is in correlation with the scale of the software development, the
development size is estimated. The view held by this method is that what the users
really need is not the programs but the functions.
24
(FAQ) Many questions on the waterfall model and the prototyping model have appeared. Be sure to correctly
understand the characteristics of each.
25
(Hints & Tips) Since the idea of the waterfall model is so clear and easy to understand, many projects have applied this
method. Since the work proceeds step by step, it is used in relatively large-scale projects. On the other hand, the
prototyping model shows its effectiveness in developing relatively small-scale applications.
Requirement analysis means carefully identifying and organizing the requirements of a system.
The results of requirement analysis are visualized by DFDs and E-R diagrams. Another method
of analysis conducted from a completely different perspective is analysis by object orientation.
The process expressed by DFD is characterized by having not only input data flow but also
output data flow which delivers the results of processing. Neither process appears alone.26
Order Delivery
Acceptance Shipping
information information
Customer of order instruction Customer
Order Shipping
Handling
information information
information
Order-receiving Shipping
information information
Inventory
check
26
(FAQ) Questions involving DFD have often appeared on the exams. Many of them ask about the meanings of the
symbols, so at least understand the meaning of each symbol. Others include questions regarding items to note concerning
statements in DFD. Each process has input and output, so a diagram always shows an input data flow and an output data
flow, such as “ ○ .” If either one is missing, it is an error as DFD.
…
: n-to-m
Size (many-to-many)27
The above diagram indicates that one branch manager (1) manages several employees (n). Also,
each employee handles several (m) products and each product is handled by several (n)
employees. Each product has the attributes of product name, price, and size.
Object Orientation
Object orientation means to model the data and their operation (process) together. Integrating
data and operation is called encapsulation. Properties shared by the data are extracted, and the
data is organized into classes in a hierarchical structure. The lower-level classes inherit the
properties of the upper-level classes in this structure, and the properties thus inherited are
referred to as “inheritances.”28
For example, consider the relationship between the automobile and the bus. If the automobile is
defined as a “vehicle,” and the bus is defined as a “vehicle to carry people,” the property of
“being a vehicle” is common to both, so the automobile is the superclass while the bus is its
subclass.29 Using the inheritance function, then, the bus can simply be defined in terms of
“carrying people.”
Automobile
: Superclass
Vehicle
Bus Truck
: Subclasses
To carry people To carry cargo
27
(Note) Relations between entities are called correspondences (cardinalities). The relation between the branch manager
and the employees is 1-to-n or 1-to-many, and the relation between the employees and the products is n-to-m or
many-to-many. This relation between the branch manager and the employees indicates that each employee has one branch
manager but the branch manager has multiple employees. If an employee is picked, then the branch manager of the
employee is uniquely identified, but picking a branch manager does not identify one unique employee associated with
him/her. In contrast, the relation between the employees and the products is such that neither does picking an employee
identify a unique product, nor does picking a product identify a unique employee.
28
(Hints & Tips) In object orientation, we need to design only the part(s) to be added. For instance, to add the truck, only
the part “to carry cargo” needs to be designed. However, in reality, it is difficult to identify the common properties and
properties to be added.
29
Superclass/Subclass: An upper-level class is called a superclass whereas a lower-level class is called a subclass.
Review Methods
A review is a discussion meeting held at the end of each process in order to avoid carrying the
existing problems over to the next process in system development.
• There should be 4 to 6 participants (if too many are present, there may be no consensus
reached).
• Documents should be distributed in advance. (Problems should be listed in advance.)
• The purpose is to find errors. (Measures to eliminate them are to be discussed later.)
• The meeting should be limited to 1 to 2 hours. (If a longer meeting is necessary, have it on
another day.)
• Management should not attend the meeting. (This could lead to personnel evaluation.)
Type Functions
Design review This is for each design process (external, internal, programming) of system
development. This is for evaluation of various design documents and validation of
interfaces, etc.
Walk-through This is for all processes of system development. In early phases, not only the
development personnel but also end users participate in it.
Inspection This is for all processes of system development. It is to be performed
systematically under the direction of a moderator.31 Problems pointed out should
be made known to the entire project. Inspection conducted in the programming
phase is specifically called code inspection.32
30
(FAQ) The meaning of the term “review” and the difference between walk-through and inspection are often asked in
exam questions.
31
Moderator: A moderator is a manager who is trained to conduct reviews and can handle errors detected. The moderator
selects reviewers called inspectors who have the ability and expertise to assess the deliverables of each process.
32
Code inspection: Code inspection specifically refers to inspection of source programs. In code inspection, the source
programs are checked and validated on a line-by-line basis.
Growth Curve
In the testing phase, the relationship between the accumulated number of detected errors and
time (period) is said to be similar to the growth curve. Characteristics of the growth curve are
as shown below. The growth curve is sometimes called the S-shape curve.
Accumulated number
of errors
→ Period
This growth curve indicates that errors are not easily detected in the beginning, that after that
the number of the detected errors gradually increases, and that the number of errors decreases
ideally in the end.33 34
Error-Planting Model
The error-planting model is also called the error-spreading model or the bug-embedding
model. In this model, errors are intentionally placed in the program. Then the ratio of the
number of errors planted and the number of errors detected is used in proportional distribution
to obtain an estimate for the total number of errors in the program. Today, this model has been
improved so as not to spread errors into the program. Rather, two independent testing groups
perform testing on the same program, and the number of errors detected by each group is used
to estimate the total number of errors.
33
(FAQ) There are exam questions to select the correct growth curve. For instance, there may be several graphs in the
answer group, and the question may say, “Which of the following curves shows that the testing is performed as planned?”
Know the characteristics of the S-shape curve.
34
(Hints & Tips)
Accumulated number of errors
Period
Graph “A” shows that the number of errors is larger than the standard bug curve. We can infer that the test data is so good
that many errors are detected. However, we can also infer that errors are found early because the quality of the software is
poor. On Graph “B,” the accumulated number of errors does not stabilize, so we can infer that the software quality must be
very poor.
Quality
characteristics Definition
Functionality Functions and purposes match up.
Reliability Specified functions work under specified conditions, and recovery from a failure
is easy.
Usability The purpose and function of use are clear, and the operation is easy.
Efficiency The execution time is fast, using the resources effectively.
Maintainability Changes and repairs are easy.
Portability It can easily be moved to another environment.
Quiz
Q1 List programming languages that are each classified as “procedural,” “functional,”
“logic,” and “object-oriented.”
Q2 Explain “reentrant.”
Q4 List three typical process models and explain the characteristics of each.
A2 This term refers to the program structure where a program can be used by multiple
tasks at the same time; it is one type of “reusable” program structure, which allows
multiple programs (tasks) to share the use of one program without having to load the
program into the main memory repeatedly.
A4 Waterfall model: Each phase of the development process proceeds from upstream to
downstream without going back.
Prototyping model: A prototype of the user interface is developed to clarify the
requirements.
Spiral model: Subsystems are developed independently.
A5 Both inspection and walk-through are methods of conducting a review and are similar
to each other; however, they differ in the method of operation and follow-up processes:
Inspection is carried under the direction of a moderator who has been trained to
conduct reviews. Errors found in inspection are to be made known to the entire project.
The most typical method of system development is the waterfall model. Here, we follow the
phases of the waterfall model and organize the contents of activities at each phase.
External design refers to designing the system without taking the computer into account. The
results of activities are summarized in the external design specifications for review.35
35
(Hints & Tips) On the IT Engineer Exams, this is called “external design,” but some reference books may call it
“functional design,” “overall design,” or “outline design.” The contents of activities are essentially the same.
• Consider the source of data, amount of data, number of items and digits, attributes, etc.
• Place input items so that they can flow from top to bottom and left to right.
• Try to standardize the screen layout and operability.
• Keep message representations consistent.
• Consider the possibilities of aborting an operation midway or re-starting from the previous
screen.
• Set up sequence and positions in consideration of the relationship among the items.
• Ensure that the title appropriately expresses the printed contents.
• Clearly distinguish various dates such as the date prepared, date reported, and date approved.
• Position the items with appropriate spacing.
• Plan for the entire report to have sufficient empty space.
• Consider the design so that critical items can be immediately identified.
Code Design
Codes need to have the functions listed in the table below. When designing, we must consider
various properties such as commonality, systematization, expandability, and clarity.37
Function Explanation
Identification A function to distinguish data.
Codes can distinguish between two people with the same first and family names.
Classification A grouping function; classifying by affiliation code, etc.
Listing A sorting function: If the digits are aligned, data can be sorted by date of birth.
Checking A function to check input values; for instance, by adding a check digit.38
36
(FAQ) The following type of questions has been frequently asked on the exams: “Which of the following is an
appropriate description on points to remember in screen design and form design?” Read the descriptions carefully to
answer them.
37
(Note) Examples of codes include consecutive codes and digit-specific codes. Consecutive codes are consecutive
numbers assigned to data listed in order from the beginning. These are useful when the number of data is fixed. In
digit-specific codes, the data is classified into large classes, middle classes, and small classes with certain standards in a
hierarchical structure, and each group has consecutive codes. Digits can be lengthy, but they are suitable for computer
processing. The zip code system is an example of this type.
38
Check digit: It is a 1-digit code obtained by performing a certain calculation, defined in advance, on each digit of a
numerical item. When a code is entered, the same calculation is performed to obtain the 1-digit number, which is then
compared with the check digit. If they are the same, the code is considered valid; otherwise, it is considered invalid. It is a
method for detecting errors.
Internal design refers to designing the system taking into account the computer that is planned
to be used. The results of activities are summarized in the internal design report for review.
Functional partitioning, structuring Identifying the functions and grouping them by processing contents
▼
File design Specifically deciding the file medium, organization, layout, etc.
▼
Input/output detailed design Deciding input/output medium, method, and check method
▼
Preparation of internal design specs Summary
File Design
The organization and medium of files need to be decided according to the purpose of their use.
For backup, various media are used, including magneto optical disks (MO), floppy disks (FD),
CD-R, CD-RW, DVD-R, DVD-RAM, DAT, etc.39
According to the purpose of their use, the organization method of files is selected from options
including sequential, direct, indexed sequential, and partitioned organizations.40 If a large
number of records exist and most of them are to be read and updated, sequential organization
is suitable. For random processing, direct organization is suitable.
39
(Hints & Tips) An appropriate medium is chosen based on what is being backed up. The approximate capacity of each
medium is as follows:
MO: 128, 320, 540, 640 MB/ 1.3 GB
FD: 1.2/ 1.44 MB
CD-R: 640/700 MB
CD-RW: 700 MB
DVD-R: 4.7/ 8.5 GB
DVD+R: 4.7/8.5 GB
DVD-RAM: 4.7 GB
DVD-RW: 4.7 GB
DVD+RW: 4.7 GB
DAT: 24 GB maximum
40
(Hints & Tips) Partitioned organization is hardly ever used in a data file. It is almost always used in library files.
Check Methods
Among the methods used for checking input data, some of the typical methods are organized in
the table below:
XXXXX Y XXXXXY
Base code Check digit Code
The check digit method often uses modulus 11. In modulus 11, weights 2, 3, … are assigned to
each digit of the base code from the lowest digit. The product of the weight and the numerical
value for each digit is then calculated and the sum of the products is found. Finally, this sum is
divided by 11, and the remainder is the check digit. If a resulting product is a 2-digit number, the
digits are separated. Below is an example when the base code is “12345.”
Since the division is by 11, this method is called modulus 11. Now, dividing a number by 11 may
cause a remainder of 10. In this case, the check digit is defined as 0. Besides this, there is also a
method called modulus 10.41
41
Modulus 10: The idea of calculating the check digit by modulus 10 is exactly like the idea of modulus 11, except that
the weighted sum is divided by 10, not 11. Modulus 11 and modulus 10 are both considered able to detect most input
errors.
When software is modeled in terms of processes and data separately, structured design is the
process-oriented technique. Alternatively, there is another technique in which design activities
proceed, focusing attention on data structure. Structured design techniques include bubble
charts, STS partitioning method, and TR partitioning method. Techniques focusing attention on
data structure include the Jackson method and the Warnier method.42
The structured design is the technique of designing based on the approach of structuring. The
structure chart is used for this method. In this method, the design activities proceed from general
overall ideas to specific details, so it is sometimes called “stepwise refinement” (top-down
approach or module partitioning43).
Bubble charts
These charts use bubbles (circles) to represent processes that convert input data into output data
and are the same as DFDs. For every system, there is only one initial bubble, but as break-down
(structuring) continues, bubbles and data flow become more complex. Below is an example of a
sales management system broken down.
Order Allocation
information Order information
allocation
Like DFD, a bubble chart is used to partition the functions of the system. It is also used to partition
the functions of a program to construct a hierarchy.
42
(FAQ) Along with structured design techniques and the Jackson method, exam questions covering an overview of the
software methods have frequently appeared; these include questions like “Which of the following is an example of …?”
and the answer group often lists terms.
43
Module partitioning: It is a process whereby the program functions are discussed, divided into functional units, and put
into a function hierarchy. Each functional unit to which the functions are partitioned is called a module. A program thus
consists of modules.
In this technique, the program structure is divided up into a source (input), a transform (processes),
and a sink (output), and each of these is defined as one function. Once STS partitioning is done,
other techniques such as DFD and bubble chart are used.
Control module
In STS partitioning, after dividing into three modules, we add a control module to control these
modules. This method is used when writing programs for batch processing.
In this technique, the transactions are partitioned by function with respect to the branching flow of
data, and they are formed into modules. The example below is dealing with a part of a payroll
program. If the function of “updating files” has three functions—updating the base salary,
updating allowances (stipends), and updating deductions—, then each function constitutes a
module:
Updating files
44
Maximum abstract point: It is a concept in STS partitioning where the program is partitioned into three modules. The
boundary between the source and the transform is called the maximum abstract input point, and the boundary between the
transform and the sink is called the maximum abstract output point. The former is the point where the input data is
maximally abstracted, at which the input data is transformed and cease to be input data. The latter is the point where the
output data is maximally abstracted, at which the output data (going backward) were first recognized in the form of output
data.
The table below shows symbols used in the Jackson method and their meanings.
A
Selection Either B or C is selected by A (one at a time).
B° C°
45
The Warnier method: It is a design technique that is based on data structure, like the Jackson method. It is
characterized by drawing the so-called Warnier diagram, similar to a flowchart.
In structured design, the validity of the finally partitioned modules is evaluated based on
various evaluation criteria such as structure and independence.
Structural Evaluation
For module partitioning, the following characteristics need to be considered:
• Size: Are they too small or too big? (Criteria should be set.) The proper size differs
depending on the language used (about 300 steps for COBOL).
• Function: Are there unnecessary functions? Are there multiple functions? (If there are
multiple functions, partition them again.)
• Interface: Are there too many parameters? (In such a case, review the partitioning.)
Evaluation of Independence
To evaluate the independence of modules, there are two measures—module strength and module
coupling. Partitioning is considered good if its modules have a high level of independence. The
weaker the module coupling46 is and the stronger the module strength47 is, the more independent
the modules are.
Types of strength Strength Independence Coupling Types of coupling
Coincidental
Content coupling
strength Weak Low Strong
Logical
Common coupling
strength
Classical
External coupling
strength
Procedural
Control coupling
strength
Communicational
Stamp coupling
strength
Informational
strength
Data coupling
Functional
Strong High Weak
strength
46
Module coupling: It is a measure of how closely modules are related to one another; the weaker the module coupling is,
the more independent they are.
47
Module strength: It is a measure of how closely the component elements within a module are related to one another;
the stronger the module strength is, the more independent the modules are.
Module strength has the seven levels as shown in the following table.
Module strength Contents
Coincidental The program is simply divided or duplicate functions are eliminated. There are
strength no special relationships among the functions within the module.
Logical The module has multiple related functions and chooses the processing based on
strength parameter (argument) conditions.
Classical The module unifies various modules executed at a designated time and executes
strength multiple functions sequentially. An initial-setup module is an example.
Procedural The module executes multiple serial functions; the relationship within the
strength module is close, and the various functions cannot be executed independently.
Communicational The module executes multiple serial functions just as in procedural strength, but
strength data is transferred between functions.
Informational The module unifies multiple functions that handle the same data structure, has
strength an input point and an output point for each function, and can call each function
separately.
Functional The module consists of one function only, and all the instructions are to execute
strength the one function and therefore are closely related.
3.2.5 Programming
Programming requires structured logic. By structuring the logic, the level of complexity can be
reduced and programs can be written so that they will be easy to understand. This leads to higher
productivity and improved maintainability in development.
Structure Theorem
“In a valid program, in which there is a pair of an entrance and an exit, no infinite loop, and no
statement that is not executed, we can write the logic only using three basic control structures:
sequence, selection (decision), and repetition (iteration).” This is called the Structure Theorem. It
is very important to keep “structuring” in mind when writing a program. The basic principle is to
write a “goto-less program” (program without using a “goto” statement).
48
(FAQ) There are many exam questions on the independence of modules. Be sure to correctly understand the contents of
module strength and module coupling.
Name Explanation
Sequence The program consists of sentences executed sequentially without “goto”
statements and logical decisions.
Selection The function to be executed depends on whether or not a certain condition
(“if- then-else” type) holds.
Multi-branch Multiple branches are designated depending on the value of a variable or a
(“case” type) processing result.
Do-while loop The condition is determined at the beginning of a repeated process
(Pre-test loop) (repeated while the condition holds); depending on the condition, the
repeated process may not occur at all.
Repeat-until loop The condition is determined at the end of a repeated process (the loop is
(Post-test loop) terminated if the condition holds); regardless of the condition, the repeated
process occurs at least once.
Fixed count loop The process is repeated a certain fixed number of times on entry into the
loop.
The following is a set of flowcharts showing the basic control structures detailed above.50
Process 1
No Yes 1 Miscellaneous
Condition Condition
Process 2
Process 1 Process 2 Process 1 Process 2 Process 3
Process 3
Repetition Repetition
No Repetition
Number of times
Condition = No Process
Condition Repeated
(Decision notation) (Loop notation) (Decision notation) (Loop notation) (Fixed count loop)
49
(Hints & Tips) The three basic control structures are “sequence,” “selection,” and “repetition.” “Sequence” and
“selection” are as shown in the table, but “repetition,” when first proposed, meant do-while loop.
50
(Hints & Tips) The loop notation shows the condition to end the repetition. In other words, it shows the condition under
which the repetition is terminated. However, in a do-while loop, the repeated process is executed as long as the condition
holds, so we must be careful in denoting the condition using the loop notation and the decision notation.
Method Explanation
Pseudo-coding Pseudo-code is similar to a program code, but allows the use of natural language
(Pseudo language) (e.g., English) for abstraction of functions.
Decision table Relations between the conditions for and the contents of the processing are
expressed in the form of a table.
NS chart The logical structure is expressed without using arrows. As a visual aid, this is
easy to read.
Structure chart A tree structure is used to express the logic.51
In system development, the design work is performed top-down, but testing is performed
bottom-up. In other words, we take the approach of starting with details and moving toward the
whole. This is called stepwise integration. In integrated tests, in order to perform the tests
efficiently, we must carefully choose the order in which the modules are tested.
Order of Tests
Tests are conducted in the following order:
51
Structure chart: It is also known as a tree-structure chart. To achieve structured programming, various types of
structure charts have been proposed by developers and research institutes. Some well-known examples are PAD (Hitachi),
SPD (NEC), YAC (Fujitsu), and HCP (NTT).
Unit test
A unit test is a quality test for modules (smallest units within a program). In a test of the entire
program it can be difficult to identify the cause of an error, so unit tests are performed for each
module as a unit. In unit tests, we perform function tests52 and structure tests.53 Since modules do
not function by themselves, we prepare drivers and stubs.
Integration test
With multiple modules linked together, we test these linked programs (load modules) in integrated
tests. The main goal here is to examine the interface between modules as well as the input and
output.
Methods for integration tests include top-down tests, bottom-up tests, big-bang tests, 55 and
sandwich tests.56
System test
This is the last test conducted in the development division and is used to examine whether the
required specifications are satisfied. For instance, we address questions such as “Are there any
performance problems (performance test)?” and “Can it endure heavy loads (load test)?” We also
test exceptional items and measures to be taken when a failure occurs.
52
Function test: It is a validation test, based on module specifications, to verify that all functions that the module is
supposed to have are satisfied.
53
Structure test: It is a validation test, based on module specifications and source program, to verify that the logic of the
module is sound.
54
(Hints & Tips) A driver is a program that simulates the functions of an upper-level module, and a stub is a program that
simulates the functions of a lower-level module. In general, a stub simply returns a value and therefore is easy whereas a
driver controls calls and is therefore often complicated.
55
Big-bang test: It is a test wherein all the modules that have completed the unit tests are linked all at once and tested. If
the program is small-scale, this could reduce the number of testing procedures; however, if an error occurs, it is difficult to
identify where the error has occurred.
56
Sandwich test: It is a test where lower-level modules are tested bottom-up and higher-level modules are tested
top-down. This is the most realistic type of testing.
Type Characteristics
Top-down test Testing from upper-level modules to lower-level modules
Requires stubs to simulate lower-level modules not yet tested.
Interfaces between modules can be sufficiently tested.
Initially, parallel work is difficult.
Effective in testing newly developed systems
Bottom-up test Testing from lower-level modules to upper-level modules
Requires drivers to simulate upper-level modules not yet tested.
Functions of the program can be sufficiently tested.
Parallel work is possible from the initial stages of the test.
Effective in developing new systems by modifying existing systems
Module 1
Top-down
Bottom-up
Module 2-1 Module 2-2 Module 2-3
The purpose of testing a program is to verify that the program runs according to the specifications
and to eliminate errors embedded in the program. To this end, sometimes error data is intentionally
entered. There are two test techniques that are proposed: black box tests and white box tests.
Name Explanation
Equivalence The range of input values is partitioned into several classes, and a test value is
partitioning picked from each class as a representative value (e.g., the median value of the
class).
Boundary value The range of input value is partitioned into several classes, and the boundary
analysis values (limit values) for the classes are picked as test values.
57
(FAQ) Frequently we see exam questions such as “Which of the following is an appropriate description of a top-down
test?” and “Which of the following is an appropriate description of a bottom-down test?” Be sure you understand the
difference between the methods of top-down and bottom-up tests as well as the roles of stubs and drivers.
For example, suppose that in a numerical (integer) item, 0 through 30 are valid data values and
other integers are erroneous, prompting an error message to be displayed. Further, suppose that
values 26 through 30 prompt a warning message to be displayed as warning data values. Here,
under equivalence partitioning, for instance, the set of test data values (-5, 15, 27, 50) may be
selected. Under boundary value analysis, the set of the boundary values of the classes (-1, 0, 25,
26, 30, 31) is selected.
Invalid equivalence class58 Valid equivalence class Invalid equivalence class
For instance, suppose there is a program with a structure shown in the figure below. Data prepared
for each of the test criteria is as follows:
In instruction coverage, data going through the path “a, b, d” are prepared since this path goes
through every instruction. In other words, only the data following the “Yes” case in the
“condition” is prepared. In decision condition coverage, data for the “Yes” and “No” cases, i.e.,
data going through both “a, b, d” and “a, c, d” is prepared.
a
Yes
Condition
c No b
Instruction
58
Valid equivalence class/ Invalid equivalence class: In a black box test, a range of correct data values is called a valid
equivalence class, and a range of erroneous data values is called an invalid equivalence class.
59
(FAQ) There are exam questions in which you are to prepare test data for equivalence partition and boundary value
analysis. Understand fully what these terms mean, and make sure that you are able to prepare test data.
60
(FAQ) Every exam has questions on the meanings of black box tests and white box tests. Be sure to know these.
The others, i.e., condition coverage, decision/condition coverage, and multiple condition coverage
are techniques used for multiple conditions. For instance, suppose that we consider the multiple
conditions “a and b” here.
In condition coverage,61 as combinations having true and false cases in the multiple conditions,
numbers (2) and (3) are tested. However, the multiple conditions “a and b” are false in both of
these numbers, so the case in which both are true is not tested. In decision/condition coverage,
the case in which both are true is included, so numbers (1), (2), and (3) are all tested. In
multiple condition coverage, every combination, i.e., (1), (2), (3), and (4) are tested.
61
(FAQ) Concerning instruction coverage and decision condition coverage, exam questions ask for specific test data. It is
best to actually prepare test data and check.
Quiz
Q1 List the types of tasks done in external design.
Q3 To increase the level of independence of modules, what should one do with module
strength and module coupling? Also, name the type of strength and coupling referred
to here.
Q4 Describe briefly each of the following checking methods: “numeric check,” “format
check,” “limit check,” “range check,” and “sequence check.”
A1
• Verification of the requirement analysis (Checking the requirements of the user)
• Definition and development of subsystems (Dividing the system into functional units)
• Screen design, form design (Designing input format, output format)
• Code design (Structures of employee code, product code, etc.)
• Logic data design (Identifying data; deciding on data structure and items)
• Preparation of external design specifications
A2
• Functional partitioning, structuring (Identifying the functions and grouping them by
processing contents)
• File design (Specifically deciding the file medium, organization, layout, etc.)
• Input/output detailed design (Deciding input/output medium, method, and check method)
• Preparation of internal design specifications
A3
To increase the level of module strength (functional strength) and to reduce the level of coupling
(data coupling)
A4
Numeric check: Check if the numerical item is really a number
Format check: Check if the data is in the right format and that the digits are correctly
aligned
Limit check: Check if the value is within the upper and lower bounds
Range check: Check if the value is within the correct range
Sequence check: Check if the key items are listed in sequence
A5
It is the method whereby test cases are designed based on the external specifications of the
program. Regardless of the program logic, test data is prepared based on the external
specifications.
A6
It is the method whereby test cases are designed based on the internal specifications of the
program. There are several test criteria: listed in ascending order of rigidity (strictness), they are
instruction coverage, decision condition coverage (branch coverage), condition coverage (branch
condition coverage), decision/condition coverage, and multiple condition coverage.
Q1. Which of the following is an appropriate statement concerning the optimization of a compiler?
a) It generates intermediate code for the interpreter instead of generating object code.
b) It generates object code that runs on a machine different from the computer on which the
compiler runs.
c) It generates object code that displays the name of the routine to which the control is passed
or the content of a variable at a certain point in time when the program is executed.
d) It analyzes program code and generates object code so that the processing can become
more efficient during execution.
Answer 1
Correct Answer: d
The optimization of a compiler means eliminating the redundancy of the object program.
a) Optimization means the elimination of redundancy; it is the compiler that generates intermediate
code. Hence, this statement is not an explanation of optimization.
d) Optimization increases the processing efficiency during execution through various means including
removing unnecessary parentheses and pre-calculating operations involving only constants.
Q2. Which of the following is an appropriate statement with regard to the method for defining an
element, which is a minimum unit for constructing an XML document?
a) A start tag and an end tag are paired up for the construction, and neither tag can be omitted.
b) A data element is constructed so that it can be placed between a start tag and an end tag.
In some cases, however, no data exists.
c) In an XML document, multiple root elements can be defined to represent a hierarchical
structure.
d) Comment information is added to represent the type of element. This is identified as the
element name.
Answer 2
Correct Answer: b
a) It is correct that the structure is such that the data is surrounded by a start tag and an end tag, but
when there is no data, to indicate the empty element, a special tag such as <element name/> can be
designated, distinguished from the start tag. For instance, this may be a description like <img
SRC=“filename”/>. Hence, the start tag and the end tag may not form a pair.
b) In principle, the data is surrounded by a start tag and an end tag. If there is no data, that is
acceptable.
c) In XML, all elements are in nested structure. An element may directly contain other multiple
elements, but there is no element that is directly contained in multiple elements. Here, the one that
includes others is called a “parent,” and that which is included is called a “child.”
XML document
d) Comment information is not contained in the data and is practically ignored. Comments are
surrounded by “<!--” and “-->.”
Q3. Which of the following statements describes the characteristic of the waterfall model, which
guarantees the consistency of system development?
Answer 3
Correct Answer: d
The waterfall model is a process model in which system development proceeds from upstream phase
to downstream phase in sequence: “basic planning external design internal design program
design programming testing installation, operation, maintenance.” Since the flow of the
development process is divided for each phase, it is easy to grasp an overview of the entire project.
Project management is also considered easier because the work flows from upstream to downstream
sequentially. However, since there is basically no going back, it has the disadvantage that the
development efficiency drops if the process requires regression.
In the waterfall model, a review is conducted at the end of each phase so that a bug is not carried on to
the next phase. If there is a bug discovered in a downstream phase (phase after programming), the cost
required for system modification (cost for regression) is extremely high. Therefore, bugs must be
discovered in an upstream phase (a design phase between basic planning and program design).
a) This is one of the characteristics of the waterfall model, but this simply describes how the process
proceeds; it does not guarantee the consistency of system development.
b) This is one of the characteristics of the waterfall model, but it describes the classification of the
contents of activities; this does not guarantee the consistency of system development.
c) A project team is organized for system development in general, not just in the waterfall model.
d) The design phase of the waterfall model is stepwise refinement. Contents of the previous phase are
carried over to the next phase; this guarantees the consistency of system development.
a) Since activities proceed sequentially through basic planning, external design, internal
design, program design, programming, and testing, it is possible to get a good overview of
the entire project and it is easy to determine the schedule and allocate resources.
b) Since a trial model is created at an early stage of system development, it is possible to
eliminate vagueness and differences of perception between the user department and the
developing department.
c) The characteristics of the software are divided into those for which the specifications are
fixed and do not require changing and those for which the specifications require changing.
Then, the process of creating, reviewing, and changing the code according to those
specifications is repeated.
d) A large application is divided into highly independent components; then processes of
design, coding, and testing are repeated for them, gradually expanding the scope of
development of the program.
Answer 4
Correct Answer: b
Prototyping is a method in which a prototype (trial model) is made for the parts directly visible to the
system user (screen, form, etc.) and systems are developed based on the feedback obtained from users
who have tested the prototype. Hence, the statement b) is appropriate.
a) explains the waterfall model, and d) explains the spiral model.
Answer 5
Correct Answer: a
An E-R diagram shows the relationships between entities (actual objects), indicating an entity with a
( ) and corresponding relations between entities with arrows ( , , — ). The following is an
example of an E-R diagram:
Example
Entity name : 1-to-1
attribute 1 : 1-to-many
attribute 2 : many-to-many
⋮
An underlined attribute is a primary key attribute. An entity type is an entity having data subject to
management. Normally, the entities are expressed with nouns such as “client” and “product.”
“Instances” are entities with values.
Client
Client code Client name Client address Entity Type
1011 George Bush Crawford, Texas Instance
1021 William Clinton Hope, Arkansas Instance
Q6. The figure below shows a certain level in a hierarchical DFD. Which is the most
appropriate method of describing DFD of the level immediately below? Assume that
the processes in the level immediately below Process n are numbered processes of the
form n-1, n-2, etc.
a) b)
1–2
1–1
1–1
1–3
1–3 1–2
c) d)
1–2 1–3
1–1
1–1 2–1
2–2 1–2
Answer 6
Correct Answer: b
In DFD, the processes (circles) get broken down in order. Hence, when they are broken down, multiple
processes on a higher level cannot be merged together on a lower level.
In DFD given in this question, note that Process 1 has two input dataflow arrows as well as two output
dataflow arrows.
a) While Process 1 has two input dataflow arrows, each of its child processes has only one input
dataflow arrow.
b) Child Process 1-1 has one input dataflow arrow, and so does Child Process 1-2. The total is 2. As
for output dataflow, there is one arrow from Child Process 1-1 and another from Child Process 1-3,
a total of 2. Hence, this may be a break-down of DFD in the question.
c) Since DFD breaks down one process, combinations like {(1-1), (1-2)} and {(2-1), (2-2)} are
acceptable whereas a combination like {(1-1), (1-2), (2-1), (2-2)}, in which multiple processes on
an upper level are combined, is not.
d) The numbers of input dataflow arrows and output dataflow arrows are correct, but every process
must have at least one input dataflow arrow and at least one output dataflow arrow. There is no
input dataflow arrow for Process 1-2.
Q7. The following table gives the number of items by category and the weighting factor for
user functions of an application program. Information is based on the function point
method. How many function points does this application program have? Here, the
correction coefficient of complexity is 0.75.
Number Weighting
User function type
of items factor
External input 1 4
External output 2 5
Internal logical file 1 10
External interface file 0 7
External inquiry 0 4
a) 18 b) 24 c) 30 d) 32
Answer 7
Correct Answer: a
In the function point method, the number of function points is obtained as follows:
- Multiply the number of functions (number of items) by the corresponding weighting factor
- Find the sum of these products
- Multiply the sum by the complexity (correction coefficient of complexity) to obtain the answer
Q8. Which of the following is the preferred procedure for improving reliability and
maintainability in software module design?
Answer 8
Correct Answer: b
In module design, increasing the independence of modules should be considered in order to improve
their reliability and maintainability. If modules are highly independent, they are unaffected by other
modules, thereby enhancing their reliability. Furthermore, their maintainability can be improved
because a modification made on one module does not affect the others.
Evaluation criteria to measure the independence of modules include module strength and module
coupling.
Module strength measures the level (strength, height) of relations within each module. The stronger
the relations within modules are, the more independent the modules are. Module coupling measures
the level (strength, height) of relations between modules. The smaller (weaker) the relations between
modules are, the more independent the modules are.
Q9. Which of the following is an appropriate statement concerning the white box test?
a) Tests are performed sequentially combining modules from the lower level to the
higher level.
b) Tests are performed sequentially combining modules from the higher level to the
lower level.
c) Tests are performed while paying attention to the internal structure of the module.
d) Tests are performed to check whether or not functions work according to the
specifications, regardless of the internal structure of the modules.
Answer 9
Correct Answer: c
The white box test is a method which focuses on the control flow of the program, prepares the test data
going through critical paths of the program, and performs the test. Since the internal structure and the
logic of the program are carefully examined, we can test detailed functions from the standpoint of the
programmer, but the functions that are in the specifications but are not yet implemented in the program
are not selected as test data.
Chapter Objectives
Today, many types of network, such as LANs, WANs,
and the Internet, are appearing. In this chapter, we will
learn the basic technology concerning information
communication networks. In Section 1, we will learn by
focusing on protocols. By setting protocols, different
types of computer can communicate with one another.
In Section 2, we will study specific communication
technologies, including how data is sent and received,
etc. In Section 3, we will learn the structures and usage
of a variety of networks including LANs and the
Internet.
¾ The OSI basic reference model and TCP/IP are the typical protocols.
Point ¾ TCP/IP is used on the Internet.
1
Protocol: It is a set of rules (conventions) for communication. A protocol stipulates the types, semantics, expression
formats, and exchange procedures of control messages for communication. Typical protocols include TCP/IP and OSI.
Observing a common protocol makes it possible to communicate between different types of computer.
2
(FAQ) The roles of each layer of the OSI basic reference model are almost always on the exams. In particular, the functions
of the network layer, transport layer, and session layer often appear on the exams.
The following figure shows the correspondence between the OSI basic reference model and
TCP/IP.3
OSI basic
reference model TCP/IP environment
Application layer Telnet, FTP
Presentation layer SMTP Application layer
Session layer POP, etc.
Transport layer TCP Transport layer
Network layer IP Internet layer
Data link layer LAN Network interface
Physical layer Ethernet, etc. layer
3
(FAQ) The correspondence between TCP/IP and the OSI basic reference model has frequently appeared on past exams.
Know that TCP corresponds to the transport layer while IP corresponds to the network layer.
IP Addresses
An IP address is a 32-bit network address used on the Internet and can be classified into several
classes according to the network size. Each class is identified by the leading bit pattern of 1 to 3
bits. The network part is unique in the world, and the host part can be systematically defined by
each network separately. Below is a schematic figure of how an IP address is structured. Class
A has a leading bit of “0,” Class B has two leading bits of “10,” and Class C has three leading
bits of “110.”4 5 6 7
32 bits
Class A 0 Network part, 7 bits Host part, 24 bits Applied to large networks
Class B 10 Network part, 14 bits Host part, 16 bits Applied to medium-size networks
Class C 100 Network part, 21 bits Host part, 8 bits Applied to small networks
Since IP addresses identify all computers on the Internet by using 32 bits, it is pointed out that
the number of usable IP addresses is insufficient. Hence, 128-bit IP addresses called IPv6 are
now in use to a certain extent.
Transmission control refers to control used to transmit data between communication devices
via a transmission line. Specifically, it includes line control, synchronization control, error
control, and data link control.
Establishing a data link means to establish a communication line and to identify the other party
(transmission destination). Mutual communication becomes possible only after establishing a
data link.
Typical procedures include the basic procedure (BSC) and the HDLC procedure.
4
FTP: File Transfer Protocol
5
SMTP: Simple Mail Transfer Protocol
6
POP: Post Office Protocol
7
Telnet: It is a virtual terminal protocol for a computer at a remote location.
To control transmission rights, various methods are used, including the contention method and
the polling/selecting method. Data is transmitted in block units while transmission and
reception are being verified.
Contention method
The contention method works as follows: between two computers connected point-to-point,10
one wishing to transmit data sends a transmission request. When a positive response is received
from the other party, transmission privilege is given, and data transfer begins.
Polling/selecting method
The polling/selecting method is used in a multi-drop system.11 A host surveys (polls) each
terminal in sequence to see if the terminal requests transmission. If so, the terminal is given
transmission privilege, and data is received by the host. The host then asks the terminal if
reception is possible. If the terminal gives a positive answer (or the terminal is selected), the
data is sent.
8
(FAQ) The basic procedure is also known as BSC (Binary Synchronous Communication). Many exam questions involve
the meaning of polling and selecting in the basic procedure. Know the meanings of these terms well.
9
Synchronization: It is required to match the timing of sending and receiving of signals when data is transmitted and
received between communication units
10
Point-to-point: It is a two-point system, or direct connection system. Two or more terminals are connected to a computer,
and each terminal has a dedicated line.
11
Multi-drop system: Multiple terminals are connected to a single line. A “control station” manages data communication
with all terminals, and this station controls all sub-stations (terminals) centrally.
Frame
F A C I FCS F
01111110 8 bits 8 bits arbitrary 16 bits 01111110
F Flag sequence: A bit string showing the beginning and the end of a frame
A Address field: Address of the transmission destination
C Control field: Various control information
I Information field: Data transmitted
FCS Frame check sequence: Check bit by the CRC method12 using A through I
12
CRC (Cyclic Redundancy Check): It is a code used to detect an error in one block of data
13
(Note) In HDLC, the bit “0” is inserted whenever there are at least five consecutive 1's. In so doing, it ensures that no bit
pattern is identical to the flag sequence. For instance, if a data sequence is “01111110,” the bit “0” is inserted so the sequence
becomes “011111010.”
14
(FAQ) There are exam questions concerning the roles of each field of HDLC and characteristics of HDLC. Be sure to
know that HDLC is bit-oriented (anything can be sent).
Quiz
Q1 Show the correspondence between the OSI basic reference model and TCP/IP.
A1
Application layer
Presentation layer
Session layer
Transport layer TCP
Network layer IP
Data link layer
Physical layer
A2
• Bit-oriented (possible to transmit an arbitrary bit pattern)
• Continuous transfer (possible to transmit without getting a response within the limits of the
certain number of frames)
• Strict error check (using CRC)
• Full duplex (full duplex communication possible even in multi-drop lines)
Transmission technology is used to transmit data at high speed, efficiency, and quality. More
specifically, it includes technology in error control, synchronization control, and duplexing.
Error control refers to improving the quality of data transmission through detecting errors in
data transmission and, in some cases, correcting errors. Typical checking methods include
parity check and CRC.
…………
15
LRC/VRC: Parity check applied to each string of bits in the same horizontal position of each character (horizontal parity
check) is called LRC (Longitudinal Redundancy Check); parity check applied to each character in the vertical direction
(vertical parity check) is called VRC (Vertical Redundancy Check).
Below is a figure where the bits in the shaded area are erroneous in odd parity. Normally, the
number of 1's should be odd, but here it is even, indicating that there is an error.16
0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1
1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1
← The
1 0 1 0 0 0 1 0 1 1 0 1 0 1 0 1 0 1 ←The number of
number of
1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1's is odd.
1's is
0 0 0 1 1 1 0 0 0 even. correction 0 0 0 1 1 1 0 0 0
0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0
↑ ↑
The number of 1's is even. The number of 1's is odd.
If 2 bits are erroneous, as shown below, the number of 1's is even while it should be odd, both
in horizontal and vertical parities. However, there are two possible combinations of errors,
making it impossible to correct them.
0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1
1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1
1 0 1 0 0 0 1 0 1 ← The number of 0 0 1 0 1 0 1 0 1 ← The number of
1 1 1 0 0 0 0 0 0 1's is even. 1 1 1 0 0 0 0 0 0 1's is even.
1 0 0 1 1 1 0 0 0 ← The number of 0 0 0 1 0 1 0 0 0 ← The number of
0 1 0 1 0 1 1 0 1's is even. 0 1 0 1 0 1 1 0 1's is even.
↑ ↑ ↑ ↑
The number of The number of The number of The number of
1's is even. 1's is even. 1's is even. 1's is even.
A code in which a bit is added for error detection is called a Humming code.17
16
(FAQ) Questions concerning parity check do appear on the exams, like “Which bit column has erroneous data if odd parity
is used?” These are very easy questions since all we have to do is to count the number of 1's.
17
Humming code: It is a code in which a check bit is added to the information bits is generally called a Hamming code. Not
only can it detect errors, but it can also correct them. Parity check is a specific example of Hamming code check.
18
ITU-T (International Telecommunications Union-Telecommunications Standardization Sector): As one sector of the
ITU, this organization considers technology, operation, and fees concerning telecommunications, prepares standards, and
issues the standards as recommendations.
To send and receive data correctly, the sender and the receiver adjust the timing of
transmission; this is referred to as synchronization. The computers or terminals of the sender
and the receiver must perform synchronization according to the data contents.
The line is always in the condition of “1,” identical to the stop bit. When the start bit “0” is
received, reception takes place with a certain cycle. For this reason, the cycle must be determined
between the sender and the receiver in advance.
No communication
19
(Hints & Tips) Bit synchronization is sometimes called the asynchronous method or start-stop synchronization. As a
means of synchronization, this method uses the so-called “asynchronous” method, which does NOT mean “not
synchronizing.” Be careful not to misinterpret this term.
S S S
Y Y Y
N N N
Å Transmission direction
ÅTransmission direction22
20
(Note) Character synchronization is also called the continuous synchronization method or the SYN synchronization
method. Since the SYN code is formed with 8 bits, the same number as for a character, the data following the SYN is
received in units of 8 bits. This system is used in mid- to high-speed terminals. This method is the synchronization method
used in the basic procedure.
21
(Note) Block synchronization is also called flag synchronization or frame synchronization. In HDLC, the bit pattern
“01111110” is used as the flag sequence.
22
(FAQ) Questions concerning bit synchronization are frequently seen on the exams. Remember that the first bit is “0” and
the last bit is “1” for each character. Further, there have been exam questions that give the number of bytes (number of
characters) of data as well as the line speed and ask you how many seconds it takes for the data to be transmitted. In bit
synchronization, a start bit and a stop bit are added to each character, so remember that each character takes 10 bits.
Multiplexing refers to the communications among multiple computers through one transmission
line simultaneously. We can reduce the communication costs by using one high-speed line by
multiplexing it into multiple low-speed lines. There are three transmission methods: simplex,
half-duplex, and full-duplex, depending on the types of data flow.
Multiplexing Methods
There are two types of multiplexing: FDM and TDM.
FDM is the method of multiplexing with a frequency division multiplexer 23 to divide the
transmission frequency bandwidth of an analog line into multiple small bands and to use each
channel as an independent communication channel. For example, a line with a bandwidth of 48
kHz may be partitioned into 12 channels, each of which has a bandwidth of 4 kHz, so that they can
be used as 12 telephone lines. Each of the divided channels can then be used for either analog
transmission or digital transmission. In digital mobile phones and digital television broadcasting,
digital transmission is performed in communication channels resulting from frequency partition.
TDM is the scheme of dividing one digital line into multiple low-speed channels. For instance,
if a line whose speed is 64 Kbps is connected to 16 terminals, then each terminal has a speed of
up to 4 Kbps.
In TDM, one digital line is partitioned by time, and transmission and reception alternate (get
switched) in time intervals of certain length. This switching unit is called a TDM (Time
Division Multiplexer).24
23
Frequency Division Multiplexer (FDM): It is a multiplexing unit used for frequency division multiplexing.
24
Time Division Multiplexer (TDM): It is a unit used to partition one digital transmission line by time so that the line can
be used as multiple communication channels.
Terminal lines
Terminal
Processing unit
Terminal
(2)
(4) (3) (2) (1) (4) (3) (2) (1)
high-speed line
Terminal
(3)
Terminal
(4)
Transmission direction →
For instance, if a channel with a transmission speed of 2.5Gbps per wavelength is multiplexed into
4 channels, transmission at a total speed of 10Gbps can be achieved.26
Transmission Methods
Transmission can be classified into three methods by the way data flow; they are simplex,
half-duplex, and full-duplex. One transmission line consists of a pair of two communication
media; it is called the two-wire system. There is another system, called the four-wire system, in
which there are two pairs of communication lines (4 media): one pair for sending, and the other
for receiving. In general, the four-wire system is used for full-duplex while the two-wire system is
used for half-duplex.27
25
DWDM: The DWDM (Dense WDM) technology is an area of current research; it is a way to achieve even
higher-capacity data transmission by increasing the number of wavelengths of WDM or narrowing the gaps between channels.
It is said that using DWDM, super-high capacity data transmission, replacing Gbps with Tbps (terabytes per second, where
one tera is 1012) is possible.
26
(FAQ) There seem to be no new exam questions on FDM and TDM available, as they have been used up in past exams.
Any question on TDM can be answered as long as you know that multiple logic channels can be used because of
time-partition of one line. Future exam questions will more than likely involve WDM.
27
(Note) Multiplexing enables a two-wire system to be used for full-duplex communication.
4.2.4 Switching
¾ There are two types of switching: circuit switching and
store-and-forward switching.
Point ¾ There are two types of store-and-forward switching: packet switching
and message switching.
The line to be used in communication differs depending on whether or not the party with whom
we are communicating is fixed. If the party is fixed, a dedicated circuit28 is used. If the party
changes, a switching circuit is used, as represented by the public telephone network.
Circuit Switching
Under circuit switching, the transmitter calls up the other party by dialing to set up a physical
circuit, as represented by the telephone service. This enables high-speed and high-quality data
transfer, but both parties are required to use the same speed and same transmission control system.
Store-and-Forward Switching
Under store-and-forward switching, the transmitted data is first stored in a switching unit, the
receiver is selected, and then the stored data is transferred to the next switching unit or to DTE.29
Although the transmission speed and quality are poorer than those of circuit switching systems, it
is not necessary that the transmitter and the receiver have the same speed nor that they use the
same transmission control system. It is suitable when the amount of data transmitted at one time is
small and when the communication traffic is light.
Among store-and-forward switching systems, there are message exchange systems, where storing
and switching occur in message units, and packet exchange systems, where messages are
partitioned into packets of a fixed size and transferred in packet units.
In message exchange, generally the message contents are transmitted without modification. For
instance, this is used in electronic mails on the Internet and foreign exchange dealing systems
between banks.
28
Dedicated line: It is a communication line which is set up between communication points desired by the users and can be
used exclusively by these users. Generally the fees for dedicated lines are charged on a monthly basis, determined by the
communication distance and transmission speed. There are analog dedicated lines stipulated by frequency bands and digital
dedicated lines stipulated data transmission speed.
29
DTE (Data Terminal Equipment): It is a unit that has the functions of a data transmitter or a data receiver or both and is
equipped with the data communication function. In general, these include computers and terminals that can be connected to
modems (modulator-demodulators).
In packet exchange, data is divided into packets30 of certain size (a block of data); then to each
packet, the forwarding address, data attributes, and error check codes are added before the packet
is transmitted onto the communication medium. Since the lines are not exclusive to any user
except when the data is actually being transmitted or received, the channels can be multiplexed,
and the lines can then be used efficiently.31
storing
packet
storing / B storing /
B
decomposing assembling
C B A C B A
digital
C A
C A packet
storing
Quiz
Q1 List the methods of synchronization control.
A1 Bit synchronization
Character synchronization
Block synchronization
A2 This is the method in which data is divided into packets and sent out onto the communication
medium.
30
Packet: In data communication, it is a block of data along with added control information such as the forwarding
address. By transmitting and receiving data by partitioning them into multiple packets, one prevents intermediate
communication lines between the two locations from being exclusively used, resulting in more efficient use of the
communication circuits. Further, since the route can be selected flexibly, when a part of one line fails, another route can be
used as a replacement.
31
(FAQ) Questions on packet exchange will appear on the exams. Know that communication is possible between different
computers and terminals with different speeds.
4.3 Networks
Introduction
4.3.1 LANs
LAN stands for “Local Area Network.” It is a network connecting various units that are spread
out over a relatively small area, such as within one building or site.
Topology of LAN
The word “topology” here refers to a connection configuration of a network. Typical topologies of
LAN include the star, ring, and bus networks.32
control
unit
32
(Note) Star network: Terminals are connected to the unit that controls communication.
Ring network: Terminals are connected to form a ring (circle).
Bus network: Terminals are connected to transmission routes called buses.
Token passing
In this method, control information called a token is circulated in a certain direction on LAN. The
computer that receives the token gets the transmission privilege, adds the destination address and
the data to the token, and sends them out. This is used for a ring-type or a bus-type LAN.33
The maximum length is the length of the cable between the two terminators in bus-type LAN; the
length of the ring in ring-type LAN; and the maximum transmission distance in star-type LAN.
The maximum length of FDDI is stated as 200km, but in ring-type LAN, sometimes the cables are
doubled up as a precaution against failures. In such a case, the maximum length will be 100km.
Wireless LAN
Wireless LAN uses transmission channels other than cables, such as radio waves and infrared rays.
Most of the cables can be eliminated, so it hardly takes any labor to install or move terminals.
However, there are limitations in speed and distance, and it may be affected by interference from
electro-magnetic noise generated by other devices. Other disadvantages include the high cost per
terminal.35
33
(Note) When the token-passing method is applied to a ring-type LAN, it is called the token ring method; if it is applied to
a bus-type LAN, it is called the token bus method. In the token passing method, it is necessary to decide the order in which
the token is circulated.
34
(Hints & Tips) Note that 10BASE-T, etc. is a star-type LAN. The device that plays the role of the control unit is called a
hub.
35
(Note) Specifications for wireless LAN, established by the IEEE802 Committee, include IEEE802.11a, IEEE802.11b, etc.
The term Internet means “a network of networks” and is a global scale network of
organizations. For the protocol, TCP/IP is used, and communication is based on IP addresses.
Intranets and extranets using Internet technologies have also been widely used.36 37
The WWW provides mechanisms such as hypertext mentioned above. To view the contents, we
need browsing software called a WWW browser, such as Internet Explorer and Firefox.
Internet Services
The Internet uses TCP/IP, so most of the services that can be used with TCP/IP are available on the
Internet. Main services are shown the following table.
Name Explanation
Telnet Standard protocol for virtual terminals
Used to interact with remote computers
FTP File Transfer Protocol
Standard protocol for transferring files
Both text files and binary files can be transmitted.
Electronic Function by which the user can send/receive messages to/from one or more
mail people
Transmission is possible even when the other party is not connected to a
computer. However, for sending and receiving messages, a mail address is
required. Protocols used for electronic mails include SMTP and POP3.38 39
36
(Note) On the Internet, to identify a network or a terminal, a 32-bit IP address (IPv4) is commonly used, but each of these
must be unique in the whole world. As the Internet gains popularity, running out of IP addresses is a real issue. Currently,
128-bit addresses called IPv6 are also in use.
37
One of the ways to address the insufficiency of IP addresses is DHCP (Dynamic Host Configuration Protocol). DHCP is a
protocol that automatically assigns IP addresses and necessary information to computers that are temporarily connected to the
Internet. When the communication is over, the IP address is automatically collected, and the same IP address is assigned to
another computer.
38
SMTP/POP3: SMTP is the protocol for sending e-mails, and POP3 is for receiving e-mails. POP3 is the latest version of
POP.
39
(FAQ) Frequently there are exam questions on SMTP and POP. Remember that SMTP is a protocol for transmitting
e-mails while POP is for receiving them.
Intranets
An intranet is an in-house (company-wide) network using Internet technologies. Normally,
between an in-house network and the Internet, a defense system called a firewall is installed to
prevent critical in-house information from leakage. With the popularity of the Internet and
introduction of user-friendly WWW browsers, it is now possible to construct systems such as
document sharing, electronic bulletin boards, and electronic mail systems at low costs.
WWW server
Division network
Division network
WWW server
Intranets
Extranets
An extranet is a network in which intranets are extended between companies. In general, intranets
are connected to the Internet to construct an extranet.
Extranet
Internet
Company A Company B
Intranet Intranet
40
URL (Uniform Resource Locator): This is the information that identifies the location of a homepage on the Web,
consisting of a protocol name, host name, file name, etc.
41
HTML (HyperText Markup Language): It is the means to write a document in the hypertext format. It uses reserved
words contained between “<” and “>” called tags to specify the text formatting, image file display position, link designation,
and script declaration. If this is opened using a WWW browser, the browser interprets and displays its contents. To specify
the address of a WWW server, URL (Uniform Resource Locator) is used.
42
(FAQ) The correspondence between the OSI basic reference model and connection units between LANs appears often on
the exams. Be sure to know that the router corresponds to the network layer, the bridge to the data link layer, and the repeater
to the physical layer.
43
Filtering: It is the function whereby the system, based on the transmitter's address, decides whether or not to accept the
packet (allow it through) and discards unnecessary packets. By the filtering function, extraneous packets are prevented from
entering LAN.
44
MAC address: It is a 48-bit (6-byte) device number assigned to a LAN card used when a terminal is connected to a
network. In principle, there are no two cards in the world with the same MAC address.
45
(Hints & Tips) A hub that relays packets with the protocol of the data link layer is called a switching hub. Meanwhile, a
hub that relays packets with the protocol of the physical layer is called a repeater hub.
46
Backbone LAN / Branch LAN: The backbone LAN refers to the transmission routes constituting the main section of the
network with in an organization. For the transmission medium, optical fiber cables are used, so the communication is
high-speed and high-capacity. This plays the role of connecting two or more branch LANs. A branch LAN is LAN set up for
a division or a department of the organization. It is mid- to small-scale, and it is LAN for communication between
workstations, PC communications, and file/printer sharing in a system spread out on the premises.
Center
Terminal Analog
modulation → ← modulation CCU computer
← demodulation line demodulation →
network
modem NCU NCU modem
Computer
Computer
Digital network
TA DSU TA
DSU
A unit called TA (Terminal Adapter) may be required between DSU and the terminal.47 TA allows
telephones, fax machines, and PCs, which have traditionally been used on analog lines, to operate
on ISDN lines. Most of the time, TAs are necessary.
47
(Hints & Tips) DSU is often installed inside TA and thus is not directly visible.
Name Explanation
Basic rate interface consisting of two B channels and one D channel (2B+D); maximum
144Kbps
Primary rate interface consisting of multiple B channels and one D channel (23B+D, 24B,
4H0, etc.); maximum 1,536Kbps 49
Having multiple channels means the option of having multiple lines. For instance, in the basic rate
interface, there are two B channels, so they can be used as two lines of 64kbps each or one line of
128Kbps. The details of each channel are shown below.
Name Explanation
D channel Signal channel for control information Basic rate interface: 16Kbps
Can be used as a B channel in packet switching Primary rate interface: 64Kbps50
B channel User information channel 64Kbps
H channel User information channel exceeding 64Kbps H0 (384Kbps)
H11 (1,536Kbps)
H12 (1,920Kbps)51 52
ADSL
ADSL (Asymmetric Digital Subscriber Line) is the technology for high-speed data transfer using
existing telephone lines. This can be used simply by connecting an ADSL modem to the
conventional equipment. The speeds are 0.5M to 1Mbps upstream and 1.5M to 40Mbps
downstream. The transmission speeds upstream and downstream differ in this “asymmetric”
digital subscriber line. It shows its power in downloading massive data such as video-on-demand
and Web pages containing video data.
Quiz
Q1 Explain the CSMA/CD method.
A1 The computer about to transmit data checks whether or not no data is being transmitted on the
transmission route and then sends the data. If data is there, the computer waits for a certain
amount of time and then re-sends the data. This method is used mainly in a bus-type LAN.
A2
Telnet: Standard protocol for virtual terminals
Used to interact with remote computers
FTP: File Transfer Protocol
Standard protocol for transferring files
Both text files and binary files can be transmitted.
Electronic mail: Function by which the user can send/receive messages to/from one or more
people
Transmission is possible even when the other party is not connected to a
computer. However, for sending and receiving messages, a mail address is
required.
Q1. Which of the following is an appropriate description concerning the network layer of the OSI
basic reference model?
a) The network layer performs the routing and relaying so that data can be transferred
between end-systems.
b) Among the various layers, the network layer is closest to the user and provides functions
such as file transfer and e-mail.
c) The network layer absorbs the differences in characteristics of physical communication
media and provides a transparent transmission route to upper-level layers.
d) The network layer provides a transmission control protocol (error check, re-transmission
control, etc.) between adjacent nodes.
Answer 1
Correct Answer: a
Process A Process B
Application layer Meaning contents Application layer
Presentation layer Expression contents Presentation layer
Session layer Dialogue Session layer
Transport layer Data transfer unit Transport layer
Network layer Data Network layer Data Network layer
Data link layer Frame Data link layer Frame Data link layer
Physical layer Electric signals Physical layer Electric signals Physical layer
Transmission medium Relay node Transmission medium
The network layer stipulates the method of selecting the communication route and the relay
method. It models a communication network; it selects the communication route between end
nodes, relays data, and sends along them. It is processed by protocols such as the X.25 protocol
in switching functions such as packet switching and circuit switching. The function of the
network layer is, therefore, to select the route to the other computer.
Q2. Which of the following protocols is used to automatically set up the IP address that a PC uses
to connect to LAN at startup time?
Answer 2
Correct Answer: a
DHCP (Dynamic Host Configuration Protocol) is a protocol that automatically sets up network
parameters. When terminals (clients) start up, an IP address is dynamically assigned to each client, and
when the session ends, the assigned IP addresses are collected.
b) FTP (File Transfer Protocol) is a protocol for transferring files on a TCP/IP network.
c) PPP (Point-to-Point Protocol) is a protocol for WAN used for network connection, not
necessarily on TCP/IP. Generally, PPP is used for dial-up connection to the Internet; the user
does not need to obtain an IP address.
d) SMTP (Simple Mail Transfer Protocol) is a protocol for sending and receiving electronic
mails between mail servers on a TCP/IP network. This is also used when an electronic mail is
sent from a mail client (terminal) to the mail server.
Q3. The character “T” (ASCII code 1010100) was sent via a data transfer using start/stop
synchronization with even-parity error detection. If the character is received correctly, what is
the bit string that is received? Here, the bits are sent in the following order: start bit (0); the
character code, from the least significant bit to the most significant bit; parity bit; and stop bit
(1). The bits are written in the sequence in which they are received, starting from the left.
Answer 3
Correct Answer: b
Since the character length is 7 bits and one parity bit is added, a character is 8 bits long. The
start bit “0” is also added before the bit string for the character, and the stop bit “1” is added at
the end. Hence, altogether, the character will be 10 bits long. Since even parity is used, the
number of 1s in the 8 bits (for the character itself) will be even (possibly 0).
0 XXXXXXXX 1
Stop bit (value “1”)
character (even parity)
start bit (value “0”)
a) 0001010101
The number of 1s in this part is 3—odd parity.
b) 0001010111
The number of 1s in this part is 4—even parity.
c) 1001010110
The stop bit is “0.”
The start bit is “1.”
d) 1001010111
The start bit is a “1.”
Hence, the bit string, when correctly received, is 0001010101, which is (b).
Q4. Audio is sampled 11,000 times per second, and sampled values are each recorded as 8-bit data.
In this system, how many seconds of audio can be recorded on a floppy disk whose capacity is
1.4 x 106 bytes?
Answer 4
Correct Answer: b
Since the audio is sampled 11,000 times per second, and each sampling produces 8 bits of data, the
amount of data transferred per second is as follows:
Number of bits transferred per second = 11,000 (times/sec) × 8 (bits/time)
= 88,000 (bits/sec)
The capacity of a floppy disk is stated to be 1.4 × 106 bytes, so to use consistent units, we convert
the number of bits of data transferred per second into bytes as follows:
88,000 (bits/sec)
Number of bytes transferred per second =
8 (bits/byte)
= 11,000 (bytes/sec)
Since 11,000 bytes are transferred every second onto a floppy disk whose capacity is 1.4 × 106
bytes, the number of seconds of the audio data that can be recorded on this floppy disk is as
follows:
1.4 × 106
Amount that can be recorded on a floppy disk =
11,000
1.4
= × 102
1.1
= 1.272727… × 102
= 127 (rounded to the nearest integer).
Q5. Three IP routers are connected by leased lines as shown in the figure below. Which of the
following statements appropriately describes the operation of router A in relaying a TCP/IP
packet from terminal A to terminal B?
Router Terminal
Leased line
B B
Terminal Router
A A Leased line
Router Terminal
C C
Answer 5
Correct Answer: c
A router verifies the IP address of the addressee to which the received text (packet) is sent,
determines an appropriate route, and delivers it to the destination. In the data link layer of the OSI
basic reference model, data can be transferred only between adjacent nodes or on the same
segment, but a router sends packets to a designated router by relaying them through the network
layer.
a) If the access control methods of LANs are all identical, a bridge performs this function. A
router can connect a network with LAN whose access control method may be different,
and it only relays to a designated route.
b) The relay route of a packet is not fixed. Routes are determined based on those which are
set up in the routers or based on the information exchanged between routers, and an best
route is selected as the relay route.
d) It is a bridge that performs relays using MAC addresses.
Q6. Which of the following medium access control methods in LAN provides the function of
detecting a data frame collision on transmission media?
a) CSMA/CA b) CSMA/CD
c) Token-passing bus d) Token-passing ring
Answer 6
Correct Answer: b
A medium access control method is a method for sending frames (transmission units) on LAN. As
a rule, when a terminal transmits a frame, other terminals need to hold all transmission until the
frame reaches the destination. CSMA/CD (Carrier Sense Multiple Access with Collision
Detection) is a medium access control method for bus-type LAN and star-type LAN. The terminal
that wishes to transmit data checks to see if any communication established by other terminals is
being done on the transmission medium; the terminal then sends the data if there is no
communication taking place. If there is communication taking place, the terminal waits for a
certain period of time and then attempts to re-send the data.
a) CSMA/CA (Carrier Sense Multiple Access with Collision Avoidance) is a medium access
control method for mid-speed LAN whose transmission speed is 1Mbps to 2Mbps.
c) Token passing bus (token bus) is an application of token passing on a bus-type LAN.
Token passing is a medium access control method for ring-type LAN and bus-type LAN.
Transmission authorization data, called a token, is constantly going around LAN, and the
terminal that has obtained the token gets the authorization for data transmission. A
terminal wishing to transmit data gets the token and, in its place, releases the data it wishes
to send. Once the transmission is finished, the token is released to the network again.
d) Token passing ring (token ring) is an application of token passing on a ring-type LAN.
Q7. Which of the following is an appropriate description concerning the function of a proxy server
used on the Web?
a) A proxy server converts private IP addresses used on an intranet into global IP addresses,
and vice-versa.
b) A proxy server dynamically assigns an IP address to a client when the client connects to the
network.
c) When a client connected to an internal network communicates with an external server, a
proxy server acts as a relay and establishes connection to the server on behalf of the client.
d) A proxy server has a correspondence table of host names and IP addresses, and it notifies a
client of the IP address of a host when the client sends a query.
Answer 7
Correct Answer: c
A proxy server is a server set up to maintain security and achieve high-speed access when making
connection to the Internet from an internal network. It prevents unauthorized access into the
internal network, and it also relays and manages access from the internal network to the outside
Internet.
a) The function that converts private IP addresses to global IP addresses and vice-versa is a
function of IP masquerade or NAT (Network Address Translation). These functions are
normally supported on routers (gateways).
b) This is an explanation of DHCP (Dynamic Host Configuration Protocol).
d) This is an explanation of DNS (Domain Name System).
Chapter Objectives
A database is an organized set of data which is
accumulated collectively for purposes of data sharing,
integrated management, and a high level of
independence. Databases have several categories, but
presently, hierarchical database, network database,
and relational database serve as the major databases.
Among these, relational database is the mainstream
database today. In Section 1, we will learn about
databases from a theoretical viewpoint, discussing
their structures and development methods. In Section
2, we will learn how to make use of SQL, the
programming language used to manipulate relational
databases. In Section 3, we will mainly learn about
DBMS, the software for efficient use of the databases.
A data model is an expression summarizing data items related to one another under certain
rules. To create a database, one first produces a data model and normalizes it by eliminating
unnecessary information and duplicate items. Then, the database is specifically designed so that
search, update, and deletion of data can be performed efficiently. Types of data model include
conceptual data models, logical data models, and physical data models. The rules that
implement each of these are called conceptual schema, external schema, and internal schema.
Abstracting and organizing the structure of real-world information, which is the object to be
made into a database, and then expressing it, is called data modeling. A data model can be a
conceptual, logical, or physical data model. These are related as shown in the figure below.1
Object world
▼ (Abstracting)2
Conceptual data model E-R model3
▼ (DBMS selection)
Logical data model Relational model, network model, hierarchical model
▼ (Data manipulation)
Physical data model Relational database, network database, hierarchical database
1
(Note) A data model is a conceptual expression (model) of data; it could also refer to the rules of expression.
2
Abstracting: It means extracting the most characteristic elements of the object and removing everything else. We can
create a database that can be shared if we, in creating the database model, first extract those elements common to all tasks
from among all the data subject to the tasks that need to be systematized.
3
(Hints & Tips) Conceptual models include E-R models discussed in Section 5.1.3. Logical data models have relational
models, network models, and hierarchical models. Further, if we take a logical data model and make a database specifically
from it, we can get a physical data model.
Independence of Data
The independence of data means that the “program does not get changed when data changes.”
Since multiple programs share the same set of data, it is not necessary to create as many data
sets as the number of individual programs. Hence, the data must be organized systematically.
One type of software to achieve the independence of data is a database management system
(DBMS), which keeps the data independent by using 3-layer schema.
3-layer Schema
A schema is a description of the framework of a database. In ANSIX3/SPARC,4 schemata are
classified into conceptual schema, external schema, and internal schema. These are called
3-layer schema. The figure below shows their relations.
Conceptual schema
Internal schema
Database
In general, the user of a database utilizes the database through an external schema.5
Name Explanation
External Definition of the database seen from the program or the user. This uses a
schema part of the conceptual schema. In relational databases, this is called a
view; in network databases, this is called a subschema. This exists for
each program and user.
Conceptual This is the data to be contained in the database, defined according to the
schema data model; a definition of the real data as a whole. It is called a table in
a relational database and a schema in a network database.
Internal This is a definition to specifically achieve the conceptual schema for an
schema external storage. It consists of information such as the medium,
organization method, and buffer length.
Concerning relational database and network database, see Section 5.1.2. “Logical data models.”
4
ANSI/X3/SPARC: The ANSI (American National Standards Institute) is a non-profit organization that establishes the
industrial standards of the United States. X3 is the committee within the ANSI which discusses the standards associated with
information processing. The SPARC (Standards Planning And Requirements Committee) is the committee that is involved
with international issues.
5
(FAQ) Many exam questions ask about the schema types and their characteristics. Know clearly the differences among
conceptual schema, external schema, and internal schema.
Logical data models contain relational model, network model, and hierarchical model.
These data models, when they are implemented, become relational databases, network
databases, or hierarchical databases.6
Network Database
A network database is different from a hierarchical database in that the parent records and
child records do not have 1-to-n (1:n) correspondences; rather, they are in many-to-many (m:n)
correspondence. In other words, a parent record may have multiple child records, and
conversely, a child record may have multiple parent records.7 A network database is sometimes
called a CODASYL database.8 The structure of a network database is as shown below.
Here, for example, “Susie” belongs to the “swimming club” only, but “Tommy” belongs to
both the “track & field club” and the “baseball club.”
6
(Note) Hierarchical databases and network databases together are sometimes called structure databases.
7
(Hints & Tips) A structure database is a network database in which each child has only one parent.
8
CODASYL database: A network database refers to any database based on the language specifications proposed by
CODASYL; hence, a network database is also called CODASYL database. CODASYL stands for the Conference On DAta
SYstems Languages. This organization consists of the United States government, computer manufacturers, and users. This is
the organization that has developed and is maintaining the business-oriented programming language COBOL. It developed
COBOL in 1960 and conducted research in database languages later.
Relational Database
A relational database is a database in which data is expressed in a two-dimensional table.
Each row of the table corresponds to a record, and each column is an item of the records. The
underlined columns indicate the primary key.9
Employee_tbl
Employee_number Name Tel_number
00100 Paul Smith 03-3456-0001 ← Row (pair, tuple, record)
00200 Rick Martin 03-3456-0011
00300 Billy Graham 03-3456-0010
00400 John Wilson 03-3456-0200
Each table is always named. In the above example, the name is “Employee_tbl.” The columns
are “Employee_number,” “Name,” and “Tel_number.” A row is a set of data like “00100, Paul
Smith, 03-3456-0001.” In other words, we can say that “Employee_tbl consists of 4 rows and 3
columns (4 by 3).”
Bachman Diagram
A Bachman diagram describes the parent-child relation between records in a network database.
A parent is called an owner while a child is called a member. Below, terms like “enrollment”
and “component” describe the parent-child relations and are called parent-child set types. The
actual contents (values) in Bachman diagrams are called occurrences.10
Enrollment Component
9
Primary key: It is a column or a set of columns that uniquely identifies a row of the table. In the same table, primary key
values cannot be repeated. Here in the “Employee table,” “Employee number” is the primary key. If the values of one column
are not unique, a combined key can be defined by combining multiple columns.
10
Occurrence: It is a specific value in a Bachman diagram. For example, if A, B, and C are three of the “students” in the
example here, A, B, and C are occurrences.
Hierarchical models, network models, and relational models are all data models with the
assumption that DBMS is used.11 However, data and information used in the real world are not
necessarily limited to those compatible with DBMS. One of the methods for expressing
real-world data structures as faithfully as possible is the E-R model. An E-R model is
expressed by using an E-R diagram.
An E-R diagram is a technique, used in designing files or databases, for expressing results
obtained by grasping the objects to be managed and data items. The objects of management and
analysis are referred to as entities, which are associated with one another by relationships. The
elements constituting entities and relationships are called attributes.
Correspondence Relations
In an E-R diagram, the 1-to-many relation “one company has multiple employees” is indicated
by the following diagram. Note that, as seen here, sometimes the attributes are omitted.12
Employment
Company Employee
Here, “Company” and “Employee” are linked by the relation called “employment.” The
relation between “Company” and “Employee” is 1-to-many (Æ), so for one company, there are
multiple (many) employees. If an employee is chosen, there is only one company related to
him/her, so one can know which company is associated with the employee. However, choosing
one company does not uniquely identify its employee because there are multiple employees.
Hence, if unique identification is possible, the identified party is the “1” in the “1-to-many.”13
11
DBMS (DataBase Management System): it is a software dedicated to the maintenance and operation of databases.
12
(FAQ) There are exam questions on how to interpret E-R diagrams. Be sure that you can identify 1-to-many, 1-to-1, and
many-to-many relations. The relation between employees and departments in a company with multiple employees, some of
whom may belong to multiple departments, is “many-to-many.”
13
(Hints & Tips) Be careful as it is easy to reverse “1-to-many” relations. If picking one data value can uniquely identify an
associated member, the identified member is the “1.” If unique identification is not possible, the other is “many.”
Share-holding
Company Shareholder
Here, the double arrow indicates a many-to-many relation between “Company” and
“Shareholder.” This suggests that a shareholder may hold shares of multiple companies and that
a company may have multiple shareholders. In other words, we can interpret this figure as
follows: “There are multiple companies, each of which has multiple shareholders.”
Normalization (data normalization) means to maintain the consistency and integrity of data by
eliminating redundant data. There are first normal forms, second normal forms, and third
normal forms. Normalization is a concept that is used only for relational databases. The
mainstream databases used today are relational databases, so normalization is an extremely
important theme.
Non-normal Form
A non-normal form is a form in which items are simply listed. In general, repeated items are
also included. In the figure below, the combination (ProductNumber, Quantity, UnitPrice) is
repeated. In the following explanations, the underlined items indicate the primary keys. Here,
the fact that the InvoiceNumber is used as the primary key assumes that there is no duplication
of the InvoiceNumber.14
14
(Hints & Tips) There cannot be two records whose primary key values are the same. In this example of the non-normal
form, since the InvoiceNumber is the primary key, there are no duplicate InvoiceNumbers.
In the example of the non-normal form above, there is repetition, so the first normal form has
only two records as shown below.
15
(Note) Dependency on the primary key means that each item can be identified by the values of the primary key.
16
Complete functional dependency/ Partial functional dependency: In the first normal form, “Quantity” is determined
for the entire primary key “InvoiceNumber + ProductNumber.” As seen in this example, the dependency on the entire
combination of primary-key items is called complete functional dependence. The unit price, on the other hand, depends only
on one of the primary keys (in this example, on “ProductNnumber”); When an item is determined by one of the primary keys,
we call it partial functional dependency.
Strictly speaking, a second normal form can be defined as “a first form in which all non-key items are in complete functional
dependence.”
So, when the records of a non-normal form are modified into a third normal form, the data
gets separated into four records: “Detail_table,” “Product_table,” “Invoice_table,” and
“Customer_table.” A third normal form is characterized by the property that no items are
duplicated except for the primary key items.17
Reference Constraints
If there is no contradiction in the data contained in a database, we say that the database is
consistent. Various conditions to verify the completeness of data are called integrity
constraints. Consistency constraints include reference constraints, existence constraints,
update constraints, and format constraints. 18 19
17
Transitive functional dependency: Customer names in the second normal form can be identified because the primary key
“InvoiceNumber” identifies the customer number, which identifies the customer name. In other words, the invoice number
indirectly identifies the customer name. This type of indirect dependency is called transitive functional dependency.
Strictly speaking, a third normal form can be defined as “a second normal form in which no non-key items are in transitive
functional dependence.”
18
(FAQ) There are exam questions that give records in a non-normal form, as well as some assumptions, and then ask you to
choose the third normal form from the answer group. If you follow the procedures described in this book to obtain the third
normal form, you will certainly get the answer, but you may run out of time. So it is necessary to intuitively find the correct
third normal form. You should try many questions for practice, but you can identify the third normal form by the property that
“there are no duplicate items except for the primary key items.”
19
Existence constraints: It means constraints that the existence of particular data requires the existence of some other data.
For instance, a child record cannot be added unless there is a parent record in existence.
Update constraints: It means constraints that a new item must satisfy certain given conditions in order to be registered. For
instance, the value “6” cannot be registered if the value must be between 1 and 5, inclusive.
Format constraints: It means constraints that an item must be in a format that satisfies certain given conditions. For instance,
text cannot be registered in an item that requires numerical entry.
(Invoice_table)
Invoice Customer
(The CustomerNumber is an external key of the Customer_table.)
Number Number
Reference
Customer Customer
Number Name
(Customer_table)
Among the various kinds of operations on relational databases, relational operations and set
operations are the most important. In a relational database, a table, a row, and a column are all
treated as a set which extracts values. Extracting processes include manipulation such as
selection, projection, and join. These are called relational operations. In contrast, there are
other operations whereby two tables in a relational database are used to create a new table;
these are called set operations. Set operations include union, intersection, and difference.
Relational Operations
The meanings of relational operations are listed in the following table. Various data is extracted
by combining these basic operations.
Operation Function
Selection Extracting rows satisfying certain conditions
Projection Extracting specific columns (attributes)
Join Connecting multiple tables for equivalent columns
Projection extracts a specific column. In the following figure, for instance, only “Department”
is extracted. Selection extracts certain rows, so, for instance, every row whose “Age” is “23” is
extracted. Join connects equivalent columns, so, for instance, two tables are joined by
“Name.”20 21
20
(Hints & Tips) Results of relational and set operations are displayed as new tables but are NOT actually saved in the
database. These are simply stored in the work area as intermediate results.
21
(FAQ) Many exam questions involve the meanings of relational operations. Be sure you know that selection extracts
“rows” and projection “columns.” Be sure to know also that join is an operation that combines multiple tables.
Set Operations
A set operation, based on the mathematical theory of sets, includes the following:22
Operation Function
Union Extracting rows which are in are in at least one of the two tables
Intersection Extracting rows which contain the same value in both tables
Difference Extracting rows that are common in both tables
For instance, union extracts rows that appear in Table A or Table B; note that “Billy” appears in
both tables, so it is extracted only once. Intersection extracts rows that appear in both Table A
and Table B. In this case, only “Billy” is extracted.
The order of operations does not matter in union or intersection, but in difference, the order
does matter. Different orders produce different results. “A – B” produces rows that are in Table
A but not in Table B. Here, “Billy” is excluded, so “Susan” and “Henry” are extracted. In
contrast, “B – A” produces those in Table B and not in Table A, so “Billy” is once again
excluded, resulting in “John” and “Nancy” being extracted.
22
Sorting/ the four basic operations: A relational database is equipped not only with relational and set operations but also
with the sorting functions and the four basic operations. Sorting is the function of ordering data in ascending or descending
order of a certain column. The four basic operations apply to numeric attributes and extract the results of doing certain
arithmetic (four basic) operations. For instance, it can extract the results of multiplying the values of a certain column by 10.
Quiz
Q1 List the three-layer schema and explain the roles of each schema.
Q2 List logical data models and explain briefly the characteristics of each.
Q4 List the types of relational operations and explain the process of each.
A1 External schema: Definition of the database seen from the program or the user. This uses a
part of the conceptual schema. In relational databases, this is called a
view; in network databases, this is called a subschema. This exists for
each program and user.
Conceptual schema: This is the data to be contained in the database, defined according to the
data model; a definition of the real data as a whole. It is called a table in a
relational database and a schema in a network database.
Internal schema: This is a definition to specifically achieve the conceptual schema for an
external storage. It consists of information such as the medium,
organization method, and buffer length.
A2 Hierarchical model: The relations between parents and children are 1:n.
Network model: The relations between parents and children are m:n.
Relational model: Table format
A database language is a language used in defining and deleting databases and tables and
searching and updating data.
A language used in defining and organizing databases is called a data definition language (DDL)
while a language used to search, update, add, and delete data is called a data manipulation
language (DML). In general, the database administrator uses DDL to edit the database while the
system developer uses DML to develop systems using the database. Typical database languages
include SQL for relational databases, and NDL23 for network databases.
23
NDL (Network Database Language): It is a database language for network databases, used to define schema and
manipulate databases. NDL consists of the following functions: schema-defining language to define the structure of the
database; subschema-defining language to define views; data-manipulating language to manipulate the data in the database;
and module language to execute the procedures of a variety of data-manipulating languages.
24
Module language: It is a language written in a data manipulation language; it processes databases when it is called from a
higher-level language such as COBOL.
Independent language
This refers to a system that provides a programming language different from general-purpose
programming languages so that the functions provided by the data-management system can be
used within the language functions. It uses SQL and NDL in dialogue style like commands.
Host language
This refers to a system for database manipulation wherein DML is embedded into programs
written in higher-level languages such as COBOL, Fortran, and C. Here, the higher-level
languages are called the host languages.
Methods for embedding DML into a program include the module language system and the
embedded system. In the module language system, we can develop a subroutine which forms the
database-manipulation section of the program, and the program calls the subroutine by a “call”
statement. The other way, the embedded sublanguage, is where DML is directly written within the
program.
Cursor Function
The cursor function is used when processing rows (records) of a relational database by using a
procedural language. It considers a query result (derived table) by DML as a file so that it can be
processed using a programming language.
With the cursor function, files used by existing programs can be switched to databases easily. In
SQL, the following manipulation statements are available.25
25
(Note) In DECLARE CURSOR, the cursor name is defined. Following DECLARE CURSOR, a SELECT statement is
written, which is a query written in DML. This procedure produces a resulting table. Then, by the FETCH statement, the
table is read beginning at the first row.
5.2.2 SQL
¾ The statement that extracts data is “SELECT.”
Points ¾ In subqueries, designate “SELECT” using a “WHERE” phrase.
Here, we explain SQL in detail, which is the database language for relational databases. In SQL, SELECT
statements are used to extract data from a database.
[Table] Employee_tbl
Name Department Home Country Age SELECT Name, Department FROM Employee_tbl Name Department
Harry Sales Italy 43 Selection (extracting rows where the age is 35 or above) Harry Sales
General General
Josh Affairs Germany 48 Josh Affairs
Human Human
Randy Resources USA 36 Name Department Name Department HomeCountry Age Randy Resources
Steve Sales UK 31 Billy Sales Billy Sales France 35 Steve Sales
26
(Hints & Tips) In a SELECT statement, the WHERE phrase can be omitted. If omitted, the conditions for extraction are
dropped, so all designated items will be extracted.
Join
Using a SELECT statement, we can join multiple tables through specified columns. Below is an
example joining “Employee_tbl” and “Department_tbl.” It is not permitted to have the same
column name in the same table, but if there are identical column names in different tables, they are
distinguished in the form “tablename.columnname.” “Employee_tbl.Department =
Department_tbl.Department” is the key joining the two tables.
[Table] Employee_tbl
Home
Name Department Country Age
Home
Name Department Country Age DeptLeader Location
IN and BETWEEN
In WHERE, we can specify complex conditions combined with AND or OR. IN designates an
OR condition while BETWEEN designates an AND condition.
To extract from “Employee_tbl” those names of people whose ages are 22, 28, and 35, there are
two methods, as shown below. Both methods produce the same result.
• SELECT Name FROM Employee_tbl WHERE Age IN (22, 28, 35)
• SELECT Name FROM Employee_tbl WHERE Age = 22 OR age = 28 OR age = 35
To extract from “Employee_tbl” those names of people whose ages are 22 to 28, inclusive, there
are two methods, as shown below. Both methods produce the same result.
• SELECT Name FROM Employee_tbl WHERE Age BETWEEN (22, 28)
• SELECT Name FROM Employee_tbl WHERE Age >=22 AND age<=28
27
(Note) If columns have the same name, variables can be used in the way shown below. Here, variable X is assigned to the
employee table and variable Y to the Department_tbl.
SELECT Name, X.Department, HomeCountry, DeptLeader, Location
FROM Employee_tbl X, Department_tbl Y
WHERE X.Department = Y.Department
ORDERED BY
ORDERED BY is used to extract data in ascending or descending order by a certain column.
Below is an example where the names are to be sorted from “Employee_tbl” in ascending or
descending order. ASC is used for ascending order and can be omitted. For descending order,
DESC is used.
[Table] Sales_tbl
Product Sales Product Sales
Number Amount Number Amount
SELECT ProductNumber, SUM(SalesAmount)
G01 100 FROM Sales_tbl G01 300 Total of G01
G02 50 GROUP BY ProductName G02 150 Total of G02
G03 200
G04 100
28
(Hints & Tips) The column names designated in “ORDERED BY” or “GROUP BY” must be contained in the column
names designated by SELECT. This is a syntax requirement of SQL.
29
Set function: It is a function prepared by database software. It can find various values such as the total and maximum
values for a specific column. Set functions include the following: SUM (total), MAX (maximum value), MIN (minimum
value), AVG (average value), and COUNT (number of values).
Subqueries
We can make a query on one table and then use the result of that query to make another query. The
first of these queries is called a subquery, which is performed by using IN.
First, let us describe the method which does not use subqueries. The SQL statement that extracts
the “ProductModel” of the product ordered by “CustomerNumber” A100 from tables “Order_tbl,”
“Order_detail_tbl,” and “Product_tbl” is as follows:30
SELECT ProductModel
FROM Order_tbl, Order_detail_tbl, Product_tbl
WHERE CustomerNumber = 'A100' AND
Order_tbl.OrderNumber = Order_detail_tbl.OrderNumber AND
Order_detail_tbl.ProductNumber = Product_tbl.ProductNumber
Here, the “OrderNumber” 100 by “CustomerNumber” A100 in the “Order_tbl” is joined with the
“OrderNumber” of the “Order_detail_tbl.” As a result, the first five lines of the
“Order_detail_tbl,” i.e. “ProductNumber” 301, 302, 301, 401, and 402 are extracted; in addition,
joined with the “ProductNumber” of the “Product_tbl,” the “ProductModel” is extracted.
Let us now write this using the format of a sub-inquiry with IN.
As shown above, we can make one query and, using the result of that query, make another query.31
The query contained in IN is called a subquery. The SELECT statement of this subquery is
executed first, and the extracted information, “Order_detail_tbl.ProductNumber” and
“Product_tbl.ProductNumber” are joined. Here, the product numbers extracted by IN are “301,
302, 301, 401, and 402,” but since “301” is duplicated, the duplication is eliminated. Consequently,
the four lines “301, 302, 401, and 402” are extracted. 32
30
(Note) Sometimes EXISTS is used for subqueries. IN and EXISTS are different functions, but the execution results are
almost identical.
31
(Note) We can write NOT before IN. In this case, NOT IN (subquery) gives the negation of the subquery result.
32
(FAQ) Every exam is certain to have questions on the extracted result by a SELECT statement in SQL. Be thoroughly
familiar with the use of the SELECT statement.
Further, duplication can be removed. As shown below, one can remove the product model
duplication by designating DISTINCT before the product model. Television and VCR are
duplicated, so each of these can be consolidated.33
33
(Hints & Tips) Note that the extracted number of records varies with the conditions specified by WHERE. IN is the same
as the OR condition, so identical values are discarded. DISTINCT also removes identical values, but it is designated
immediately before the column name designated by SELECT.
Quiz
Q1 Explain the roles of DDL and DML.
Q3 Give the data extracted from the table “Student_list_tbl” by the following SQL statement:
SELECT Name FROM Student_list_tbl
WHERE Major = 'Physics' AND Age < 20
[Table] Student_list_tbl
Name Major Age
Paul Newman Physics 22
John Wayne Chemistry 20
Tom Hanks Biology 18
Robert Redford Physics 19
Clint Eastwood Mathematics 19
A1 DDL: Data Definition Language: A language system used to define schema based on a data
model
DML: Data Manipulation Language: A language system used to manipulate databases by the
user
A2 It is the function used when processing rows (records) of a relational database using a
programming language. Since query results (derived tables) of DML consist of multiple rows,
these rows can be read line by line (row by row) just as a programming language processes
files.
A3 Interpreting the SQL statement, we see that this is the “manipulation of extracting from the
student list the names of the students whose major is physics and who are less than 20 years of
age.” Hence, the extracted data is “Robert Redford.”
In order to ensure the reliability of data, various controls are applied to a database system. In a
distributed database, it is important to maintain its consistency
Database control functions include the access control function and shared resources management.
They also support recovery from database failures.
Access Control
In general, for access control to maintain the integrity of a database, exclusive access control34 is
conducted. Exclusive access control prohibits multiple users from accessing the same data at the
same time. Through this control, multiple users can use the same database without causing
contradictions. However, if all transactions35 are only referential, exclusive access control is not
necessary.
In some instances, exclusive access control may cause a deadlock. A deadlock is a situation in
which two transactions are waiting for each other to release the lock. An image of a deadlock is
shown below.36
Transaction A Transaction B
34
Exclusive access control: It means locking a part of the database while it is updated by one transaction so that other
transactions can be prohibited from accessing the same part of the database.
35
Transaction: It is a processing unit of the data which is sent from a terminal to the host computer. It is also called a
message.
36
(Note) If each transaction locks all necessary resources at the beginning of its processing and does not lock them during
the processing, a deadlock can be avoided. Or, if all transactions have an identical order of locking, a deadlock can be
avoided.
Name Explanation
A Atomicity Property that there is no intermediate stage at the end of processing;
either all processes are complete or nothing is being done.
C Consistency Property that, regardless of the completion condition of a transaction,
the contents of a database cannot have contradictions.
I Isolation Property that the processing results cannot be different whether
multiple transactions are executed simultaneously or sequentially.
D Durability Property that results are not ruined by failures and other factors once
the transactions are finished.
Recovery Management
There are various recovery methods, depending on the situation of database failure.
Failure Recovery
type method Explanation
System System A computer system failure such as physically erroneous operation
failure restart (1) Back up to the point in time when the data was backed up 38
(2) Rewrite sequentially using post-update information of the log 39
Transaction Roll-back A logically erroneous operation due to program failure, etc.
failure (1) Roll back the failure data only, using the pre-update information
of the log.
(2) Re-execute the transaction(s).
Medium Roll-forward A problem with a medium such as a magnetic disk
failure (1) Replace the medium
(2) Back up to the point in time when the data was backed up
(3) Rewrite sequentially using post-update information of the log
37
(FAQ) There are exam questions on the meaning and necessity of exclusive access control. Remember that exclusive
access control is the function that prohibits access to the same location (record) at the same time. Understand also that
exclusive access control is carried out to maintain the integrity of data.
38
Back up: it is a duplicate of the contents of the entire database on a medium such as a magnetic tape, copied at regular
time intervals. Normally the copied contents are the data immediately before the startup or immediately after the shutdown of
an online system. If the system is operating 24 hours a day, backup is often carried out when the transactions are the fewest,
such as around midnight.
39
Log file (journal file): It means data record in which conditions of the database before and after the updating are recorded
whenever the contents of the database are updated. In the operation of some systems, only the contents prior to the updating
are recorded.
Distributed database is the technology of taking databases kept on multiple computers connected
to a network and making them appear as if they were a single database. Therefore, it is not
necessary for users to be aware of which computer actually has the necessary data.
Two-Phase Commitment
Two-phase commitment is a mechanism that ensures the integrity of a distributed database and is
important in ensuring the integrity when the database is updated.40
Two-phase commitment has two phases. In the first phase, the party requesting synchronization
makes a request to processing parties for a guarantee of update operation. At this point, all
processing parties are secure. 41 Then, each processing party returns either COMMIT or
ROLLBACK 42 to the party requesting synchronization. In the second phase, the
synchronization-requesting party decides whether to commit or roll back, considering the response
from each processing party. Specifically, even if only one of the processing parties returns
ROLLBACK, the requesting party chooses rollback.
The figure in the next page shows the process flow under normal circumstances. We now use this
figure to explain two-phase commitment. “ACK” in the figure is a response message indicating
normal completion.
Site A and Site B are the locations where the distributed database is located. The host is the
computer that controls this distributed database. When the database is updated, the host gives an
updating command (1) to each site. Upon receipt, each site temporarily updates the database. This
is a condition where the database can be updated any time, but the database has not yet been
updated physically. Further, each site prepares itself for the updating and deleting at any time,
once it receives a secure command (2) from the host. This is the first phase. If any of the sites have
trouble in this phase, the database updating is cancelled at all of the sites.
Next, after confirming that each site is ready for the updating (ACK), the host sends a
commitment command (3) to all of the sites sequentially. Site A, upon receiving this commitment
command, carries out the actual updating and reports the normal completion of the process to the
host (ACK). The host then sends the commitment command to Site B, which, likewise, carries out
the actual updating and reports the normal completion to the host (ACK). The host then confirms
that the entire database is actually updated (4). This is the second phase.43
40
Commitment: It means finalizing a database updating. Only after this, the process result is maintained. When the
application executes a COMMIT command, the update becomes finalized.
41
Secure status: It is a status in which it is possible to complete a processing or to return to the previous status.
42
ROLLBACK: It means stopping processing and returning related information back to what it was before the processing.
This is performed when some trouble occurs during the transaction processing, causing the processing not to be completed
normally. Rollback may be done by DBMS but can also be executed by a ROLLBACK command through an application.
43
(FAQ) Many exam questions link database failure conditions with roll-back and roll-forward. These are sure points you
can earn if you simply understand the meanings of roll-back and roll-forward.
Phase 1
Temporary
updating (1) Updating
command Temporary
updating
ACK
ACK
Phase 2
Replication
Replication is the mechanism of automatically reflecting updated contents in a copy (replica) of
the database on the network. The objective of replication is to enhance the responsiveness of the
database access in a distributed database environment.
Replicas of the master data are placed on other servers on the network, and when data is updated,
the change is automatically reflected onto the replicas. However, the updating of the replicas is
carried out asynchronously from the master data.
In general, updating can be only performed on the master, and replicas are only for reference, in
order to maintain database integrity.
Quiz
Q1 Explain the roles of DDL and DML.
Q3 List what is necessary for recovery when a database fails due to a transaction failure. What is
this recovery method called?
A2 The ACID characteristics are a concept for maintaining data consistently without
contradictions. The term ACID is an acronym for Atomicity, Consistency, Isolation, and
Durability.
A3 What is necessary is the pre-update information of the log (journal). This is the rollback
method.
A4 This is a mechanism that ensures the integrity of a distributed database. It consists of two
phases. The first is the phase in which a party requesting synchronization makes a request to
processing parties for an update-guarantee process. The second is the phase in which the party
that made the synchronization request considers the response from each processing party and
determines whether to commit or roll-back.
Question 1
Difficulty: * Frequency: ***
Q1. Which of the following operations extracts specific columns from tables in a relational
database?
Answer 1
Correct Answer: b
Question 2
Difficulty: ** Frequency: ***
Q2. Which of the following is the third normal form of the “Skill_record” table? Here, the
underlined items represent the primary keys.
Skill_code Skill_name
Emp_ID Name
Skill_code Skill_name
d) Emp_ID Skill_code
Skill_code Skill_name
Answer 2
Correct Answer: c
Since the skill records come with no explanations, we have no choice but to speculate on the
primary key. Here, we may assume it is the employee ID. From the repeated part of the skill
records, we can see that one ID number can correspond with multiple skill codes. This implies that
one employee can have multiple skills. Hence, the years of experience cannot be determined until
two items (the employee ID and skill record) are known. We make the assumption that one
employee does not have the same skill code more than once. Otherwise, as explained later, the skill
code cannot be the primary key by itself.
First, remove repetition of “Skill_code, Skill_name, Experience_years” from the skill records. The
data is then partitioned into multiple records, but if the employee ID is made the primary key by
itself, duplication would occur. Hence, it is necessary to define the combination of the employee
ID and the skill code as the primary key. This is a first normal form.
Next, note that the name is functionally dependent only to the employee ID (partially functionally
dependent with respect to the primary key), the skill name is functionally dependent only to the
skill code (partially functionally dependent with respect to the primary key), and the
“Experience_years” is functionally dependent (completely functionally dependent with respect to
the primary key) to “ID, Skill_code” since an employee could have multiple skills.
Now, we need to eliminate partially functional dependency. This is a second normal form. For
explanations, we assign a name to each table.
After this partitioning, each table has only one non-key item, so there cannot be any transitional
functional dependency. Hence, this is a third normal form.
b) The name can be uniquely determined by the employee ID only, so there is partial
functional dependence. Hence, this is a first normal form.
Q3. Which of the following is an appropriate description concerning the primary key of a relational
database?
a) Rows cannot be searched unless a search condition is specified for a column specified in
the primary key.
b) If a column storing numerical values is specified in the primary key, then that column
cannot be used as a subject of arithmetical operations.
c) Rows with identical primary-key values are not present in a single table.
d) It is not possible to form the primary key comprising multiple columns.
Answer 3
Correct Answer: c
In a relational database, the primary key is set by combining one or more items in order to identify
a row (record). The primary key cannot contain duplicate values within one table.
b) A primary key is an item or a set of items (columns) that have no duplicate values in the
table and does not determine the attributes of the data. Hence, the values can be used for
operations as well.
Q4. Which of the following is an appropriate description concerning relational database views?
Answer 4
Correct Answer: d
A view (virtual table) is a description of the database seen from the standpoint of an application. It
is a table separately defined by combining necessary items from multiple tables or a single table.
Since only the items necessary for the users are defined, the scope in which the data is used can be
limited, enabling the protection and integrity of the data. Further, a view derived from a single
table can be updated, but a view derived from multiple tables cannot be updated. Even if a view is
derived from a single table, it cannot be updated under the following conditions:
• DISTINCT: In DISTINCT, rows with duplicate values are combined into one row, so
which row of the original table needs to be updated cannot be determined.
• Set functions, calculations: Values obtained by calculation (for example, the value
obtained by the SUM operation, i.e. the total) cannot be updated. In this case, the original
data should be updated.
• Subqueries
• GROUP BY, HAVING
c) Since only necessary items are extracted and defined from the original table, the view is
not related to the structure of the original table.
Q5. Which of the SQL statements below can acquire Table B from Table A?
[Table] A [Table] B
emp_ID name dept_code salary dept_code emp_ID name
10010 Lucy Brown 101 2,000 101 10010 Lucy Brown
10020 Mike Gordon 201 3,000 101 10030 William Smith
10030 William Smith 101 2,500 102 10040 John Benton
10040 John Benton 102 3,500 102 10050 Tom Cage
10050 Tom Cage 102 3,000 201 10020 Mike Gordon
10060 Mary Carpenter 201 2,500 201 10060 Mary Carpenter
Answer 5
Correct Answer: d
Table B extracts the department codes, employee IDs, and names from Table A. This is specified as
follows:
Note that the records are sorted by department code and, within the same department, by employee
ID. This is specified by the following statement. Remember that ASC means “in ascending order”;
if this is omitted, the default is ASC.
Hence, to get Table B from Table A, we use the following SQL statement:
SELECT dept_code, emp_ID, name ; picking dept. code, emp. ID, and name
FROM ; from Table A
ORDER BY dept_code, emp_ID ; sorting in ascending order of the dept. code and the emp. ID.
Q6. In a distributed database system, which of the following methods is used to inquire whether
multiple sites performing a series of transaction processes can be updated, and can perform a
database updating process after confirming that all sites can be updated?
Answer 6
Correct Answer: a
In the second phase, the requesting party examines the response contents of the distributed parties
and instructs either COMMIT or ROLLBACK to each database. If any one of the databases has
responded with ROLLBACK, the requesting party instructs ROLLBACK to all of the distributed
parties. On the other hand, if they all respond with COMMIT, then the requesting party sends a
COMMIT command to all of the distributed parties. At this point, the distributed databases are
actually updated.
Chapter Objectives
When information is exchanged through a network,
security needs to be ensured, as there is a risk of
information leakage and tampering that can occur out
of the user's sight. In addition, computer viruses are
highly rampant, and, therefore, it is imperative that
computer systems and data be protected from these
threats. In Section 1, we will learn methods for
ensuring security. Meanwhile, the software and data
need to be standardized in order to exchange
information via a network. Standardization means to
set common formats and structures for information.
Information can be exchanged without performing any
special operations if the information is assembled in
accordance with certain standards. In Section 2, we
will learn the trends in standardization.
6.1 Security
6.2 Standardization
6.1 Security
Introduction
Security means maintaining the safety of computer systems and network systems. One way to
prevent unauthorized access is by requiring the user to enter his or her user ID and password. It
is also effective to encrypt data to prevent data leakage to a third party.
Encryption
Encryption means to scramble information by a certain pattern so that a third party cannot
understand its contents. It is a highly effective method for protecting information saved in a
computer system. Depending on the combination of keys for encryption and for decryption,
there are private key cryptography (common key cryptography) and public key
cryptography. The concept of encryption is shown below. 1
[Sender] [Receiver]
Encryption Decryption
key key
Plain text : data before encryption
Plain text Encrypted text Plain text Encrypted text : data after encryption
Encryption Decryption
Cryptography Explanations
Private key The same key is used as the encryption key and the decryption key
cryptography (symmetric conversion). The key must be kept secret from others.
DES and FEAL are examples of this system.
Public key The encryption key is public while the decryption key is kept secret from
cryptography others. The message is encrypted by the public key of the receiver and
decrypted by the private key of the receiver (non-symmetric conversion).
RSA is an example.2 3
1
(Hints & Tips) Encryption cannot prevent data falsification because it only makes the data unreadable. Also, there is a risk
of the encryption pattern breaking if the same key is used for a long time. Be aware that encryption does not offer a perfect
solution.
2
(Note) Be aware that in the public key cryptography, encryption is performed using the receiver's public key. The public
key is available on the network, and anyone can obtain it. However, the decryption key is kept secret. This system is
characterized by the fact that symmetric conversion is almost impossible and that the decryption is not possible using the
encryption key.
3
DES/RSA: An example of a private key cryptography is DES (Data Encryption Standard). An example of a public key
cryptography is RSA (Rivest-Shamir-Adleman), named after the initials of the three people who invented it.
Authentication
Authentication means verifying that the user is, without a doubt, a valid user. There are various
authentication methods as listed in the table below.
Method Explanations
Entity A technology of identifying whether the party with whom we are
authentication communicating is valid
Often the combination of a user ID and a password is used. Various means
are used, including call-back,4 private key cryptography, and public key
cryptography.
Message A technology of detecting any possible falsification in a transferred text or
authentication file
If falsified, the check bit will be changed.
Digital signature5 A technology of assuring the validity of a document, typically using public
key cryptography
Access Management
Access management means preventing unauthorized access to recourses (such as data) in a
computer system.
To this end, user IDs and passwords are registered in advance in the system, and the user is
required to enter his or her user ID and password to gain access to resources and the network.
The right of being able to access into resources is called the access right.
Type Explanations
Individual's knowledge Password, etc.
Individual's possession ID card, IC card, optical card, etc.
Individual's characteristics Fingerprint, voice print, hand shape, retina pattern, signature, etc.
4
Call-back: It is a method by which the receiver disconnects communication and then reconnects by calling the sender
back.
5
Digital signature: It is a method which applies a public key cryptography. There are various ways to implement this, but
the simplest method is to use the public key cryptography in reverse.
The sender encrypts the text with his or her own private (secret) key and adds his or her name in plain text. The receiver,
based on the plain name, obtains the public key of the sender (or assumed to be the sender) and uses that public key to
decrypt the text. If the decrypted message is readable, the sender is verified; otherwise, the receiver determines that it was
sent by someone in disguise.
With the growing popularity of the Internet, there is a trend that occurrences of various types of
damage through networks are on the rise. Hence, it has become increasingly crucial to take
measures against computer viruses and to protect against unauthorized access from the outside.
Let us explain these issues in detail, including specific measures to be taken.
Computer Viruses
A computer virus, or, simply, a virus, is a malicious program that enters a system via
networks or storage media, and destroys, falsifies, or steals data. A computer virus reproduces
itself through networks and storage media. In addition, macro-viruses6 can be made easily by
almost anyone, so the damage is spreading widely.
Function Characteristics
Self-contagion function It causes infection by reproducing itself by its own function or
by reproducing itself onto another system using a system
function.
Incubation function It contains certain conditions for attack, such as a specific date
and time, duration of time, and number of processes; then the
virus keeps itself hidden until the attack begins.
Symptom-presentation It has functions to destroy files such as programs and data or to
function execute operations which are not intended by the creator.
6
Macro-virus: It is a computer virus that abuses macro functions of spreadsheet and word-processing software.
Conventional computer viruses required knowledge at the level of machine language, so it was difficult for common end
users to create them. However, macro-viruses can be written in programming languages, so they can be created relatively
easily. Often they are hidden in files attached to an e-mail.
7
(FAQ) In the past exams, many questions involving computer viruses have appeared. Be sure you know well the definition
of a computer virus, measures to avoid virus infection, measures to take in case of an infection, and matters related to
vaccines.
8
(Hints & Tips) A vaccine recognizes patterns of already discovered viruses. Hence, it may not be able to handle new viruses.
So the basic idea is to take precaution so that the computer does not get infected with a virus. Once an infection is discovered,
it is crucial to take immediate actions so that the infection does not spread any further.
Anti-Virus Measures
In order to prevent computers from being infected by computer virus, the following points need
to be observed:
• Have a vaccine
• Do not copy software illegally
• Do not execute suspicious programs
• Set up passwords and access privileges
• Perform backup periodically
• Do not share disks (clarification of management)
• Do not open suspicious e-mails
If we detect a virus infection, we need to contact the administrator immediately to ask for
instructions. Acting on our own judgment can cause further damage by the computer virus.
Firewalls
A firewall is a system (mechanism) that protects an internal network such as an in-house LAN
from unauthorized access from the outside. More specifically, it is installed between the
internal network and the external network, such as the Internet. All communication between the
inside and the outside takes place through the firewall. A firewall restricts services available for
each user and identifies access from the outside to determine whether or not to allow access
into the internal network. Sometimes a computer installed as a firewall is equipped with the
functions of a proxy server9 as well.
Internet
Firewall
In-house LAN
…
Computers
9
Proxy server: It is a server installed for security protection and high-speed access when an internal network is connected to
the Internet. It prevents unauthorized entry into the internal network and relays and manages access from the internal network
to the Internet. This function is identical to that of a firewall, so frequently the proxy-server function is carried out by the
firewall machine. Additionally, a proxy server has the proxy-response function: data sent from the Internet can be stored here
temporarily. Later, when the same Web site is accessed, the access to the Web site can be made faster by turning it at the
proxy server.
Computer crime is the act of entering an information system with a malicious intent and
carrying out an action such as destroying data. With the spread of networks, unauthorized users
who access networks have begun to appear. Computer viruses are one common means of
computer crime.
Computer crimes include manipulating online systems of banks, hacking into remote
computers via networks, and placing traps on public domain software.
Falsification
Falsification refers to the act of intentionally changing data or programs in a computer and
includes falsifying or modifying a document, replacing storage media with a false version
prepared in advance, rewriting data, and erasing data. It is difficult to prevent tampering, but
one effective way to detect tampering is message authentication. An example of tampering is
the salami method.
The salami method is a way to steal assets little by little from a large quantity of resources.
For instance, a user may open a fictional bank account and transfer one or two cents from the
other accounts to his or her own account.
Destruction
Destruction is the act of erasing critical data or a program stored in a computer or disabling
system devices or storage media by physically destroying them. Examples of destruction
include the Trojan horse, logic bombs,10 and e-mail bombs.11
The Trojan horse hides in a program a special instruction not to affect the usual processing
and then executes unauthorized functions while letting the program perform its usual objectives.
Once a certain condition is satisfied, it may destroy all the files in the computer or steal user
IDs and passwords. To prevent a Trojan horse, one can carefully save a backup copy as a real
copy and compare a suspicious program with the backup copy to discover the virus. However,
it is said that there is no real effective method to prevent the Trojan horse besides checking the
source program at the time the program is written or modified.12 13
10
Logic bomb: It is an application of the method used by the Trojan horse. It embeds into the system a process to destroy
the system when a certain condition is satisfied (time, situation, frequency, etc.).
11
E-mail bomb: It is an act of sending a large number of e-mails or large-size e-mails to a particular person within a short
period of time, leading to a failure of the e-mail system.
12
(FAQ) In the past exams, questions related to computer crime, including items regarding computer viruses, have appeared.
Know well the Trojan horse and scavenging.
13
(Note) Another type of computer crime is superzapping. This is an abuse of the special function that the system has for
emergencies (for instance, a utility program which has access to all the files and through which one can change the contents
of those files).
Leak
A leak is a collective term referring to the robbery of data or copies from an information
system. The methods include placing a transmitter on an output unit, mixing confidential data
into an output report, and making confidential data appear to be something else by encrypting it.
An example of a leak is scavenging (trash hunting).
Scavenging is the act of stealing information from a computer after a job is executed. One may
steal information from a document thrown away as trash or information left on the hard disk or
in memory. One effective method to prevent scavenging is erasing all information in memory
used for temporary storage and on the hard disk upon completion of a job.14
Tapping
Tapping is the act of illegally intercepting data on a network and stealing information or
illegally accessing a computer system. Targets of tapping include not only computer data, but
also audio data. Encryption is an effective way to prevent tapping.
Disguise
Disguise is the act of stealing someone else's user ID and password and acting on a network
using the stolen identity. By doing so, the unauthorized user steals confidential information that
only the authorized user should access, or commits wrongdoing and blames the authorized user
for what he or she has done. Digital signatures are effective in preventing disguise.
14
(Note) Various methods are in use to maintain security in communication, including “calling number identification” and
“closed user group.” “Calling number identification” is a way to notify the call-receiving party of the telephone number of
the party making the call. “Closed use group” is to register, in advance, the addresses of the terminals that are allowed to
make connection with the electronic exchange unit and make connection only with those terminals.
Quiz
Q1 Enter appropriate words in the blank boxes in the table below.
Q2 In general, a computer virus is defined as having at least one of three functions. List the three
functions.
A1
Encryption key Decryption key
Common key cryptography Private key (common key) Private key (common key)
Public key cryptography Public key of the receiving party Private key of the receiving party
6.2 Standardization
Introduction
Standardization Organizations
The following table includes well-known standardization organizations. 16 17
Name Explanation
ISO International Organization for Standardization
This is an international organization that works to unify and stipulate standards in
the industry-related fields. In each field, there is a technical committee (TC), and
under TC are subcommittees (SC) and working groups (WG).
ITU International Telecommunications Union
This is an international organization that standardizes telecommunications
technologies as well as standardizes and recommends international standards for
communications of all kinds. ITU-T18 is responsible for telecommunications while
ITU-R 19is responsible for radio and wireless systems.
15
(Hints & Tips) Among the standardizing activities of ISO, work in electrical and electronic fields is jointly done with the
IEC. Work in telecommunication is jointly done with ITU. Sometimes ISO cooperates with local organizations like ANSI as
necessary.
16
IEC (International Electrotechnical Commission): It is a standardization organization set up for the purpose of unifying
international standards in electrical and electronic fields. It has now become the telecommunications department of ISO
(ISO/IEC), working together as one organization.
17
IEEE (Institute of Electrical and Electronics Engineers): This group has powerful influence in setting standards in
areas such as LAN and various interfaces.
18
ITU-T (International Telecommunications Union – Telecommunications Standardization Sector): This is the sector
that discusses technologies, operation, and fees related to telephone and telegraph; it also issues its own standards as
recommendations. Its major recommendations include the I series for ISDN, the V series for analogue lines, and the X series
for digital lines.
19
ITU-R (International Telecommunications Union – Radiocommunication Sector): This is the sector that assigns radio
ISO 9001 stipulates the requirements concerning quality management systems for corporations
and organizations in the following cases:
• When it is necessary to verify that the company has the ability to provide products that satisfy
the client's requirements or applicable required standards
• When the company wishes to achieve improved customer satisfaction
Hence, it is a standard for quality management systems, not a standard for products. Since ISO
9001 is internationally recognized, companies that obtain this can gain international trust.
frequencies and standardizes radio systems, handling satellite communication, fixed wireless communication, mobile
communication, television broadcast, etc.
20
(FAQ) Some exam questions ask about the roles of international standardization organizations including ISO and ITU.
However, most exam questions assume that you have prior knowledge of these organizations. A good example of such a
question is, “Which image compression format is being standardized by a joint organization of ISO and IEC?” Be sure to
know at least the names and roles of the standardized organizations mentioned in this book.
When exchanging data, one of the following processes is necessary: to deliver data after
making the data compatible with the format of the receiving party; or to receive data in a
different format and then change the format to the receiver's format. Either way, if the formats
are inconsistent, it is necessary to change them for the other party. However, this “change of
format” can be eliminated if there are standard formats and everyone uses those formats. The
data to be standardized here include character codes and file formats.
Character Codes
A character code is a code assigned to each character and symbol for the purpose of processing
those characters and symbols on computers. Character codes that can be processed vary depending
on the computer.
Code Explanations
EBCDIC Extended Binary Coded Decimal Interchange Code
Character code established by IBM for general-purpose computers.
A set of 8 bits represents one character.
Unicode Standard for expressing the characters used all over the world in one
(UCS-2) integrated character code
All characters are expressed using 2 bytes.
This is adopted as part of international standards by the ISO.
In addition to the codes listed in the table above, there are ASCII21 and EUC,22 among
others.23
21
ASCII (American Standard Code for Information Interchange): It is a character code set established by ANSI, setting
codes for the alphabet letters, numerals, special characters, and control characters such as the new-line (return) code, each
using 7 bits. ASCII is adopted as part of international standards by the ISO (ISO 646).
22
EUC (Extended Unix Code): It is any character code, used mainly by UNIX. It can process 2-byte characters as well as
1-byte characters. It is an international standard code established by AT&T and includes Japanese EUC, Korean EUC, and
Chinese EUC, etc.
23
(Note) Unicode was extended after its initial standardization to use 3 or more bytes. Hence, today it is defined such that
every character uses 4 bytes in Unicode (UCS-4).
Image Files
An image file is a file in which a still image like a photograph or an illustration is digitized as a
file. There are various file formats as listed below.
Format Explanation
JPEG Joint Photographic Experts Group: A joint organization of ISO and ITU-T for coding color still
images, or the compression/decompression method established by this organization
GIF Graphic Interchange Format: An image format developed by CompuServe, a large online
service company in US
It is compatible with color or monochrome image files with 256 or fewer colors.
BMP Format to save images as bitmap data, standard graphics format used by Windows.
TIFF Target Image File Format: Expressing data using tags in data blocks within files
By using tags, the data format is specified.
24
(Note) MPEG also codes audio and sound along with the moving images. MPEG has MPEG-1, MPEG-2, MPEG-3,
MPEG-4, and more. MPEG-1 is for storage media such as CD-ROM and is a standard for ISO/IEC (ISO 9660). MPEG-2 is
an upgrade version of MPEG-1 and is for HDTV (high-definition television) as well as image transmission using broadband
ISDN. MPEG-4 is a high-performance code of moving images and audio designed for the Internet and radio communication
(mobile communication).
25
(FAQ) Exam questions on the characteristics of SGML, HTML, and CSV have frequently appeared. The key word for
each is “markup” for SGML, “hyperlink” for HTML, and “comma” for CSV.
In order to exchange transaction data between companies, the data being exchanged need to be
standardized. One of these standardization concepts is EDI.26 Another is STEP, which is an
exchange standard for product model data. If optimum software products can be freely joined
together, including software written by various software manufacturers, a better system can be
constructed. For this, software also needs to be standardized.
Standard Explanation
EDI Electronic Data Interchange (electronic transaction, electronic data exchange)
CALS Commerce At Light Speed
All product-related information is shared from specifications, development, and
design to procurement, operation, and maintenance. It is designed to improve
productivity, shorten the development period, and reduce costs.
EC Electronic Commerce
Sales are made not at a store or through mail order but on the Internet.
Electronic money27 is being considered as a means of electronic payment.
STEP Standard for the Exchange of Product Model Data
ISO 10303 standard
International standard for the exchange of product model data
Open Systems
An open system is a computer system constructed in such a way that, by standardizing the
specifications, hardware and software can function without conflict, regardless of the
manufacturer. In distributed processing systems, hardware from different manufacturers is often
connected together to construct a system; thus, hardware and software need to be standardized.
26
(Hints & Tips) EDIFACT adopted in the U.S. and Europe is used to make data exchange overseas more efficient.
27
Electronic money: It is a method of payment using IC or PC communication, characterized by the feature that physical
bills and currency are not used. It is used as a means to make payments in e-commerce on a network such as the Internet. In
addition, there are IC cards, as small as a business card, equipped with a microprocessor on which an amount is recorded, so
that the user can carry it just like cash.
Standardization of Software
For standardization of object-oriented software, there are following software, standards, and
standardization organizations.
Name Explanation
CORBA Common Object Request Broker Architecture
Shared specifications so that objects can exchange messages with each other in a
distributed system environment.
This is established by OMG (Object Management Group).
EJB Enterprise JavaBeans
Standard specifications to construct Java distributed object-oriented applications.
It is possible to combine components using tools from different vendors.
It is compatible with CORBA.
RFC Request for Comments
A group of documents on technical proposals and comments which is compiled
by IETF.28 It is available on the Internet and can be obtained by FTP or e-mail.
TCP/IP-related protocols, etc., are written in RFC.
OMG Object Management Group
A non-profit organization promoting the popularization and standardization of
object-oriented technology.
It establishes the industrial standards (OMA)29 in the field of object-oriented
technology.
28
IETF (Internet Engineering Task Force): It is an organization for designing and developing Internet protocols and
architectures. This group is open to network designers and researchers, and anyone can join.
29
OMA (Object Management Architecture): It consists of ORB (object request broker, functions and software used by
objects to exchange information with each other), which exchanges messages between objects, a fundamental concept in
distributed object orientation; a group of object services, which provide services (CORBA-services) based on ORB; objects
that make up application parts; and other components. The common specifications of ORB, which is central, are CORBA.
Quiz
Q1 Describe the contents of the ISO 9000 series.
A1 The ISO 9000 series is a set of international standards for quality-assuring structures and
guidelines indicating the ability of a company or an organization to provide products required
by customers. Since the ISO 9000 series is internationally recognized, a company that obtains
this can gain an international reputation for reliability.
A2 (1) JPEG
(2) MPEG
Q1. Which of the following procedures enables a sender to send an encrypted document to a
receiver by using a public key cryptography?
a) The sender encodes the document by using his own public key, and the receiver decodes
the document by using his own private key.
b) The sender encodes the document by using his own private key, and the receiver decodes
the document by using a public key.
c) The sender encodes the document by using the receiver's public key, and the receiver
decodes the document using his own private key.
d) The sender encodes the document by using the receiver's private key, and the receiver
decodes the document by using his own public key.
Answer 1
Correct Answer: c
A public key cryptography is a system in which the encryption key is publicly released while the
decryption key is kept secret. The released key is called the public key, and the one kept secret is
called the private key. Unlike in a private key cryptography (common key cryptography), only one
encryption key and one decryption key are necessary, so the management of keys is easier. Further,
since the decryption key is publicly shared, the key does not have to be sent. But, since the public
key cannot be used to decrypt the text, the encryption and decryption can be time-consuming.
In a public key cryptography, the sender encrypts the message by using the receiver's public key
while the receiver decrypts the message using his or her own private key.
a) In a public key cryptography, what has been encrypted by the public key is decrypted by
the private key paired up with the public key. In this option (a), the public key belongs to
the sender whereas the private key belongs to the receiver, so they do not make a pair.
d) A private key is kept secret. In this answer, it is stated that “the sender encrypts the
document by using the receiver's private key,” but the sender does not have the private key
of the receiver.
a) Even if a program file in which a virus lies hidden exists on the computer, as long as the
user does not intentionally activate the file, the computer will not be infected.
b) Viruses destroy the main memory physically and trigger operations not intended by the
computer user.
c) A computer that is updated with the latest engines and signature files for detecting and
exterminating viruses will not become infected.
d) In the virus extermination process, the user can avoid infection from the boot sector by
using an OS startup disk that is not infected by the virus.
Answer 2
Correct Answer: d
When a computer is turned on, the first drive where the system goes to read the program is called the
startup drive, and the hard disk or the floppy disk used as the startup drive is called the startup disk. A
startup disk is prepared in advance to be used in case of emergencies. When the boot sector is infected
with a virus, the OS is to be started up from the startup disk which is not infected by the virus.
a) A boot sector virus (or simply called “boot virus”) enters the boot sector in which the boot
program (the program that starts up the OS from the hard disk) is stored and attacks the
computer when it is turned on. Thus, virus infection can occur even when the user is
unaware.
b) Since a virus is a program, it does not destroy hardware although it does destroy software.
c) An engine (software) that detects and eliminates viruses is called a vaccine (vaccine
software, computer vaccine). A signature file stores information on previously discovered
viruses. Vaccines detect and exterminate viruses by cross-checking the target programs and
data with the signature file. Therefore, a vaccine cannot detect or remove a new virus not
yet registered in the signature file.
Q3. Which of the following is an appropriate description concerning ISO 9001:2000 certification?
Answer 3
Correct Answer: d
The ISO 9000 series is a collective term referring to the multiple international standards established by
ISO concerning the quality management systems of companies. The standard for certification is ISO
9001, and other standards are items that show guidelines for obtaining the ISO 9001 certification. It is
not a standard for products; rather, it certifies internationally the quality processes of the companies or
organizations based on the following view points:
• They have the ability to provide products that satisfy the customer's requirements or applicable
required standards.
• They are working to improve customer satisfaction.
Incidentally, ISO 9001 was revised in December 2000. The required items that used to be distributed
are organized into four categories: “management responsibility,” “resource management,” “Product
realization,” and “measurement, analysis, and improvement.” It is characterized by the concept of a
quality management system and continuous improvement of the quality management system.
The “2000” in “ISO 9001: 2000” denotes the fact that it was revised in the year 2000.
a) A certified company must be examined every year or every six months, and a complete
re-assessment is required every three years. Hence, continuous activities are necessary.
b) There are many certifying organizations. An organization wishing to be certified can
choose any certifying organization at its discretion. However, each country has only one
accreditation organization that examines and approves certifying bodies.
c) It applies to all industries.
Chapter Objectives
In modern society, computers are used in various fields. In our daily lives,
we use personal computers at home and work. Computers are also used in
corporate accounting systems and production management as well as train
seat- and ticket-reservation systems. In this chapter, we will acquire
knowledge concerning the development of such systems. In Section 1, we
will study information strategies used by companies. Section 2 covers
corporate accounting, and Section 3 covers business management. In
Section 4, we will study specific examples of information systems using
computers.
Management control is an activity that integrates an organization to direct it toward the next
action to be taken. It is often said that a company consists of human resources, products,
finances, and information. Management control relates the flow of these components with one
another, coordinates them, and generates higher value by a guideline called a “management
strategy.”
CIO
CIO (Chief Information Officer) is the highest-ranking officer in charge of overseeing
information systems. Unlike an officer simply in charge of managing the information systems
department, CIO is responsible for developing information strategies to effectively utilize the
information resources in corporate management. In general, an officer in charge of supervising
the information systems department serves as CIO.1
KJ Method
The KJ method is named after the initials of Jiro Kawakida, the inventor of the method. In this
method, various ideas are generated to solve one problem, and these ideas are grouped together
and related with one another. When an information system is designed, the first step is to listen
to the opinions given by the users. The various comments and information collected during this
step include many contradictions and conflicts, especially when the number of the users
becomes large. The method can be used to effectively identify a true universal need among
these contradictions and conflicts.
1
(FAQ) CIO's meaning and roles have been asked in the past exams. In addition to being knowledgeable about the
information systems, CIO is also required to take responsibility for developing information strategies. Therefore, CIO is
required to possess a wide range of knowledge including the industry in general, the business of the company, and general
administrative functions.
Brainstorming
Brainstorming is a type of meeting which is held under the guideline that absolutely no
criticism is allowed on remarks made by the participants. It is characterized by four principles:
criticism is forbidden; comments are freely made; quantity is more important than quality; and
piggybacking on someone else's idea and position-switching are welcome. These principles
facilitate participants to freely express their own ideas and opinions without any restrictions,
and, therefore, innovative ideas are expected to be generated during the course of the
brainstorming.
On the other hand, there is Off-JT (Off the Job Training), which is generally considered as
typical classroom-style education. This training is targeted for certain employees and is
conducted outside the usual workplace separate from their daily work.
2
(Note) One method of learning how to conduct information-gathering interviews is role-playing. Four people form one
team; one of them acts as the interviewer while another acts as the respondent. Then, the remaining two serve as observers
and make comments after completing the role-playing.
3
(Note) A project is an organization that is formed to achieve clearly defined objectives in terms of schedule, cost, and
technical performance under predefined time limits; it is dissolved upon achieving its objective. Unlike typical corporate
organizations, a project has an objective, a beginning, and an end. In most cases, the job is done by a group of people.
DSS
DSS (Decision Support System) is a system that supports decision-making by managers and
administrators facing non-structured problems (non-routine task). For decision-making in
non-routine task, it is difficult to have necessary information defined in advance or to have
solution models prepared. Hence, DSS is equipped with a database function, 4 model base
function,5 and human interface function.6 Using these functions, the user can search for a solution
to non-routine task.
SIS
SIS (Strategic Information System) is an information system that actively uses information
technology as a part of its corporate strategy to obtain a competitive edge. This includes
home-delivery systems for courier services and POS analysis systems at convenience stores.
BPR
BPR (Business Processing Reengineering) is the work of modifying the actual business contents
and/or organization, restructuring the business field, based on an analysis of the business contents
and business flow, and redesigning for optimization in order to achieve the target level profit or
customer satisfaction.
4
Database function: It is a function that allows free search and analysis of necessary data when a problem occurs.
5
Model base function: It is a function that chooses appropriate solution models as needed, such as a simulation model or a
mathematical model, and performs trial and error.
6
Human interface function: It is an interface function that allows the database function and model base function to be used
easily and interactively.
Business models
A business model is a framework for making business concepts concrete. In other words, it is a
template for how to carry out business to generate profits. It receives greater public attention as it
promotes more distinction (application for patent) by combining business with computers and the
Internet through the advancement of IT (Information Technology). A patent of a business model is
called a business-model patent.
e-business is a new business structure which takes advantage of the Internet and computers. In an
environment of expanding networks, this is an innovative business structure connecting that
expansion with the expansion of transactions. It is achieved by defining a business model and
making changes in business processes, rules, and organization.
Dot com business (.com business) is a collective term referring to general business activities
using the Internet. The term “dot com (.com)” is the domain name indicating the US “company.”
Corporations actively doing business on the Internet are called “.com (dot com) companies” or
“e-companies.” 8
SOHO
SOHO (Small Office Home Office) is a term coined by joining the phrases small office and home
office. The former is an attempt to use business resources in and out of the company effectively
through networks such as the Internet. The latter refers to working at home by obtaining necessary
information by accessing the company server from home and working via network communication.
This is a business mode popularized with the growth of the Internet.
7
Virtual company: It is a corporate structure where a company is set up virtually on a network and is managed by multiple
people.
8
EC (Electronic Commerce): It is a method of selling goods and services on a network such as the Internet instead of at a
store or through conventional mail order. A business can be started with little capital, and the operating costs can be
significantly reduced because there is no store and just a few people are managing the business. Also it is possible to provide
different information to different customers.
Quiz
Q1 Explain what CIO is.
Q4 Explain BPR.
A3 1. Criticism is forbidden.
2. Comments are freely made.
3. Quantity is more important than quality.
4. Piggybacking on someone else's idea and position-switching are welcome.
A4 BPR is the work of modifying the actual business contents and/or organization, as well as
restructuring the business field, based on an analysis of the business contents and business
flow, and redesigning for optimization, in order to achieve the target level profit or customer
satisfaction.
Corporate accounting is the procedure of reporting the activity status of a corporation to related
parties in and out of the corporation; it can be classified into financial accounting and
management accounting.
Liabilities
Expenses
Assets Capital Revenue
Net income Net income
[balance sheet] [profit and loss statement]
9
Trial balance sheet: It is a table prepared to check whether or not the transaction is correct when the account is finalized.
The total amounts of debit and credit will always be equal.
Depreciation12
Depreciation is a method of reducing the value of a fixed asset by assigning the cost for acquiring
the fixed asset13 as an expense according to a certain method. As shown below, there are several
methods including straight-line and declining-balance methods.
Method Description
Straight-line Find the difference between the acquisition cost and the residual value14 of the
method asset. Divide it by the useful life, and that fixed amount is deducted each period as
depreciation.
acquisition cost - residual value
Depreciation for each period =
useful life
Declining-balance A certain fixed depreciation percentage is multiplied by the current book value
method15 (undepreciated value) of the fixed asset to obtain the depreciation expenses for the
period.
Depreciation for each period = book value (undepreciated value) × fixed rate
10
(Hints & Tips) Capital is the fund prepared by the company itself and is also called equity capital. Liabilities are capital
borrowed from someone and is called borrowed capital. The capital and liabilities (equity capital and borrowed capital) are
together called total capital.
11
(Note) Profits include the following:
Gross profit = sales – cost of sales
Operating income = gross profit – selling and general administrative expenses
Ordinary profit = operating income + non-operating income – non-operating expenses
Generally, the ordinary profit is the corporate profit that gets evaluated.
12
(FAQ) There will be exam questions that give an account item and an amount and ask you to calculate the operating
income and ordinary profit. Know the formulas for various profits well. There are also exam questions where you have to
calculate depreciation expenses. Understand well the meaning of the calculation formulas for the straight-line method and the
declining-balance method.
13
Acquisition cost: amount paid when the asset was purchased.
14
Residual value: value of the asset anticipated at the end of its useful life. Generally, this is 10% of the cost paid to acquire
the asset.
15
(Hints & Tips) The rate for the declining-balance method is determined by the depreciation duration. For example, if
computers are depreciated over 6 years, the rate is 0.319.
In break-even point analysis, we graph the relation “sales = unit sale price * quantity sold” by
plotting the quantity sold on the horizontal axis and the sales on the vertical axis. This graph (sales
line) is normalized so that it can become a 45-degree line increasing to the right. On the other hand,
since the fixed cost is constant regardless of the quantity sold, it is shown as a horizontal line. The
variable cost can be indicated by the formula “variable cost = unit price of manufacturing *
quantity sold,” so it becomes a line (the value is 0 if no units are sold), also increasing to the right,
according to the quantity sold. The sum of the fixed cost and the variable cost is then drawn as the
total cost line.
In a graph obtained by following the above procedure, the intersection point of the sales line and
the total cost line is the break-even point.17
cost, sales
← sales line
profit
point
variable cost
fixed
cost line total cost
loss
fixed cost
quantity sold
16
Fixed cost: It is a cost incurred regardless of the sales, including personnel expenses (payroll), rent, and utility expenses.
Variable cost: It is a cost incurred depending on the quantity sold, such as the material cost.
17
(Hints & Tips) In principle, the sales line and the total cost line do intersect. The sales line is “unit sale price * quantity
sold” whereas the total cost line is “unit price of manufacturing * quantity sold.” Since the unit sale price includes the unit
price of manufacturing as well as markup (profit), that is, “unit price of manufacturing + markup = unit sale price,” the slope
of the sales line is greater than the slope of the total cost line.
18
For the break-even point sales, the following equation holds:
fixed cost
Break-even point sales =
1 - variable cost / sales
fixed cost
=
1 - variable cost ratio
fixed cost
=
contribution margin ratio
Financial Analysis
Financial analysis is conducted to evaluate the management records and financial conditions by
analyzing the safety and profitability of a company. For indexes in financial analysis, relation
ratios are often used.19
Ratios of safety
Ratios of safety are ratios whereby the debt-paying ability of the company is evaluated. They are
shown in the following table.
Ratio Description
Current ratio The ratio of current assets, which has relatively high liquidity, to current
liabilities that will be due shortly
This indicates the short-term paying ability of the company. 200% or more is
desirable.
Quick ratio The ratio of current checking funds, which has high liquidity, to current
liabilities
This indicates the immediate paying ability of the company, more certain than
the current ratio. 100% or more is desirable.
Fixed ratio The ratio of fixed assets to equity capital
Fixed assets are safe to procure by capital, so the smaller this ratio is, the more
desirable it is. 100% or less is desirable.
Debt equity The ratio of the liability guaranteed by equity capital
ratio The less debt there is with respect to the equity capital, the safer it is; hence, a
small value is desirable here. 100% or less is desirable.
Capital ratio20 The ratio of equity capital to the total capital, indicating rigidity
The more equity capital there is with respect to the total capital, the better it is;
therefore, a large value is desirable here. 50% or more is desirable.
18
(Hints & Tips) Note that the profit used in break-even point analysis is the operating income.
19
Relational ratio: It is the ratio of an account item to other account items, expressed in percentage. Comparison among the
account items on B/S is called stationary analysis while comparison among account items on P/L is called dynamic analysis.
20
(Note) The calculation formula for each of the ratios of safety is as follows:
Current ratio = (current assets) / (current liabilities) × 100
Quick ratio = (quick assets) / (current liabilities) × 100
Fixed ratio = (fixed assets) / (equity capital) × 100
Debt equity ratio = (liabilities) / (equity capital) × 100
Capital ratio = (equity capital) / (total capital) × 100
Ratios of profitability
Ratios of profitability are indexes showing how much profit the company is making. They are
classified as shown in the table below.
Ratio Description
Ratio of return This shows how much profit there is with respect to the capital, i.e., the
on equity efficiency of the capital use. The larger the profit, the better it is, so a large
value is desirable.
Ratio of profit This indicates how much profit there is with respect to the sales. The larger the
to net sales profit, the better it is, so a large value is desirable.
Turnover ratio This indicates the degree to which the assets and capital are used within one
accounting period. Since it is desirable to have large sales with little assets and
capital, a large value is desirable.
Specifically, there are ratios as listed below. As the capital can change during one period, the mean
value at the beginning and at the end of the period is commonly used.21
operating income
Ratio of operating income to total capital = × 100
total capital
operating income
Ratio of operating income to sales = × 100
sales
sales
Turnover of total capital = × 100
total capital
Costs
Costs are the expenses required to manufacture and sell products. Depending on what expenses
are included, they can be classified as shown in the following figure.
Sales profit
General
administrative cost
Sales expenses
Manufacturing Sales price
indirect cost Total cost
Direct materials cost Manufacturing cost
Manufacturing
Direct labor cost direct cost
Direct expenses
Depending on how costs are incurred, they can be classified into materials cost, labor cost, and
expenses. Materials cost is the raw price for consuming materials. Labor cost is cost incurred by
consuming labor. Costs besides materials cost and labor cost are the expenses.
On the other hand, if the costs are considered to be related to the product, they can be classified
into direct and indirect costs. Direct costs are the costs that can be directly calculated for a specific
product. Indirect costs are the costs that cannot be calculated for a specific product; they are
distributed to various products according to certain criteria.
21
(Hints & Tips) When calculating ratios of profitability and using the total capital in the calculation, the average capital
value is often used. This is because the capital (including liabilities) is different at the beginning and at the end of the period.
The average capital value is the average of the total capital at the beginning and at the end of the period.
Inventory Evaluation
Inventories are defined as assets that will be converted to cash by sales or be consumed for
manufacturing the products. Examining the actual amount of the inventories is called inventory
evaluation. There are several methods, as shown below, for inventory evaluation.
Method Description
Last-in first-out Inventory is evaluated with the assumption that new products in the
method (LIFO) inventory go out first, leaving old products (inventory close to the beginning
of the period).
Moving average A new unit price is calculated using the residual amount and newly
method delivered amount each time products are brought in.
remaining balance + newly delivered value
Average unit price =
residual quantity + quantity newly delivered
First-in first out Inventory is evaluated with the assumption that old products in the
method (FIFO) inventory go out first. New inventory (inventory close to the end of the
period) remains.
Periodic average The inventory prices and quantities are totaled to find the average,
method regardless of the time, at the beginning or at the end of a period.
The inventory value is changed depending on which inventory evaluation method is used. Each
company decides which method is to be adopted.22
22
(Hints & Tips) The result of inventory evaluation depends on the economic conditions. For products whose purchasing
unit price is gradually increasing, the unit price is higher for the units delivered into the inventory later; in this case, the FIFO
method will result in the highest evaluation. If, on the other hand, the purchasing unit price is gradually dropping, the unit
price is higher for those units that were delivered to inventory first. Hence, the result using the LIFO method will result in the
highest evaluation. The gross-average and moving-average methods will result in intermediate values between the results of
LIFO and FIFO methods.
Quiz
Q1 When finalizing accounts at the end of a period, the following profit-and-loss statement was
obtained. Calculate the operating income for the period.
A1 Operating income
= sales – cost of sales – selling, general and administrative expenses
= 150 – 100 – 20
= 30
30 (million dollars)
current assets
A2 Current ratio = × 100
current liabilities
7.3.1 IE
¾ Common methods of IE include ABC analysis and QC seven tools.
Points ¾ ABC analysis is used to find critical control points such as
inventory control.
IE is a system of engineering techniques and methods for optimally designing, operating, and
controlling human resources, products (machines, equipment, raw materials, auxiliary materials
and energy), finances, and information, in order to set out management objectives and to achieve
those objectives while taking into account a harmony with the environments (both social and
natural environments). It is defined as a variety of activities related to the entire process of
production management.23
Seven QC Tools
The seven QC tools are tools used to analyze mainly quantitative numerical data as shown in the
following table.
Tool Description
Cause & effect Component elements of a certain theme are carefully analyzed, and this diagram
diagram clearly displays the structure. Due to its shape, it is sometimes called a “fishbone.”
Pareto Starting with an item with the largest quantity, the cumulative total is connected by a
diagram line while the actual quantity of each item is displayed with a bar graph. Important
items and causes are then chosen from the large amount of data.
Histogram The range of data is partitioned into subintervals, and the frequency of data in each
subinterval is counted and displayed with a bar graph. The variation is then understood
from the distribution condition, shape, and average.
Scatter By the look of data spread in the diagram, we can understand the existence and
diagram strength of a correlation between two attributes (such as cause and effect).
Check sheet This is used to collect data for each item and to check for any lack of verification. It is
a collective term of diagrams/tables which are easy to understand simply by checking.
Stratification This refers to classifying obtained data and survey results into items. It is necessary to
use graphs so that the difference among the items can be seen at a glance.
Control chart This is a diagram used to study whether or not a particular process is in stable
condition and to maintain the process in stable condition.
23
(Hints & Tips) There are many definitions of IE as well as many scopes of application. As for the scope of application, the
general consensus and interpretation is that it includes the analysis methods and process control centered on work research
(not just to determine the efficient work methods by studying and analyzing the work methods and work conditions; this also
includes a system of analysis methods for setting fair standard duration). However, some believe that, in a broad sense, IE
could involve anything related to management control while, in a more limited sense, it is limited to production management.
80 Agriculture
10:00- 10:59 24 60
11:00- 11:59 17
40
12:00- 12:59 36 Miscellaneous
13:00- 13:59 18 20
Total 125 0
1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 (year)
[Control chart]
← Upper control limit
← Center line
ABC Analysis
Goods in an inventory can be grouped by item of goods, and then each group can be arranged in
descending order by inventory price (inventory configuration ratio) or by sales revenue (sales
revenue configuration ratio). Then, the cumulative sums can be shown on the same graph so that
the inventory can be categorized and managed in 3 groups—Groups A, B, and C. This method is
called ABC analysis.25 For ABC analysis, we use a Pareto diagram as shown below.
24
(Hints & Tips) For “check sheets” and “stratification,” there is no specific figure or graph format. We can use an
appropriate diagram or graph for the objective. The figures shown here are just examples.
25
(FAQ) There are many exam questions on ABC analysis. Understand its viewpoint and where to apply it. In application,
you may see this on programming tests. For example, as a source for discussing which programs should be managed at a high
level, the number of errors for each program may be shown in a Pareto diagram.
Low-level control
Low-level control
High-level control
In ABC analysis, product group A requires critical (or high-level) control while product groups B
and C require relatively low-level management. Product group A represents about 70% of the
cumulative configuration ratio, product group B about 70 to 90%, and product group C at 90% or
higher.
ABC analysis is a technique of analysis and control based on the Pareto Principle.26
26
Pareto Principle: Only a few factors have significant impact on a certain event while most factors have very little impact.
On this basis, product group A has the highest priority to be managed in ABC analysis.
An analytical method called PERT (Program Evaluation and Review Technique) is used for
schedule control and process control of a large-scale project. In PERT, after an arrow diagram
(chart showing relationships of activities and numbers of days required) is prepared, various
analyses are performed to develop an optimum schedule.
Arrow Diagrams
An arrow diagram is suitable for managing large-scale projects in which multiple activities run
parallel. An example of an arrow diagram is shown below.27
Activity
Dummy
dummy task28
Node activity
Days required
In this arrow diagram, activity A is the initial work that begins the whole process, and its duration
is 4. Following activity A, activities B, C, and D are carried out in parallel. Further, at node ○ 6 ,
activities E and F merge. This indicates that activity H cannot be initiated until both of these
activities are completed.
27
(Note) In an arrow diagram, letters A, B, … I, shown along the arrows are each called activities. In reality, activity titles
such as systems design and programming are necessary. The numbers along the arrows refer to the numbers of days required,
which represent durations required for those activities. Instead of numbers of days, we can use hours, dates, or years.
However, within one arrow diagram, this unit needs to be consistent.
28
Dummy activity: It is an activity without any content. A dotted line is used to indicate synchronization. When C is
finished, activity F can begin immediately, but activity I cannot begin until activities C and G are both completed. The
number of days required for a dummy task is considered 0.
Forward calculation
We calculate the earliest node times29 as we follow the diagram starting at node ○ 1 . At nodes
where multiple activities merge, we take the maximum number of days required for each path and
make it the earliest node time for that merge node. This is because the next activity cannot be
initiated until all the activities merging at that node are completed. The calculation result at each
node is to be written in the top box at the node. Consequently, we can calculate the number of days
required for the entire project.
3 →○
○ 6 10+1=11
If the numbers are equal,
4 →○
○ 6 9+2=11
2 →○
○ 3 take either one.
Take the larger one. → 11
4+6=10
1 →○
○ 2
4 →○
○ 7 9+0=9
0+4=4
5 →○
○ 7 6+1=7
2 →○
○ 5 2 →○
○ 4
Take the larger one. → 9
4+2=6 4+5=9
Basically, if we take the earliest node time at one node and add the number of days required for
the next activities, we get the earliest node time for the next node, unless the next node is where
multiple activities merge. In that case, we should pay closer attention.
A ← Activity
○ 1 2
○
x a x+a
1 + number of days required for
← Earliest node time of ○
activity A = x + a
number of
days required
In forward calculation, we set the earliest node time of the first node to “0.” Next, since activity A
takes 4 days to complete, node ○ 2 can be departed 4 days later. In other words, activities B, C,
and D can each begin 4 days later. However, at node ○ 7 , activities merge. The path “A, C, dummy
activity” takes 9 days, so activity I cannot begin until 9 days later. However, the path “A, D, G”
takes only 7 days, meaning that activity I can begin 7 days later. On the other hand, activity I
cannot begin until activity G and the dummy activity (actually activity C) are finished. Hence, for
the earliest node time at node ○7 , we need to take the larger of the two numbers.
29
Earliest node time: It is the earliest time at which an activity may be started when all preceding activities are completed
as rapidly as possible.
Backward calculation
Now we calculate the latest node times,30 beginning at node ○ 8 and working our way backward.
Where multiple activities diverge, take the minimum number of days required for each path, and
that becomes the latest node time at that node. The reason is that, since activities diverge at that
node, the latest node time needs to be determined by the activity that must begin earliest. The
calculation result at each node is to be entered in the bottom box.
6 ←○
○ 8
3 ←○
○ 6
15-3=12
12-1=11
The last is always 0.
1 →○
○ 2
4-4=0
Copy
9 ←○
○ 8 11-6=5 7 ←○
○ 8
2 ←○
4 15-6=9
○ 9-5=4 4 ←○
○ 7 12-2=10
5 →○
○ 7
2 ←○
○ 6 8-2=6 4 ←○
○ 6 9-0=9
9-1=8
Smallest→4 Smaller→9
Basically, if we take the latest node time at one node and subtract the number of days required for
the previous activity, we get the latest node time for the previous node, unless the node is where
multiple activities diverge. In that case, we should pay closer attention.
A ← Activity
1
○ 2
○
Earliest node time of ○
2 - number of days required for task A→ a
= x-a x-a x
Days
required
By using forward calculation, we identified that the number of days required to complete the
project is 15 days. Further, at the last node, the earliest node time and the latest node time is
identical. By using backward calculation, we are now to calculate the latest day on which each
activity must begin in order to reach node ○ 8 . For instance, take node ○6 . Note that node ○ 8
needs to be reached 15 days after the project starts, and activity H takes 3 days. Hence, activity H
can begin as late as 12 days after the project is initiated. Further, at node ○4 , if we consider the
path “F, H,” we see that activity H can be initiated as late as 10 (= 15 – 3 – 2) days after the start;
however, along the path “dummy activity I,” activity H must be initiated 9 (= 15 – 6) days after
the start. Hence, at node ○
4 , the latest node time must be set to 9.
30
Latest node time: It is the latest time at which an activity may be started without delaying the minimum completion time
of the project.
Critical Paths
A path connecting nodes where the latest node time and the earliest node time are identical is
called a critical path. Activities along a critical path have no extra time, so these activities cannot
be delayed. If they are delayed, the number of days required for the entire project will change.
In the arrow diagram shown above, a critical path is “A→C→ dummy activity→ I” (○
1 →○
2 →○
4
→○ 7 →○ 8 ).
In order to shorten a project schedule, it is necessary to shorten the number of days required for
activities on a critical path.
Dummy Activities
A dummy activity is an activity without substance and is expressed by a dotted line. This is used
only to indicate the time relation between two activities. If there are two activities between two
nodes as shown below, this expression on the left can only show one activity, so we insert dummy
activity d as shown on the right.32
A
A
1
○ 2
○ 1
○ 2
○
B
B d
3
○
31
(Hints & Tips) If the schedule (number of days) of some activities in an arrow diagram is shortened, a critical path may
change as well. There are situations where shortening an activity on a critical path by 2 days results in the entire project being
shortened by only one day. When the number of days required is shortened at some node, the earliest and latest node times
need to be re-calculated.
32
(FAQ) There are many exam questions that give you an arrow diagram and then ask you to calculate the number of days
required or to find a critical path. If the number of days required is the only thing you need to find, forward calculation is
sufficient. However, it is advisable to always perform backward calculation to verify the accuracy of the calculation. In
backward calculation, the latest node time at the very first node will always be 0. If you don't get 0, you must have made a
calculation error.
Linear programming (LP) is a method that is effective in answering questions where the
condition expressions and the target equation are all linear (first-degree). For instance, it can be
used when the supply of resource is limited or when the production plan at a factory or the
transportation cost for distribution needs to be minimized.
“In order to manufacture 1 ton of product A, we need 4 tons and 9 tons of raw materials P and Q,
respectively. For product B, we need 8 tons and 6 tons of these materials, respectively. In addition,
the profits resulting from products A and B are 20,000 and 30,000 dollars per ton, respectively.
However, we only have 40 tons of material P and 54 tons of material Q.”
Let us examine how much of each product should be produced to maximize the overall profit. Let
x and y be the amount (in tons) of products A and B to be produced, respectively. The following
table summarizes the information above.
From the table, we can write down the following expressions for the constraint conditions and the
objective function.
• Constraint conditions:
4x + 8y < 40 (for material P)
9x + 6y < 54 (for material Q)
x > 0, y > 0 (non-negative condition)34
• Objective function:
Z = 2x + 3y (to maximize Z)
33
Constraint conditions/Objective function: Constraint conditions are expressions regarding supply limits of the materials,
etc. and are almost always expressed by inequalities. The objective function is the expression to maximize the profit, etc.
Linear programming is finding the maximum value of the objective function subject to the constraint conditions.
34
(Hints & Tips) The non-negative condition simply means that the value cannot be negative. x and y are amounts to be
produced and therefore will never be negative.
Solving Method
We now change the linear inequalities expressing the constraint conditions to the corresponding
equations and graph them as lines on the coordinate plane to indicate the range defined by the
inequalities, using the x- and y-axes. Note the non-negative conditions: x > 0, y > 0.
1
4x + 8y < 40 Æ 4x + 8y = 40x + 5.Æ y=–
2
2
9x + 6y < 54 Æ 9x + 6y = 54 Æ y = – x + 9.
3
(0,9)
←9x+6y=54
C (0,5)
←4x+8y=40
x
O (0,0) A (6,0) (10,0)
The area in which the constraints are satisfied is the region on the graph surrounded by points O, A,
B, and C. Regardless of the objective function, a point (x, y) where the objective function achieves
its maximum value is proved to be one of the vertices of this area where the constraints are
satisfied.35
Hence, we now evaluate the function Z by substitution at the four vertices O, A, B, and C.
O:Z =2×0+3×0 =0
A:Z =2×6+3×0 = 12
B:Z =2×4+3×3 = 17
C:Z =2×0+3×5 = 15
The value of Z is maximized at point B (x, y) = (4, 3). Therefore, the maximum profit will be
generated when 4 tons of product A and 3 tons of product B are made.36
35
(Note) The optimum solution in linear programming is often the intersection of the lines. You will need to know how to
solve a system of linear equations.
36
(FAQ) The frequency at which linear programming questions appear on the exams is rather high, but most of the Morning
Exam questions will simply ask you to find the constraint expressions. The Afternoon Exam questions, however, will ask you
to find the solution. Hence, you will need to know how to solve these questions by reading the graphs.
Inventory control is a system of optimally managing the inventory of items such as products and
raw materials that the company has stored. There are two ordering methods: the periodical
ordering system and the fixed order quantity system. The EOQ formula is used to determine the
appropriate quantity to be ordered based on balance of the inventory expense and the ordering
expense.
Ordering Systems
The system where the orders are placed at fixed intervals and the quantity ordered varies due to
changing demands and other factors is called the periodic ordering system. In contrast, the
system where the quantity ordered stays constant and orders are placed whenever the inventory is
depleted is called the fixed order quantity system. The characteristics of each system are shown
in the following table. 37
37
(Hints & Tips) The periodic ordering system is often applied to the A items in ABC analysis. The fixed order quantity
system is generally applied to the B and C items in ABC analysis. A items are usually expensive, so it is necessary to keep the
inventory low; hence, the demand for these must be carefully estimated.
Variable Description
M Firm demand for a fixed period (e.g., a year)
K Ordering expense per order
P Purchasing unit price
c Inventory maintaining rate38
x Quantity ordered per order
n Number of orders for a fixed period (e.g., a year)
If n orders are placed in a year and the quantity ordered each time is the same, the following
relation holds:
Ordering expense = n × K
Demand (M) = n × x
Now, let x be the quantity ordered at the time of delivery. Another order will be placed when the
delivered units are consumed and the inventory becomes 0. Hence, the average inventory can be
considered as x / 2. Therefore, the inventory expense can be expressed as follows:
From the above, the total expense necessary to control the inventory (T) is as follows:
If the value of x (= Q) that minimizes T is identified, this value is the optimum quantity to order.
As the graph shows, considering the point where the ordering expense equals the inventory
expense, the optimum quantity to be ordered can be calculated as follows:39
(MK) / x = (xcP) / 2
x= 2 MK / cP (x > 0)
38
Inventory maintaining rate: It is the rate of cost necessary for maintaining the inventory: “purchasing unit price *
inventory maintaining rate = inventory expense.”
39
(FAQ) There will be exam questions where you are asked to calculate the difference between the periodic ordering system
and the fixed order quantity system, optimum quantity to be ordered using the EOQ formula, and the number of orders to be
placed. Understand the characteristics of the periodic ordering system and the fixed order quantity system. You do not need to
memorize the EOQ formula as this will be given in question texts. Be sure, though, that you can use it in calculation.
Expense
Inventory
Total
expense
Ordering expense
Quantity to be ordered
↑
Q: Optimum quantity to be ordered
Probability is a value representing the likeliness of an event to occur. For example, the probability
that we can get “1” in a roll of a die is 1/6. Statistics is to numerically clarify some tendency of a
population from its sample. To do this, it is necessary to calculate values such as the mean and
variance. When the population is large, normal distribution is used.
Probability
When a die is rolled, one of the outcomes 1 through 6 will result. The numbers 1 through 6 here
are called a random variable. The probability of each of these outcomes is 1/6, totaling 1.
Consider the following example now. Company X purchases its products from Companies A, B,
and C, with percentages 50%, 30%, and 20%, respectively. Suppose that each of these companies
has a defective rate of 1%, 3%, and 3%, respectively. One product purchased by X was randomly
chosen, and it was defective. What is the probability that this was purchased from Company A?
For example, products from Company A make up 50% of all the products. The defective rate is
1%, meaning that the defective rate of Company A among all the products is obtained as follows:
Defective rate of Company A over all products = 50% × 1% ……..(1)
Similar calculations can be performed for Companies B and C as follows:
Defective rate of Company B over all products = 30% × 3% ……..(2)
Defective rate of Company C over all products = 20% × 3% ……..(3)
The sum of (1), (2), and (3) is the defective rate over all products, and the defective rate of
Company A over all products is (1), so the probability that the defective product was from
Company A is calculated as follows:
Let x be the mean of the sample x, V be the variance, and σ be its standard deviation.41 Then,
the following equations hold:
n
x = 1 ∑xi
n i=1
n
V=σ2= 1 {∑(xi- x )2}
n i=1
Since the meanings of these equations may be hard to understand, we will explain these concepts
specifically. Suppose five sample values have been taken out from the population and they are as
follows:
3, 2, 7, 7, 6
The mean is the sum of these sample values divided by the number of values, 5. Below, the
underlined value is the number of sample values (sample size).
Mean = (3 + 2 + 7 + 7 + 6) ÷ 5 = 25 ÷ 5 = 5
To calculate the variance, find the difference between each of the sample values and the mean,
square each difference, add up these terms, and then divide the sum by the number of sample
values. Below, the underlined value is the mean.
In addition to the above, other measures, including the mode and the median, can be used in order
to make estimates concerning the population.43
40
Population/Sample: In a sample study, the entire set being studied is called the population, and a subset taken from the
population is called the sample. Since the population is often unknown, we make estimates about the population based on the
sample.
41
Expected value/Variance/Standard deviation: Expected value (or mathematical expectation) is the mean value of the
random variable. Variance measures the spread (variation) of the random variable; if the variance is small, the data values are
relatively close to the mean value, and we say that the “variation is small.” The positive square root of the variance is called
the standard deviation.
42
(Hints & Tips) Know the properties of standard deviation.
• The standard deviation does not change if a constant a is added to each data value.
• The standard deviation gets multiplied by a if each data value is multiplied by a.
43
Mode/Median: Mode is the most frequently occurring value in the sample. Median is the value in the middle when the
sample values are sorted in order. If there are odd sample values, the middle value is unique. If there are even values, take the
Binomial Distribution
Binomial distribution is the discrete probability distribution in which P(x) represents the
probability that an event with probability P occurs exactly x times in n trials. Sometimes this is
denoted B (n, P). When two dice are rolled, the sum of the two dice gives this probability
distribution. The expected value μ and the variance V in binomial distribution are expressed as
follows:44
μ=nP V = n P (1 – P)
Normal Distribution
Normal distribution is a continuous probability distribution, and it approximates binomial
distribution when the probability P is not small and n is large. Sometimes it is denoted N ( μ, σ2).
μ denotes the mean, and σ is the standard deviation. However, in practice, we use the standard
normal distribution.
The standard normal distribution is obtained when any normal distribution is converted by the
equation u = (x – μ) / σ, resulting in N (0, 1). Here, u is the mean of the standard normal
distribution, and x is a sample value.
Let us now show an example of the standard normal distribution. Note that the standard normal
distribution is symmetric.
u P(u)
0.0 0.5000
0.5 0.3085
1.0 0.1587
1.5 0.0668
2.0 0.0228
P (u)
2.5 0.0062
3.0 0.0013
O u
In the standard normal distribution, if u = 2.0, then P(u) represents the area of the region under the
curve satisfying “2.0 < u < ∞.” If u = 0.0, then it is the area of “0.0 < u < ∞.” Since this is exactly
the right half of the standard normal distribution, the area is 0.5 (50%).
Let us now do a test using the standard normal distribution. Suppose that the dimension of a
certain product manufactured at a certain fabrication process is 200mm with a standard deviation
of 2mm distributed normally. Assuming that the standard is 200mm±2mm, let us calculate the
probability that the product is defective.
Since this product has the mean 200mm and standard deviation 2mm, the distribution can be
expressed by normal distribution N(200, 22). Thus, the random variable 200±2 (the dimension
range of the product is 198mm to 202mm) can be converted to the standard normal distribution as
follows:
Therefore, the product is considered good if its dimension is between – 1.0 and 1.0 in the standard
normal distribution, and the area of the region – ∞ to – 1.0 as well as the region 1.0 to ∞ gives the
probability that the product is defective. Searching the standard normal distribution table for P(u)
for u = 1.0, one gets P(u) = 0.1587. Hence, the probability we are looking for can be calculated as
follows:
Correlation Coefficient
Two quantities are said to have a correlation if there is a tendency that when one increases, so does
the other, or when one increases, the other decreases. The numerical value that quantifies
correlation is the correlation coefficient (r), which is interpreted in the table below.
Quiz
Q1 Explain ABC analysis.
Q2 Which of the following histograms shows the distribution with the largest variance?
frequency
frequency
data class data class data class
A1 The management method in which goods in an inventory are grouped according to product
class, and then each group can be arranged in descending order by inventory price (inventory
configuration ratio) or by sales revenue (sales revenue configuration ratio) and the cumulative
sums are shown on the same graph so that the inventory can be managed in 3 groups—Groups
A, B, and C.
Various information systems are used in companies. These information systems can be
classified into engineering systems, represented by FA, and business systems, represented by
POS.
MIS45 CAP/MRP46 47
CAD CAPP48
Process planning
Management Planning
Production Planning
Product Design
CAM
Manufacturing
process control
Schedule planning
FMC
CAP
Research
and FMS
development
CAE49
FA
45
MIS: Management Information System
46
CAP: Computer Aided Planning
47
MRP: Material Requirement Planning
48
CAPP: Computer Aided Process Planning
49
CAE: Computer Aided Engineering
Method Description
Wire frame model Expression method for a 3-dimensional shape, using vertices and edges
Surface model Faces placed between wires
Expressing the intersection line of two planes and cross sections
Solid model Expressing the interior solid below the surfaces
CAD, in a broad sense, could refer to all computer-aided processes involving design, but generally
the term refers to computer support in regard to shapes and drawing of parts. To clarify this
distinction, the process of concept design and analysis evaluation is referred to as CAE.50 51
Documentation, drawing
Analysis evaluation
Concept design
Process design
Detail design
50
FMS (Flexible Manufacturing System): It means automation of assembly lines compatible with flexible and
multi-model, small-quantity production. It consists of assembling machinery, robots, conveyers, unmanned transportation
vehicles, and automatic storage facilities.
51
FMC (Flexible Manufacturing Cell): It means automation of cell processes. A cell is the smallest unit of fabrication and
assembly in manufacturing.
52
(Hints & Tips) CAM receives data from CAD as they interact with each other; CAM then prepares manufacturing
instruction data. In reality, because CAD and CAM are linked closely, they are together called CAD/CAM.
FA (Factory Automation)
FA is a system that organically integrates and manages the entire production system including
production planning, ordering, fabrication, assembling, testing, inspecting, transporting, storing,
and delivering. Conceptually, FA contains all of CAD, CAM, and CAE, including CAT, 53
assembly, fabrication, and process control.
53
CAT (Computer Aided Testing): It is a system where computers are used to conduct various characteristic tests on parts
and products during the developing process of a product. This may also refer to a system in which computers are used to
inspect products during the manufacturing process.
54
(Hints & Tips) The use of a POS system could bring about the following results: shorter waiting time at checkout, certain
and accurate cash register entry, improved customer service, assessment of sales promotion effects, reduction in employee
training, and automation of tallying (statistical) work.
Wholesaler, Supplier
Inventory
POS system control
Revenue
Customer
management,
Information
Database Management control
collection
Electronic Banking
Electronic banking is a system in which computers of financial institutions and computers or
terminals of individuals and corporations are connected by communications lines whereby data
such as fund transfer and balance inquiries is transferred electronically. It includes the following
services.
Name Description
Firm banking Connecting financial institutions and corporations
Fund transfer such as inter-account transfer and deposit
Inquiries of deposits and withdrawals, etc.
Home banking Connecting financial institutions and individuals
Account balance inquiries
Fund transfer such as inter-account transfer and deposit
Application for fixed-term deposit, etc.
Internet banking Providing banking services on the Internet
PCs can be used to check balance, pay utility bills, and transfer money.
Card Systems
Card systems including those listed in the table below are provided to maintain the customer base
or to bring in new customers. Payment methods and the need for an ID depend on the card. Bank
POS is used as debit cards.55
Type Point card Prepaid card Credit card Bank POS card
Format Point service amount prepaid; amount paid amount paid
no name later; paid all at immediately
once or by
installments
Purpose keeping keeping absorbing absorbing and
customers customers customers keeping
customers
Identification
Yes No Yes Yes
function56
Payment
No Yes Yes Yes
function
Record function Yes Yes No No
Groupware
Groupware is a family of software used to efficiently communicate andr share information within
an organization such as a company.57 Whereas a business system handles company-wide regular
tasks, groupware is used for decision-making in irregular types of tasks such as schedule control of
meetings.
To do joint work in a group more efficiently, it is highly effective to use PCs and networks. For
instance, a group can use a PC-LAN to send and receive e-mails, manage schedules of jobs and
meetings, and communicate within the group to get the joint work done smoothly. Tools of
groupware include the following.
55
Debit card: It means a service in which a cash card issued by a bank can be used to make payment when shopping. The
amount of payment is directly deducted in real-time from the bank account.
56
(Hints & Tips) Identification function is the function to verify the identity of the person. Payment function means the
ability to make payment just as cash (bills). For instance, a prepaid card does not have the ID function, so whoever has the
prepaid card can make purchases. Record function refers to the capability of a card to keep a record on itself.
57
(Hints & Tips) The original meaning of groupware was intellectual joint work as a group. However, then it would imply
that the human work itself is groupware. So, in its revision, this term now refers to a system that supports joint work in an
organization by the use of computers, by providing a variety of services via a network, including electronic mail, electronic
bulletin boards, electronic conference, and conference room reservations.
PC Communication
PC communication means connecting PCs by a communications line to a host computer providing
a PC communications network so that PCs can communicate with one another and receive various
services, including electronic mail, electronic bulletin boards, electronic conference rooms, and
information services.58 In information-providing services, all kinds of information are provided,
such as news, weather forecasts, sports updates, market updates, corporate information, and
classified advertisements.
Commercial Databases
A commercial database is a database that provides, for a fee, business information such as
information on a company, science/technology-related information, and patent information.
Generally, this service uses databases via a communication line such as a PC communication
service.59
Quiz
Q1 Explain CAD.
Q2 What is the name of the card system whereby payments can be made by a bank-issued cash
card and the money is immediately paid?
A1 CAD is one of the systems that constitute an FA system; it is a system that uses computers,
displays, automatic drawing machines, and other devices to carry out design and drawing tasks
in an interactive format automatically.
58
(Hints & Tips) PC communication and the Internet are similar in that both provide services through communications lines.
However, in PC communication, a company providing the PC communication has a host computer installed, and services are
provided through this host computer. On the other hand, the Internet does not have a designated host computer.
59
(Note) One of the means of data transfer between companies is EDI (Electronic Data Interchange). EDI is a standardized
data format for electronic commerce and its procedures.
Q1. Which of the following is an appropriate description concerning the development of an overall
information system plan?
a) CIO collects all systemization requests from each user department and proceeds in
sequence, starting with those that can be launched immediately.
b) CIO makes adjustments with business plans, studies technology trends, etc. and establishes
an overall plan as a mid/long-term plan. Next, CIO obtains approval and support for the
plan from top management.
c) The leaders of the individual user departments work as key persons and consolidate
individual plans to form an overall plan.
d) Specialists in telecommunications in the information systems department develop an
overall plan, taking into consideration leading-edge technologies.
Answer 1
Correct Answer: b
The overall plan of an information system requires setting strategies and targets of the information
system and writing down the entire subject area to be included in the information system in an
outline form. Policies such as the organization of the system building, applicable tasks, and
information technology are to be clarified, the entire schedule is to be established, and approximate
investment effects are to be estimated.
CIO (Chief Information Officer) is the highest-ranking officer in charge of information systems.
From the top management viewpoint, CIO sets a mid- to long-term plan. CIO also directs the
implementation of the plan with approval and instructions from the top management (Chief
Executive Officer). Typically, the officer in charge of the information systems department becomes
CIO. CIO is not only required to have knowledge on information systems but also held responsible
for establishing computerization strategies; therefore, he or she must have a wide range of
knowledge encompassing the industry in general, the business of the company, and general
administrative functions.
a) CIO may gather systems requirements to establish computerization strategies, but this is not the
main task of CIO.
c) The overall plan is not formed by summarizing various individual plans that come from the
bottom up. Rather, the plan is established top-down and is implemented.
d) The overall plan is established by CIO. The experts on information technologies in each
department implement the plan under the CIO's direction.
Question 2
Difficulty: ** Frequency: **
Q2. What is the sales cost in thousands of US$ for the current term if product inventory sales at the
beginning of the term were $20,000, product purchasing costs for the term were $100,000, and
product inventory sales at the end of the term were $30,000?
a) 50 b) 70 c) 90 d) 110
Answer 2
Correct Answer: c
The sales cost is the expense incurred for the sales of the product. In this case, it is the total of the
product purchasing costs for the products sold.
The product inventory sales at the beginning of the term are the assessed value of the products in the
inventory at the beginning of the term. The product purchasing costs for the term are the purchasing
cost of the products purchased during this term. The product inventory sales at the end of the term are
the assessed value of the products in the inventory at the end of the term.
Hence, the product costs for the products sold during this term are the product inventory sales at the
beginning of the term plus the product purchasing costs for the term, minus the product inventory sales
at the end of the term. This then is the sales cost.
Sales cost = product inventory sales at the beginning of the term + product purchasing
costs for the term – product inventory sales at the end of the term
= $20K + $100K – $30K
= 120 – 30
= 90 (thousand US dollars).
a) If fixed costs do not change, the break-even point rises when variable cost ratio declines.
b) If fixed costs do not change, the break-even point falls by half when variable cost ratio falls
to half their original level.
c) Sales at the break-even point are equal to the sum of fixed and variable costs.
d) If variable cost ratio does not change, the break-even point rises when fixed costs decline.
Answer 3
Correct Answer: c
A break-even point is a point where the sales and expenses are equal to each other, indicating that the
profit is 0. If the sales during a certain period are less than the sales at the break-even point, a loss will
result; if they exceed the sales at the break-even point, a profit will result.
In break-even point analysis, the focus is placed on the relationship between fixed costs and variable
costs. Fixed costs are expenses that are incurred with a certain fixed amount regardless of any changes
in sales. These include land, lease, depreciation, insurance fees, real-estate taxes, and others. Variable
costs are, on the other hand, expenses that change in correlation with the sales; they include materials
costs, for example.
Let S be sales, F be fixed costs, V be variable costs, and P be the profit (target profit). The following
relationship holds:
Profit = sales – (variable costs + fixed costs) Æ Sales = fixed costs + variable costs + profit
S=F+V+P ……(1)
Variable costs (V) are expenses directly proportional to sales, so if the constant of proportion is v, then
the following equation holds:
S = F + vS + P ……(3)
In either (1) or (3), we can solve for sales S that makes profit P equal 0, and that will be the break-even
point.
fixed costs F
Break-even point sales = =
1 - variable costs / sales 1- V / S
fixed costs F
= = ……(4)
1 - varaiable cost rate 1- v
A graph showing the break-even point is called a break-even point chart. In the chart, the sales at the
point where the variable cost line and the sales line intersect is the break-even point.
Cost,
Sales Sales line
As shown in the chart above, if the sales are less than the break-even point sales, there is a loss;
inversely, if the sales exceed the break-even point sales, there is a profit.
The break-even point sales are the sales that make the profit 0, so the profit in Equation (1) is 0. In
other words, sales are equal to the sum of fixed costs and variable costs.
a) In the formula for finding the break-even point sales, if fixed costs do not change and variable
cost ratio decreases, the denominator increases (1 – variable cost ratio), so the break-even point
sales decreases.
b) Substitute 0.5 for v in Equation (4) for break-even point sales, and then compare that result to
the result of plugging in 0.25 (which is half of 0.5, the first variable cost ratio). Note that the
result is not halved.
Break-even point sales (v = 0.5) S0.5 = F ÷ (1 – 0.5) = 2F
Break-even point sales (v = 0.25) S0.25 = F ÷ (1 – 0.25) ≅ 1.333F
d) In the equation to find the break-even point sales, if the fixed costs decrease while variable cost
ratio remains the same, the numerator (fixed costs) decreases, reducing the break-even point
sales.
Answer 4
Correct Answer: d
ABC analysis is a method where an inventory is grouped according to product items and then each
group is sorted in descending order by the inventory price (inventory configuration ratio) or the sales
revenue (sales revenue configuration ratio); the cumulative sum is then calculated so that the inventory
can be managed for each product item. The result of ABC analysis is expressed with a Pareto diagram.
In ABC analysis, the inventory is categorized into 3 groups: Group A is carefully managed while
Groups B and C are managed with relatively lower priority. This is based on the Pareto Principle,
which states that, for many events, only a few factors have significant impact while most other factors
have very little impact.
100%
90% Level
70% Level
A B C
Inventory price
Cumulative %
As shown in the graph, item groups are listed in descending order by price (configuration ratio). A
curve is then drawn by connecting the cumulative sums. The items are grouped such that Group A
makes up about 70% of the configuration ratio, Group B between 70 - 90%, and Group C the
remaining items. Different management methods are applied for each of these groups.
In general, Group A receives close management attention, and the periodic ordering system is applied.
For Group B, the fixed order quantity system using the EOQ formula is applied. For Group C, the
fixed order quantity system where an order is placed when the inventory reaches a certain level, or the
2-bin system, is used.
a) This is an explanation of the basket analysis (simultaneous purchase analysis). Basket analysis
identifies the cross-selling opportunities by analyzing “which product and which product tend
to be purchased together (i.e., there is a correlation).” For example, there is a well-known
correlation: “in supermarkets, disposable diapers and beer are often sold simultaneously.” It
was then discovered that men sent to the store to buy diapers often end up buying beer as well.
Consequently, when a store placed diapers and beer close to each other, the sales grew.
By finding these correlations, stock of an item seemingly unrelated to some crucial product
could be expanded and the sales could grow. The name comes from the idea of looking into
customers' shopping baskets to find correlations.
Basket analysis is used in a variety of fields such as purchase data analysis in the retail industry
and relational analysis on option requests at telephone service companies.
c) This is an explanation of the Delphi method as repeated surveys are mentioned. The Delphi
method is a logical projection technique used in long-term future projection and technology
projection; it is classified under intuitive methods. Intuitive methods are methods of projection
or prediction based on human experience and knowledge.
The Delphi method takes advantage of the feedback characteristic. In this method, opinions of
a large sample of people are collected and analyzed through questionnaires, and the results of
the surveys are summarized, shown to the respondents, and then the survey process is repeated.
This method has many advantages. First, it is effective when projecting unpredictable and
discontinuous technology changes as it employs an intuitive method. It can also help avoid
being influenced by the group dynamics that tend to come from regular face-to-face meetings,
etc. In addition, when a comment collected from the survey is different from the majority's
opinion, invaluable new ideas can be obtained from the reasons added by the respondent.
Hence, the formulation and selection of survey questions are vital to the success of this method.
Q5. The table below indicates weather changes at a particular location. For example, on the day
following a clear day, there is a 40% chance that the weather will be clear, a 40% chance that it
will be cloudy, and a 20% chance that it will be rainy. If the change in weather is a simple
Markov process, what is the probability that the weather is clear two days after it rains?
Unit: %
Clear next day Cloudy next day Rainy next day
Clear 40 40 20
Cloudy 30 40 30
Rainy 30 50 20
a) 15 b) 27 c) 30 d) 33
Answer 5
Correct Answer: d
A Markov process means that the probability that an event occurs at a particular time depends on
events that happened prior to that time. In a Markov process, to find the probability of an event in the
future based on probabilities of past events, we need to go back a finite number of steps. In a simple
Markov process, we go back only one step.
The probability that the weather is clear two days following the given rainy day is as follows:
Q6. In the arrow diagram shown below, after each activity was reviewed, it was identified that only
activity “D” can be reduced by three days. How many days can be reduced to complete all
the activities (“A” through “H”)? Here, a dotted-line arrow indicates a dummy activity.
E
(3 days)
B 4 5 G
(3 days) (3 days)
A
(5 days) D
1 2 7
(10 days)
C F H
(5 days) (12 days) (6 days)
3 6
a) 0 b) 1 c) 2 d) 3
Answer 6
Correct Answer: b
We calculate the earliest node time at each node before and after the shortening. We assume that
the dummy activity takes 0 days.
Each of the numbers shown below indicates the earliest node time at the respective node.
20 E 23
(3 days)
B 4 5 G
(3 days) (3 days)
0 A
(5 days) D
1 2 7
(10 days)
5 29
C F H
(5 days) (12 days) (6 days)
3 6
10 23
By reducing the number of days activity D takes by 3 days (from 10 days to 7 days), the earliest
node times at the shaded nodes change.
17 E 20
(3 days)
B 4 5 G
(3 days) (3 days)
0 A
(5 days) D
1 2 7
(10 days)
5
C F H
(5 days) (12 days) (6 days)
3 6
10 22
Q7. Which of the following provides comprehensive support to a series of production-related tasks
with the use of a computer?
Answer 7
Correct Answer: a
b) EOS (Electronic Ordering System) is a system to efficiently help stock items at the store
and to reduce residual inventory items. POS system analyzes the sales tendency for each
item, and this sales information is used to help stock the goods at the store.
c) OA (Office Automation) is the idea of bringing in office machines and equipment such as
workstations and word processors to enhance the efficiency of information processing in
the office.
d) POS (Point Of Sales) is a system that collects sales information in real-time at the cash
register and analyzes the information. Barcodes attached to or printed on the products are
read by a barcode reader, and the information is automatically collected.
Q8. Which of the following systems exchanges data between enterprises and is used in EC
(Electronic Commerce)?
Answer 8
Correct Answer: b
EDI (Electronic Data Interchange) defines the data format for electronic data exchange on a
network and its procedures so that electronic commerce can take place between different
companies.
a) CA (Certificate Authority) is an agency that certifies that a public key is valid when, for
electronic commerce, etc., digital signatures based on a public-key cryptography are used.
c) SET (Secure Electronic Transactions) is the specifications for secure processing of credit
card payments on the Internet. It was developed jointly by Visa International and
MasterCard International of the United States.
d) SSL (Secure Sockets Layer) is a security protocol between a WWW server and a WWW
browser. It enables authentication and encryption by combining public-key and
private-key cryptography.
308
FE(Morning) Trial
Part 2
This part contains a full set of FE exam (Morning exam and Afternoon
exam) consisted of the questions that are used in the past exams. The
answers and the comments are provided to each question.
Fundamental IT Engineer
Examination (Morning)
Trial
Q1. Which of the following is a decimal number that can be expressed in binary
floating-point without any possible rounding (round-off) error?
Q2. There is a non-zero integer whose number of digits is D in decimal and B in binary.
Which of the following expressions correctly describes the relationship between D and
B?
a) D 2 log10 B b) D 10 log2 B
c) D B log2 10 d) D B log10 2
Q3. Which of the following numeric values or expressions represents an n-digit binary
number consisting entirely of ones, “1111…111”? Here, a negative number is
expressed in two’s complement.
a) –(2n-1–1) b) –1 c) 0 d) 2n –1
a) A, D, F, G b) B, C, E, H c) B, E d) C, E, H
Q5. The calculation time for solving a system of linear equations on a computer is
proportional to the cube (third power) of the number of unknowns in the equations. If it
takes 2 seconds on a computer to solve a system of linear equations involving 100
unknowns, how many seconds does it take on a computer with four times the processing
speed to solve a system of linear equations involving 1,000 unknowns?
a) 5 b) 50 c) 500 d) 5,000
Q6. There is an 8-bit code whose most significant bit is a parity bit. Which of the following
bitwise operations can be used to obtain the lower 7 bits other than the parity bit?
Q7. As a result of inspecting 100 parts, 11 were found with defect A, 7 with defect B, and 4
with defect C. Moreover, 3 were detected with both A and B, 2 with both A and C, and
none were found with both B and C. How many parts were free of defects?
a) 78 b) 83 c) 85 d) 88
Q8. Which of the following truth tables represents the logical formula
Z = X • Y + X • Y ? Here, “ • ” is used for the logical product, “+” for the logical
sum and “ A ” for the logical negation of “ A .”
a) b)
X Y Z X Y Z
0 0 0 0 0 0
0 1 0 0 1 1
1 0 0 1 0 1
1 1 1 1 1 0
c) d)
X Y Z X Y Z
0 0 0 0 0 1
0 1 1 0 1 0
1 0 1 1 0 0
1 1 1 1 1 1
Q9. How many bits are at least required to uniquely represent the English capital letters (A
through Z) and the numeric characters (0 through 9) with the same number of bits?
a) 5 b) 6 c) 7 d) 8
Q10. Which of the following character strings is accepted by the finite automaton in the
diagram shown below? The symbol indicates the initial state and the symbol
indicates the accept state.
1
1
0
0 0,1
0 1
0
s
1
Q11. A particular syntax is described by using the syntax diagram shown below. The
numeric representations such as –100, 5.3, and +13.07 conform to this syntax.
+
Numeral Numeral
・
−
Based on this notation, which of the following numeric representations conforms to the
syntax specified in the figure shown below?
+ +
Numeral Numeral E Numeral
・
− −
Q12. In binary tree traversal, there are three different methods depending on the traversal
sequence.
(1) Pre-order: Scans in the order of node, left subtree, and right subtree
(2) In-order: Scans in the order of left subtree, node, and right subtree
(3) Post-order: Scans in the order of left subtree, right subtree, and node
When the tree illustrated below is traversed in pre-order, which of the following
indicates a sequence of the output node values?
b e
c d f g
h i j k
a) abchidefjgk b) abechidfjgk
c) hcibdajfegk d) hicdbjfkgea
a) b) c) d)
1 3 3 6
7 4 7 4
3 6 1 3
Q14. Which of the following appropriately describes a characteristic of the hash method used
in table searches?
Q15. Which of the following sort algorithms is illustrated in the flowchart below?
a) It does not require periodic refresh to retain stored information, and all or part of
information can be erased and rewritten electrically.
b) All stored information can be erased using ultraviolet light and rewritten.
c) Since it can read data at high speed, it is often used as cache memory.
d) It requires periodic refresh and is widely used as main memory.
Q17. Which of the following logical circuits with two inputs and one output can generate 0
for X only when inputs A and B are both 1s?
A
X
B
Q18. In a certain computer, one instruction is executed in the order of steps 1 through 6 in the
table shown below. How many nanoseconds are required to execute 6 instructions
using pipeline processing in the figure shown below? Here, it takes 10 nanoseconds to
execute each step, and there is no instruction, such as branch and jump, that stalls the
pipeline processing.
5 Fetch data
6 Execute operation
Fig. Pipeline Processing to Execute Instructions
a) 50 b) 60 c) 110 d) 300
Q19. Which of the following technologies is suited for a multimedia system that allows one
instruction to execute the same operation on two or more data concurrently?
Q20. A certain computer’s average instruction execution time is 0.2 µsec. What is this
computer’s performance in terms of MIPS?
Q22. There are two CPUs X and Y that are configured in the figure shown below. Both have
exactly the same conditions except the access time of the cache memory and main
memory in the table shown below. When a certain program runs on both CPUs,
processing time is the same for both. In this case, what is the hit ratio of the cache
memory? Here, no factors other than CPU processing have an impact on the hit ratio.
a) Data is written only into the cache memory when CPU performs the write operation.
b) Data is written simultaneously into both the cache and main memory.
c) Changes to data in the main memory take place when the data is pushed out of the
cache memory.
d) Because of a relatively low frequency of memory access, the bus occupancy ratio is
also low.
Q24. On a hard disk, a sequential organization file consists of unblocked fixed length
records. A program reads and processes all data in this file sequentially. When the
file organization or reading method is changed, which of the following is an
appropriate solution to achieve the shortest time in which the program needs to read
the data? Here, multiprocessing is not considered.
a) By dividing and storing the data into separate files and accessing these files
sequentially.
b) By creating an indexed organization file and reading the data by using a key of each
record.
c) By creating a direct organization file and reading only the necessary data.
d) By blocking records and increasing the number of records acquired in a single
physical read operation.
Q26. Which of the following appropriately describes the features of USB 1.1?
a) USB 1.1 adopts a high-speed transfer method that is suitable for data to be delivered
in real time, such as audio and video. USB 1.1 allows devices to be connected in a
daisy-chain or tree topology, and permits connection even in the absence of a PC
acting as host.
b) Peripheral devices are connected through a PC acting as host. USB 1.1 supports
multiple modes of data transfer; generally, a printer or a scanner uses full speed
mode, and a keyboard or a mouse uses low speed mode.
c) USB 1.1 is a serial interface that is originally designed for connecting modems, but is
also used for connecting peripheral devices to a PC.
d) USB 1.1 is a parallel interface that connects hard disks, laser printers and other
peripheral devices to small computers, including PCs.
Q28. About how many megabytes (or Mbytes) in memory are required to display a screen of
1,024 horizontal pixels and 768 vertical pixels when the video memory stores 24 bits of
color information per pixel? Here, 1 Mbyte is 106 bytes.
Q29. In basic computer architecture, which of the following is the method that both programs
and data are read into the storage device of a computer prior to execution?
Q30. When three tasks run standalone, their priority levels are shown in the table below, and
each operation sequence and processing time of the CPU and I/O devices are also
described in the table. How many milliseconds of the CPU idle time are there from
the instant all three tasks become executable at a time until the execution of all tasks is
completed? Here, no conflict occurs in I/O operations, and the overhead of the OS
itself can be ignored.
a) 2 b) 3 c) 4 d) 5
Q31. Which of the following appropriately describes an expression that represents the
relationship between turnaround time, CPU time, I/O time, and process waiting time?
Here, all other types of overhead time can be ignored.
Q34. Which of the following appropriately describes the API (Application Program
Interface) in an OS?
Q35. In the hierarchical file system shown below, when the current directory is B1, which of
the following is the relative path name of the file C2? Here, a symbol “..” in the path
name indicates a parent directory. A backslash “\” appearing at the head of a path name
means the root directory, whereas a backslash “\” in the middle indicates a delimiter of
the directories or file names. The boxes in the figure represent directories.
A1 A2
C1 C2
a) ..\A1\B2\C2 b) ..\B2\C2
c) A1\B2\C2 d) B1\..B2\C2
Q36. In the function layer of a 3-layer client/server system, which of the following
combinations of two functions is processed?
a) The input of search conditions and the assembling of the data processing conditions
b) The input of search conditions and data access
c) The assembling of the data processing conditions and data manipulation
d) Data access and data manipulation
a) The multiple processors share a hard disk, and each processor is controlled by its
own OS. Processing power is enhanced by distributing workload on a per-job
basis.
b) The multiple processors share the main memory and are controlled by a single OS.
In principle, a task in the system can be executed by any of the processors, so
processing power is enhanced by distributing workload in small pieces.
c) Normally, one of the processors is in the standby state. When a failure occurs in the
active system, processing is continued by switching over to the standby processor.
d) Two parallel connected processors concurrently perform the same processing and
compare their results with each other. If one of the processors fails, it is removed
and processing is continued.
a) In OLTP (Online Transaction Processing), MIPS values are used to evaluate system
performance.
b) Response time and turnaround time are performance evaluation indexes from the
viewpoint of a system operation administrator.
c) Generally, response time is improved as the utilization ratio of system resources
becomes higher.
d) The number of transactions or jobs that can be processed in a unit of time is important
for evaluating system performance.
Q39. In a parallel system shown below, at least how many subsystems are required to
increase the availability of the entire system to 99% or more? Here, the availability of
each subsystem is 70%. The entire system is running as long as one subsystem is
running.
Subsystem
Subsystem
…
Subsystem
a) 3 b) 4 c) 5 d) 6
Q40. In a hierarchical DFD, a part of DFD at a certain level is shown below. Which of the
following illustrates the appropriate method for describing the immediately lower level
of DFD? Here, the processes in the immediately lower level of Process n are
numbered as follows: Process n-1, Process n-2, etc.
a) b)
1–2
1–1
1–1
1–3
1–3 1–2
c) d)
1–2 1–3
1–1
1–1 2–1
2–2 1–2
a) A recursive program can call itself either directly or indirectly through another
program it has called.
b) A recursive program can be located and executed at any address in the main memory.
c) A recursive program can produce correct results even if it is simultaneously called by
multiple tasks.
d) A recursive program can be repeatedly executed without reloading.
a) The sequence of computation is specified by data flow, not by control flow. The
data used by an instruction is not used by this or the other instructions after that.
b) The control of computation is sequentially passed from instruction to instruction.
The transfer of data between instructions is done indirectly by referencing the
memory through “variables.” The instructions and the definition of data are
separated.
c) Data is hidden from the outside world and operated indirectly by a procedure called a
“method.” A program is a collection of encapsulations of data and methods.
d) A program is composed of data and instructions (symbols of operation) that represent
nest-structured operational expressions and functions. “Instruction execution”
means “calculation (evaluation) of that particular expression or function.”
a) COBOL is suited for business data processing and executed by using an interpreter.
b) C is a system description language, and a program written in C needs compiling
prior to its execution.
c) The language specifications of Java depend on the platform, and Java is executed by
using an interpreter.
d) Perl is suitable for writing programs that run on a client, and a program written in
Perl needs compiling prior to its execution.
Q45. Which of the following should be approved after completion of the external design of
the system?
Q48. Which of the following appropriately explains white box testing that is one of the
program testing methods?
a) The lowest-level modules in the program structure are first tested. Next, they are
integrated into the higher-level modules, and the integrated modules are tested. In
this way, this method repeats integration and testing at higher level in sequence.
b) The highest-level module in the program structure is first tested. Next, the
lower-level modules called by the highest module are integrated and tested. In this
way, this method repeats integration and testing at lower level in sequence.
c) This testing method focuses attention on the external specifications of a program,
and all possible combinations of the input values are tested. This method includes
techniques such as equivalence partitioning, boundary value analysis, and
cause-effect graph.
d) This testing method focuses attention on the internal structure of a program. The
program logic is examined and tested so that all paths can be executed. This
method includes techniques such as instruction coverage and condition coverage.
Q49. When a calculation method of check digit shown below is used for appending a check
digit to a given data value, which of the following is the correct result? Here, the data
value is 7394, the weight factor assigned to each position is “1, 2, 3, 4,” and the base 11
(modulus 11) is assumed.
[Method]
1) Multiply each digit of the data by the corresponding digit of the weight factor, and
then sum up the results.
2) Divide the sum of step 1 by the base to obtain the remainder.
3) Subtract the remainder of step 2 from the base, and append the resulting one’s place
value to the end of the data value as a check digit.
Q50. In a GUI screen, which of the following is the most significant point to remember, in
order to provide an efficient user interface both for users who are accustomed to
keyboard operations and for those who are not?
a) Minimizing direct input from the keyboard and enabling selection from lists using
the mouse
b) Placing important items, such as mandatory fields, at the top of the screen, regardless
of the format of the input form
c) Making both mouse and keyboard interfaces available for frequently performed
operations
d) Making it possible to execute frequently used functions by double-clicking the
mouse
Q52. Which of the following appropriately describes how to use stubs in the testing phase?
Q53. A sales company is developing an application that can provide data to sales personnel
at branch offices across the country from a server located in the head office, using the
company’s own intranet. When a system test is done in the LAN environment of the
head office, which of the following is a difficult item to verify? Here, the company’s
internal network consists of LAN in the head office, LANs in the branch offices, and
communication lines connecting these offices.
Q54. In the logic test shown below, which of the following test cases is needed to achieve
decision condition coverage (branch coverage)?
True
A OR B
False
Instruction
a) b) c) d)
A B A B A B A B
False True False True False False False True
True False True True True False
True True
Q55. In a system development project, PERT is used to create an implementation plan and
find a critical path. Which of the following can be figured out of the critical path?
a) The activities that require the most attention in terms of system quality
b) The activities whose implementation sequence can be changed
c) The activities that are directly connected to delay of the whole project
d) The most costly activities
Q56. Which of the following appropriately describes the function point method?
Q57. Which of the following appropriately describes the purpose of appending a check digit
to a customer code?
Q58. The charge for using a computer system is determined in consideration of various
criteria such as the usage of resources and the number of users. Which of the following
graphs shows a declining metered rate system (or a diminishing charge system)?
Here, the horizontal axis indicates the usage amount, and the vertical axis is the usage
charge?
a) b)
0 0
c) d)
0 0
Q59. Which of the following devices is installed so that a computer system will not shut down
due to a sudden failure of the external power supply?
a) CVCF b) UPS
c) Private electric power generator d) Backup power-receiving equipment
Q60. In software maintenance, which of the following tests is used to make sure that side
effects of corrections or changes are not happening?
Q61. Which of the following appropriately describes protocols that are used in the session
layer of the OSI basic reference model?
a) There are protocols for error detection and recovery process for the sequence and
loss of transmitted data, multiplexing of data, etc.
b) There are protocols for remote data access, file transfer, etc.
c) In order to do transparent, error-free data transfer between adjacent systems, there
are protocols for error control, recovery control procedures, send/receive timing, etc.
d) In order to establish a logical communication path and support orderly data
exchange, there are protocols for interoperation control, exception reporting, etc.
Q62. As shown in the figure below, the 16-bit data is arranged in a square of 4 by 4 blocks,
and parities are appended to rows and columns. Up to how many bits of error can be
corrected by means of this method? Here, the shaded areas in the figure indicate
parities.
1 0 0 0 1
0 1 1 0 0
0 0 1 0 1
1 1 0 1 1
0 0 0 1
a) 0 (uncorrectable) b) 1
c) 2 d) 3
Q63. When messages consisting of 90 characters each are transferred at a speed of 14,400
bps on the start/stop line, how many messages can be sent in 1 minute? Here, each
character consists of 8 bits without parity, and the 1-bit start and 1-bit stop signals are
used. The actual usage ratio of the line is 80%.
a) 12 b) 16 c) 768 d) 960
Q64. When a file with an average size of 1,000 bytes is transferred every two seconds
between terminals connected through a leased line with a communication speed of
64,000 bps, which of the following is closest to the line usage ratio (%)? Here, during
file transfer, control information equivalent to 20% of the transfer amount is appended.
Q65. When the collision lamp of the 10Base-T hub remains solidly lit, which of the following
appropriately describes the status of LAN?
a) The gateway is used for protocol conversion only at lower layers from the first layer
through the third layer in the OSI basic reference model.
b) The bridge relays frames based on the IP address.
c) The repeater amplifies signals between segments to extend the transmission distance.
d) The router relays frames based on the address at the MAC layer.
Q67. A certain record consists of items A through F. The combination of items A and B is
the primary key for this record. Moreover, item F can be identified by item B.
Which of the following is in the 3rd normal form of this record?
A B C D E F
a) A B C D E B F
b) A B C D E B F
c) A B F C D E B F
d) A C D E B C D E B F
a) Projection combines the query results of one table with those of another table to form
a single table.
b) Projection extracts from a table the rows that match a specific condition.
c) Projection extracts only the specific columns from a table.
d) Projection forms a new table by combining groups that match particular conditions
in two or more tables.
Q69. Based on the “Product” table shown below, a “Profitable Product” table is created by
using [View definition]. Which of the following update processes decreases the
number of rows appearing in the “Profitable Product” table?
Product
Product code Product name Model Sales price Purchase price
S001 PC T T2003 1,500 1,000
S003 PC S S2003 2,000 1,700
S005 PC R R2003 1,400 800
[View definition]
CREATE VIEW Profitable_Product
AS SELECT * FROM Product
WHERE Sales_price – Purchase_price >= 400
Q70. Which of the following appropriately describes the log file in DBMS?
a) A log file is created by periodically writing updated data in the main memory onto a
disk, so as to shorten database recovery processing time in the event the system goes
down.
b) A log file is created by constantly writing a copy of the same data into a database on
a separate disk or into a database at another site, so that the system can be
immediately restored in the event of a disk failure.
c) A log file is created by duplicating the content of the database on a per-disk basis, so
as to restore the database from disk failure.
d) A log file is obtained by writing the data values preceding and following data updates
in order to keep records of the database updates, for use in database recovery.
Q71. The figure shown below is a conceptual diagram of a public key cryptography. Which
of the following is the appropriate combination to be inserted in A and B?
Sender Receiver
Encrypted Encrypted
Plain Text Encryption Text Text Decryption Plain Text
A B
A B
a) Receiver’s public key Receiver’s private key
b) Receiver’s private key Receiver’s public key
c) Sender’s public key Receiver’s private key
d) Sender’s private key Receiver’s public key
Q72. Which of the following appropriately describes security in use of the Internet?
Q73. Which of the following standards has the objective of achieving customer satisfaction
through effective use of a quality management system that includes preventive
processes for nonconforming products?
Q74. When the relationship between the preset price and expected demand of a given product
is approximated by a linear expression, which of the following is the appropriate value
A
that should be put in the box below?
Q76. Some items calculated from a profit and loss statement are included in the table shown
below. In this case, how much is the break-even point in dollars?
Unit: $
Item Amount
Total sales 10,000
Variable cost 8,000
Fixed cost 1,000
Profit 1,000
Q77. Which of the following is appropriate as a case to which the Delphi method is
applicable?
a) A network diagram is created with arrows connecting the individual activities and
indicating their order relationships. This is useful for identifying process
bottlenecks and preparing schedules.
b) A center line and a pair of upper and lower limit lines are drawn, and the
characteristic values of products are plotted. This is useful for detecting quality
problems and abnormal situations in process, eliminating the causes of problems,
and preventing problem recurrences.
c) The number of product defects and the amount of loss are categorized on a
cause-by-cause basis, accumulated and sorted in descending order. This makes it
possible to identify items whose improvement is highly effective.
d) Factors considered possible causes of a problem are arranged in a shape such as a fish
skeleton. This makes it possible to identify the root causes of the problem, and is
useful in solving it.
Q79. In order to compare last year's hiring examination with this year's, the company had a
large number of employees to take both examinations. Then, the correlation coefficient
and regression line were obtained by plotting their scores from last year's examination
on the x-axis and their scores from this year's examinations on the y-axis. Which of the
following is the appropriate statement that can be concluded from the results below?
[Results]
The correlation coefficient was 0.8.
The slope of the regression line was 1.1.
The value of the y-intercept of the regression line was 10.
a) Based on the value of the y-intercept of the regression line, it is understood that a
person whose score on this year’s exam was 0 could score about 10 on last year’s
exam.
b) Based on the slope of the regression line, it is understood that the average score on
this year's examination is about 1.1 times that of last year's examination.
c) Based on the slope and the y-intercept of the regression line, it is understood that
scores on this year's examination tend to be higher than those on last year's
examination.
d) Based on the slope of the regression line and the value of the correlation coefficient,
it is understood that this year's examination is of high quality.
Q80. In a certain factory, three products A, B, and C are manufactured using the same
material M. Table 1 shows the time required to manufacture 1 kg of each of products
A, B, and C, the necessary amount of material M, and the profit. Table 2 shows the
amounts of resources that can be allocated every month. At this factory, they want to
know the quantities of products A, B, and C that will yield the highest profit. Which of
the following is the most appropriate method to solve this issue?
Table 1 Manufacturing conditions Table 2 Allocatable amounts of
Product A B C resources
Time required to Manufacturing time
2 3 1 240
manufacture (hours/kg) (hours/month)
Necessary amount of Amount of material M
2 1 2 150
material M (liters/kg) (liters/month)
Profit ($/kg)
8 5 5
Now, take the n-bit binary number with all ones (1’s) and convert it into two’s complement.
111…11
↓ reversing each bit
000…00
+) 1 1 is added
000…01 = (1)10
Since the result of converting it into two’s complement is (1)10, the number whose bits are all ones
(1’s) is –1.
Here, since we are solving a system of linear equations with 1,000 variables, we substitute x = 1,000 =
103 in the equation (2):
Y = 0.5 × 10-6 × (103)3
= 0.5 × 10-6 × 109
= 0.5 × 103
= 500 (seconds).
Therefore, the answer is to simply perform the logical AND operation with (01111111)2 = (7F)16.
In the following explanations, “x” means either one(1) or zero (0).
A) x x x x x x x x ← 8-bit code
AND) 0 0 0 0 1 1 1 1 = (0F)16
0 0 0 0 x x x x
Here, only the lower four (4) bits are produced.
B) x x x x x x x x ← 8-bit code
OR) 0 0 0 0 1 1 1 1 = (0F)16
x x x x 1 1 1 1
Here, only the higher four (4) bits are produced, and the lower four (4) bits are all 1s.
C) x x x x x x x x ← 8-bit code
AND) 0 1 1 1 1 1 1 1 = (0F)16
0 x x x x x x x
Here, the most significant bit (parity bit) is zero (0), and the lower seven (7) bits stay the same.
( A U B U C ) is the set of parts with defect A, defect B, or defect C. Hence, the set of defect-free parts
is the complement of this set, and so the answer can be calculated by subtracting the number of parts
included in ( A U B U C ) from the total number 100.
Let us denote the number of elements in the set ( A U B U C ) as n( A U B U C ) . We then have the
following expression:
n( A U B U C ) = n( A) + n( B) + n(C ) − n( A I B) − n( A I C ) − n( B I C ) + n( A I B I C ) (1)
Now, we substitute the numbers given in the question.
n(A) = 11 (parts with defect A; may also have defects B and/or C)
n(B) = 7 (parts with defect B; may also have defects A and/or C)
n(C) = 4 (parts with defect C; may also have defects A and/or B)
n( A I B) = 3 (parts with defects A and B; may also have defect C)
n( A I C ) = 2 (parts with defects A and C; may also have defect B)
n( B I C ) = 0 (parts with defects B and C; may also have defect A)
Substitute these values in the equation (1), and we get
( A U B U C ) = 11 + 7 + 4 − 3 − 2 − 0 + n( A I B I C )
= 17 + n( A I B I C ) .
Now, since n( B I C ) = 0, ( B I C ) is the empty set. Hence, ( A I B I C ) is also the empty set,
and we conclude that n( A I B I C ) = 0.
∴ n( A U B U C ) = 17.
Therefore, the number of defect-free parts is obtained as follows:
n( A U B U C ) = 100 − 17
= 83 (parts).
Since the number of defect-free parts is 83, the correct answer is (b).
Below, we explain why the equation (1) holds.
Consider the Venn diagram shown below. n(A), n(B), and n(C) can be expressed as follows:
n(A) = ○
1 + ○
4 + ○
6 + ○
7
n(B) = ○
2 + ○
4 + ○
5 + ○
7
n(C) = ○
3 + ○
5 + ○
6 + ○
7
Thus, n(A) + n(B) + n(C) can be expressed as follows:
n(A) + n(B) + n(C) = ○ 1 + ○ 4 + ○ 6 + ○ 7 + ○ 2 + ○4 + ○5 + ○7 + ○
3 + ○
5 + ○
6 + ○
7
= (○1 + ○ 2 + ○ 3 + ○ 4 + ○ 5 + ○6 + ○
7 ) +○
4 + ○
5 + ○
6 + ○
7 + ○
7
On the other hand, n( A U B U C ) is the sum of ○
1 through ○ 7 , so we get n(A) + n(B) + n(C) as
below:
n(A) + n(B) + n(C) = (○ 1 + ○ 2 + ○ 3 + ○4 + ○ 5 + ○ 6 + ○ 7 ) : n(A∪B∪C)
+ (○
4 + ○ 7) : n(A∩B)
+ (○
5+ ○ 7) : n(B∩C)
+ (○
6+ ○ 7) : n(A∩C)
− ○7 : n(A∩B∩C)
This gives us the formula:
n(A) + n(B) + n(C) = n( A U B U C ) + n( A I B ) + n( A I C ) + n( B I C ) − n( A I B I C )
∴ n( A U B U C ) = n(A) + n(B) +n (C)
− n( A I B ) − n( A I C ) − n( B I C ) + n( A I B I C )
The logical formula Z = X • Y + X • Y is the exclusive logical sum (also called exclusive OR). The
result of its logical operation is 0 if the logical variables X and Y have the same value and 1 if X and Y
are different.
Hence, the truth table for the logical expression Z = X • Y + X • Y is as follows:
X Y Z
0 0 0
0 1 1
1 0 1
1 1 0
If you cannot recognize the formula soon as the exclusive logical sum, you can also actually do the
operation Z = X • Y + X • Y and verify the answer.
X Y X Y X •Y X •Y Z = X •Y + X •Y
0 0 1 1 0 0 0
0 1 1 0 0 1 1
1 0 0 1 1 0 1
1 1 0 0 0 1 0
In general, using n-bit binary numbers, we can express values ranging from 0 through 2n − 1 if we do
not take negative numbers into consideration. In this question, we need 36 characters, so we can do the
following calculations.
2n − 1 = 36
2n = 37
It is a bit complicated to calculate this exponentiation exactly, so here is an easier way as follows:
25 = 32 < 37 < 26 = 64
Based on the above result, 6 bits are sufficient to express 36 different patterns.
A character string is accepted if it begins at the initial state ( ) and ends at the accept state
The diagram is followed from left to right; for example, the first syntax diagram in the question
indicates that the character string does not have to begin with + or − and that a numeral can be
repeated after that. The radix point “.” is never located at the beginning.
Numerals repeated
Numeral Numeral
For these reasons, −100, 5.3, and +13.07 do conform to the syntax shown in the example.
In the second syntax diagram given in the question, since the order “(+, −, or nothing) → numeral(s)
→ radix point” is specified, there must be a numeral before the radix point. Further, another part
shows the order “E → (+, −, or nothing) → numeral(s),” so “E” must be followed by +, −, or a
numeral. Taking this fact into consideration, you can check up each expression listed in the answer
group.
a) The radix point “.” must be preceded by a numeral, but “–” exist instead of a numeral.
b) Since 5.2 conforms to the order “numeral(s) →radix point → numeral(s)” and + or – is omitted at
the beginning, it conforms to the syntax. Further, E−07 is of the order “E → − → numeral(s),” so
it conforms to the syntax also. Hence, 5.2E−07 conforms to the syntax.
c) “E” must be followed by +, −, or a numeral.
d) “E” cannot be immediately preceded by a radix point.
the left subtree, follows the path a → b → c, and reaches “h.” After “h,” it scans node “c” and
proceeds to its right subtree “i.” Hence, the scan runs “h → c → i →…(hcibdajfegk),” which is the
answer option (c).
Post-order traversal goes in the order of “left subtree, right subtree, and node,” so it runs “h → i → c
→…(hicdbjfkgea),” which is the answer option (d).
Pre-order, in-order, and post-order traversal methods are called depth-first traversal; when scanning a
binary tree under these methods, the following patterns hold:
In a binary tree as shown below, if the values are output as you pass on the left side of the nodes, it is
pre-order traversal. If they are output when you pass under the nodes, it is in-order traversal. If they
are output when you pass on the right side of the nodes, it is post-order traverse.
4
6 6
5 7 7 7
1 1 1 1 1 1
6 3 (final result)
7 7 7
1 1 1
a) Hashing is a method for determining the storage location; it is unrelated to what is used as the data
structure.
b) There are occasions in which hashing gives the same address for different key values. This is
called a collision. In hashing, collisions cannot be prevented. If a collision happens, the data
already stored there is called home, and the data that caused the collision is called a synonym.
d) The data is converted to the address using a hash function, so the conversion can be done in the
same amount of time. In other words, it does not depend on the size of the table (the number of
elements in the table). If a collision occurs, re-hashing takes place, and the method of re-hashing
does impact the time required for search.
Next, we follow the flowchart of the question. The initial value of “i” is 1 (i Æ 1), and “i” is
increased by 1 each time (by “i + 1 Æ i” in the loop) until “i = n,” at which point the process escapes
the loop. Further, understand “A(i): A(i + 1)” to mean that both sides of this array are compared to
each other. For example, if “i = 1,” A(1) and A(2) are compared. If “i = 2,” then A(2) and A(3) are
compared. Comparing the neighboring elements and swapping them if they are out of sequence are
characteristics of the bubble sort algorithm.
0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0
If you do not recognize the logical product, you may carry out the logical operations of the answer
group and check the results.
a) b) c) d)
A B A XOR B
A AND B A AND B(NAND) A OR B (Exclusive OR)
0 0 0 1 0 0
0 1 0 1 1 1
1 0 0 1 1 1
1 1 1 0 1 0
As shown in the diagram in the question, the first instruction requires an execution time of 6 steps, but
each of the other instructions is delayed by one step, so you can consider an execution time of only
one step. Here, since six instructions are executed, the first instruction takes duration of 6 steps while
each of the other five instructions takes only duration of one step, a total of 5 steps. Hence, it takes a
total execution time of 11 steps to complete the process.
Execution time for 6 instructions = 11 (steps) × 10 (nanoseconds/step)
= 110 (nanoseconds)
You can verify the result by using the figure illustrated below.
First instruction 1 2 3 4 5 6
Second instruction 1 2 3 4 5 6
Third instruction 1 2 3 4 5 6
Fourth instruction 1 2 3 4 5 6
Fifth instruction 1 2 3 4 5 6
Sixth instruction 1 2 3 4 5 6
Number of steps
executed
1 2 3 4 5 6 7 8 9 10 11
1
Number of instructions per second =
Average instruction execution time
1
= −6
0.2× 10 (seconds/instruction)
1
= × 10 6 ( seconds/in struction)
0.2
= 5.0 × 106 (instructions/sec.)
→5.0 MIPS
If the CPU detects an interrupt, the OS stores the status of the program being executed prior to the
interrupt into PSW (Program Status Word). Then, it investigates the cause of the interrupt and
transfers control to the processing routine. Another interrupt may also occur during an interrupt
processing, so multiple interrupts are controlled by assigning each processing priority, depending on
the type of interrupt.
Clock mechanism Expiration of specific time (Interval timer), specific time stamp
interrupt
(Timer interrupt)
I/O interrupt I/O completed, I/O unit status change (printer out of paper,
etc.)
External signal interrupt Due to instructions from the system console or external
signals
Internal Program interrupt Overflow, invalid instruction code (undefined instruction
interrupt (exceptional interrupt) code), divide-by-zero (operation exception), memory
protection exception
Interruption for calling I/O operation request, task switching, page fault, calling the
control program control program function (supervisor call: SVC)
(instruction interrupt)
The answers a), c), and d) in the answer group are classified as internal interrupts.
Type Characteristics
CD-ROM CD-Read Only Memory
CD-R CD-Recordable: writing only
CD-RW CD-Rewritable: after complete erasure, re-writing is allowed
a) Disc at once (DAO) is a way of writing on CD-R and CD-RW. It can write data on the entire disc
all at once. No additional writing is allowed.
b) Track at once (TAO) is a way of writing on CD-R. It can write data in track units. After data is
written, additional writing is allowed.
c) Packet writing is a way of writing on CD-R. Whereas the writing unit for normal CD-R and
CD-RW is a track, this method divides data into smaller blocks for writing. Since writing is done
in smaller units, CD-R can be used in the same manner as a floppy disk or MO.
d) Multi-session is a way of writing on CD-R and CD-RW. It can write data in multiple sessions. To
write using multi-session, the CD-R drive must support writing by the TAO method. A session is
an area in which multiple tracks are joined together.
The reasons that the other options in the answer group are not correct are as follows:
a) Audio and video require high-speed transfer, but USB 1.1 connects relatively low speed devices.
Daisy chain is applied to SCSI. Connection in a tree topology without a host PC and corresponds to
10BASE-T.
c) This is an explanation of RS-232C.
d) This is an explanation of SCSI.
The TFT (Thin Film Transistor) method is one in which the screen dots are controlled by thin film
transistor (TFT). It has superior contrast, grayscale, and response speed. The STN (Super Twisted
Nematic) method has a simple structure, so the manufacturing costs are low; however, its display
quality is inferior to the quality of TFT. The DSTN (Dual-scan Super Twisted Nematic) method is an
upgrade of STN with a higher response speed.
The following table shows a relative comparison between LCDs and OLEDs. A circle means the
superior one.
Power Manufacturing
Thinness Large panel Field of view Life
consumption costs
LCD ○ ○
About the same
OLED ○ ○ ○
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
High I/O 5 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
High I/O 5 5
Mid I/O 6 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
CPU High-3 Mid-2 Low-1 High-2 Mid-2 Low-2 High-2 Mid-2 Low-1
High I/O 5 5
Mid I/O 6 5
Low I/O 5 4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
High I/O 5 5
Mid I/O 6 5
Low I/O 5 4
Hence, in the time scale shown in the chart, the CPU idle time is 2 milliseconds between 6 and 8, 1
millisecond between 10 and 11, and 1 millisecond between 17 and 18; that is, a total of 4 milliseconds
is the idle time.
CPU waiting time I/O unit waiting time CPU waiting time Ending process
↓ ↓ ↓ ↓
a) “..” indicates the parent directory, which is Directory A1. The next part “\A1” points to Directory
A1 under Directory A1, but there is no such directory in this figure.
c) “A1” points to Directory A1 under the current directory B1, but there is no such directory in this
figure.
d) “B1” points to Directory B1 under the current directory B1, but there is no such directory figure in
this figure.
In the function layer, search conditions sent from the presentation layer are assembled as the
processing conditions to access the database and sent to the data layer. Then, the response from the
data layer is manipulated to meet the request of the presentation layer and then sent to the presentation
layer.
a) Search conditions are sent to the presentation layer, and the data manipulation conditions are
assembled in the function layer.
b) Search conditions are sent to the presentation layer, and data access takes place in the data layer.
d) Data access takes place in the data layer, and data is manipulated in the function layer.
CPU
Main
memory Auxiliary memory
CPU
In this formula, the probability that every subsystem is unavailable is subtracted from the whole
(probability 1).
We substitute p = 0.7 (70%) into the equation for availability and calculate the availability of this
n-subsystem parallel configuration (call it A).
A = 1 − (1 − 0.7)n
= 1 − 0.3n
We want the availability A to be at least 99% (0.99), so the following inequality holds.
1 − 0.3n > 0.99
1 − 0.99 > 0.3n
0.01 > 0.3n
∴ 0.3n < 0.01
This inequality involves an exponential function, so it takes a long time to calculate it by hand.
Therefore, plug in integers, starting at n = 2, and see if the condition is satisfied.
n = 2: 0.32 = 0.09 > 0.01
n = 3: 0.33 = 0.027 > 0.01
n = 4: 0.34 = 0.0081 < 0.01
Hence, when n = 4, the inequality 0.3n < 0.01 holds. As a result, at least 4 subsystems are required.
External design
Internal design
Program design
Programming
Testing
Installation, operation
→ → → → → →
The external design is the system design without regard to a computer. It involves the definition and
deployment of subsystems, designs of the screen and reports, code design, and logical data design.
Then, an external design documents are prepared.
b) This is a design technology focusing on the data structure. The Jackson method and the Warnier
method take this approach.
c) This explains emulation.
d) This is generally done in systems development and is not an explanation of reverse engineering.
In addition, white box testing has various techniques such as instruction coverage, decision condition
coverage (branch coverage), condition coverage (branch condition coverage), decision
condition/condition coverage, and multiple condition coverage. Thus, the answer is (d).
(a) is an explanation of bottom up testing, (b) is top down testing, and (c) is black- box testing.
Data : 7 3 9 4
× × × ×
Weight : 1 2 3 4
=
Sum : 7 + 6 + 27 + 16 = 56
[Step 2] Divide the sum by the base (11) and find the remainder as follows:
56 ÷ 11 = 5 remainder 1
[Step 3] Subtract the remainder from the base (11). The digit in the resulting one's place is the check
digit as follows:
11 − 1 = 10
Check digit
Hence, the result of appending the check digit to the given data is 73940.
b) Both of these test cases turn out to be “true,” so we do not have two different results: true and
false.
A B A OR B
F T T
0 1 1
T F T
1 0 1
c) Two sets of test data are prepared: a true case and a false case. These can be used for branch
coverage.
A B A OR B
F F F
0 0 0
T T T
1 1 1
d) All three sets of data give true cases, so we do not have two different results: true and false.
A B A OR B
F T T
0 1 1
T F T
1 0 1
T T T
1 1 1
Function Explanation
The number of points determined by means of the type and complexity
External input
degree of the external input
The number of points determined by means of the type and complexity
External output
degree of the external output
The number of points determined by means of the type and complexity
External inquiry
degree of the external inquiry
The number of points determined by means of the type and complexity
Internal logic file
degree of the file accompanying access
The number of points determined by means of the type and complexity
External interface file
degree of interface linked to another system
Unit processed
XXXXXX ○
Check digit
Original data
The check digit is obtained by performing a certain operation to the original data. When the data is
entered, the entire unit including the check digit is entered. The computer then uses the same operation
used to create the check digit to verify the check digit entered. If the calculated check digit matches
the check digit entered, the input is determined to be valid.
Check digits are generally used to check codes. If the original code has 5 digits, a check digit is
appended, and the data is processed as 6-digit code.
b) This is an explanation of the array function for codes. If customer codes are assigned in
consecutive numbers, the customers can be listed in the order in which they were entered.
c) This is an explanation of the identification function for codes. For example, if customer codes
are given like “MINATO-011,” then it is immediately obvious that the customer is in
“MINATO.” Another example may be a product code “TV-001,” which immediately identifies
the product as a television.
d) This is an explanation of the classification function for codes. For example, if customer codes are
of the form “1-100,” where the leading digit is a region code, the data can be geographically
classified.
Graphs (a) and (d) are the opposite of a diminishing charge system since the rate of increase of the
usage charge increases as well.
Graph (b) cannot be right because the rate of increase goes down in a diminishing charge system; the
charge does not become constant.
As a result, this parity check can detect and correct up to 1 bit error.
Hence, the number of messages that can be sent per second is as follows:
The number of messages that can be sent per 1
second = Transmission time for one message
144
= (messages)
9
The number of messages that can be sent per minute is as follows:
The number of messages that can be sent per minute
= The number of messages that can be sent per second × 60
144
= × 60 = 960 (messages)
9
However, the usage ratio of the line is 80%, so the effective number of messages that can be sent per
minute is: 960×0.8=768 (messages).
= 1,200
2
= 600(bytes/sec)
Each byte is 8 bits, so the number of bits transferred per second is as follows:
The number of bits transferred per second = 600 × 8
= 4,800 (bits/sec)
(2) Calculation of the line usage ratio
The communication speed is 64,000 bits/sec and the amount of data transferred is 4,800 bits/sec, the
line utilization rate is calculated as follows:
Line utilization rate = 4,800 (bits/sec) × 100
64,000 (bits/sec)
48
= 640 × 100
= 0.075 × 100
→ 7.5 (%)
A B C D E F
A B C D E B F
This SQL statement calculates the difference (sales price – purchase price) for each of the products in
the Product table and extracts the rows where the value is 400 or more. When this SQL is executed,
two rows are extracted as shown in the following table.
Sales price –
Product code Product name Sales price Purchase price Extraction result
Purchase price
S001 T2003 1500 1000 500 Extracted
S003 S2003 2000 1700 300
S005 R2003 1400 800 600 Extracted
a) When the sales price of model R2003 is updated to 1300, (sales price – purchase price) becomes as
follows:
(sales price – purchase price) = 1300 – 800 = 500 >= 400
This was already extracted before the change and will be extracted after the change, so the number
of rows extracted will not change.
b) When the purchase price of model R2003 is updated to 900, (sales price – purchase price) becomes
as follows:
(sales price – purchase price) = 1400 – 900 = 500 >= 400
This was already extracted before the change and will be extracted after the change, so the number
of rows extracted will not change.
c) When the purchase price of model S2003 is updated to 1500, (sales price – purchase price)
becomes as follows:
(sales price – purchase price) = 2000 – 1500 = 500 >= 400
This was not extracted before the change and will be extracted after the change, so the number of
rows extracted will increase.
d) When the sales price of model T2003 is updated to 1300, (sales price – purchase price) becomes as
follows:
(sales price – purchase price) = 1300 – 1000 = 300 < 400
This was extracted before the change but will not be extracted after the change, so the number of
rows extracted will decrease.
ISO 9001 was revised in December 2000; requirements that had been distributed before were
organized into four categories as follows:
Management responsibility
Resource management
Product realization
Measurement, analysis, and improvement
It is characterized by items such as the concept of quality management systems and its continuous
improvement.
d) ISO 14001 is the international standard established by ISO so that corporations and organizations
can carry on business activities while taking the global environment into consideration. It certifies
the results of the organization's environment management, such as human resource development
and system establishment, to reduce the burden on the environment
ISO 14001 is a standard for certification; third-party organizations (certification bodies) registered
with the government are performing the assessment.
(2) Since x = 1000 and y = 60000, substitute these into the equation [2], and we get the following:
60000 = 1000 × a − 3000a
=1000a − 3000a
= −2000a
60000
∴ a = − = −30
2000
Hence, we can now find b as follows:
b = −3000a
= −3000 × (−30)
= 90000
Thus, the equation [1] is changed as follows:
y = −30x + 90000 ……[3]
(3) Substitute x = 1500 into the equation [3] to find the expected demand.
y = −30 × 1500 + 90000
= −45000 + 90000
= 45000 (units)
9000
8000
7000
6000
5000
Break-even point→
4000
3000
2000
1000
Fixed costs line
0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000
(1) Draw the fixed costs line.
The fixed costs are constant ($1,000), so this is a line parallel to the x-axis.
Fixed cost line: y = 1,000
(2) Calculate the variable costs.
Variable costs are costs that are directly proportional to the quantity sold. If the variable costs are
y and the total sales are x, we have the following equation:
Variable costs: y = α x (α is the constant of proportionality.) ……(A)
From the table given in the question, we see that when x (total sales) is $10,000, y (variable costs)
is $8,000. Substituting these values into the equation (A), we can get the result as follows:
8,000 = α × 10,000. ∴ α = 8,000 ÷ 10,000 = 0.8
(3) Draw the total costs line.
The total costs are the sum of the variable costs and fixed costs. The total costs line is then
obtained by adding the expressions of (1) and (2) above.
Variable costs line (total costs line): y = 0.8x + 1,000 ……(B)
(4) Draw the total sales line.
Draw the total sales line so that the line can make a 45-degree angle with the x-axis.
Total sales line: y = x ……(C)
The Delphi method takes advantage of the feedback characteristic. In this method, opinions of a large
sample of people are collected and analyzed through questionnaires, and the results of the survey are
summarized, shown to the respondents, and then the survey process is repeated. This method has many
advantages. First, as it employs an intuitive method, it is effective when applied to discontinuous
changes of technology. It can also help avoid being influenced by the group dynamics that tend to
come from regular face-to-face meeting, etc. In addition, when a comment collected from the survey is
different from the majority’s opinion, invaluable new ideas can be obtained from reasons added by the
respondent. Hence, the setting for questions to ask is an important key to success of this method.
a) Cause analysis is to pick out and analyze the root causes of problems found in operations and
systems. There is no particular set method for this, but at least in this case, it does not repeat the
same survey as in the Delphi method.
b) A segment is a unit that has been minutely partitioned. A segment of cell phone service users may
be defined by classification according to gender or age group. In segment analysis, the frequency
of service use is studied for each of the finely segmented classes. For example, the analysis may
survey all cell phone service users based on their age as shown below:
c) Analysis of population dynamics is the study of population change from various perspectives. For
instance, it may study the number of childbirths and number of deaths. In this case, it will be
time-series analysis, so the survey is not repeated to the same subjects as in the Delphi method.
Center (average)
Abnormal value
a) A network diagram with arrows connecting individual activities and indicating their order
relationships is also known as an arrow diagram. Process bottlenecks are identified by analyzing
the activities along a critical path.
b) ABC analysis is suitable here.
d) Cause and effect diagram (or fishbone diagram) can be used.
a) This year's score being 0 means y = 0. Then, the value of x (last year's score) is as follows:
0 = 1.1x+10
1.1x = −10
∴ x = −9.090…
≅ -9.1 (rounded to the nearest tenth).
Hence, the last year's score is -9.1, not 10
b) The average score needs to be calculated by adding individual scores. For instance, if the last
year's average is 50, i.e., x = 50, then we can see the relationship between x and y as follows:
y = 1.1 × 50 + 10
= 65
y 65
∴ = = 1.3
x 50
Regardless of the individual scores, the average score is not 1.1 times the score last year.
c) As seen under (b) above, the person who scored 50 last year would have scored 65 this year.
Hence, we can conclude that points were easier to earn on this year's exam than on last year's.
d) The analysis is only about the scores, and we cannot evaluate the contents of the examinations.
Further, the statement that “this year's exam scores were good” does not necessarily mean that
points were easier to earn.
b) The least squares method is used to estimate the relation correlation between two quantities. For
instance, if there is a fact that those who score high in mathematics also score high in science,
then science and mathematics test scores are plotted in a scatter diagram, with many sample
points. From the scatter diagram, one can obtain a relational expression between the scores of
mathematics and science. Once the equation is found, when you know the score of one test,
either mathematics or science, you can predict the other score.
d) The fixed order quantity system is a method of ordering in which the quantity to be ordered
remains constant but the time of ordering varies depending on the fluctuation of the demand. It
is the idea of letting the fluctuation of the demand be absorbed in the fluctuation of ordering
intervals. An order is placed when the inventory goes down below a certain level, so it is also
called the order point method.
Fundamental IT Engineer
Examination (Afternoon)
Trial
Question Nos. Q1 - Q5 Q6 - Q7 Q8 - Q9
Question Selection Compulsory Select 1 of 2 Select 1 of 2
Examination Time 150 minutes
Conditional expression
A repetition process with the condition at the top.
Process
The Process is executed while the Conditional expression
is True.
[Operator]
Multiplication and
* /
division operation
Addition and subtraction
+ -
operation
Relational operation > < >= <= = ≠
Logical sum
or xor Low
Exclusive logical sum
true false
Q1. Read the following descriptions of logical operations and full adders, and then answer
Subquestions 1 through 3.
(1) The logical circuit symbols for the main logical operations are as follows.
Logical product
Logical operation name Logical sum (OR) Exclusive logical sum (XOR)
(AND)
Logical circuit A A A
Z Z Z
symbols B B B
(2) Below is a figure which shows a full adder that adds binary numbers digit by digit with
considerations for carry. The table shown below is the truth table for that full adder.
A k , B k : Input
Ak
Full adder k
Z k Z k : O utput
Bk
Subquestion 1
From the answer group below, select the correct answer to be inserted in the blank in
the truth table of the full adder.
Answer group:
a) 0 0 b) 0 1
c) 1 0 d) 1 1
Subquestion 2
From the answer group below, select the correct answer to be inserted in the blank in
the logical circuit of the full adder.
Ak
Bk Zk
Ck
Ck+1
Answer group:
a) b) c)
Subquestion 3
When a logical circuit is configured with full adders to add n-digit binary numbers represented as
two’s complement, the addition of the most significant digits (An, Bn and Cn) causes an overflow
(the shaded part of the full adder truth table). A logical circuit for detecting this can be configured
with one XOR. Select from the answer group below the correct combination of X and Y inputs to
this logical circuit.
A1 A2 An
Full adder 1
Full adder 2
Full adder n
Z1 Z2 Zn
B1 B2 ... Bn
C1 C2 C2 C3 ... Cn Cn+1
X
V
Y
(Overflows when V = 1 )
Answer group:
Q2. Read the following description about the relational database, and then answer
Subquestions 1 through 3.
The following relational database consists of an employee table and an employee skill table.
Employee Table Employee Skill Table
Employee Employee Employee Skill Date
Department
number name number code registered
0001 Brown A1 0001 FE 19991201
0002 Charles A2 0001 DB 20010701
0003 Taylor B1 0002 NW 19980701
0004 Williams D3 0002 FE 19990701
0005 Parker A1 0002 SW 20000701
0006 James B1 0005 NW 19991201
Subquestion 1
From the answer groups below, select the correct answers to be inserted in the blanks
in the following description.
True is returned for the EXISTS phrase when the sub-query result exists, and false is returned if it
does not exist. When the following SQL statement is executed, the number of selected employee is
A .
The EXISTS phrase evaluates each respective row if the specified sub-query contains a reference
to a table different from that in the main query. When the following SQL statement is executed,
the number of selected employee is B .
a) 0 b) 1 c) 2 d) 3
e) 4 f) 5 g) 6
Subquestion 2
The following SQL statement outputs some employees’ employee numbers. Who will be selected?
From the answer group below, select the correct answer.
Answer group:
Subquestion 3
You want to add the employee’s name to the information obtained in Subquestion 2. From the
answer group below, select the correct answer to be inserted in the blank in the
following SQL statement.
Answer group:
Q3. Read the following program description, and then answer Subquestions 1 through 3.
[Program description]
Subquestion 1
A character string type array LINE[i] (i = 1, 2, …, MAX) and integer type variable TAIL are
defined in the program. The character string Li on Line i is stored in LINE[i] and an index of the
array element (“0” if empty), which always corresponds to the last line, is stored in the variable
TAIL.
Variable CP contains an index of the array element of the line to be processed. From the answer
group below, select the sub-program(s) for which the amount (order) of computation needed remain
constant, regardless of the number of lines (select all applicable answers).
TAIL
n n Ln
MAX
Fig. 1 Example: Implementation Using an Array
Answer group:
Subquestion 2
The program was changed so that it can handle a bi-directional list using pointers, as shown in
Figure 2. Element i in the list consists of the pointer to the element storing Line i–1, the character
string Li at Line i, and the pointer to the element storing Line i+1. If an applicable element does
not exist, “0” is stored as the pointer value.
The variable HEAD is a pointer to the element which stores the first line. The variable TAIL is a
pointer to the element which stores the last line.
The variable CP is a pointer to the element containing the line to be processed. From the answer
group below, select the sub-program(s) for which the amount (order) of computation needed remain
constant, regardless of the number of lines (select all applicable answers).
Answer group:
Subquestion 3
From the answer group below, select the correct answers to be inserted in the blanks in
the following text.
A bi-directional list using pointers was implemented using three arrays. Figure 3 is an example of a
case in which the maximum number of lines is 10. The index of the array element containing the
line to be processed is stored in the variable CP. The index of the array element containing the first
line is stored in the variable HEAD. The index of the array element containing the last line is stored
in the variable TAIL. The index of the first array element in the empty list is stored in the variable
EMPTY.
Assume that Lines L1, L2, L3, L4, and L5 are stored as shown in Figure 3 and CP = 8. If DELETE()
is executed to delete the array element containing L3, then the values of HEAD and TAIL are not
changed and EMPTY = 8, PREV[9] = A , NEXT[2] = B , and NEXT[8] =
C . Assume that the deleted array element is added to the beginning of the empty list.
Answer group:
a) 0 b) 1 c) 2 d) 3
e) 7 f) 8 g) 9 h) 10
Q4. Read the following program description and the program itself, and then answer
Subquestions 1, 2 and 3.
[Program Description]
The subprogram HeapSort is a program to sort integer values that are stored in an array in
ascending order by heapsorting.
(1) The Num items of integers (Num >= 2) to be sorted are stored in an array of global variables
A[1], A[2], ... , A[Num].
(2) The heap sort uses a binary tree to sort data. In order to represent a binary tree with an
array, when a certain node corresponds to A[i], the node for the left child corresponds to
A[2*i] and the node for the right child corresponds to A[2*i+1]. In the figure below,
the circles represent the nodes, the numbers inside the circles represent the node values, and
the actual array elements that store the values are shown next to the circles.
(3) As shown in the figure, the heap is a binary tree in which the value of each node is greater
than or equal to the values of its children.
[Program]
B /* Compare 2 elements */
A[Top] < A[L]
Swap(Top, L)
MakeHeap(L, Last)
Subquestion 1
From the answer groups below, select the correct answers to be inserted in the blanks
in the above program.
a) L ← Top b) L ← Top + 1
c) L ← Top * 2 d) L ← Top * 2 + 1
Subquestion 2
From the answer group below, select the correct answers to be inserted in the blanks
in the following description.
Using the heap in the figure, when Steps (iii) and (iv) of (4) in the section [Program Description]
are executed just once and Step (ii) is completed, the index of the array element in which the
Answer group:
a) 2 b) 3 c) 4 d) 5
e) 6 f) 7 g) 8 h) 9
Subquestion 3
The subprogram InitHeap that initially creates a heap can be created using MakeHeap.
From the answer group below, select the correct answer to be inserted in the blank in
the following program.
MakeHeap(Idx, Last)
Answer group:
Q5. Read the following description on program design, and then answer Subquestions 1 and
2.
You are going to design a program for a game that raises and lowers a flag. In the game, the player
responds to instructions (raise/lower red flag and/or white flag) using the input device by operating
buttons. A certain type of window is displayed when the correct response is made and another type
is displayed when an incorrect response is made. This pattern is repeated only for a set number of
times and the number of correct responses is displayed as the score. The game hardware
configuration and the windows for correct responses and incorrect responses are shown in Figure 1
below.
Initialization
value
Correct
response
Program
Up Up
Correct
Down Down
response
• The output device gives an instruction by voice and displays the correct response and the
player’s response after the response detection time has elapsed. After the game ends, the
score is displayed.
• The input device detects only the first button pressed after the instruction has been given and
notifies the program of this button.
[Explanation of Program]
(1) The program internally holds the current status of the flag.
Red flag status: Up or Down
White flag status: Up or Down
(2) The program reads the initial value file and sets the initial status of both the red flag and white
flag to “Down”.
(3) The program randomly selects a flag for the raise/lower instruction and a selectable
instruction, based on the status of the flags given in the table, and then sends that information
to the output device.
(4) There are 5 types of operations which can be received from the input device: raise the red flag,
lower the red flag, raise the white flag, lower the white flag, and move neither flag (when a
response is not detected within the response detection time).
(5) The program detects the player’s response. If correct, it adds “1” to the number of
correct responses and sends the correct response window to the output device. If
incorrect, it sends the incorrect response window to the output device and, after the incorrect
window display time in the initial value file has elapsed, it sets the flag in the window to the
correct state.
(6) The programs repeats steps (3) through (5) the number of repetitions specified in the initial
value file.
(7) The program sends the player’s score to the output device.
Subquestion 1
Figure 2 below is a state transition diagram of the flags. From the answer groups below, select the
correct answers to be inserted in the blanks in the following description about Figure 2.
S0 indicates the initial status and the “raise the red flag” response corresponds to state transition (i)
from S0 to S1. In this case, S1 is the A state. And, the response corresponding to state
transition (ii) of S3 to S0 is B and that for state transition (iii) of S3 to S2 is C .
(i) S1
S0 S2
(iii)
(ii) S3
Subquestion 2
Figure 3 below is a flowchart of this program. The initial status is set in the initialization
processing. Other than the initialization processing and the setting of the number of repetitions,
what two processes reference the content of the initial value file?
Main process
Start start
Yes
Correct?
Repetition
No Correct count ← Correct count +1
Score display
Incorrect response Correct response
processing processing
End
Fig. 3 Flowchart
Answer group:
Select one question from Q6 and Q7. If two questions are selected, only the first
question will be graded.
Q6. Read the following description of a C program and the program itself, and then answer
Subquestion.
[Program Description]
This program parses a tagged character string, and picks up tag values into separate elements.
(1) The syntax for tagged character strings is defined as follows. Symbols used in syntax
notation are defined as given in Table 1 below. In addition, <, >, and / are used as tokens.
(3) The tagged_structure may include another tagged_structure. (See Figure 1 below.)
<STUDENT>BILL<AGE>14</AGE></STUDENT>
Low-level
tagged_structure
High-level
tagged_structure
[Program]
typedef struct {
char *tag;
int depth;
char *value;
} ELEMENT;
elmtbl[elmnum].tag = A ;
elmtbl[elmnum].depth = level;
for (; *mlstr != '>'; mlstr++);
*mlstr = '\0';
/* Tag_value processing */
elmtbl[elmnum].value = B ;
C ;
while ( D )
Subquestion
From the answer groups below, select the correct answers to be inserted in the blanks
in the above program.
Q7. Read the following description of a Java program and the program itself, and then
answer Subquestions 1 and 2.
[Program Description]
This program draws a point that moves within a rectangular space as shown below.
The point is represented with class Point, and this class stores the coordinates (x, y) that represent
the position of the point and the speed. The speed is a positive value and represents the distance
moved in the x-axis direction and y-axis direction per unit time.
The rectangular space in the figure is given by class Space, and the following class methods can be
called.
The abstract class Motion uses the point given by Point as its initial value within the constructor,
and repeatedly draws, moves and erases the point in this order to display the movement of the point.
The coordinates of the point after moving is given by the method update.
SimpleMotion is the subclass of Motion, and it implements methods update and main.
Method update expresses, within a rectangular space, the movement of a point that moves in a
straight line at a fixed speed, and bounces off the sides of the rectangle. Method main tests the
program.
Note that it is assumed that the initial values of the coordinates of Point generated by method
main are within the rectangular space, and collisions between points do not need to be taken into
account.
[Program 1]
[Program 2]
[Program 3]
Subquestion 1
From the answer groups below, select the correct answers to be inserted in the blanks
in the above programs.
Subquestion 2
From the answer group below, select the correct answer for the movement of point P in the
following figure when method run is executed. Assume that point P is what is given as Point to
the constructor of class SimpleMotion, and that the value of speed is 1.
Also assume that all the in the program have the correct answers.
Answer group:
Select one question from Q8 and Q9. If two questions are selected, only the first
question will be graded.
Q8. Read the description of the following C program and the program itself, and then answer
Subquestions 1 through 4.
[Program Description]
M[7][0] M[7][7]
End point
(3) Except for the start point and end point, all squares on the outside edge of the maze (squares
whose row or column value within M is either “0” or “7”) are walls.
(4) Global variables are used by this program as follows.
Variable name Use
M Stores the maze data (two-dimensional array)
x Row of square inside maze
y Column of square inside maze
dir Direction of move while going through maze
Assume that initial values are loaded for these global variables.
The function maze searches through the maze to find the path from the entrance to the exit.
[Program]
(Line No.)
1 #define UP 0
2 #define RIGHT 1
3 #define DOWN 2
4 #define LEFT 3
5 #define ROAD 0x00 /* Path code */
6 #define WALL 0xff /* Wall code */
7 #define SMAX 8
8 #define ENTRANCE 0xf0 /* Start point code */
9 #define EXIT 0xf1 /* End point code */
10
11 int rcheck(void);
12 int fcheck(void);
13 void go(void);
14 void maze(void);
15
16 int M[SMAX][SMAX], x, y, dir;
17
18 void maze()
19 {
20 while ( M[y][x] != EXIT ) {
21 if ( ( rcheck() == ROAD ) ||
22 ( rcheck() == EXIT ) ) {
23 dir = ( dir+1 ) % 4;
24 go();
25 }
26 else if ( ( fcheck() == ROAD ) ||
27 ( fcheck() == EXIT ) ) go();
28 else dir = ( dir+3 ) % 4;
29 }
30 return;
31 }
32
33 int rcheck()
34 {
35 if ( dir == UP ) return M[y][x+1];
36 else if ( dir == RIGHT ) return M[y+1][x];
37 else if ( dir == DOWN ) return M[y][x-1];
38 else return M[y-1][x];
39 }
40
41 int fcheck()
42 {
43 if ( dir == UP ) return M[y-1][x];
44 else if ( dir == RIGHT ) return M[y][x+1];
45 else if ( dir == DOWN ) return M[y+1][x];
46 else return M[y][x-1];
47 }
48
49 void go()
50 {
51 if ( dir == UP ) y--;
52 else if ( dir == RIGHT ) x++;
53 else if ( dir == DOWN ) y++;
54 else x--;
55 }
Subquestion 1
Given the maze shown in Figure 2 below, what is the path output by this program? Select the correct
answer from the answer group below. 1 through 8 shown in the figure are used to represent the
location of squares in the maze. In this program, the value of the global variables x, y, and dir
are as follows at the time the function maze is called:
x = 1
y = 0
dir = DOWN
Start point
M[0][0] M[0][7]
1 3 ③
4 8
2 5 7
M[7][0] M[7][7]
End point
Fig. 2 Locations of Squares in Maze
Answer group:
Subquestion 2
Given the maze shown in Figure 3 below, what is the path output by this program? Select the correct
answer from the answer group below. In this program, the value of the global variables x, y, and
dir are as follows at the time the function maze is called.
x = 1
y = 0
dir = DOWN
Start point
M[0][0] M[0][7]
1 4
2 3
M[7][0] M[7][7]
End point
Answer group:
Subquestion 3
The following statement was added to the program to output the positions of the squares, including
the start points and end points, being passed through and the directions of the moves. What is the
best place to add this statement? Select the correct answer from the answer group below.
Assume that #include <stdio.h> is placed at the top of the program.
printf("dir=%d y=%d x=%d\n", dir, y, x);
Answer group:
Subquestion 4
Assume that the following function lcheck is used in place of rcheck to find maze solutions, and
that the function lcheck replaces rcheck on Lines 11, 21, and 22.
If this is done, what other changes must be made to the program? Select the correct answer from the
answer group below.
int lcheck()
{
if ( dir == UP ) return M[y][x-1];
else if ( dir == RIGHT return M[y-1][x];
)
else if ( dir == DOWN ) return M[y][x+1];
else return M[y+1][x];
}
Answer group:
Q9. Read the following description of a Java program and the program itself, and then
answer Subquestions 1 and 2.
[Program Description]
This is a program for an electronic calculator that performs addition, subtraction, multiplication,
and division operations on integers. The I/O component is supplied by a test program, and can be
used to test the main calculator component program.
(1) Class CalculatorEvent is an event that is generated when a calculator key is pressed.
The value of the field type represents events. Types are either DIGIT, OPERATOR, or
CLEAR, and each represent either the number keys (0 through 9) on the calculator, operation
(e.g., +) or the equal (=) keys, or the Clear key (C) respectively. When the type is DIGIT,
the numerical value corresponding to the number key is stored in the field value. When the
type is OPERATOR, the character representing the type of operation or ‘=’ is stored in the field
value. When the type is CLEAR, value is not used.
(2) The interface CalculatorOutput declares the method display that displays numerical
values and errors on the calculator.
(3) The class Calculator is the main calculator itself.
Method eventDispatched receives events, and performs operations, etc., in accordance
with the event type.
Note that the results of the operations—addition, subtraction, multiplication, and division—of
two numerical values match the results of the same operations on Java’s int type.
(4) Class CalculatorTest is a program to test Calculator.
CalculatorOutput is implemented as an anonymous class. In this implementation,
method display outputs the numerical value or character string specified by
System.out. Method main generates CalculatorEvent from the character string
given by the argument args[0], and calls method eventDispatched of Calculator.
The correspondences between the characters and calculator keys are as shown in the
following table.
For example, the character string “2+7=” represents the calculator keys 2, +, 7, and = being pressed,
in that order. When the character string is passed to the method main as argument args[0], the
program outputs the following:
2
2
7
9
[Program 1]
[Program 2]
[Program 3]
public class Calculator {
private int accumulator = 0, register = 0;
private int operator = 0;
private CalculatorOutput output;
[Program 4]
Subquestion 1
From the answer groups below, select the correct answers to be inserted in the blanks
in the above programs.
a) CalculatorEvent(type, 0)
b) new CalculatorEvent(type, 0)
c) return new CalculatorEvent(type, 0)
d) super(type, 0)
e) this(type, 0)
a) implements CalculatorOutput()
b) interface CalculatorOutput()
c) new CalculatorOutput()
d) new Temp() implements CalculatorOutput
e) public class Temp implements CalculatorOutput
a) c - '0', CalculatorEvent.DIGIT
b) c, CalculatorEvent.DIGIT
c) CalculatorEvent.DIGIT
d) CalculatorEvent.DIGIT, c
e) CalculatorEvent.DIGIT, c - '0'
Subquestion 2
The following table shows the final output results when method main is executed with the
character strings below as the arguments.
From the answer group below, select the correct answer to be inserted in the blank in
the table.
Assume that all the blanks in the program have the correct answers.
3+4*5= 35
3*4***= D
3*4=+5 E
3+4/0= F
Answer group:
a) 0 b) 3 c) 4 d) 5
e) 7 f) 12 g) 17 h) 53
i) / by zero j) Error
Trial
Answers & Comments on Afternoon Questions
Exam
Q1:
An adder is a circuit that adds binary bits and can be a half adder or a full adder. A half adder is a
circuit that does not consider carrying from lower (less significant) digits and is used in the operation
of the lowest order (least significant) digit. A full adder is a circuit that does take into account carrying
from lower digits and is used in addition of digits other than the lowest digit.
Ck + 1 Zk
Because Ck + 1 = 1 and Zk = 0, we have the following:
Input Output
Ck Ak Bk Ck + 1 Zk
1 0 1 1 0
Ak
XOR
Bk Zk
Ak XOR Bk
Ck
Ck
OR Ck+1
Ak
Ak AND Bk
AND
Bk
As this figure shows, if we let P be the result of the logical operation in the shaded box, the result
of the logical sum (OR) of P and “Ak AND Bk” is Ck + 1. In addition, P is the result of some binary
operation involving “Ak XOR Bk” and Ck. From the answer group, it is clear that this operation is
the logical product (AND), logical sum, or exclusive logical sum (XOR). So we can organize all
this information as shown below.
Ck Ak Bk P Ak AND Bk (P OR Ak AND Bk) = Ck + 1
0 0 0 ? 0 0
0 0 1 ? 0 0
0 1 0 ? 0 0
0 1 1 ? 1 1
1 0 0 ? 0 0
1 0 1 ? 0 1
1 1 0 ? 0 1
1 1 1 ? 1 1
Here, the logical sum of P and “Ak AND Bk” is Ck + 1. So we can “estimate” what the value of P is
as follows. Since the operation here is the logical sum, if Ck + 1 is “0,” both arguments (inputs)
would have to be “0.” But if Ck + 1 is 1, then either both arguments are “1” or one of the arguments
is “1.”
Ak AND Bk Ck + 1 Estimated value of P
0 0 0
0 0 0
0 0 0
1 1 1, 0
0 0 0
0 1 1
0 1 1
1 1 1, 0
Now, from the estimated values of P, we can estimate the operation of “Ak XOR Bk” and Ck.
Again, the answer group offers only the logical product, logical sum, and exclusive logical sum, so
we limit our consideration to these.
Input Output
Ck Ak Bk Ck + 1 Zk
Case 1 0 1 1 1 0
Case 2 1 0 0 0 1
Below, we add Cn, An, and Bn for each of these two cases.
(1) Case 1
0 = Cn
1 = An
+) 1 = Bn
10
Cn + 1 Zn
In this case, An and Bn are highest bits, both of which are 1, implying that we are adding two
negative numbers. However, Zn, the highest bit of the result of the operation, is 0, indicating a
positive number. Hence, the operation is not performed correctly. This is because an overflow has
occurred, pushing a “1,” which indicates a negative number, over to Cn + 1 (which has disappeared).
(2) Case 2
1 = Cn
0 = An
+) 0 = Bn
01
Cn + 1 Zn
Here, An and Bn are highest bits, both of which are 0, so we are adding two positive numbers.
However, Zn, the highest order bit of the result of the operation, is “1,” indicating a negative
number. Hence, this operation is not performed correctly either. This is because an overflow has
occurred, pushing a “0,” which indicates a positive number, over to Cn + 1 (which has disappeared).
An Bn Cn Zn Cn + 1
0 0 1 1 0
1 1 0 0 1
Q2:
EXISTS in an SQL statement is executed from a sub-query (outside the SELECT statement). For one
line of the result of the main query, if the result of the sub-query (the SELECT statement within
parentheses) has at least one line, the value “true” is returned; if it has no lines, the value “false” is
returned. If the value is “true,” the contents of the column designated by the SELECT statement in the
main query are extracted.
We thus conclude that all lines of the employee skill table do satisfy the condition, so all the
employee numbers from the employee skill table will be extracted. This table has six rows.
B:
The SELECT statement in the main query picks one line from the employee table, and its result is
delivered to the sub-query. In the sub-query, the contents of the delivered line are evaluated, and
the result is returned to the main query. The main query then evaluates the result of the sub-query
concerning the extracted row and determines whether or not to extract it.
In the sub-query, the employee table (A) and the employee skill table (B) are joined, and the rows
where the skill code is “FE” get extracted. For example, the first line in the employee table (0001,
Brown, A1) is joined to the employee skill table by the employee number “0001” as shown below:
Now, there are two rows in the employee skill table with the same employee number “0001”, but
only one of these lines has the skill code “FE”, the first row. Here, the result of the sub-query is
“true” (there is a row), so the “employee name” designated in the main query gets extracted from
the employee table. Similarly, the second row of the employee table (0002, Charles, A2) shares the
same employee number with Rows 3, 4, and 5 of the employee skill table, but only Row 4 has the
skill code “FE”. Hence, the employee name is extracted. In contrast, there are no rows in the
employee skill table that share the same employee number as Rows 3, 4, or 6 of the employee table.
So these employee names are not extracted. As for the fifth row (0005, Parker, A1) of the
employee table, there is a row in the employee skill table with the same employee number;
however, the skill code is not “FE,” so the employee name is not extracted.
Summarizing all of this, we see that two rows (Brown and Charles) are extracted as shown below.
In this figure, solid lines indicate relations that are objects of extraction; dotted lines indicate that
the same employee number exists but the name is not extracted because the skill code is not “FE.”
The idea of EXISTS is covered above. However, the end result here is that the employee names of
the rows satisfying the sub-query condition (skill_code = ‘FE’) are extracted.
If we do not consider DISTINCT, because the employee numbers of rows having the
solid-line correspondence are taken out of B1, what is extracted is (0001, 0001, 0002, 0002,
0002, 0002, 0002, 0002). However, the SELECT statement in the main query designates
DISTINCT, duplicate values are eliminated, and consequently (0001, 0002) will be extracted.
Therefore, the employee numbers of those with multiple skills are extracted. Incidentally, if an
employee number has only one line (employee number 0005), the skill code always matches,
so it will not be extracted.
Q3:
A global variable is a variable that can be referenced from every subprogram. A variable used in a
program can be a global variable or a local variable. A local variable is a variable valid only within the
program that defines it.
a) DELETE() deletes the line designated by CP. By deletion, the next line and lines below that all
move up by one line, so for instance the removal of Line 1 requires shifts Line 2 Æ Line 1, Line
3 Æ Line 2, … , Line n Æ Line (n – 1), which is (n – 1) moves. If Line 2 is removed, Line 1
stays the same, but Line 3 Æ Line 2, Line 4 Æ Line 3, … , Line n Æ Line (n – 1), which is (n –
2) moves. Hence, the number of moves depends on the value of n, suggesting that the amount of
computation is not constant.
b) GET() returns the character string of the line designated by CP. This simply takes out the
character string from the position designated by CP, so the amount of computation is constant.
c) INSERT() inserts a new line x in the line designated by CP. By insertion, all lines below the
inserted line move down by one line. For example, if a line is to be inserted before the first line,
Line n Æ Line (n + 1), Line (n – 1) Æ Line n, … , Line 1 Æ Line 2, which is n moves, and then
the insertion into Line 1 occurs. If a line is to be inserted before the second line, Line 1 stays the
same, but Line n Æ Line (n + 1), Line (n – 1) Æ Line n, … , Line 2 Æ Line 3, which is (n – 1)
moves, and then the insertion into Line 2 occurs. Thus, the number of moves depends on the
value of n, suggesting that the amount of computation is not constant.
d) LAST() returns the line number n of the last line (the number of lines in the text). If the text is
empty, this returns 0. As shown in Figure 1, this is stored in TAIL, so this simply extracts the
value of TAIL. Therefore, the amount of computation is constant.
a) DELETE() changes the pointer connection as shown below. An “×” shows a pointer that is cut,
and a thick arrow indicates a pointer that is connected.
↓
LL PLR PCPL LCP PCPR PRL LR
(1) PCPR is stored in PLR. This is so that LL points to LR because LCP is being deleted.
(2) PCPL is stored in PRL. This is so that LR points to LL because LCP is being deleted.
This process only involves operations (1) and (2), so the amount of computation is constant.
b) GET( ) simply reads LCP at the position designated by CP, so the amount of computation is
constant.
c) INSERT(x) changes the pointer connection as shown below. Assume that LA is the line to be
added and that the pointer of this added line is known.
↓
LL PLR PCPL LCP PCPR PRL LR
d) There is no information on the number of lines anywhere, so to count the number of lines, it is
necessary to actually count, beginning at HEAD and following pointers. Or, the number of lines
can be counted from TAIL. Hence, LAST() takes longer to process when there are more lines.
Therefore, the amount of computation is not constant.
HEAD CP TAIL
4 4 6
0 L1 2 4 L2 8 2 L3 9 8 L4 6 9 L5 0
4 2 8 9 6
↓ DELETE ()
0 L1 2 4 L2 9 2 L3 9 2 L4 6 9 L5 0
4 2 8 9 6
Next, we switch the pointer of the empty list. Since the element number 8 is deleted, this becomes
the head of the empty list. So EMPTY changes “from 3 to 8.” Further, we make sure that NEXT[8]
of the element number 8 deleted will point to “3,” which is the head of the empty list prior to the
deletion.
Q4:
¾ In a heap, the root value is the largest, and a parent value is greater
Points than a child value.
¾ HeapSort repeats root extraction and reconstruction of a heap
A heap is a binary tree in which data are placed from shallow to deep nodes and, on the same depth
level, from left to right, and the values have the following restriction:
Parent element value > child element value (or parent element value < child element value)
Hence, the elements with larger (smaller) values are gathered around the root, and those with smaller
(larger) values are close to the leaves. Since the root has the element with the largest (smallest) value,
this data structure is suitable for extracting the largest (smallest) value.
A heap can be represented by an array in the following way. If a node corresponds to A[i], its child
node on the left corresponds to A[2×i], and the child node on the right corresponds to A[2×i + 1].
Then, an array A contains the elements in the following way:
Index 1 2 3 4 5 6 7 8 9 10
A[] 91 86 72 72 45 69 24 55 1 12
Right child
Parent (Notes)
Left child
B:
The case in which “R < Last” does not hold is the case in which there is no right child. This
is because Last is the index of the very last element, and R is the index of its right child node.
In this case, we check if there is a left child. If there is, we compare the value of the parent node
(whose index is Top) and the value of the left child (whose index is L). Whether or not there is
a left child can be determined by “L < Last” just as “R < Last” used before. Hence, we
insert “L < Last” here.
The root A[1] and the last node A[10] are switched. Then, the heap is reconstructed.
Index 1 2 3 4 5 6 7 8 9 10
A[] 91 86 72 72 45 69 24 55 1 12
12 86 72 72 45 69 24 55 1 91
Comparison
A[] 12 86 72 72 45 69 24 55 1 91
The value 86 (A[2], the left child) is the largest value as the result of the comparisons, so A[1]
and A[2] are switched.
Index 1 2 3 4 5 6 7 8 9 10
A[] 12 86 72 72 45 69 24 55 1 91
86 12 72 72 45 69 24 55 1 91
Next, with A[2] (which has been just switched) as the parent, the same process is executed.
1 2 3 4 5 6 7 8 9 10
A[] 86 12 72 72 45 69 24 55 1 91
As the result of the comparisons, 72 (A[4], the left child) is the largest value, so A[2] and A[4]
are switched.
Index 1 2 3 4 5 6 7 8 9 10
A[] 86 12 72 72 45 69 24 55 1 91
86 72 72 12 45 69 24 55 1 91
Next, with A[4] (which has just been switched) as the parent, the same process is executed.
1 2 3 4 5 6 7 8 9 10
A[] 86 72 72 12 45 69 24 55 1 91
As the result of the comparisons, 55 (A[8], the left child) is the largest value, so A[4] and A[8]
are switched.
Index 1 2 3 4 5 6 7 8 9 10
A[] 86 72 72 12 45 69 24 55 1 91
86 72 72 55 45 69 24 12 1 91
Next, we make comparisons with A[8] as the parent, but there is no child node, so the process
ends here. Consequently, swapping has occurred 4 times.
The final state of array A is as shown below, so the value 12 is stored as the array element whose
index number is “8.”
1 2 3 4 5 6 7 8 9 10
A[] 86 72 72 55 45 69 24 12 1 91
The left child of A[j] is (A[j × 2]), and the right child is (A[j × 2 + 1]), so the parent of
A[k] is A[k / 2]. Hence, the index of the last node with a child is the quotient of the number
of elements (Last) by 2, i.e., (Last / 2). In this particular case, since Last = 10, the index of
the last node with a child is (10 / 2 = ) 5. Hence, we need to reconstruct heaps while each time
subtracting 1 from Idx, which begins with (Last / 2), until Idx = 1 (while Idx ≧ 1).
Therefore, we insert “Idx: Last / 2, Idx >= 1, -1.”
Q5:
Q6:
A:
Referring to Table 2, the pointer to “STUDENT” is stored in elmtbl[elmnum].tag. The initial
value of elmnum is 0, so first, the pointer to “STUDENT” (address of the letter “S”) is stored in
elmtbl[0].tag. Therefore, “mlstr” is inserted here.
Further, in the next statement, the depth of the nesting is set, and the reading is skipped until “>.”
The fact that “\0” is stored in the position of “>” indicates that the process is completed.
B:
This is processing a tag value, so according to Table 2, we see that the pointer to “BILL” is stored
in elmtbl[elmnum].value. However, in the “for” statement three lines up from this, the
index stops at the position of “>.” So 1 needs to be added and the position should be at “B.”
Therefore, “mlstr + 1” is inserted.
C:
After processing the tag value, we scan for the next “<” or “</.” This is the next “for” statement
after the blank B. Just as in the beginning tag process, the index stops at the position of “<,” so the
program determines whether or not the next character is “/.” If it is, the depth of the nesting does
not change, but if not, there is a lower-level tag structure. Therefore, here, we need to add 1 to the
index by inserting “elmnum++”.
D:
As explained under C, if the next character is not “/”, there is a lower-level tag. In this case,
“parse_ml_date” is recursively called, and the analysis of the text continues. Therefore, here
we insert the condition that the character is not “/.”
This is “*(mlstr + 1) != ‘/’”.
Q7:
A thread is a unit of program execution, and it can be started as a separate execution unit at the
same time as the method main, which is executed initially. In Java, there are two methods of
creating a thread. The first is to inherit the class Thread, and the second is to implement a
Runnable interface. In this program, the second of these methods is used.
A:
In the abstract class Motion, in addition to the constructor Motion and method “run( )”,
which continues to display the movement of the point, the abstract method “update” is defined
to determine the movement of the point. In the method “run ( )”, a point is drawn at the
coordinate position defined by “while(true){…}”. The “while” statement does the
following: (1) drawing the point, (2) stopping for 40 milliseconds, (3) blank A, and (4) erasing the
point. Here, there is a need to move the point between (2) and (4) above, so in blank A, it is
necessary to call “update()” which moves the point and changes the contents of the point.
Therefore, we need to insert “point = update(point).” Note that “Point current = point;”
immediately before blank A is a process for retraction to erase the point before the move with
“Space.erase(current)” immediately after blank A.
B:
Blank B is inside “new Thread (…)” of “for (int i = 0; i < point.length; i++)”,
which is a repeated process. This indicates that as many threads as the number of points to be
drawn are generated. For a thread, as mentioned above, the Runnable interface must be
implemented, but the class Motion is an abstract class and cannot be generated as a thread. In
other words, the class SimpleMotion, which inherits this abstract class Motion, is generated
as a thread. Therefore, “new SimpleMotion(points[i])” is inserted here.
From [Program 3], we see that in the method “update()” x and y values are each calculated
from directionX and directionY, respectively. Line 2 defines the initial values of
directionX and directionY; both are 1. So X is positive, and Y is positive. Hence, both x
and y are moving in the increasing direction, i.e., arrow (ii). The answer is “The point moves as
shown by arrow (ii).”
Q8:
¾ Directions of motion are ↑ for up, → for right, ↓ for down, and ←
for left.
Points ¾ “rcheck( )” is the function that determines if it is possible to move to
the right.
Note first that in Lines 1 through 4, UP, RIGHT, DOWN, and LEFT are defined as constants
representing the up, right, down, and left directions, respectively, and corresponding to 0, 1, 2, and 3.
The question itself does not give any specifics on the directions, but the item names give clues. In
Subquestion 1, the initial condition is x = 1 and y = 0, pointing to the “Start point” of Figure 1. Here,
the only possible direction of motion is down, so you see that DOWN indeed means the down direction.
Now, consider the processes of the functions “rcheck() ”, “fcheck()”, and “go()”.
(1) rcheck( )
When “dir = UP” this returns “M[y][x + 1]”, so it is returning the coordinates of the square to the
right. If “dir = RIGHT” it returns “M[y + 1][x]”, so it is returning the coordinates of the square
below. If “dir = DOWN” it returns “M[y][x – 1]” which is the coordinates of the square to the left,
and if “dir = LEFT” it returns “M[y – 1][x]”, which is the coordinates of the square right above
it. In the figures below, each of the arrows (↑, →, ↓, and ←) indicates the direction determined by the
value of “dir,” and the shaded square is the position of the returned coordinates.
In each of the figures above, if the gray area is either ROAD or EXIT, then the program can proceed
to the shaded square. So, to proceed to the shaded square, we let “dir = (dir + 1) %4”. The table
below shows what happens to the “dir” by this operation.
The new “dir” points to the square to the right of the moving direction. Hence, the program first
checks if it is possible to move to the square on the right. Note that if it is possible, the point moves
to that square. If not, the program calls fcheck() and decides in which direction the point should
move next.
(2) fcheck()
Consider just as in rcheck().
dir = UP: M[y – 1][x] Æ Returns the coordinates of the square above
dir = RIGHT: M[y][x + 1] Æ Returns the coordinates of the square to the right
dir = DOWN: M[y + 1][x] Æ Returns the coordinates of the square below
dir = LEFT: M[y ][x – 1] Æ Returns the coordinates of the square to the left
If the shaded area in each of the figures above is either ROAD or EXIT, then the program can proceed
in the direction of the shaded area, so the program directly calls “go()” and proceeds. This is
checking whether or not it is possible to go to the next square in the moving direction.
Hence, you can see that if rcheck() determines that proceeding to the square on the right is not
possible, then the program checks if it is possible to move to the next square in the direction of
motion.
Now, if the program calls rcheck() or fcheck() and finds that neither motion is possible, then,
as you can see from Line 28, the value of “dir” is changed in the direction of the left square. When
this happens, “go()” is not called, so it involves only the change of “dir.”
The relationship between the current direction and the new direction (shaded) is as follows. Note that
the direction is turned by 90 degrees to the left (counterclockwise).
x–1 x x+1 x–1 x x+1 x–1 x x+1 x–1 x x+1
y-1
Y ↑ → ↓ ←
y+1
dir = UP dir = RIGHT dir = DOWN dir = LEFT
(3) go()
The relationship between the value of “dir” and the direction of motion is as follows. Note that,
unlike “rcheck()", "fcheck()" and “go()” actually changes the coordinates.
dir = UP :y-- Æ Moves to the square above
dir = RIGHT :x++ Æ Moves to the square on the right
dir = DOWN :y++ Æ Moves to the square below
dir = LEFT :M[y][x – 1] Æ Moves to the square on the left
Next, “rcheck()” is called again, and the square to the right of the direction of motion
(M[1][0]) is checked, but again this is a square to which moving is not allowed. So
“fcheck()” is called, and the square in the direction of motion (M[2][1]) is checked.
Proceeding to that square is possible, so now the program moves to M[2][1]. These processes
are repeated, and the path moves from Square 1 to Square 2.
M [0] [1]
↓
1 3 4 8
↓
↓
2 b 5 7
a
6
1 3 4 8
2 b 5 7
a
6
↓
1 ← ← ← ← 4
↓ ↑
2 → → → → 3
Square 1: The program looks to the right of the direction of motion, but it cannot move to the right.
So it moves forward in the direction of motion.
Square 2: The program looks to the right, but it cannot move there. So it looks forward, but
forward movement is also impossible. So it changes the direction to the left (→).
Square 3: Again, the program checks the square on the right and the square ahead in front. Neither
motion is possible, so it changes the direction to the left of the direction of motion,
changing to ↑.
Square 4: Again, the program checks the square on the right and the square ahead in front. Neither
motion is possible, so it changes the direction to the left of the direction of motion,
changing to ←.
Whereas “rcheck()” checked the right side of the direction of motion, “lcheck()” checks the
left side. The program is to use “lcheck()” to see if it is possible to move to the left. If it is
possible, the direction of motion is to be turned by 90 degrees to the left. For this, we change Line
23 so that, just like Line 28, the motion gets turned by 90 degrees to the left. Since Line 23 is
changed to make this 90-degree turn to the left, Line 28 needs to be modified so that it rotates the
direction to the right by 90 degrees.
Hence, we see that the following changes are necessary:
(Line number)
23 dir = (dir+1) % 4; Æ dir = (dir+3) %4; /* 90 degrees to the left */
28 dir = (dir+3) % 4; Æ dir = (dir+1) %4; /* 90 degrees to the right */
Therefore, the changes are the following: “+1 on Line 23 must be changed to +3 and +3 on Line
28 must be changed to +1.”
Q9:
B:
Blank B is the argument when the instance “calc” of the class Calculator is generated. Based
on the fact that CalculatorOutput is packaged as an anonymous class and that the method
“display” outputs the designated numerical value or character string to System.out in this
packaging, one can figure out that blank B must generate the anonymous class that packages the
interface CalculatorOutput. Also, the argument delivered to the constructor of the class
Calculator is a variable of the CalculatorOutput interface type, and the class that
packages the interface “CalculatorOutput” must package two (one for each type) methods
“display.”
For these reasons, in this section containing blank B, the interface CalculatorOutput is to be
packaged by an unnamed class (anonymous class), and, at the same time, it must be instantiated by
a new keyword. Hence, here, the answer needs to include the name of the interface to be packaged
by an anonymous class. The answer then is “new CalculatorOutput().”
Incidentally, an anonymous class is used in various situations, such as when we take an existing
class or interface, modify it partially to create a new class, and wish to use the new class locally
within a class. In such a case, packaging and inheriting take place without “extends” or
“implements” key words.
C:
Blank C corresponds to the argument used when the instance “event” is generated when the
input value is 0 through 9. The argument is delivered, considering cases like = and +, and the first
argument is CalculatorEvent.DIGIT. For the second argument, the program can simply
deliver the number entered, but the characters “0” through “9” need to be converted to numbers 0
through 9. When characters “0” through “9” correspond to 0x30 through 0x39 in hexadecimal
numbers, conversion to numerical values can be accomplished simply by subtracting the
hexadecimal number 0x30. Therefore, what needs to be inserted to blank C is
“CalculatorEvent.DIGIT, c - ‘0’.”
E:
At the time “3*4 =” is entered, both the display and the value of the variable “accumulator”
are 12. When “+5” follows, the value displayed in the end is the last input value “5.” This is
because “register” becomes 0 when the equal sign is pressed. With “register” 0, until another
number is added to the input value 5, the display shows “5.” Hence, the display result is “5.”
F:
At the time “3 + 4 / 0” is entered, “accumulator” contains 7, which is the result of the
calculation “3 + 4”, while “register” stores the last input value 0. Then, when the last equal
key is pressed, division by 0 occurs, triggering the exception processing
“ArithmeticException”. Hence, “Error” is displayed.
The product names appearing in this book are trademarks or registered trademarks of the
respective manufactures.