CH08 COA11e

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 50

Computer Organization and Architecture

Designing for Performance


11th Edition, Global Edition

Chapter 8
Input/Output

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.1
Generic Model of an I/O Module

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


External Devices
Three
• Provide a means of
exchanging data between
the external environment
categories:
and the computer • Human readable
• – Suitable for communicating with the
Attach to the computer by a
computer user
link to an I/O module – Video display terminals (VDTs), printers
– The link is used to exchange
control, status, and data • Machine readable
between the I/O module and
– Suitable for communicating with
the external device equipment
– Magnetic disk and tape systems,
• Peripheral device sensors and actuators
– An external device connected
to an I/O module • Communication
– Suitable for communicating with remote
devices such as a terminal, a machine
readable device, or another computer

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.2
Block Diagram of an External Device

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Most common means of
computer/user interaction

Keyboard/Monitor User provides input through the


keyboard
The monitor displays data provided
International Reference Alphabet by the computer
(IRA)
• Basic unit of exchange is the character • Keyboard Codes
– Associated with each character is a code
– Each character in this code is • When the user depresses a key it generates
represented by a unique 7-bit binary an electronic signal that is interpreted by the
code transducer in the keyboard and translated into

the bit pattern of the corresponding IRA code
128 different characters can be represented
• This bit pattern is transmitted to the I/O module
• Characters are of two types:
in the computer
– Printable
 Alphabetic, numeric, and special characters
• On output, IRA code characters are transmitted
that can be printed on paper or displayed on to an external device from the I/O module
a screen
• The transducer interprets the code and sends
– Control the required electronic signals to the output
 Have to do with controlling the printing or device either to display the indicated character
displaying of characters or perform the requested control function
 Example is carriage return
 Other control characters are concerned with
communications procedures Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
I/O Functions

The major functions for an I/O module fall


into the following categories:
Control and timing
• Coordinates the flow of traffic between internal resources and external devices

Processor communication
• Involves command decoding, data, status reporting, address recognition

Device communication
• Involves commands, status information, and data

Data buffering
• Performs the needed buffering operation to balance device and memory speeds

Error detection
• Detects and reports transmission errors

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.3
Block Diagram of an I/O Module

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Programmed I/O

Three techniques are possible for I/O operations:


• Programmed I/O
– Data are exchanged between the processor and the I/O module
– Processor executes a program that gives it direct control of the I/O operation
– When the processor issues a command it must wait until the I/O operation is
complete
– If the processor is faster than the I/O module this is wasteful of processor time
• Interrupt-driven I/O
– Processor issues an I/O command, continues to execute other instructions, and
is interrupted by the I/O module when the latter has completed its work
• Direct memory access (DMA)
– The I/O module and main memory exchange data directly without processor
involvement

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Table 8.1
I/O Techniques

No Interrupts Use of Interrupts

I/O-to-memory transfer through


Programmed I/O Interrupt-driven I/O
processor

Direct I/O-to-memory transfer Direct memory access (DMA)

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


I/O Commands

• There are four types of I/O commands that an I/O module may
receive when it is addressed by a processor:
1) Control
– used to activate a peripheral and tell it what to do

2) Test
– used to test various status conditions associated with an I/O module and its
peripherals

3) Read
– causes the I/O module to obtain an item of data from the peripheral and place it in an
internal buffer

4) Write
– causes the I/O module to take an item of data from the data bus and subsequently
transmit that data item to the peripheral

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.4
Ba kĩ thuật nhập khối dữ liệu

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


I/O Instructions
With programmed I/O there is a close correspondence between the I/O-related instructions that the processor fetches from
memory and the I/O commands that the processor issues to an I/O module to execute the instructions(với I/O được lập
trình, có một tương ứng chặt chẽ giữa các chỉ thị liên quan đến I/O mà bộ xử lý lấy từ bộ nhớ và các lệnh I/O
mà bộ xử lý phát ra tới một mô-đun I/O để thực hiện các chỉ thị đó.)

Each I/O device connected through I/O modules is given a unique identifier or address(Mỗi thiết bị
I/O được kết nối thông qua các mô-đun I/O sẽ được gán một định danh hoặc địa chỉ duy
nhất.)

The form of the instruction


When the processor issues an I/O
Memory-mapped I/O(Bộ nhớ ánh xạ I/O)
depends on the way in which
external devices are command, the command contains the
addressed(Hình thức của address of the desired device(Khi bộ xử
hướng dẫn chỉ thị phụ thuộc lý phát ra một lệnh I/O, lệnh đó chứa địa
vào cách thiết bị ngoại vi chỉ của thiết bị mong muốn.)
được địa chỉ hóa.)

There is a single address space for memory locations and A single read line and a single write line are needed on the
Thus each I/O module must interpret the address I/O devices(có một không gian địa chỉ duy nhất cho các vị bus(Trên bus, chỉ cần sử dụng một dòng đọc và một dòng
lines to determine if the command is for itself(Do trí bộ nhớ và các thiết bị I/O.) ghi duy nhất.)
đó, mỗi mô-đun I/O phải giải thích các dòng địa
chỉ để xác định xem lệnh có dành cho nó hay
không.)

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


I/O Mapping Summary (tóm tắt ánh xạ)
• Memory mapped I/O (bộ nhớ ánh xạ)
– Devices and memory share an address space(thiết bị và bộ nhớ
chia sẻ một không gian địa chỉ)
– I/O looks just like memory read/write(trông giống như việc
đọc/ghi bộ nhớ)
– No special commands for I/O(không có lệnh đặc biệt cho I/O)
▪ Large selection of memory access commands available(có nhiều lệnh
truy cập bộ nhớ lựa chọn)

• Isolated I/O(cô lập I/O)


– Separate address spaces(không gian địa chỉ riêng biệt)
– Need I/O or memory select lines(cần các dòng chọn I/O hoặc bộ
nhớ
– Special commands for I/O(lệnh đặc biệt cho I/O)
▪ Limited set
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Figure 8.5
Memory-Mapped and Isolated I/O

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Interrupt-Driven I/O
The problem with programmed I/O is that the processor has to wait a long
time for the I/O module to be ready for either reception or transmission of
data(vấn đề với I/O được lập trình là bộ xử lý phải chờ lâu để mo-đun I/O
sẵn sàng để tiếp nhận hoặc truyền dư liệu)

An alternative is for the processor to issue an I/O command to a module


and then go on to do some other useful work(một phương án khác là cho
bộ xử lý phát ra một lệnh I/O tới một module và sau đó tiếp tục thực hiện
một số công việc hữu ích khác)

The I/O module will then interrupt the processor to request service when
it is ready to exchange data with the processor(sau đó, mudule I/O sễ
ngắt bộ xử lý để yêu cầu dịch vụ khi nó sẵn sàng để trao đổi dữ liệu với
bộ xử lý)

The processor executes the data transfer and resumes its former
processing(bộ xử lý thực hiện việc truyền dữ liệu và tiếp tục xử lý công
việc trước đây)

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.6
Simple Interrupt Processing

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.7
Changes in Memory and Registers for an
Interrupt

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Design Issues

Two design • Because there will be


issues arise in multiple I/O modules how
does the processor
implementing determine which device
issued the interrupt?(Vì sẽ
interrupt có nhiều mô-đun I/O, bộ xử
lý cần xác định thiết bị nào
I/O(Có hai vấn đã phát ra ngắt)

đề thiết kế • If multiple interrupts have


phát sinh khi occurred how does the
processor decide which one
thực hiện I/O to process?(Nếu có nhiều
ngắt xảy ra cùng một lúc,
theo cơ chế bộ xử lý phải quyết định
ngắt I/O): xử lý ngắt nào trước)

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Device Identification(Nhận dạng thiết bị)
Four general categories of techniques are in common use(Có bốn danh mục chung của các kỹ thuật
thông thường được sử dụng)
• Multiple interrupt lines(Nhiều dòng ngắt)
– Between the processor and the I/O modules(Giữa bộ xử lý và các mô-đun I/O)
– Most straightforward approach to the problem(Phương pháp đơn giản nhất cho vấn đề)
– Consequently even if multiple lines are used, it is likely that each line will have multiple I/O modules attached to it(Do đó,
ngay cả khi sử dụng nhiều dòng, có thể có nhiều mô-đun I/O được kết nối với mỗi dòng)
• Software poll(Kiểm tra phần mềm)
– When the processor detects an interrupt it branches to an interrupt-service routine whose job is to poll each I/O module to
determine which module caused the interrupt(Khi bộ xử lý phát hiện một ngắt, nó nhảy đến một rutin dịch vụ ngắt, nhiệm vụ
của nó là kiểm tra từng mô-đun I/O để xác định mô-đun nào gây ra ngắt)
– Time consuming(Tốn thời gian)
• Daisy chain (hardware poll, vectored)(Chuỗi daisy chain (kiểm tra phần cứng, vectored)
– The interrupt acknowledge line is daisy chained through the modules(Dòng xác nhận ngắt được nối chuỗi qua các mô-
đun)
– Vector – address of the I/O module or some other unique identifier(Vector - địa chỉ của mô-đun I/O hoặc một định danh
duy nhất khác)
– Vectored interrupt – processor uses the vector as a pointer to the appropriate device-service routine, avoiding the need
to execute a general interrupt-service routine first(Ngắt vectored - bộ xử lý sử dụng vector như một con trỏ đến rutin dịch
vụ thiết bị phù hợp, tránh việc thực hiện trước một rutin dịch vụ ngắt tổng quát)
• Bus arbitration (vectored)(Tranh chấp bus (vectored)
– An I/O module must first gain control of the bus before it can raise the interrupt request line(Một mô-đun I/O phải lấy
được quyền kiểm soát bus trước khi nó có thể nâng dòng yêu cầu ngắt)
– When the processor detects the interrupt it responds on the interrupt acknowledge line(Khi bộ xử lý phát hiện ngắt, nó
phản hồi trên dòng xác nhận ngắt)
– Then the requesting module places its vector on the data lines(Sau đó, mô-đun yêu cầu đặt vector của nó trên các dòng
dữ liệu.)
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Figure 8.8
Use of the 82C59A Interrupt Controller

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.9
The Intel 8255A Programmable Peripheral
Interface

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.10
The Intel 8255A Control Word

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.11
Keyboard/Display Interface to 8255A

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Drawbacks of Programmed and Interrupt-
Driven I/O

• Both forms of I/O suffer from two inherent drawbacks:


1) The I/O transfer rate is limited by the speed with which the
processor can test and service a device

2) The processor is tied up in managing an I/O transfer; a number


of instructions must be executed for each I/O transfer

• When large volumes of data are to be moved a more


efficient technique is direct memory access (DMA)

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.12
Typical DMA Block Diagram

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.13
DMA and Interrupt Breakpoints during an
Instruction Cycle

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.14
Alternative DMA Configurations

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.15
8237 DMA Usage of System Bus

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Fly-By DMA Controller

Data does not pass 8237 contains four DMA


through and is not stored channels
in DMA chip • Can be programmed
• DMA can only transfer independently
data between an I/O Can do memory to • Any one of the
port and a memory memory via register channels may be active
address at any moment
• Not between two I/O • These channels are
ports or two memory numbered 0, 1, 2, and
locations 3

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Table 8.2
Intel 8237A Registers
Bit Command Status Mode Single Mask All Mask

D0 Memory- to- Channel 0 has Clear/set chan-


memory E/D reached TC nel 0 mask bit
Select channel
Channel select
mask bit
D1 Channel 0 Channel 1 has Clear/set chan-
address hold E/D reached TC nel 1 mask bit
D2 Controller E/D Channel 2 has Verify/write/read Clear/set Clear/set chan-
reached TC transfer mask bit nel 2 mask bit

D3 Normal/com- Channel 3 has Clear/set chan-


pressed timing reached TC nel 3 mask bit

D4 Fixed/rotating Channel 0 request Auto- initialization


priority E/D

D5 Late/extended Channel 0 request Address increment/


write selection decrement select

D6 DREQ sense Channel 0 request


active high/low

D7 DACK sense Channel 0 request Demand/single/


active high/low block/cascade mode
select

E/D = enable/disable
TC = terminal count
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Direct Cache Access (DCA)

• DMA is not able to scale to meet the increased demand due


to dramatic increases in data rates for network I/O
• Demand is coming primarily from the widespread deployment
of 10-Gbps and 100-Gbps Ethernet switches to handle
massive amounts of data transfer to and from database
servers and other high-performance systems
• Another source of traffic comes from Wi-Fi in the gigabit range
• Network Wi-Fi devices that handle 3.2 Gbps and 6.76 Gbps
are becoming widely available and producing demand on
enterprise systems

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.16
Xeon E5-2600/4600 Chip Architecture

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Cache-Related Performance Issues (1 of 2)
Network traffic is transmitted in the form of a sequence of protocol blocks called packets or protocol data units

The lowest, or link, level protocol is typically Ethernet, so that each arriving and departing block of data consists
of an Ethernet packet containing as payload the higher-level protocol packet

The higher-level protocols are usually the Internet Protocol (IP), operating on top of Ethernet and the
Transmission Control Protocol (TCP), operating on top of IP

The Ethernet payload consists of a block of data with a TCP header and an IP header

For outgoing data, Ethernet packets are formed in a peripheral component, such as in I/O controller or network
interface controller (NIC)

For incoming traffic, the I/O controller strips off the Ethernet information and delivers the TCP/IP packet to the host CPU

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Cache-Related Performance Issues (2 of 2)

In a DMA scheme, when an


application wishes to transmit data,
it places that data in an application-
assigned buffer in main memory
• The core transfers this to a system buffer in
main memory and creates the necessary TCP
and IP headers, which are also buffered in
For both outgoing system memory
• The packet is then picked up via DMA for
and incoming traffic transfer via the NIC
• This activity engages not only main memory
the core, main but also the cache
• Similar transfers between system and
memory, and cache application buffers are required for
are all involved incoming traffic

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Packet Traffic Steps:

Incoming • Outgoing
• Packet arrives
• Packet transfer requested
• DMA
• Packet created
• NIC interrupts host
• Output operation invoked
• Retrieve descriptors and
headers • DMA transfer
• Cache miss occurs • NIC signals completion
• Header is processed • Driver frees buffer
• Payload transferred

© 2018 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Direct Cache Access Strategies

Simplest strategy was implemented as a prototype on a number of Intel Xeon processors


between 2006 and 2010
The DCA function in the memory controller
This form of DCA applies only to incoming This enables the core to prefetch the data
sends a prefetch hint to the core as soon as
network traffic packet from the system buffer
the data is available in system memory

Much more substantial gains can be realized by avoiding the system buffer in main
memory altogether

For incoming packets, the


The packet and packet Implemented in Intel’s
core reads the data from It has no need to access
descriptor information are Xeon processor line,
the buffer and transfers that data in the system Cache injection
accessed only once in the referred to as Direct Data
the packet payload to an buffer again
system buffer by the core I/O
application buffer

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.17
Comparison of DMA and DDIO

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Evolution of the I/O Function
1. The CPU directly controls a 4. The I/O module is given direct
peripheral device. access to memory via DMA. It can
now move a block of data to or
2. A controller or I/O module from memory without involving the
is added. The CPU uses CPU, except at the beginning and
programmed I/O without end of the transfer.
interrupts.
5. The I/O module is enhanced to
3. Same configuration as in become a processor in its own
step 2 is used, but now right, with a specialized instruction
interrupts are employed. set tailored for I/O
The CPU need not spend
time waiting for an I/O 6. The I/O module has a local
operation to be performed, memory of its own and is, in fact, a
thus increasing efficiency. computer in its own right. With this
architecture a large set of I/O
devices can be controlled with
minimal CPU involvement.

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Figure 8.18
I/O Channel Architecture

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Universal Serial Bus (USB)
• Widely used for peripheral connections
• Is the default interface for slower speed devices
• Commonly used high-speed I/O
• Has gone through multiple generations
– USB 1.0
▪ Defined a Low Speed data rate of 1.5 Mbps and a Full Speed rate of 12 Mbps
– USB 2.0
▪ Provides a data rate of 480 Mbps
– USB 3.0
▪ Higher speed bus called SuperSpeed in parallel with the USB 2.0 bus
▪ Signaling speed of SuperSpeed is 5 Gbps, but due to signaling overhead the usable data rate is up to
4 Gbps
– USB 3.1
▪ Includes a faster transfer mode called SuperSpeed+
▪ This transfer mode achieves a signaling rate of 10 Gbps and a theoretical usable data rate of 9.7 Gbps

• Is controlled by a root host controller which attaches to devices to create a local


network with a hierarchical tree topology
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
FireWire Serial Bus
• Was developed as an alternative to small computer system interface (SCSI) to
be used on smaller systems, such as personal computers, workstations, and
servers
• Objective was to meet the increasing demands for high I/O rates while avoiding
the bulky and expensive I/O channel technologies developed for mainframe and
supercomputer systems
• IEEE standard 1394, for a High Performance Serial Bus
• Uses a daisy chain configuration, with up to 63 devices connected off a single
port
• 1022 FireWire buses can be interconnected using bridges
• Provides for hot plugging which makes it possible to connect and disconnect
peripherals without having to power the computer system down or reconfigure
the system
• Provides for automatic configuration
• No terminations and the system automatically performs a configuration function
to assign addresses
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
SCSI

• Small Computer System Interface


• A once common standard for connecting peripheral devices
to small and medium-sized computers
• Has lost popularity to USB and FireWire in smaller systems
• High-speed versions remain popular for mass memory
support on enterprise systems
• Physical organization is a shared bus, which can support
up to 16 or 32 devices, depending on the generation of the
standard
– The bus provides for parallel transmission rather than serial, with a bus
width of 16 bits on earlier generations and 32 bits on later generations
– Speeds range from 5 Mbps on the original SCSI-1 specification to 160 Mbps
on SCSI-3 U3 Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Thunderbolt
• Most recent and fastest
peripheral connection
technology to become available
for general-purpose use
• Developed by Intel with
collaboration from Apple • Provides up to 10 Gbps
• The technology combines data, throughput in each direction
video, audio, and power into a and up to 10 Watts of power to
connected peripherals
single high-speed connection
for peripherals such as hard
drives, RAID arrays, video-
capture boxes, and network
interfaces

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


InfiniBand

• I/O specification aimed at the high-end server market


• First version was released in early 2001
• Heavily relied on by IBM zEnterprise series of mainframes
• Standard describes an architecture and specifications for data flow
among processors and intelligent I/O devices
• Has become a popular interface for storage area networking and
other large storage configurations
• Enables servers, remote storage, and other network devices to be
attached in a central fabric of switches and links
• The switch-based architecture can connect up to 64,000 servers,
storage systems, and networking devices

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


PCI Express and SATA
PCI Express • SATA
• High-speed bus system for • Serial Advanced Technology
connecting peripherals of a Attachment
wide variety of types and • An interface for disk storage
speeds systems
• Provides data rates of up to 6
Gbps, with a maximum per
device of 300 Mbps
• Widely used in desktop
computers and in industrial
and embedded applications

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Ethernet
• Predominant wired
networking technology
• Has moved from bus-based to
• Has evolved to support data switch-based
rates up to 100 Gbps and – Data rate has periodically
distances from a few meters increased by an order of
to tens of km magnitude
• Has become essential for – There is a central switch
supporting personal with all of the devices
computers, workstations, connected directly to the
servers, and massive data switch
storage devices in
organizations large and small • Ethernet systems are currently
available at speeds up to 100
• Began as an experimental Gbps
bus-based 3-Mbps system

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Wi-Fi
• Is the predominant wireless • As the technology of
Internet access technology antennas, wireless
transmission techniques, and
• Now connects computers, wireless protocol design has
tablets, smart phones, and evolved, the IEEE 802.11
other electronic devices such committee has been able to
as video cameras TVs and introduce standards for new
thermostats versions of Wi-Fi at higher
• speeds
In the enterprise has become
an essential means of • Current version is 802.11ac
enhancing worker productivity (2014) with a maximum data
and network effectiveness rate of 3.2 Gbps
• Public hotspots have expanded
dramatically to provide free
Internet access in most public
places
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Figure 8.19
IBM z13 I/O Channel Structure
d 85 p artitions p er s ys t em

d 15 p artitions p er chann el s ubs ys tem

Logical Logical Logical Logical


partition partition partition partition

Channel Channel Channel 6 channel


S ubs ys tem S ubs ys tem S ubs ys tem s ubs ys tems

S ubchann el S ubchann el S ubchann el S ubchann el 4 s ubchann el s ets


Set Set Set Set p er chann el s ubs ys tem

up to 64k
chann els p er
s ubchann el s et
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Figure 8.20
IBM z13 I/O System Structure

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved


Summary
•Input/
Chapter 8 Output
• Direct Cache Access
• External devices
• I/O channels and
• I/O modules processors
• Programmed I/O • External
interconnection
• Interrupt-driven I/O standards
• Direct memory • IBM zEnterprise EC12
access I/O structure

Copyright © 2022 Pearson Education, Ltd. All Rights Reserved

You might also like