[Computer Architecture]Chapter1. Computer abstractions and technology
in Dev on Computer Architectrue
Introduction
Let’s explore the comprehensive concepts of computer.
- Contents
What we will see in this category
We will have a long journey for explore the detailed computer operating mechanism and its architecture. We will see these important points later :
- How high-level programs are translated into the machine language.
- The hardware/software interface
- What determines program performance and how much?
- How hardware designers improve performance
- What is parallel processing
Levels of Program Code
Have you made an question about how my source code runs? So, how programming language like C, C++ or Python codes operates on my computer?
You may heard Compile term on your very first programming class. Yes, the compile process convert your High-level language into Assembly language for MIPS(Microprocessor without interlocked pipeline stage). Then, the Assembler converts assembly language to Machine code(Binary Code) for MIPS.
What is a Computer?
Computer has these components :
- processor
- memory (cache as SRAM, main memory as DRAM, disk storage as HDD or SSD)
- input (mouse, keyboard)
- output (display, printer)
- interconnection or network
We will primarily focus on the processor, the cache and memory system.
The processor is the combination of Datapath and Control. Datapath is the road for data, and control handles data.
Now, let’s see how processor works.
Input devices put data for memory.
Memory fetches(give 1 line of) binary instruction to control, and decode it.
control gives decoded instruction to datapath, and datapath executes the instruction.
Next, processor fetches the next instruction from memory.
So, Control decides which is the next instruction and decode it, also gives signals to operate datapath components. Datapath executes instructions (e.g. adder) and storage location.
After the program completed, the data to be output by binary code.
Instruction Set Architecture
Instruction Set Architecture(ISA), or simple architecture - the Abstract Interface between the highest level hardware and the lowest level software that encompasses all the information necessary to write a machine language program, including instructions, registers, memory access, I/O, etc.
The combination of the basic instruction set (the ISA) and the operating system interface is called the Application Binary Interface(ABI).
- ABI = ISA + OS interface.
- ABI is like API in high level langauge between software targets above OS.
Now, let’s see the overview of one of the most famous ISA, The MIPS ISA.
- Instruction Categories
- Computational
- Load/Store
- Jump and Branch
- Floating Point (with coprocessor)
- Memory Management
- Special
- 3 Instruction Formats : all 32 bits wide.
- ISA is fitted on this place :
Performance Metrics
How can I choose which computer has the basic performance with low cost? So, in here, we will see what factors in the architecture contribute to overall system performance and the relative importance and cost of these factors.
Basically, there are 4 important factors related on performance. We will see these one by one in detail in future chapters.
Algorithm : Determines the number of operations executed.
- Programming language, compiler and architecture : Determines the number of machine instructions executed per operation in a certain architecture.
- Operating System : Hardware and software interface. This support program.
- Processor, memory system and I/O system : Determines how-fast, how-many each instruction is executed. Also, determines how fast I/O operations are executed.
Now, let’s see how the performance can be calculated.
- Performance is defined as :
If X is n times faster than Y, then :
Simple example : If computer A runs a program in 10 seconds and computer B runs the same program in 15 seconds, how much faster A than B? Answer : We know that A is n times faster than B if
The performance ratio is 15 / 10 = 1.5, so, A is 1.5 times faster than B.
- Execution time is defined as CPU time because program elapsed time is device-dependent. CPU time only has the time that CPU spends working on a task, does not include the time waiting for I/O or running other programs.
CPU execution time can be calculated as :
Simple examples : A program runs on computer A with a 2 GHz clock in 10 seconds. What clock rate must computer B run at to run this program in 6 seconds? Unfortunately, to accomplish this, computer B will require 1.2 times as many clock cycles as computer A to run the program.
Answer :
- CPU clock cycles can be defined as the average time of taking times of instruction. Different instructions take different amount of execution time. For example, integer add takes 1clock, integer multiply takes 5clock, and float multiply takes 20clock time. So, CPU clock cycles must be calculated as average time of execution time.
- Clock cycles Per Instruction(CPI) is the average number of clock cycles each instruction takes to execute. To measure CPI, compare two different implementations on the same ISA.
Simple example : Computers A and B implemented on the same ISA. Computer A has a clock cycle time of 250ps and an average CPI of 2.0 for some program, and computer B has a clock cycle time of 500ps and an average CPI of 1.2 for the same program. Which computer is faster and by how much?
Answer :
- So, CPU clock cycles can be calculated as
- And average CPI can be calculated as
Simple example : One program is alternatively compiled, result n two sequences of different code, using instructions in classes A, B, C. Calculate Average CPI on sequence 1 and sequence 2.
Answer :
- Finally, the basic performance of CPU equation is then
These equations separate the three key factors that affect performance.
- Can measure the CPU execution time by running the program
- The clock rate is usually given.
- Can measure overall instruction count by using profilers/simulators without knowing all of the implementation details.
- CPU varies by instruction type.
Plus, let me introduce some parts of program affect on these factors :
Simple example : There are some operators with some measurements. Answer these questions.
How much faster would the machine be if a better data cache reduced the average load time to 2 cycles?
Answer : CPU time new = IC x CPI(1.6) x CC. so 2.2/1.6 means 37.5% faster.
How does time compare with using branch prediction to save a cycle off the branch time?
Answer : CPU time new = IC x CPI(2.0) x CC. so 2.2/2.0 means 10% faster.
What if two ALU instructions could be executed at once? Answer : CPU time new = IC x CPI(1.95) x CC. so 2.2/1.95 means 12.8% faster.