Main Page| Registers| Memory organization| 80x86 Instructions | Pipelining

 

Memory organization and access of 80x86 processors

 

1)      The different types of memory

Picture from http://www.faculty.iu-bremen.de/course/FundCS2/Lectures/OSN-02%20Hardware.e.pdf

 

A couple of general remarks about memory:

o       There are several memory levels, each of which has different purposes and can be accessed within different time limits.

o       The fastest access is for the register set located directly on the CPU. Unfortunately, the space available is limited.

o       Due to this, the introduction of cache memory was necessary. Cache memory is fast-access SRAM memory, which is typically more expensive and therefore also used in limited amounts. Cache memory usually keeps frequently used data to which it provides faster access than the main RAM. For 80486 and later processors, there are two different levels of cache, one inside the CPU (Level 1) and one which is external to the CPU (Level 2).

o       The main memory (RAM) is typically slow to access but has a much larger capacity of holding information.

 

2)      Segmentation: a characteristic feature of 80x86 processors

Picture from http://webster.cs.ucr.edu/Page_asm/ArtofAssembly/pdf/ch04.pdf

o       This feature is characteristic of and was introduced along with the 80x86 processors.

o       Our current perception of memory is something like a linear stack or array of bytes with a single-index address (this is called flat or linear addressing).

o       In contrast, segmented addressing specifies an offset within a given segment. The segment size is 16 bits on all 80x86 processors. This resembles a two-dimensional array.

o       This gives the programmer the ability to attach blocks of variables (a segment) to a particular piece of code.

o       It also allows partitioning of programs into modules which operate independently, as the segments are theoretically and ideally independent of each other. The best example for the use of this feature is object oriented programming.

o       Moreover, segmentation allows for the extension of addressability of processor (from 64kB to 1MB on the 8086 processor)

o       Segmented addressing does not imply that the physical memory must also be non-linear. 80x86 memory is still linear, and there is a function which converts segment:offset type address to a meaningful location. In processors using only real addressing mode, this function is very simple: the CPU multiplies the segment value by sixteen (10h), which is the number of bits on a segment, and adds the offset portion. In 80286 processors, Intel changed this function by introducing the protected addressing mode and thereby extended the memory available for addressing.

 

3)      Real and protected mode addressing

 

o       Real mode addressing in 80x86 refers to strictly converting one (or two in the case of segmentation) address value into a physically meaningful location in the RAM. This limits the available addressable memory of a16 bit processor to 64kB and 1MB respectively in the case of segmentation.

o        Segmentation introduces an additional problem, which is that there are several different memory addresses that refer to the same physical address. For example, 11F0:0, 1100:F00, and even 1080:1700 all correspond to physical address 11F00h. The problem is solved by introducing a normalized address which limits the offset portion to values between 0 and 0Fh

o       In 80286 and later processors, a different addressing mode, namely the protected addressing mode, was introduced. Rather than using a function to determine the physical address, protected mode processors use a look up table. Segment registers simply point to data structures that contain the information needed to access a location. Every protected mode program must include a table of "descriptors", which are 8 byte data structures that define the start and end of a segment. Depending on the type of segment, a descriptor may have other information such as access rights and the like. A typical descriptor contains the following information, packed into an 8 byte record:

·         Segment start: absolute 32 bit address

·         Segment limit: Maximum address this segment can reference

·         Segment status: privilege level, segment present, segment available, segment type, etc.

(contents of descriptor according to http://www.ganssle.com/articles/aprot1.htm)

o       Given the information about the starting point and the length of the segment, the CPU computes the physical address as follows:

Picture from http://webster.cs.ucr.edu/Page_asm/ArtofAssembly/pdf/ch04.pdf

o       As one can imagine, the protected mode is particularly important because it provides the segment limit and status, which prevents a program from overwriting previous machine instructions during execution. In real mode, it is possible for a program to go on writing data in memory (in the following segment) after the program segment has ended, which can alter data from that segment.

 

4)      Addressing modes

 

80x86 processors provide several ways of addressing operands. Among these, the most common ones are:

  • Immediate Mode (memory is not accessed) - operand is part of instruction (for example a constant is encoded in the instruction.
  • Register Addressing (memory accessed once) - operand contained in register
  • Direct Mode (memory accessed once) - operand field of instruction contains address of operand.
  • Indirection (memory accessed twice) - sometimes called "deferred"; operand field of instruction contains address of the address of the operand. This is like a "pointer" which can be modified.
    • 80x86 uses Register Indirect Addressing - effective address of operand contained in a register
  • Indexing : Base + Register - effective address obtained by adding value of operand field to contents of register (also called displacement addressing)
    • Fixed Base (address) + Variable Register Offset (operand field contains a base) Array type addressing.
    • Fixed Register + Variable Offset (address) (operand field contains a displacement) Record type addressing.
    • PC + offset: relative addressing (operand field contains a displacement) Used by near and short jump instructions on the Intel 80x86.
  • Indexing With Scaling: Base + Register Offset * Scaling Factor (operand field contains base address). Useful for array calculations where size of component is multiple bytes long.
  • Auto-indexing: Register Indirect or Indexing with auto-increment/decrement of register.
  • Indexing Indirect: Indexing coupled with indirection 
    • postindexing: indexing after indirection
    • preindexing: indirection after indexing
  • Stack Addressing: push & pop - a variant of register indirect with auto-increment/decrement using the SP register implicitly 

For more information see http://www3.wittenberg.edu/bshelburne/Comp255S/AddressingModes.htm

 

5)      Cache organization and the segment descriptor cache

Picture from http://www.faculty.iu-bremen.de/course/FundCS2/Lectures/OSN-02%20Hardware.e.pdf

Ø      The idea behind cache memory use is to keep those blocks of memory, which are used frequently, stored in more expensive but faster access memory. Cache memory is always accessed first to determine if the data is there. The cache performance is relying on both spatial and temporal locality meaning that

o       The nearby memory cells of the cell which is being used are likely to also be used soon, and should be stored in the cache

o       Memory cells, which have been accessed recently, are likely to be used again soon and should be stored in the cache.

(See http://www.faculty.iu-bremen.de/course/FundCS2/Lectures/OSN-02%20Hardware.e.pdf)

Ø      Studies show that there is a certain size of the cache memory (around 128kB) beyond which performance does not improve anymore.

Ø      Cache memory is organized in slots of fixed size (e.g. 16 bytes), such that an entire block from main memory can be mapped to a slot in the cache. The question now arises how to identify which blocks of main memory are mapped into the cache. This information is encoded into the main memory address, which is divided into at least two fields: the tag, used to identify which block is in the cache; the offset (see 2) and optionally the slot/set used to determine which slots in the cache a block could be mapped to.

Ø      There are several ways to map main blocks of memory to cache memory slots.

o                    The fully associative cache, in which the caching controller can place a block of bytes in any one of the cache lines present in the cache memory. This is very flexible but the extra circuitry to achieve full associativity is expensive and can slow down the memory subsystem.

o                    The direct mapped cache (also known as the one-way set associative cache), in which a block of main memory is always loaded into the same cache line in the cache. One of the problems with this is that it may not make effective use of all the cache memory, due to the fact that we would need more bits for addressing.

o                   The n-way set associative cache is a mixture between direct and associative mapping. The cache is divided into sets of cache lines. The CPU selects a particular set using some subset of the address bits, just as for direct-mapping. Within each set there are n cache lines. The caching controller uses a fully associative mapping algorithm to select one of the n cache lines within the set.

(According to http://webster.cs.ucr.edu/Page_AoAWin/HTML/MemoryArchitecturea2.html)

Ø      Most cache designs are direct-mapped, two-way set associative, or four-way set associative. The 80x86 family CPUs use all three (depending on the CPU and cache). The caches of 80x86 processors are usually unified, meaning that data and instructions are contained in the same cache.

Ø      One aspect of cache memory is essential to main memory accessing, and that is the segment descriptor cache (in direct relation with the descriptor table discussed already in 3), which was implemented beginning with the 80286 processor. This segment is updated each time a segment register is loaded and is essential to protected mode operations, because without it, determining segment base, limit, and access rights would require more than one CPU cycle to complete. The microprocessor accesses the descriptor tables for each memory access. Due to the size of this data, these values are kept in slow-access memory. Therefore, without an internal segment-descriptor cache, each memory access would require many other accesses to memory.

Ø      Although the layout of the segment descriptor cache is different for almost all processors, a typical structure (showing also the relationship between the descriptor table and the segment descriptor) is shown below, for 32-bit descriptor table entries.

 

Figure 1: Combining fields from the descriptor table into the segment-descriptor cache.

Offset

63..56

55

54

53

52

51..48

47

46..45

44

43..40

39..16

15..00

Description

Base[31:24]

G

D/B

0

AVL

Limit[19:16]

P

DPL

S

Type

Base[23:00]

Limit[15:00]

 

 

Segment Base

 

 

Segment Access Rights

 

 

Segment Limit

Picture from http://www.x86.org/ddj/aug98/aug98.htm

 

6)      Main memory and programming

o       Of course, the ultimate goal of having a computer is to make use of an operating system. How does this system organize the main memory?

 

Picture from http://www.briceg.com/linux/aoal/MemoryAccessandOrg.html

o       An operating system like Linux or Windows tends to put different types of data into different sections (or segments) of main memory.

o       The lowest memory addresses are reserved by the operating system and are usually user inaccessible (attempts to access this part of the memory will result in a protection fault).

o       The remaining six areas in memory are reserved for different types of data. These sections of memory include the stack section, the heap section, the code section, the READONLY section, the STATIC section, and the STORAGE section. The properties specified for each of these areas in memory (via descriptors) are one more check which prevents the programmer from misusing the data. For example, the user is not allowed to write data into the code section.

 

References:

http://www.briceg.com/linux/aoal/MemoryAccessandOrg.html

http://www.ganssle.com/articles/aprot1.htm

http://www3.wittenberg.edu/bshelburne/Comp255S/AddressingModes.htm

http://webster.cs.ucr.edu/Page_asm/ArtofAssembly/pdf/ch04.pdf

http://www.faculty.iu-bremen.de/course/FundCS2/Lectures/OSN-02%20Hardware.e.pdf

http://www.x86.org/articles/pmbasics/

http://wgc.chem.pu.ru/OS2/concepts/conc2.html

http://condor.depaul.edu/~jourada/97summer/CacheMemory/sld005.htm

http://webster.cs.ucr.edu/Page_AoAWin/HTML/MemoryArchitecturea2.html

http://www.x86.org/ddj/aug98/aug98.htm