Rutgers University CS 211 Sections 1 and 2, Fall 2005 Third Exam

Rutgers University CS 211 Sections 1 and 2, Fall 2005 Computer Architecture Third Exam

Read these instructions carefully. On the front of your bluebook, write your name and section number (1 or 2). Choose TWO of the following questions to answer and write your answers in the bluebook provided. Do not answer more than two questions; we will only grade the first two answers we encounter.

Consider the following C function:
```
void foo (int n) {
        int i, *p;
                                                                                
        for (i=1; i<=n; i++) {
                p = (int *) malloc (i);
                free (p);
        }
}
```
Suppose that malloc uses a single explicit free list to keep track of free blocks. free puts a block at the head of the list in constant time. The asymptotic time complexity of this function in terms of n is given by the total number of times a link in the free list is traversed (assume constant time for allocating a new block from system memory). For instance, if an ideal version of malloc traverses only one link per iteration, then the asymptotic complexity would be O(n).
- What is the asymptotic complexity of this function if a first-fit allocation policy is used?
- What is the asymptotic complexity of this function if a best-fit allocation policy is used?
Consider the following fragment of assembly language, with line numbers added for reference:
```
1         movl    $9, %edx
2 .L1:
3         addl    %eax, %esi
4         addl    %esi, %eax
5         addl    %ebx, %ecx
6         addl    %ecx, %ebx
7         decl    %edx
8         jns     .L1
```
Assume the following:
- %eax, %ebx,%ecx, and %esi could have any values before the loop.
- This code is executed on a machine with a 5-stage pipeline as described in class: Fetch, Decode, Execute, Memory, Writeback.
- Branch prediction is always correct for this code.
- There will be no structural hazards.
- Data hazards are resolved by stalling until operands are available in the register file.
Without changing the meaning of the code, rewrite it so that it will execute faster. That is, your new code should run faster and all of the register values at the end of the loop should be the same as they would be for the unmodified code.
Assume the following:
- An L1 data cache hit costs 1 cycle.
- An L1 data cache miss costs 10 cycles.
- A TLB hit costs 1 cycle.
- A TLB miss costs 10 cycles if the page table entry is cached, 100 cycles otherwise.
- An L2 cache miss costs 100 cycles.
- A page fault costs 10,000 cycles. Once a page fault occurs, the page in question is brought into the L1 cache at no extra charge.
- Virtual memory is 1 gigabyte (2³⁰ bytes).
- Pages are 1 kilobyte (2¹⁰ = 1024 bytes).
- The L1 data cache is big enough; the only misses are compulsory.
- On a context switch, all entries in the L1 cache and TLB are marked as invalid.
- When a process begins, none of its pages are in physical memory or in the cache.
- The caches use physical tags.
Answer the following questions:
- How many page table entries are there?
- Suppose that a context switch has just occurred. A movl instruction tries to read a virtual address x. The page table entry for x is in the L2 cache and there is a valid mapping to a physical page for x, but that page is not in the L2 cache. How many cycles will this access take?
- Suppose that a movl instruction tries to read a virtual address y. There is an address translation for y in the TLB and the physical page for y is in the L1 cache. How many cycles will this access take?