CS 3853 Spring 2010, Homework 1

Due on Thursday, February 4, 2010 at 11:59pm.

  1. Consider the following sequence of hexadecimal values:
    55 89 E5 83 EC 08 83 E4 F0 31 C9 BA 01 00 00 00 B8 0D 00 00 00 01 D1 01 CA 48 79 F9 31 C0 C9 C3
    
    
    This sequence of bytes represents a subroutine in Intel 80386 machine language in 32-bit mode.
  2. For this part, you will count the number of machine language instructions executed by a C program for two different levels of compiler optimization. You will do this compiling the C program into assembly language, then placing instrumentation code into the assembly language program. The instrumentation code keeps a running total of the number of instructions executed so far, then prints out that number at the end of the program.

    Do the following:

    1. Obtain the prog.c C program by clicking on the link in this sentence.
    2. Compile the C program to assembly language on the main computers. Log on to one of those machines and place the C program into your directory. Compile the program to assembly lanuage with the following command line:
      gcc -S -o prog-O0.S -O0 prog.c
      
      The -O0 option tells the compiler to compile with optimization level zero, i.e., no optimizations. The -S option tells the compiler to compile to assembly language instead of to an executable program, and to place the assembly language in a file called prog-O0.S.
    3. Edit the assembly language program and divide it into basic blocks. Rather than count each individual instruction, you will count groups of instructions, called basic blocks, together. A basic block is a sequence of instructions that are executed together. A basic block begins either after a label or after a jump instruction, and end either at a label or a jump instruction. Place a comment before each basic block and white space around it so that the division of the program into basic blocks is clear. Comments in an assembly language program begin with '#' and continue until the end of the line.
    4. Place definitions in the assembly language program for an integer variable called insn_count and a printf format string that says "%d instructions executed".
    5. Count the number of instructions in each basic block. At the beginning of each basic block, insert an instruction that adds that number to insn_count. For instance, if there are four instructions in a basic block, then you would insert the instruction addl $4,insn_count at the beginning of that block (but after any label).
    6. Add code to the very end of the program, just before push $0/call exit, to print out insn_count using printf and the format string defined earlier.
    7. Assemble and run the instrumented program with the following commands:
      gcc -o prog-O0 prog-O0.S
      ./prog-O0
      
    8. Repeat steps 2 through 7, using instead the following command line to compile the program:
      gcc -S -o prog-O3.S -O3 prog.c
      
      The -O3 option tells the compiler to compile with optimization level three, i.e., aggressive optimizations.

    Additional Information

    Example

    Here is an example for a small C program. The following C program computes and prints the 29th Fibonacci number to standard output:
    #include <stdio.h>
    #include <stdlib.h>
    
    int main () {
    	int	i, a, b;
    
    	a = 1;
    	b = 1;
    	for (i=0; i<14; i++) {
    		a = a + b;
    		b = b + a;
    	}
    	printf ("%d\n", a);
    	exit (0);
    }
    
    The source code file is named sample.c. When this program is compiled with gcc version 2.95.4 using the command line gcc -O0 -S sample.c, the following file called sample.s is produced:
    	.file	"sample.c"
    	.version	"01.01"
    gcc2_compiled.:
    .section	.rodata
    .LC0:
    	.string	"%d\n"
    .text
    	.align 4
    .globl main
    	.type	 main,@function
    main:
    	pushl %ebp
    	movl %esp,%ebp
    	subl $24,%esp
    	movl $1,-8(%ebp)
    	movl $1,-12(%ebp)
    	movl $0,-4(%ebp)
    	.p2align 4,,7
    .L3:
    	cmpl $13,-4(%ebp)
    	jle .L6
    	jmp .L4
    	.p2align 4,,7
    .L6:
    	movl -12(%ebp),%eax
    	addl %eax,-8(%ebp)
    	movl -8(%ebp),%eax
    	addl %eax,-12(%ebp)
    .L5:
    	incl -4(%ebp)
    	jmp .L3
    	.p2align 4,,7
    .L4:
    	addl $-8,%esp
    	movl -8(%ebp),%eax
    	pushl %eax
    	pushl $.LC0
    	call printf
    	addl $16,%esp
    	addl $-12,%esp
    	pushl $0
    	call exit
    	addl $16,%esp
    	.p2align 4,,7
    .L2:
    	leave
    	ret
    .Lfe1:
    	.size	 main,.Lfe1-main
    	.ident	"GCC: (GNU) 2.95.4 20011002"
    
    By placing more assembly language instructions and a few extra definitions in this file, as well as some comments for clarity, the following program has been instrumented to count the number of instructions executed:
    	.file	"sample.c"
    	.version	"01.01"
    
    # this section has some definitions needed for counting instructions
    
    # declare an integer called 'insn_count'.  it is automatically initialied to 0
    # by the assembler
    	.comm	insn_count,4,4
    # a constant string used for the 'printf' call that will print out
    # the number of instructions executed
    
    .count_string:
    	.string	"%d instructions executed\n"
    
    gcc2_compiled.:
    .section	.rodata
    .LC0:
    	.string	"%d\n"
    
    .text
    	.align 4
    .globl main
    	.type	 main,@function
    
    # basic block #1
    main:
    	addl $6,insn_count # six instructions in this block
    	pushl %ebp
    	movl %esp,%ebp
    	subl $24,%esp
    	movl $1,-8(%ebp)
    	movl $1,-12(%ebp)
    	movl $0,-4(%ebp)
    	.p2align 4,,7
    # basic block #2
    .L3:
    	addl $2,insn_count # two instructions in this block
    	cmpl $13,-4(%ebp)
    	jle .L6
    
    # basic block #3
    	addl $1,insn_count # one instruction in this block
    	jmp .L4
    	.p2align 4,,7
    # basic block #4
    
    .L6:
    	addl $4,insn_count # four instructions in this block
    	movl -12(%ebp),%eax
    	addl %eax,-8(%ebp)
    	movl -8(%ebp),%eax
    	addl %eax,-12(%ebp)
    # basic block #5
    .L5:
    	addl $2,insn_count # two instructions in this block
    	incl -4(%ebp)
    	jmp .L3
    	.p2align 4,,7
    # basic block #6
    .L4:
    	addl $9,insn_count # nine instructions up to 'call exit' in this block
    	addl $-8,%esp
    	movl -8(%ebp),%eax
    	pushl %eax
    	pushl $.LC0
    	call printf
    	addl $16,%esp
    	addl $-12,%esp
    
    # code to print out the instruction count
    
    	movl insn_count,%eax
    	pushl %eax
    	pushl $.count_string
    	call printf
    	addl $16,%esp
    
    	pushl $0
    	call exit
    	addl $16,%esp
    	.p2align 4,,7
    .L2:
    	leave
    	ret
    .Lfe1:
    	.size	 main,.Lfe1-main
    	.ident	"GCC: (GNU) 2.95.4 20011002"
    

    You may not work together on this assignment with other classmates or receive assistance from any person other than your professor. You may consult literature or the Internet for assistance; if you do, indicate the source in your writeup.

    Requirements

    Send the following four items to the teaching assistant, as separate email attachments, on or before 11:59pm, Thursday, February 4, 2010.
    1. Your answer to the first part.
    2. Your instrumented and commented prog-O0.S.
    3. Your instrumented and commented prog-O3.S.
    4. A paragraph describing your observations of what happened when you ran the programs. How many instructions did prog-O0 execute? How many instructions did prog-O3 execute?

    Late assignments will not be accepted.

    Your professor knows that most of you don't know 80386 assembly and machine language. You will need to be resourceful and listen carefully in class and in the recitation.