PHY201 - Worksheet 5, F00

Numerical Calculation of erf(x)- Part IV

Timing routines and compiler optimisation switches

Aleksandar Donev - Dr. Phillip Duxbury

Due Friday October 6th

Physics 201 home

Benchmarking Fortran Routines

In this worksheet you will use the previously developed module Erf_Series for calculating $\func{erf}(x)$ via a Taylor series. You will probably find it useful to also reuse pieces of the your code from worksheet 4. In this section we will learn how to tell the computer to find the execution time for a Fortran routine such as ErfOfReal. Also, since our project is already getting big and we are reusing a lot of code, you will learn how to compile modules separately and then link them. Along the way, you will hopefully learn something about compiler optimization switches as well.

You will need to read section 2.7.2 from the manual and review all the previous material covered in the first four worksheets. The next worksheet will introduce arrays for the first time and we will use them to plot the results from this worksheet using a Fortran graphics library called DISLIN.

Fortran 90 timing subroutine `SYSTEM_CLOCK`

The Fortran manual you got does not discuss this routine, but it is a simple and very useful routine. The syntax of this subroutine is,

SYSTEM_CLOCK([COUNT=clock_count],[COUNT_RATE=clock_rate],...)

where both the COUNT and the COUNT_RATE arguments are optional. Each computer and compiler have their own fast clock. This clock is ticking from the start of the routine at a given number of ticks per second. The value of this counting rate is returned in the integer clock_rate, while the current value of the clock counter is returned in the integer clock_count.

This will become clear via an example. Let's assume you want to calculate the time it takes for a given action to complete. Then you would use:

   INTEGER :: clock_start,clock_end,clock_rate
   REAL(KIND=sp) :: elapsed_time
   ...
   CALL SYSTEM_CLOCK(COUNT_RATE=clock_rate) ! Find the rate
   CALL SYSTEM_CLOCK(COUNT=clock_start) ! Start timing
      ...Do your calculation here, for example:...
      ...erf=ErfOfReal(x)...
   CALL SYSTEM_CLOCK(COUNT=clock_end) ! Stop timing
   ! Calculate the elapsed time in seconds:
   elapsed_time=REAL((clock_end-clock_start)/clock_rate,sp)

Timing the `Erf` function

One of the difficulties of timing a rouine like Erf is that it is very fast--of the order of several microseconds ( $\mu s$ ). Most clocks can only resolve milliseconds or tens of milliseconds. The basic strategy in this case is to call the routine n_repetitions times in a DO loop and time the total time for the execution.

Write a new module called, for example, Erf_Timing, that will contain a single routine inside, say ErfTiming, which will take xas an argument and return the time (in seconds) it took for the routine to execute (per instance). Put this module in a separate file.

Your function may look something like (by now you should know how to place this in a module and properly declare the arguments):

  FUNCTION ErfTiming(x) RESULT(elapsed_time)
    ...
    CALL SYSTEM_CLOCK(COUNT=clock_start)
    DO i=1,n_repetitions
      erf=ErfSeries(x)
    END DO
    CALL SYSTEM_CLOCK(COUNT=clock_end)
    ...Return the elapsed time...
    ...Remember to divide by n_repetitions!...
  END FUNCTION ErfTiming

Play with the value of n_repetitions. Reasonable values are anywhere from 10,000 to 1,000,000. You will know whether the value is large enough if the results of your program do not fluctuate when you execute the program several times. You will know it is too large if you have to wait for more than a few minutes for your program to complete.

Advanced: Compiler optimization and timing

The above will work OK in most cases, but it has some dangers that need to be pointed out. Namely, smart compilers will see that the body of the DO loop above does not change and so will not execute the loop many times but only once. This falls under the great Fortran strength of compiler optimization and is discussed in section 2.7.2 of the manual. This is especially likely in Fortran 90, where the compiler can be informed (say by the PURE attribute) that the Erf routine is ``harmless''-it has no outside effects.

To ammend this, a common strategy is to turn compiler optimization off when compiling only the timing routine (ErfTiming). On many compilers this is done by adding a -O0 switch when compiling (in words, set optimization level to 0). For example:

>   f90-vast -O0 timing.f90 ...

The main program

Last time you wrote a program in which you calculated the error function for n_points values in the interval [x_min,x_max] and wrote it to a file. This time just remove the file I/O statements and replace the call,

   erf=ErfOfReal(x) ! Or erf=ErfSeries(x)

with

   elapsed_time=ErfTiming(x)

Then print the value of x and the elapsed time in microseconds (of course, of you want to, you can simply write these to a file and do as least modifications as possible). Again, let the user enter the number of output points n_points, x_min and x_max, and the precision $\varepsilon$ . For example, the output of your code may look like:

[donev@gauss erf]$ ./erf_timing.x
Enter: x_min, x_max, n_points and error: 0.0,4.5,5,1E-6
x=     0.0000     time =     1.0000 us
x=     1.1250     time =     4.5000     us
x=     2.2500     time =     7.7000     us
x=     3.3750     time =     14.600     us
x=     4.5000     time =     22.500     us

Advanced: Notes on compiling

A quick note about something that came up in class and the manual also emphasizes: Include the statement,

   IMPLICIT NONE

in all your programs and modules after any potential USE statements (which should come first always). This way the compiler will check and make sure you have declared all your variables correctly.

By now you probably realized that you need to put all modules USEd by other modules at the top of the file, before they are USEd. This is because the compiler needs to compile these first, generate the needed information and then USE it. In larger projects, like our now month-old error function series is slowly becoming, it is wise to keep each module in a separate file and compile it individually. Although it is not neccessary you do this, it will be much easier and you will avoid a lot of copying and pasting and repetitive work

This is how that is done: Assume we have a file Module.f90 which contains a module used in the main program or another module that is in the file Program.f90. First, just compile, and don't produce any executables, (the switch is -c) the module file:

>   f90-vast -c Module.f90 -o Module.o

This will produce something called an object file Module.o from all the subroutines in the module and make a file in the current directory Module.vo with information about the module. These files will be used by all compilations that use the module. Make sure you stay in the same directory if you like your life to be easier. Now, you can compile the executable, and link the produced object file:

>   f90-vast Program.f90 Module.o -o Program.x

Compiler usage is not trivial, especially with Fortran 90, but the same principles apply to any programming language, so it is well worth your time to play with this compiler!

About this document ...

This document was generated using the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998)

The command line arguments were:
latex2html -split 0 worksheet5_f00.tex.

The translation was initiated by Phil Duxbury on 2000-10-02

Phil Duxbury
2000-10-02