| Improving execution times in a Java program can be very important 
              part of making it a usable tool. There are various profiler 
              tools available commercially that give detailed information on the 
              time taken by various parts of a program. The Sun J2SE java program, 
              in fact, includes the options -Xprof 
              and -Xrunhprof 
              to produce time profiles. There are some optimization steps that the programmer 
              can make to improve the performance. For example, if you are sure 
              that you will not be overriding a method then you should you use 
              the modifier final. 
              The compiler can then inline the method since the interpreter 
              will not have to search for possible overriding methods.  Inline puts the program code directly in the execution 
              flow and therefore avoids jumping to and back from another section 
              of memory. For AWT components we saw that double buffering and 
              other techniques (see Chapter 
              11: Supplements: AWT Flicker ) reduce the number of calls to 
              the host windowing system and this speeds execution considerably. 
              (Swing components already employ double bufferiing.) Avoiding multiple drawing calls reduces drawing times. 
              We mentioned in a Chapter 6: 
              Java : Drawing Polygons, a considerable speedup can be obtained 
              by using a single drawPolyline() 
              call instead of multiple drawLine()calls 
              when drawing a curve. Timing with System.currentTimeMillis()  When you are unsure about the relative speed of different 
              coding techniques, you can test them with the static method System.currentTimeMillis(). 
              This method returns a long value equal to the number of milliseconds 
              since a standard date in 1970. That number in itself is of little 
              value, but the difference between two millisecond time readings 
              can be meaningful. Thus, you can bracket a code section of interest 
              with this method and find the difference in milliseconds to get 
              an idea of how long that section of code takes to run. On today's high performance machines, you often need 
              to loop over an operation a large number of times to produce a time 
              in the millisecond range. Such a timing measurement has come to 
              be known as a microbenchmark and is fraught with difficulties 
              for a variety of reasons including just-in-time and dynamic compilation 
              techniques, compilation warm-up periods, the use of separate compile 
              threads, background compilation, compiler optimizations, dead code 
              removal, and others.  Nevertheless, many people still rely on System.currentTimeMillis() 
              to measure execution time differences. We explain here how that 
              is normally done and attempt to ameliorate many of the concerns 
              that make microbenchmarks untrustworthy.  With multiprocessing happening in the operating system 
              and multithreading in the JVM, the measured times will vary from 
              run to run, so a statistical average is more meaningful than a single 
              run. It is wise to unload any unnecessary programs that might steal 
              time slices away from the program you are testing.  Demo Timing Application 
              Program As an example, consider the java.util.Vector 
              class we discussed in Chapter 
              10 : Java. It is often very useful compared to a regular fixed-length 
              array because of Vector's 
              ability to remove and add elements. However, in situations where 
              high performance is required, you may find that an array is preferable 
              because Vector 
              offers noticeably slower performance than an array.  One reason that Vector 
              is slow is because it is synchronized 
              for thread safety. An alternative that can be used in most situations 
              is the ArrayList 
              class which does not have the synchronization overhead. In the test 
              code below we compare timings of object arrays, Vector, 
              and ArrayList 
              objects. For consistency, we populate each with Integer 
              objects. In the main() 
              method shown below in the class TimeTest, 
              we read in some command line parameters and then call doTests() 
              several times. Inside doTests() 
              we call four methods that test fetching Integer 
              objects from four different container types - Vector, 
              ArrayList, 
              regular object arrays, and an ArrayList 
              collection object that takes advantage of the new generics feature 
              in Java 5.0 (see Chapter 10). 
              If you want to run this code on a pre-5.0 Java system, you'll need 
              to comment out the generics version.  The program uses the following methods to check the 
              four container types:  
              testv() - vectortestal() - ArrayList
 testali() - ArrayList<Integer>
 testa() - array
  In each case, the fetches from the containers are 
              in a loop in order to make the total time large enough to measure. 
              We have to do something with the Integer 
              objects fetched from the containers or else the compiler will notice 
              that nothing is changing inside the loop and remove the loop altogether. 
              So for our timing loop we first cast the Object 
              type retrieved from the container to an Integer 
              and then we accumulate a total sum of all the values after converting 
              from Integer 
              type to int type using the intValue() 
              method on Integer.  
              
                 
                  | TimeTest.java |   
                  | import java.util.*; 
 /**
 * Run timing tests on four types of code:
 *  1 - Vector element access
 *  2 - ArrayList element access
 *  3 - ArrayList element access.
 *  4 - Object array access (using Integer 
                    objects
 **/
 public class TimeTest
 {
 private int fWarmup, fNum, fOuter, fSleep;
 
 /** Fill timing parameters. **/
 public TimeTest (int warmup, int sleep, int num, 
                    int outer) {
 fWarmup = warmup;
 fSleep = sleep;
 fNum = num;
 fOuter = outer;
 } // ctor
 
 /** Run tests. **/
 public static void main (String[] args) {
 int warmup=0, sleep=0, num=0, outer=0, 
                    total=0;
 if (args.length == 5) {
 warmup = Integer.parseInt 
                    (args[0]);
 sleep  = Integer.parseInt 
                    (args[1]);
 num    = 
                    Integer.parseInt (args[2]);
 outer  = Integer.parseInt 
                    (args[3]);
 total  = Integer.parseInt 
                    (args[4]);
 System.out.println ("Testing 
                    with warmup = " + warmup +
 ", sleep 
                    = " + sleep + ", num = " + num +
 ", outer 
                    = " + outer + ", total = " + total);
 }
 else {
 System.err.println 
                    ("Usage: java TimeTest warmup " +
 "sleep loop 
                    outer total");
 System.exit 
                    (1);
 }
 
 System.out.println ("V\tAL\tALI\tarray");
 TimeTest tt = new TimeTest (warmup, 
                    sleep, num, outer);
 for (int i = 0; i < total; i++) {
 tt.doTests ();
 }
 } // main
 
 /** Carry out the different set of timing tests.**/
 public void doTests () {
 long vectime    = 
                    testv ();
 long arlisttime = testal ();
 long alitime    = 
                    testali ();
 long arraytime  = testa 
                    ();
 
 System.out.println (vectime + "\t" 
                    + arlisttime +
 "\t" + alitime + "\t" 
                    + arraytime);
 } // doTests
 
 /**
 * Create a Vector and an array whose 
                    elements reference
 * Integer objects. A time comparison 
                    is made between accessing
 * the Integer objects with the Vector 
                    and the array.
 **/
 public long testv () {
 // Create Vector and fill elements 
                    with Integer objects
 Vector vec = new Vector (fNum);
 for (int i=0; i < fNum; i++) {
 vec.add (new 
                    Integer (i));
 }
 
 // Now test access times by looping 
                    through the Vector
 // to access each Integer element.
 
 // First warmup the hotspot compiler.
 long sum = 0;
 for (int i = 0; i < fWarmup; i++) 
                    {
 sum += ((Integer)(vec.elementAt 
                    (i))).intValue ();
 }
 // And give it time to finish the 
                    JIT compile
 // in the background.
 if (fSleep > 0) {
 try {
 Thread.sleep 
                    (fSleep);
 }
 catch (InterruptedException 
                    e)
 {}
 }
 // Then do the loop for real and time 
                    it.
 long t1 = System.currentTimeMillis 
                    ();
 for (int j = 0; j < fOuter; j++) {
 for (int i 
                    = 0; i < fNum; i++) {
 sum 
                    += ((Integer)(vec.elementAt (i))).intValue ();
 }
 }
 long t2 = System.currentTimeMillis 
                    ();
 long elapsed = t2 - t1;
 vec.clear ();
 vec = null;
 System.gc ();
 return elapsed;
 } // testv
 
 /** Create Object array and fill elements with 
                    Integer objects.**/
 public long testa () {
 Object[] array = new Object[fNum];
 for (int i=0; i < fNum; i++){
 array[i] = 
                    new Integer(i);
 }
 
 // Now test access times by looping 
                    through the array
 // to access each Integer element.
 
 // First warmup the hotspot compiler.
 int sum = 0;
 for (int i=0; i < fWarmup; i++) {
 sum += ((Integer) 
                    (array[i])).intValue();
 }
 // And give it time to finish the 
                    JIT compile inthe background.
 try {
 Thread.sleep (fSleep);
 }
 catch (InterruptedException e) {}
 
 // Then do the loop for real and time 
                    it.
 long t1 = System.currentTimeMillis();
 for (int j=0; j < fOuter; j++) {
 sum = 0;
 for (int i=0; 
                    i < fNum; i++) {
 sum 
                    += ((Integer) (array[i])).intValue();
 }
 }
 long t2 = System.currentTimeMillis();
 long elapsed = t2 - t1;
 array = null;
 System.gc ();
 return elapsed;
 } // testa
 
 /** Create ArrayList and fill elements with Integer 
                    objects.**/
 public long testal () {
 ArrayList al = new ArrayList (fNum);
 for (int i=0; i < fNum; i++){
 al.add (new 
                    Integer(i));
 }
 
 // Now test access times by looping 
                    through the array list
 // to access each Integer element.
 
 // First warmup the hotspot compiler.
 int sum = 0;
 for (int i=0; i < fWarmup; i++) {
 sum += ((Integer)(al.get 
                    (i))).intValue();
 }
 // And give it time to finish the 
                    JIT compile inthe background.
 try {
 Thread.sleep (fSleep);
 }
 catch (InterruptedException e) {}
 
 // Then do the loop for real and time 
                    it.
 long t1 = System.currentTimeMillis();
 for (int j=0; j < fOuter; j++) {
 sum = 0;
 for (int i=0; 
                    i < fNum; i++) {
 sum 
                    += ((Integer)(al.get (i))).intValue();
 }
 }
 long t2=System.currentTimeMillis();
 long elapsed = t2 - t1;
 al.clear ();
 al = null;
 System.gc ();
 return elapsed;
 } // testal
 
 /** Create ArrayList and fill elements 
                    with Integer objects. **/
 long testali () {
 ArrayList ali = new ArrayList 
                    (fNum);
 for (int i=0; i < fNum; i++){
 ali.add (i);
 }
 
 // Now test access times by looping 
                    through the array list
 // to access each Integer element.
 
 // First warmup the hotspot compiler.
 int sum = 0;
 for (int i=0; i < fWarmup; i++) {
 sum += ali.get 
                    (i);
 }
 // And give it time to finish the 
                    JIT compile in the background.
 try {
 Thread.sleep (fSleep);
 }
 catch (InterruptedException e) {}
 
 // Then do the loop for real and time 
                    it.
 long t1 = System.currentTimeMillis 
                    ();
 for (int j=0; j < fOuter; j++) {
 sum = 0;
 for (int i=0; 
                    i < fNum; i++) {
 sum 
                    += ali.get (i);
 }
 }
 long t2 = System.currentTimeMillis();
 long elapsed = t2 - t1;
 ali.clear ();
 ali = null;
 System.gc ();
 return elapsed;
 } // testali
 
 } // class TimeTest
 |  In the testv() 
                method, for example, we first populate a Vector 
                with Integer 
                objects. Then we begin to fetch from the Vector 
                and accumulate the sum. We are using Sun's Hotspot(tm) 
                compiler, which uses adaptive optimization, so we want to give 
                it time to warm up, we first perform a short summation 
                loop without timing. The number of warm up loops is specified 
                by one of the command line parameters. After warming up we sleep 
                a short while.  The purpose of the sleep is to relinquish the processor 
                so that the Hotspot compiler can finish a background compile of 
                the loop if needed. Finally we begin the timing loop by calling 
                  System.currentTimeMillis(), 
                followed by the loop itself (actually an inner and outer loop) 
                and then another call to System.currentTimeMillis() 
                so we can calculate the elapsed time. We then clear the Vector, 
                set the reference to null, and request garbage collection (System.gc 
                ()). The elapsed time is returned to doTests() 
                where it is printed to the console.  The command line parameters adjust the number of 
                warm up loops, the sleep time, the number of inner and outer loops 
                to be timed, and the total number of times to run the tests. If 
                you download this code you can experiment with the various parameters 
                to see what effect they have on the timing results. The results 
                can vary widely based on the platform in use. In a run of the program, via the command line we 
                set the five parameters as follows:   
              num warmup iterations 
                = 3000 sleep = 300 msecs
 num inner loop interations = 3000
 num outer loop iterations = 3000
 num test runs 
                = 4   
              The following four runs show typical output variations: 
                 
                  | C:\Java\...>java 
                    TimeTest 3000 3000 3000 3000 4 Testing with warmup = 3000, sleep = 10, num = 3000, outer 
                    = 3000, total = 4
 V       AL      AL<I>   
                    array
 1032    203     187     
                    47
 1031    219     187     
                    47
 1062    218     203     
                    110
 1031    203     187     
                    63
 
 C:\Java\...>java TimeTest 3000 10 3000 3000 4
 Testing with warmup = 3000, sleep = 10, num = 3000, outer 
                    = 3000, total = 4
 V       AL      AL<I>   
                    array
 1062    219     188     
                    46
 1406    328     343     
                    109
 1391    328     359     
                    110
 1407    328     360     
                    109
 
 C:\Java\...>java TimeTest 3000 10 3000 3000 4
 Testing with warmup = 3000, sleep = 10, num = 3000, outer 
                    = 3000, total = 4
 V       AL      AL<I>   
                    array
 1047    204     360     
                    110
 1406    219     
                    359     47
 1406    344     
                    343     109
 1406    344     
                    360     109
 |   
                For this case we see that the Vector 
              is slowest, the ArrayList 
              is up to four to five times faster than the Vector, 
              the ArrayList<Integer> 
              is varies from faster to slower than the non-generic version, and 
              the simple object array is fastest of all; up to 14 times faster 
              than the Vector 
              and 2-4 times faster than the ArrayList. 
             While the exact times will vary considerably from 
              platform to platform, these general relavtive timings will hold. 
              So from these tests you would clearly not want to use a Vector 
              in a time critical code segment and use a plain array where time 
              is paramount. So cleary we see that such timing tests can be valuable 
              even if the absolute numbers are not especially meaningful.  Other performance tips 
               Here are some tips on performance coding. See the 
              reference section below for links to a number of sites with extensive 
              lists of performance coding recommendations.  
               
                Local variables run faster than instance variables.
 
The long 
                and double 
                type variables typically require extra time to access and modify 
                since they contain additional bytes compared to int 
                and float. 
                However, significantly slower performance is not true of all JVMs 
                on all platforms since some take advantage of particular processor 
                capabilities.
 
" The JVM bytecode is weighted towards int 
                operations so use int 
                type except where you specifically need to use one of the other 
                types. Math operations with the shorter integer types are widened 
                in the JVM to int. 
                For example, a sum of two byte values results in an int 
                value (that may be cast back to a byte 
                if the result is to be stored into a byte 
                variable).
 
Use x += A; 
                assignment commands rather than x 
                = x + A; The first requires 1 instruction and the latter 
                4.
 
If you are concatenating lots of strings, use StringBuffer 
                and its append() 
                method instead of the String 
                "+" operator. If a Java 5.0 JVM is available, use StringBuilder 
                instead.
 
As demonstrated in the TimeTest 
                example above, the original Java utility classes like Vector, 
                 Enumeration, 
                and Hashtable 
                are slow because they use synchronization for thread safety. Although 
                synchronization overhead has been significantly reduced in modern 
                versions of Java, if you do not require synchronization, then 
                the newer utilities from the Java Collections Framework are normally 
                superior. The new classes ArrayList 
                and HashMap 
                generally replace Vector 
                and Hashtable, 
                respectively, and Iterator 
                is preferred over Enumeration.
 
 
                Avoid creating lots of objects if possible, 
                  especially in loop in which speed is important. Creating an 
                  object takes time and uses up memory resources.
 
 
                Avoid method calls in a loop as much as possible. 
                  That is, do your loops inside of methods rather than calling 
                  methods inside of loops.
 
 
                Performance can vary significantly among different 
                  JVMs for the same platform. Do timing measurements for all the 
                  JVMs that would potentially be used with your program
 References & Web Resources Latest update: Oct. 3, 2005 |