No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help
View Responses


Grex Consumer Item 105: Superficial C/Java Comparison
Entered by remmers on Mon Oct 18 15:56:36 UTC 1999:

 I'm bored, so I think I'll post the following: 

 Here's a C program to count the number of whitespace characters in
 standard input:
 
        #include <stdio.h>
        #include <ctype.h>
        
        int main(void)
        {
          int c;
          int sp = 0;
          while ((c = getchar()) != EOF)
            if (isspace(c)) ++sp;
          printf("spaces: %d\n", sp);
        }
 
 
 Here's a Java program to do the same:
 
        import java.io.*;
        class JavaWhite {
            public static void main(String[] args)
            throws IOException {
                BufferedInputStream in = 
                    new BufferedInputStream(System.in, 1024);
                int c;
                int sp = 0;
                while ((c = in.read()) != -1)
                    if (Character.isWhitespace((char)c)) ++sp;
                System.out.println("spaces: " + sp);
            }
        }
 
 Running the C program on a 1Mb file and timing it with /bin/time gives
 the result:
 
 real        0.1
 user        0.1
 sys         0.0
 
 Reading the same input file, the Java program's times are:
 
 real        0.9
 user        0.8
 sys         0.1
 
 Comments?

6 responses total.



#1 of 6 by scott on Mon Oct 18 19:32:50 1999:

Interesting item for the Consumer conference.

Is this a compiled Java program?  


#2 of 6 by i on Tue Oct 19 00:57:43 1999:

My first thought was that the c program should enjoy a much larger speed
advantage over the java.  A peek at the c code shows why it's "slow" - 
it's optimized for source code compactness & readability.  (Though 
counting "whitespace characters" (there's several of 'em) instead of
just spaces makes it much slower on 'most any general-purpose CPU.)


#3 of 6 by remmers on Wed Oct 20 18:52:06 1999:

(Opps!!  I meant to post this in "jellyware", not "consumer".
Well, feel free to comment anyway. You might turn this into a
discussion the relative benefits of C and Java for consumers...)

<remmers blushes>

Out of curiosity, how would you make the C version faster?


#4 of 6 by i on Sat Oct 23 04:38:22 1999:

<i peeks at a few things & plays around>
It looks like the local version of C takes care of the first three
things i suspected might drag down your code's speed - overhead of
calling getchar(), overhead of calling isspace(), and non-lookup-
table logic in isspace().  That leaves the overhead of reading one
character at a time from stdin and (minor) bit-fiddling in the local
isspace()'s lookup table.  

Running your code against /vmunix, i got times of 1.3/1.2/0.1 by 
compiling with gcc -O2.  Hacking it to take input in large blocks
(with fread), use its own isspace() lookup table, and unrolling the 
innermost loop a bit, i got times of 0.7/0.5/0.1 (same gcc -O2).

Though my hacks double the speed, it's still managing less than
3MB/sec.  I'm not really impressed with the assembly code that 
gcc puts out, but i doubt that even hand-crafted assembly code
would get much more than another 2x speed-up.  Oh, well - i was
(wrongly) assuming grex's CPU's rated a few more MIPS than that. 


#5 of 6 by remmers on Wed Oct 27 17:38:14 1999:

Hm, that was some nice optimizing you did.

I entered a version of this item in the place I originally intended
for it -- the "jellyware" conference.  Discussion there is heading
off in a different direction, though.


#6 of 6 by i on Mon Sep 3 02:44:12 2001:

I briefly played with this again.  My idea was to build an array

unsigned char is16space[256*256],

initialize it to 

is16space[i] =  2 if isspace[i>>8] and isspace[i&0xFF],
                1 if      "        xor      "         ,
                0 otherwise,

and use that to sum over the input buffer twice as fast (by doing the
sum 2 bytes at a time vs. 1 at a time).  (Yes, may be *slower* if the
effective L1 data cache size isn't fairly large.)  My quick & dirty 
attempt abused typing more than gcc could stand, however, and i'm not
feeling interested enough tonight spend more time. 

One of these years i'll learn enough of the local assembly language to
do this "right".... 

Response not possible - You must register and login before posting.

No Next Item No Next Conference Can't Favor Can't Forget Item List Conference Home Entrance    Help

- Backtalk version 1.3.30 - Copyright 1996-2006, Jan Wolter and Steve Weiss