Sunday 29 July 2007

Why Ant?

Since I'm relatively new to the Java universe, I come with some preconceived ideas. One of them was that when you need to build something, you need to write some makefiles. So I never could understand why should the Java crowd need one another tool? Admittedly, make has got its quirks, but it is there, it's stable and it's understood. And the makefiles are human-readable! So I searched the internet for an answer. A typical statement would look like this:

Ant has been on the top of many a developer's list as the revolutionary tool that got them out of the world of make. For those of you not familiar with make, it'll be enough t say that it just isn't the best tool to use for building Java projects, since it isn't platform-independent and it isn't that easy to use.*
Not easy to use??? Come angain? What about Ant's XML-disease:
Ant ... uses an XML configuration file, the infamous build.xml. Ant has enjoyed heavy popularity with its many advantages, but it also has some drawbacks. The build.xml files can be extremely terse, and their use requires the developer to learn the syntax up front.*
...the disease of forcing humans to use a machine-oriented markup language to configure the build process! But the main gripe, as I understood it, was make's usage of tabs to demarcate the commands to be invoked for specific rules:
Makefiles are inherently evil as well. Anybody who has worked on them for any time has run into the dreaded tab problem. "Is my command not executing because I have a space in front of my tab!!!" said the original author of Ant way too many times. Tools like Jam took care of this to a great degree, but still have yet another format to use and remember.**
You kidding? I personally didn't have any problems with tabs, never ever. That cannot be the reason for one another tool! Only when I read this I understood:
Gnu make+Unix, windows and NMAKE, etc. While single-IDE or single-platform development worked in small, in-house projects, it didn't cut it for open source dev, where one person may have a solaris box with make on, but the other developer is on a windows system, another on a version of debian-unstable they built themselves. There's is/was not enough unity of infrastructure to stabilise on tools, so James Duncan Davidson had to write one. Ant. ***
Well, the real reason is the portability: the problem, which JVM solved for us. But as I see it, there is another side to this. Did you notice the phrase "the other developer ... on a windows system"? That's it: Microsoft is killing make, is killing the Unix legacy with Java's hands! Because there's no make installation on Windows, not because make isn't platform independent (it' written in C, you have only to recompile it). And because it's rm not del, and dir not ls. Microsoft couldn't kill Unix toolchain all alone, but ironically Java helped here a great deal by defining a new level of abstraction, which requires new tools. Ok, at last, the riddle is solved...

But wait, there's a moral in this story too! Strictly speaking even two morals. First: there's a lot of half-truths on the internet, and second: the truth isn't easily revealed, but it pays to search for it.

---
* http://www.onjava.com/pub/a/onjava/2006/03/29/maven-2-0.html
** this I understood as a received wisdom coming right from the source: http://ant.apache.org/
*** http://www.1060.org/blogxter/entry?publicid=0E729BD9B8A06F372CC136D402810C82

Saturday 28 July 2007

Google Solvers: Python vs. Perl, Java and C++

It happened a couple of months ago. My distinguished colleague Stefan Z., a dyed in the wool C/C++ developer, educated on Dijkstra, Knuth and Wirth, wanted to learn Python to extend his horizon. As he's already got the basic free Python books from the web, he proceeded to coding of a simple example. He chose the Google problem: find the solution(s) of the equation "wwwdot - google = dotcom"*, and coded the greedy (or is it the brute-force?) solution. He showed it too me, and true to the motto "Doing science for fun and profit ;-)" I guessed it would be interesting to compare this code for speed and for visual appeal to its Perl equivalent. And then to Java.

Python code looked like this:
#!/usr/bin/python

print "Program Start."

for W in range(10):
for D in range(10):
for O in range(10):
for T in range(10):
print "looking for W=", W, "D=", D, "O=", O, "T=", T
for G in range(10):
for L in range(10):
for E in range(10):
for C in range(10):
for M in range(10):
a = 100000*W+10000*W+1000*W+100*D+10*O+T
b = 100000*G+10000*O+1000*O+100*G+10*L+E
s = a - b
r = 100000*D+10000*O+1000*T+100*C+10*O+M
if s == r:
print "FOUND the solution: a=", a, "b=", b, "s=", s, "r=", r
print " W=", W, "D=", D, "O=", O, "T=", T, "G=", G, \
"L=", L, "E=", E, "C=", C, "M=", M

print "Program End."
short and to the point!

The Perl code like that:

#!/usr/bin/perl -w

print "Program Start.\n";

for $W (1..10) {
for $D (1..10) {
for $O (1..10) {
for $T (1..10) {
print "\nlooking for W=", $W, "D=", $D, "O=", $O, "T=", $T;
for $G (1..10) {
for $L (1..10) {
for $E (1..10) {
for $C (1..10) {
for $M (1..10) {
$a = 100000*$W+10000*$W+1000*$W+100*$D+10*$O+$T;
$b = 100000*$G+10000*$O+1000*$O+100*$G+10*$L+$E;
$s = $a - $b;
$r = 100000*$D+10000*$O+1000*$T+100*$C+10*$O+$M;
if ($s == $r) {
print "\nFOUND the solution: a=", $a, "b=", $b, "s=",
$s, "r=",$r;
print "\n W=", W, "D=", D, "O=", O, "T=", T, "G=", G,
"L=", L, "E=", E, "C=", C, "M=", M;
}
}
}
}
}
}
}
}
}
}

print "\nProgram End."
so it wasn't that unreadable: actually a little more readable than Python by its usage of direct range expressions and curly braces to denote scopes. On the other side the brace doesn't add any useful information in this special example but can't be omitted, so maybe Python isn't so bad after all?

In Java code I skipped unneccesary braces a thing which... would not be possible in Perl:

public class GoogleSolver
{
public static void main(String[] args)
{
System.out.println("Program Start.");
int a=0, b=0, s=0, r=0;

for (int W=1; W <11; W++)
for (int D=1; D <11; D++)
for (int O=1; O <11; O++)
for (int T=1; T <11; T++)
{
System.out.println("looking for W="+ W+ " D="+ D+ " O="+ O+ " T="+ T);
for (int G=1; G <11; G++)
for (int L=1; L <11; L++)
for (int E=1; E <11; E++)
for (int C=1; C <11; C++)
for (int M=1; M <11; M++)
{
a = 100000*W+10000*W+1000*W+100*D+10*O+T;
b = 100000*G+10000*O+1000*O+100*G+10*L+E;
s = a - b;
r = 100000*D+10000*O+1000*T+100*C+10*O+M;
if (s == r)
{
System.out.println("FOUND the solution: a="+ a+ "b="+
b+ "s="+ s+ "r="+ r);
System.out.println(" W="+ W+ "D="+ D+ "O="+ O+ "T="+
T+ "G="+ G+ "L="+ L+ "E="+ E+ "C="+
C+ "M="+ M);
}
}

System.out.println("Program End.");
}
}
so ist wasn't overly verbose except for System.out.println and the "public static main void" beast. Now the measurements on my Pentium IV (ca 3 GHz) with Redhat Enterprise Linux 3:

Perl: Program End. 2526.790u 0.310s 42:11.34 99.8% 0+0k 0+0io 244pf+0w
Python: Program End. 4494.230u 0.330s 1:15:00.51 99.8% 0+0k 0+0io 337pf+0w
Java: Program End. 38.847u 0.150s 0:38.86 100.3% 0+0k 0+0io 1pf+0w

Pretty interesting stuff here: Perl outperformed Python twice and Java was lightning fast! It seems Python's VM implementation isn't very sophisticated. OK, these are no scientifical experiments, but the numbers are reproducible and telling: Perl is pretty fast, and usable for some number crunching, Python is painlessly slow :-(, and Java is just good! Both Python and Perl are installed in a standard way in the /usr/bin directory, so there's no unfair network overhead involved! And Java deserves praise, considering that I used the most inefficient way of constructing the output strings!

Ok, I thought, now let's try something really fast, and I quickly recoded the algorithm in C++. Here I skipped unneccesary braces as well:

#include <iostream>
using namespace std;

int main()
{
cout << "Program Start.\n";
int a=0, b=0, s=0, r=0;

for (int W=1; W <11; W++)
for (int D=1; D <11; D++)
for (int O=1; O <11; O++)
for (int T=1; T <11; T++)
{
cout << "\nlooking for W="<< W<< "D="<< D<< "O="<< O<< "T="<< T;
for (int G=1; G <11; G++)
for (int L=1; L <11; L++)
for (int E=1; E <11; E++)
for (int C=1; C <11; C++)
for (int M=1; M <11; M++)
{
a = 100000*W+10000*W+1000*W+100*D+10*O+T;
b = 100000*G+10000*O+1000*O+100*G+10*L+E;
s = a - b;
r = 100000*D+10000*O+1000*T+100*C+10*O+M;
if (s == r)
{
cout << "\nFOUND the solution: a="<< a<< "b="<< b<< "s="
<< s<< "r="<< r;
cout << "\n W="<< W<< "D="<< D<< "O="<< O<< "T="
<< T<< "G="<< G << "L="<< L<< "E="<< E<< "C="
<< C << "M="<< M;
}
}
}

cout << "\nProgram End.";
}
The code is little less verbose than of Java example, but will it run better? For all we know it should! Then I run my measurements and got a little shock:

C++: Program End. 37.420u 0.060s 0:37.92 98.8% 0+0k 0+0io 197pf+0w
Java: Program End. 38.847u 0.150s 0:38.86 100.3% 0+0k 0+0io 1pf+0w

So C++ was better, but by a very small margin!** You can see the JIT technologiy is working very good for this example: the JIT compiler must run only unfrequently and then the machine code is invoked all the time! Ok, so maybe this marketing drivel about Java speed being the same as C++ isn't all that wrong? Admittedly, C++ streams aren't the latest cry in performance, but I wanted to write standard code, without tweaks for optimization...

But wait, I thought. I remember some article on whole program optimization, and it said that Java people think C++ cannot optimize and that only JVM can. Think, think... Did you used the -O switch??? Of course I didn't, so the measurement is pretty unfair on C++ here. I repeated the measurement with gcc's -O2 optimization:

C++: Program End. 8.030u 0.010s 0:08.41 95.6% 0+0k 0+0io 199pf+0w

OK, that was reeeaaaaally fast, indeed.

So recapitulating: Perl was 2 times faster than Python, Java 2 orders of magnitude faster than both of them and C++ 1 order of magnitude faster than Java. So no surprises here. Except for Python's bad performance. So programming in the dynamically typed languages isn't always so much fun!*** In C++ you know when you are using language construct, that tey'll be rather fast, and when you are using STL you'll have your big-Oh complexity documented. But when you're using Python you don't know anything! Or you must google for implementation details of lists, tuples etc, and do some code tweaking. By the way, does anyone know why Perl is faster than Python? Does its VM do some instruction cashing that Python's doesn't? Would Jython be faster, as it can enjoy benefits of the massively optimized JVM? Maybe I should try it out...

----
* This is typically Google. They adversized one time with this line: "www.{first 10-digit prime found in consecutive digits of e}.com" or similar. The idea was to to find the people who can solve the problem inside of {}, i.e. use the prime distribution theorem, and hire them.

** On another machine it was even a little slower:
C++: Program End.44.552u 0.077s 0:44.64 99.9% 0+0k 0+0io 0pf+0w

*** see for example Uncle Bob's:http://www.artima.com/weblogs/viewpost.jsp?thread=4639

Thursday 19 July 2007

C++ sucks

Ok, I admit it's rather an unusual title for the first entry in a blog that's supposed to go against the fads of the hour! Well, let me explain...

Lately, I was porting some code of a major telco player from their standard hardware to a new board. Admittedly, it was embedded code, but nothing hard-realtime an no microcontroller stuff - just a totally regular PowerPC processor.

And this code sucks! It's not even that bad, it's object-oriented and has some separation of concerns, but it's all huge classes, huge methods, no efforts to avoid repetitive code, lot of copy and paste: you'd think it's Java, with its lack of possibilities to invent some clever shortcuts and hide the boilerplate code! This was C++ code and it sucked: ergo does C++ suck? I've never thought I'd say that, but really, I've never been so annoyed with C++ code (or any other code for that matter). Everything was so straightforward, so a=b+c, so "churn out the LOCs", so unimaginative! It reminded me somehow of bloated, repetitive and vacuous Java syntax, but it was much worse than any bad Java code I've seen. I could just see missed design opportunuties.

Sorry, it shouldn't be a rant about Java (perhaps only a little bit about its syntax...) but about C++ code. The question here is: can a programming language stop an unimaginative programmer from writing ugly code? There are two canonical answers to this question: the modern one - yes it can, through a clean language design, and the oldschool, hardcore one - no, you can write garbage in each programming language. They have both both some merit, but my private view is a different one. I think, that with a language like C++ you can always hide the ugliness away, using the proverbial "additional layer of indirection", i.e. defining some higher level abstractions instead of thinking in straightforward, procedural ways. This is the lesson that OO taught us, and a basic one! So you cannot put the blame on the programming language's design, it's rather a problem of lack of freedom in your thought, of getting bogged down with the details, of fear of simplicity. I'm not saying here that you can write garbage in each language, as some languages are better at generating it than another. I'm saying that a better language is only one half of the deal, and we must add the another half.

Ok, but there's more to it. There's another aspect to the title of this entry: do typed languages suck? And further: is iterator pattern a sign of weakness of typed languages? Are design patterns just workarounds (like J2EE patterns obviously are)? But more about it in next installments of this blog.

Wednesday 18 July 2007

intro

Hi!

This will be my technical blog: I hope to be able to share a couple of thoughs on software and programming languages (hence the unimaginative title) in rather an irregular manner. This blog is related to my freelancing work done under the name http://www.ib-krajewski.de/ in Munich, Germany.

I'm in first line a C++ kind of a guy: doing it for a quite long time and liking it rather well. But as my interest lies in the technology rather than in evangelism, I was as well enjoing (or not) programming in Java, Perl, Python and some other less known languages (you're right, Ruby is not on the list). Technologically I come from academia and the research on distributed systems and networks, later enhanced with network management, client-server systems, and somentimes even web applications. I liked designing frameworks for other people and objects for my own systems. I hope I'll be able to say something new about languages, programming and aestetics, but it may as well degenate into rants... So be forewarned!

Marek