Monday, 31 December 2007

Some jokes for the new year

Happy new year everybody!

What about some (programming) jokes for a smooth beginning of a new year? Let's start with an old one:

How many software developers does it take to change a lightbulb? 10 to discuss the requirements, 10 more to do the analysis, 10 more to do the design, and one to write the code, 12 years later.
Well, that's definitely a "waterfall-model" one! Do you know some XP-jokes? I don't think there are any, as Agility (with the capital A!) is much to cool nowadays to be joked about ;-)! So, as not to let you think I'm a hopeless oldtimer, here is some more modern one, actually a cartoon* :

Well, a discussion could begin about Java, its low-level interfaces, its megatons of frameworks, etc... but not this time.

And here's another one, this time more higher-order and functional (Haskell, if I'm right)**:

Isn't it cute?

And, at last, hava a look at a day in programmers life: http://www.muza.pl/7001.html?cc=0.6113369317218645&mode=image&img=0#gallery. It's really hard sometimes!!!


PS: If you know some XP or Agile jokes, please let me know!

---
* sorry, don't remember the source :-((
** source: http://arcanux.org/lambdacats.html, thanks to Philip Wadler ;-)

Saturday, 15 December 2007

C++ pocket lambda library


What I planned to do at first was to write an article for my old homepage to describe my private C++ lambda library implementation, as I wanted to jazz up my web-site (and to brag a little too...). And only because I couldn't write that article this very blog was born. So at last I'm going to do what I should have done long ago, despite of the pre-christmas madness, and the fact that I've got to find a new project middle in the holiday season!

1. The Problem


I like STL. But one thing always drives me crazy! In C++ you cannot write this:
  for_each(v.begin(), v.end(),
           new class<T> Homomorphoize { void operator()(T* t){do_it(t);}( ));
nor even that:
  class<T> Homomorphoize { void operator()(T* t){do_it(t); };
  for_each(v.begin(), v.end(), Homomorphoize() );
You must declare the doer Homomorphoize functor in the global scope, as to be able to use it in the for_each template instantiation. Yada, yada, yada? Let's see what it means in practice. One of my C++ files looked roughly like that:
  //--- helper classes: no inner classes in C++ :-((
  ...
  class Finalizer
  {
    public: void operator()(Worker* w) { ...
    ...
  };
  class Starter
  {
    void operator()(Worker* w) { ....
    ...
  };
  class NotifSrcFilter : public Worker::TestEventObj
  {
    virtual bool test(D_ev* ev) const { ....
  };
  
  // --- main
  main() {
    ...
    // and start
    for_each(m_workers.begin(), m_workers.end(), Starter());
    ...
And as I like STL rather much, there will be always a couple of such classes at the start of each of my C++ files. So what is so wrong with it? Certainly you could live with this, couldn't you? Well, I don't like the separation of code parts pertaining to a single task. Typically the functors so defined are rather short and will be used in a one single place. So IMHO it's a waste of time looking up the functor definition from the place where it is used in some STL algorithm. And it always annoys me when I'm reading the code: the long prelude which must be skipped to arrive at the real code!

2. The solution


So what could be done? There are several alternatives.
  1. Write the code in Perl, Python, Groovy, etc: these languages have some support for lambda expressions, and many of us would readily jump to the occasion, however in commercial environment it is not always an option. And beside this, not everything would be possible in Python, for example, as it doesn't support full-blown lambda functions, only lambda expressions! In Java there are anonymous classes and they can be used to do exactly that, but they aren't fully functional closures too*.

  2. Write an general functor which can be parametrized before it's used in an STL algorithm: I think I've seen such an implementation in the "Imperfect C++" book, but I was not really convinced. For brevity's sake I won't discuss this technique here.

  3. Use the Boost lambda library: it's exactly what we need here! However... it's not a part of the standard C++ library, and so it's sometimes not an option in commercial environment! And it's big: did you look into the code? In includes other parts of Boost and is not very comprehensible (I didn't understand much)! We'll see later that this is inevitable as the complexity of the library increases, but I don't want to do everything what goes! There is only some subset of the general lambda functionality which I'd use in many programs.
So the result of this investigation was a decision to write small, lightweight and comprehensive lambda library for my own usage. I simply wanted to clear up my code even in a Boost-free environment!

The first thing was to get the requirements right: in this case, to decide what are the most often use cases are. I had a look at my code and found and my initial idea was to do some basic things like:
  find_if(vec.begin(), vec.end(), lambda(_$1 <= 2));
  for_each(vec.begin(), vec.end(), lambda(cout << _$1 <<  "\n"));
  sort(vec_ptr.begin(), vec_ptr.end(), lambda(*_$1 <= *_$2));
  find_if(vec_ptr.begin(), vec_ptr.end(), lambda(_$1->getValue() != 7));
  for_each(vec_ptr.begin(), vec_ptr.end(), lambda(delete _$1));
As I said, we don't want a big, complete lambda library, only a pocket edition. Here you can see the lambda function denotation, an idea which I abandoned rather soon. Frankly, I didn't know how to implement it. I knew from general descriptions of Boost lambda library, that the lambda expression should be generated by compiler via operator overloading not via a function**. So let's continue with the operator overloading idea. But how exactly should the overloading code be working?

Then Bjarne came to the rescue: he presented a following technique in as late as 2000***:
  struct Lambda { };  // placeholder type (to represent free variable)
  template<class T> struct Le // represents the need to compare using <=
  {
     // ...
     bool operator()(T t) const { return val<=t; }
  };

  template<class T> Le<T> operator<=(T t, Lambda) { return Le<T>(t); }
  
  // usage:
  Lambda x;
  find_if(b,e,7<=x); // generates find_if(b,e,Le<int>(7)));
                    // roughly: X x = Le<int>(7); for(I p = b, p!=e; x(*p++));
Do you see how simple it is? The Le<> class represents the comparison and is generated whenever a comparison is needed. How does compiler know is it needed? By the templated <= operator which is invoked as soon as compiler sees the usage of an Lambda typed variable. So the Lambda variable is just a bait for the compiler to lure it into generating our special Le<> class! This is the reason why it's referred to as a placeholder variable.

Having clarified that, we can write code to do the following:
  find_if(vec.begin(), vec.end(), _$1 <= 2);
  find_if(vec.begin(), vec.end(), _$1 == 1);
OK, tacitly we've done some design decisions: we won't use the lambda denotator, and we don't need to declare the placeholder variable, as it'll be part of the library. Moreover, we decided how the placeholder variables (yes, there will be more of them) are named. As I'm coming from Unix and shell tradition, the $1 syntax seemed more natural to me than the functional languages "_" placeholder as used in Boost. But the $1 syntax made my assembler suffer (sic!) and I decided for the middle ground: _$1!

3. The fun begins


Encouraged by the initial success we'd like to write more lambda expression using the same basic technique. Let us try to implement this:
  vector<int*> vp(10);
  find_if(vp.begin(), vp.end(), *_$1 );  // hack: means find if not 0
i.e we'd like to dereference the actual iterator. Well, this requires a somehow different approach: an operator on our placeholder variable. Not very difficult stuff. We extend our placeholder class like this:
  template <int Int> class placeholder : public lambda_expr
  {
    // ...
    public: Deref operator*() { return Deref(); }
  };

  placeholder<1> _$1;
  placeholder<2> _$2;   // etc...
So the * operation on a placeholder returns an instance of the Deref class, which looks like this:
  struct Deref : public lambda_expr
  {
      Deref() {}
      template<class T> T operator()(T* t) const { return *t; }
  };
i.e. it will, in turn, dereference its argument (the iterator) when invoked by STL via the function call operator! Simple! In the same manner we can define an Addr<> class and overload placeholder's address operator, which allows for the following code:
  // init a vector of integers
  vector<int> v10(10);
  for_each(v10.begin(), v10.end(), _$1 = 7);
  
  // construct an index vector
  vector<int*> vp(10);
  transform(v10.begin(), v10.end(), vp.begin(), &_$1);  // store ptrs to v10!!
  sort(vp.begin(), vp.end(), *_$1 <= *_$2);  // now sort the index instead of the data
Cool, we have obtained pointers to all elements of v10 and stored it away! And before we initialize the source vector elements to a value of 7! How did we do it? As the astute reader will probably know, we defined an Assgn<> class, which will be returned by the overloaded assign operator of the placeholder<> class. The sort using dereferenced comparison is a piece of cake for us! Or is it? Well, here we have placeholders on both sides of the comparison. In math-speak we no more need an unary operator on placeholders (like: *, & or ++), wee need a binary one! So let's define it:
  struct Le2 : public lambda_expr
  {
      Le2() { }
      template<class T> bool operator()(T a, T b) const { return a <= b; }
  };

  Le2 operator<=(placeholder<1>, placeholder<2>) { return Le2(); }
Not really different from what we've had before, is it? But wait, didn't we forget something? What about dereferencing the iterators before comparison? Well, we need one layer more for that:
  template <class S, class T> struct Le2_forw : public lambda_expr
  {
    S e1;
    T e2;

    Le2_forw(S s, T t) : e1(s), e2(t) { }
    template <class U> bool operator()(U a, U b) const { return e1(a) <= e2(a); }
    // ...
  };

  template <class S, class T>
    Le2_forw<S, T> operator<=(S s, T t) { return Le2_forw<S, T>(s, t); }
Now we are using the forwarder Le2_forw<> to sore any expression used on both sides of the <= comparison, remember them an then invoke them when the final comparison is done. Uff! OK, but this one will work without forwarding layers!
  find_if(vp.begin(), vp.end(), *_$1 == 1);
Or at least it should: we defined the operators for comparing placeholders to a value and dereferencing them! Well not, in the simple form as they are coded, they cannot be combined! We need to write an extra class to handle this combination:
  template<class T> struct EqDeref : public lambda_expr
  {
    T val;
    EqDeref(T t) : val(t) { }
    bool operator()(T* t) const { return val==*t; }
  };

  template<<lass T> EqDeref<T> operator==(T t, Deref) { return EqDeref<T>(t); }
Exactly as in the forwarding case above. A little annoying, isn't it? We need forwarders and/or combined operators en masse! But on the other side, it's still simple, annoying but simple. When you compare this to more advanced and layered, orthogonal implementation there is one clear advantage: simplicity. If I cannot compile some expression, I can extend the code without any problems. It's a simple library.

4. And the fun ends


Everything is so simple! Let's try some more easy lines:
  for_each(v.begin(), v.end(), cout << "--- " << _$1 << "\n"); 
Uh-oh... Now we don't even have an binary or ternary operator - we have an unary operator which can be chained and becomes effectively an n-ary one, where n is unlimited!

Well, how about this:
  struct XXX { int value;  bool isSeven() { return value ==7;};  };
  vector<XXX> vx(10);
  find_if(vx.begin(), vx.end(), _$1.value == 7);
  find_if(vx.begin(), vx.end(), _$1.isSeven()); 
We cannot overload the . operator in C++ as for today. But this is obviously a very useful piece of functionality, so maybe we can come up with something similar?

And then, let's us be brave and try this:
  for_each(v10.begin(), v10.end(), if(_$1 == 1, _$1 = 7));
  for_each(v10.begin(), v10.end(), if(_$1 == 7, ext_counter++)); 
i.e. to execute an action depending on the value of the current element under the iterator. And what about function currying and closures? Arrgh, problems... But they can be solved. Look out for the part 2 of this article!

PS: As you could see, there is room for improvement in the implementation, but I only wanted to get the code work.

---
* you can write this in Java:
  Collections.sort(anagrams,
                   new Comparator<List<String>>() {
                         public int compare(List<String> o1, List<String> o2) {
                           return o2.size() - o1.size();
                         }
                       });
cool, I like it! But you cannot change any variable from the outer scope, as there are no closures in Java (∀ j∈J: j <= 6)! So the ext_counter example from above wouldn't work, and besides: the Java algorithms library is no match for STL! (comments encouraged...)

** Now I think that it would be perhaps possible to add the lambda denotation, but only to discard it in the process and go on with operator overloading.

*** Speaking C++ as a Native (Multi-paradigm Programming in Standard C++): http://conferences.fnal.gov/acat2000/program/plenary/STROUSTRUP.pdf.

Sunday, 4 November 2007

Web Services, SOA, and the (r)REST


OK, it's time to talk about SOAP and web-services (and many other things ;-). Why now? Well, I want to write about it before I start reading Nicolai Josuttis' book about SOA, just to record my current (and possibly superficious) impressions.

Beacause Nicolai is known to me as author of fine C++ books, I expect rather much from this one, and suspect that its lecture will shutter some of my safe and secure beliefs. If thad be so, I'l write a new blog to restore the honour of SOAP. But now, ad rem! Down to business!

I remember as a couple of years ago I was receiving one email after another: new SOAP-based protocol A invented, new SOAP-based protocol B approved, new SOAP-based protocol C proposed. And I was wondering - why are the people inventing the wheel for the n-th time afte RPC, DCE and CORBA? Why so much protocols? One is daunted by the sheer number of needed specs, regulative bodies, and documents. Wasn't is supposed to be about simplicity? It's Web! Here I'll cite my first source - "Dave Podnar’s 5 Stages of Dealing with Web Services"*:
  1. Denial - It's Simple Object Access Protocol, right?
  2. Over Involvement - OK, I'll read the SOAP, WSDL, WS-I BP, JAX-RPC, SAAJ, JAX-P,... specs. next,I'll check the Wiki and finally follow an example showing service and client sides.
  3. Anger - I can't believe those #$%&*@s made it so difficult!
  4. Guilt - Everyone is using Web Services, it must be me, I must be missing something.
  5. Acceptance - It is what it is, Web Services aren't simple or easy.
So, at it seems it was about simplicity in the beginnings,and then it got more and more complicated. Why? As a SOAP book author** said: "Web Services are hard because they are a form of distributed computing and distributed computing is just about the hardest problem in computer science"!

Well, of course it's about distributed computing, but let me make my point here: web-services are got so complicated because they are too low-level a model of distributed computing! SOAP is basically RPC done with XML, and PRC deals with location-transparent procedure calls. This paradigm never had a lot of success - look at the demise of IIOP and RMI - because it's too fine-grained. Compare this remote call model to other model of distrinuted computing - Google's Map-Reduce system (to my mind the most successful distributed computing system today)***. It works at totally different level: it's about gathering and transforming information in a distributed manner, hiding away details like remote calls and node failures!

Now enter REST (representational state transfer)****, one another model for distributed computing. Let's explain it to those of you doing embedded programming or databases: REST is the principle the Internet is acting upon. There is a set of ressources which we can acces using URL's, i.e. only knowing their location. So, instead of using UDDI+WSDL+SOAP we just send:
and we'll get the data (usually XML) in response, or we'll create a new blog entry. Simple? Yes, it is! Compare it to the 5 stages of web services!

Well, one must admit, there's a lot of public SOAP interfaces in the Web at the moment: look at Google's and Yahoo's maps, Amazon's bookstore and Ebay auctions. But these interfaces are increasingly perceived as old-fashioned, because RESTful services scale better and are simpler to use. For example Google discontinued its SOAP-based search API and is rumored to be striving to get away from SOAP-based APIs entirely. As for Yahoo and Amazon, which offer SOAP and REST based API in parallel, 85% of their users allegedly choose REST.

It is perhaps a good place now for some well-known citation. It has aleady made its round over the Web, but I'll use it here nevertheless, as it is simultaneously fun and instructive*****:
"WS-* is North Korea and REST is South Korea. While REST will go on to become an economic powerhouse with steadily increasing standards of living for all its citizens, WS-* is doomed to sixty years of starvation, poverty, tyranny, and defections until it eventually collapses from its own fundamental inadequacies and is absorbed into the more sensible policies of its neighbor to the South.

The analogy isn’t as silly as it sounds either. North Korean/Soviet style “communism” fails because it believes massive central planning works better than the individual decisions of millions of economic actors. WS-* fails because it believes massive central planning works better than the individual decisions of millions of web sites."
Let me close this post with a little story. At the BEA's Arch2Arch Summit in Frankfurt, Germany, this summer I attended the panel discussion about SOA with some of high-level BEA people. BEA is a middleware company, Java-centric and very much into web-services. Of of the questions at the panel discussion was the simple and succint: "Why do you think SOA is so good?". The answer of the president of the German subsidiary was: "Because it's about customer orientation!". The answer of BEA's chief architect was: "Because there's no programming language binding (as in CORBA) - the protocol is language-neutral!".
After hearing that I asked myself: is that something really new? Isn't industry repeating itself? It was the same feeling I described at the start of this post, when I heard about new SOAP protocols: are the people trying to solve something, or to generate more buzz and revenues? But my gut feeling may be wrong - admittedly SOAP is too low level as an abstraction, but SOA is building a layer on top of it. On the other hand SOA is normally implemented using SOAP. So SOA=SOAP=WS* in the end? And why is corporate world loving it? Is it doomed to failure like North Corea? Could we do SOA with RESTful services?

PS:
I updated my August-07 "iPhone Presentation" post. After I've read Nicolai's book, I'll probably update this one too.

---
* see http://rmh.blogs.com/weblog/2006/04/dave_podnars_5_.html and discussion http://www.theserverside.com/news/thread.tss?thread_id=40064
** books homepage (http://soabook.com/), and excerpt http://soabook.com/blog/?p=5
*** for example see the presentation: "Using mapreduce on Large Geographic Datasets (Barry Brumitt, Google)", http://video.google.com/videoplay?docid=6202268628085731280 "
**** defined in Roy Fielding's dissertation: "Architectural Styles and the Design of Network based Software Architectures, 2000", www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
***** see here for the post and discussion: http://cafe.elharo.com/xml/north-and-south/

Tuesday, 30 October 2007

What can you learn from the bad guys?

Really, is there something you can learn from them? Confucius said some time ago: "I'm always glad to meet another voyager: if it's a pleasant man, I can learn what I can do better, if it's an unpleasant man, I can learn what I shouldn't do!". But can we learn something in apositive sense? In order to be able to answer this, let us return to the realm of IT.

What do the really bad IT-guys do? They break in into systems! They use a host of techniques, and one which is particularly interesting is the "SQL injection" (I personally prefer the more colourful - if slightly incorrect - name "SQL hijacking"). Let's explain it using an SQL example. If we build an SQL query string like this:
query = "SELECT * FROM users WHERE name = ' " + userName + ' ";
Now, if we supply an user name string like, say "dummy' or name not null --", the resulting SQL query looks like*:
SELECT * FROM users WHERE name = 'dummy' or name not null --'
And this will get you the list of all users in the system! If the submitted username is like "xxx'; DROP TABLE users; --" the attacker can even get nasty on the database! Simple but effective!

So what, you'd say, how many bad guys will have direct access to my SQL database? Well, you'd be surprised! If you have a Web application which accesses an SQL database the following scenario is conceiveable. The bad guy invokes your SQL database indirectly requesting an URL in the form:
your.site.de/your_app/ListUsersAction.do?ID=11;UPDATE+USER+SET+TYPE="admin"+WHERE+ID=23
which, when appended to the "SELECT * FROM users WHERE id=" string will grant the attacker the admin privileges! Or, when he requests the URL in the form:
your.site.de/your_app/ListUsersAction.do?ID=11;GO+EXEC+cmdshell('format+C')
the disaster is looming! I don't want to discuss this in detail, as it's not a hack instruction page, but I hope you've got the general idea: this a real problem for the Web application development. Well, that's all very bad behaviour and all, but the question is: why should we learn anything about this?

Recently I was charged with extension of a J2EE web application for one of my clients.They were using a proprietary custom tag JSP library**, whose sources were not available (unfortunately, in a big company that's not so unusual). The custom tag library was rather nice, but as the code was gone, it was not extensible. For example, I couldn't specify the onClick attribute for the custom button tag:

<somelib:button action="someapp/SomeAction.do" image="..." ... ⁄>
Can you see it? Think, think... Yes! Let us try some JSP/Javascript injection:

<somelib:button action="someapp/SomeAction.do \" onClick=\"alert('Gotcha!'); \" " image="..." ... ⁄>
which produces, when iterpreted to HTML, a wonderful:

... url="someapp/SomeAction.do" onClick="alert('Gotcha!')" image="..."
The reason why this is working*** ist the fact that \" isn't interpreted as the end of string sign, as it is escaped with a "\" ! So we have a kind of double bottom here - on one level we have a regular string, which, when interpreted, is transformed into something different sematically - it's the second bottom. Isn't it beautiful? On one level (tags) we have a senseless string, which on the lower level (HTML) gets injected into a page code and starts living its own life! Like a seed coming to life after a hibernation phase. Since then I used this technique in several other places, and it never ceases to amaze me. Look at that: <somelib:help key="start_task" tooltip="Help \" align=\"right\" "/>.

So now we can answer the title question: from the bad guys, we can learn how to extend a non-extensible JSP custom tag library! And yes, Confucius was right: we can learn from every person. Indeed.

---
* here "--" comments out the last apostrofe, but we might as well add some ineffective condition like "and name='yyy" to balance the apostrophes.
** for those of you not doing J2EE: a custom tag is a sort of macro generating HTML code when interpreded on the server side
*** this works on Tomcat, and as Tomcat is the reference implementation of a Servlet container... guess what.

Saturday, 20 October 2007

Some plans, or what's to be expected

...from this blog? As this blog is stalling a bit recently (I began several articles in parallel and they are each in a different phase of completion) I felt like it would be a good idea to reinstate the purpose of this blog. For me: to motivate myself to complete at last some of the most promising articles, and for my readers (yes, don't laugh, I know that nread >= 1!): to get some perspective if it'd worth reading this blog for the next time.

When I started, I had two purposes on my mind:

1. to write about some of my technical work which I found interesting
2. to give some interesting vistas on some general topics in SW developement - provided I can find some ;-)

So you can see this blog wasn't intended as uncoordinated blatherings, but rather a limited collection of articles: I simply haven't done that many interesting things or found that many new ideas. This can be explained by the fact that this blog is a sort of replacement for my old, nonfunctional, dilapidated homepage, which I really should update sometimes soon. But these days starting a blog is much simpler than re-working an old homepage (it's Web 2.0 now!). So let's extemporize a list of topics I've got in the pipeline:

1. Technical topics

  • design and implementation of my own small lambda expression library in C++ (as it was fun)
  • modal logic and tesing of multithreaded application (as it shows the limits of knowledge)
  • using lock-free synchronization (as C++ programmers always worry about performance, maybe to much?)
  • subtle Java vs C++ differences (as I like language-design discusssions)

2. General SW-engineering topics

  • why is everybody keen on scripting languages (sociology applied ;-)
  • Python's language design (coming to terms with Python hype)
  • waterfall and XP methods (a somewhat different comparision)
  • web services, SOAP an the (r)REST
  • does speed really kill in SW-engineering?
  • the new antipattern (some ruminations on singletons)
  • C++ myths in the Java community (as most of the Java people know C++ only from fairy tales, I think)
  • from DLL to XML hell (sometimes it must be a rant)
  • Idiot's sorts (I mean sort algorithms...)

etc...

OK, let's get real. I'd be happy to complete the blogs on the three first topics from the list 1. and the four first topics from the list 2. In addition I'll sometimes do a "follow up" on some earlier blog entry. The one I must definitely write is a follow-up on the "Google Solvers"* blog: there is defintely something that has to be added to that one! Sometimes I'll simply extend some older article if it isn't too long, for example the "iPhone presentation"** blog: at the time of its writing I purposely omitted examining any of my guesses, now it's time to compare my intuitions with reality. And there are some other topics I've forgotten, but I may remember again sometime. As my distinguished colleague Stefan Z.* would say "Software engineering is like war, ...", so I'll keep on fighting.

----
* http://ib-krajewski.blogspot.com/2007/07/google-solvers.html
** http://ib-krajewski.blogspot.com/2007/08/iphone-presentation-with-afterthoughts.html

Tuesday, 18 September 2007

shell as a "modern" programming language

I don't know how many of you is reading ACCU's Overload magazine, but I do. In the last issue* we find an article by Thomas Guest with the flippant title: "He Sells Shell Scripts to Intersect Sets", which demonstrates implementation of the set-theoretic operations (i.e. union, intersection, difference, etc) in UNIX shell language. As interesting pastime as these techniques are, I was struck by the final words of the article:

"For me, it’s not just what the shell tools can do, it’s the example they set. Look again at some of the recipes presented in this article and you’ll see container operations without explicit loops. You’ll see flexible and generic algorithms. You’ll see functional programming. You’ll see programs which can parallel-process data without a thread or a mutex in sight; no chance of shared memory corruption or race conditions here.

.....the intent shines through as clearly as ever: we have a compact suite of orthogonal tools, each with its own responsibility, which cooperate using simple interfaces. We would do well to emulate this model in our own software designs."
The second part of the quote was not new to me: in fact I've heard it uncountable many (pun intended) times from my distinguished colleague Stefan Z.: set of simple tools which can be combined through a common set of interfaces. You've probably read this many times before. But what I really liked, is the mercilessly pointed out (timeless?) modernity of the shell: its high level programming model, its extensibility, its focusing on concepts and hiding details away. This stands in sharp contrast to the self proclaimed "modern" programming languages of the day: Java and C# (or is this already Python?), which rather don't excel in this regard.

We so often take these features of shell programming for granted, that it must be sometimes pointed out to make us think about it. These features are not given as a matter of course! After day-to-day struggle with low level aspects of the "modern" programming languages it sometimes pays off to turn your attention to shell and contemplate it's design once again (and dream of a better world ;-).

If you are Java programmer, read it. You'll like it. It's a lesson in abstraction.

---
* Overload 80, August 2007: http://accu.org/index.php/journals/c78/


Reprise:


Speaking about set operations: relational databases are built around a set algebra (as relations are sets in, ehm, set theory), so the shell file operations could be in principle implemented with SQL! If only we could turn the Apache logs from the abovementioned article into realtional tables...

Well, why not! Look at SQLite - an embeddable, zero-configuration, C-language database (it's used in GoogleGears for example). This database has one pretty nifty feature known as "virtual tables"*: you only have to write some callback methods which will be called by the SQLite database engine to read data which are stored in different formats. For example you could treat the filesystem as a database table(s), i.e. fire SQL queries on it! So, leaving out all technical details, it's possible with the help of virtual tables to scan our logfile using SQL! I guess it's rather cool.

---
* http://www.sqlite.org/cvstrac/wiki?p=VirtualTables

Monday, 10 September 2007

Do you GoF?

My first encounter with design patterns in action was a memorable one. It was about 10 years ago, shortly after the GoF book appeared and everybody who thought to be somebody was supposed to speak about it. I too pored briefly over the book, but was not a very dedicated student of it, and did it rather out of the sense of obligation. Not by any means to be able to solve a design-patterns crossword (yes, things like that exist in the vast open spaces of the web*)!

But what about the announced encounter? Well, there was a guy on the project I was in at that time, and he used nearly every pattern he could. The trouble was, his program didn't work! This gave me the first impression of the design patterns: it's something for weirdos, something to brag about, nothing for the people who really want to get the job done!

Then I have re-read the GoF book two or three times and found some of the techniques interesting, some not so, and some rather dull. Now, 10 years later, rather suprisingly for myself, I arrived at the opinion that I never really used the design patterns that much!!! I realized for example, that the much beloved patterns of the Java community** had never much appeal to me. For example I never used a factory pattern: why should I do it though? If I need an object, I simply create it using its constructor. Nothing simpler than that! In that way I never needed an Inversion of Control framework - I simply used Dependency Inversion, i.e. I parametrized the lower layer objects from the upper layer ones! I used the singelton pattern, but only to sort out the C++ problems with static object's initialization, not to ensure a single instance of something. If you need it once, create it once as Uncle Bob rightly said***. Ok, I must confess my sins: I used singletons instead of global objects :-((, but it was my laziness, and - yes, you guess it - nobody dared to gripe about this, as it was a "design pattern"! I never used the visitor pattern: it's simply too complicated, too confusing, and its variants (like asynchronous visitor) didn't make the situation any better. It's an "aha" moment when you realize that the visitor pattern is not about visiting but about adding new methods to classes!

An yes, I admit, there could be some problems where I'd use the abstract factory pattern, and there is always a problem where I will use the observer pattern, but here it is, my main point about GoF-ing: it is not that difficult to make out the equivalent solution on your own, really! Moreover, having a pattern given to you and ready to use (i.e. ready to copy and paste) is actually pernicious, as you won't have the deeper understanding of what you are doing, which can only arise from intensive occupation with the subject at hand. I know, the "deeper understanding" phrase sounds rather Harry Potter'esque, but I think it contains some truth.

So maybe the real value is that they establish a "pattern language" for the solutions of SW problems? I think it is so. You can refer to a portion of your code like "I use here an observer for this and a singleton for that" and don't have to describe the solution in detail - only its main idea.

Alas, the design patterns are presented to the developers rather as a catalogue of ready-made, off the shelf solutions to be copy-and-pasted every time you have a problem, or worse, every time you want to look smart to some other people. And this way they are stifling the creativity, stopping you from starting your own journey. But they were invented (if I'm interpreting Ch. Alexander's notion of architectural pattern correctly) to let you start your journey from a higher level.

PS: There is another opinion on design patterns out there: Mark Dominus maintains that a design pattern is really an indicator for a flaw in a programming language, i.e. that a language of next generation will make it obsolete. His example is the iterator pattern. I must admit I still haven't given it enough thought. On the other hand Rod Johnson makes a case again the design patterns as well - in this case again J2EE design pattens. He accuses them of being a mere workarounds for the design flaws of the J2EE platform.

---
* design patterns crossword: http://www.vokamis.com/products/cword/app/enterGame.php?ns=/a/a&or=V&amp;amp;amp;amp;amp;h=128&pub=2&ex=http://www.softwaresecretweapons.com/jspwiki/Wiki.jsp?page=GangOfFourSoftwareDesignPatternsJavaScriptCrossword
** see for example Rod Jonson's book, Expert One-on-one J2EE Design and Development, Chap. 4: http://www.theserverside.com/tt/articles/article.tss?l=RodJohnsonInterview
*** Uncle's Bob blog: http://butunclebob.com/ArticleS.UncleBob.SingletonVsJustCreateOne

Sunday, 2 September 2007

servlets in C++ or 2 small C++ ideas

As I was recently reading the article of Stefan Wörthmüller "C++ Versus JEE"*. Basically Stefan argues that:

"If you want to write a server application, you need control. Maybe even control about your failures. But you need control."
And that of course is what C++ will give you. For the details refer to the article. However another sentence got stuck in my mind:
"I still wonder why there are so few approaches for Web frameworks/toolkits or application servers written in C++. C++ is made for such a task."
So I've got that big ;-) idea: why shouldn't I take the Servlet specification, restate the interfaces in C++ and code the Servlet classes and (perhaps) a servlet container! Why not use the language which is known to me (and many other server-side programmers) and combine it with an iterface which is known to many web-developers (i.e. Java programmers) to get the best of the two worlds? This will be a big success!!! Ok, I tell you why: no spare time.

But I still think this idea is not bad. It goes in the same vein as my other idea about C++: why should I use scripting languages for scripting? I can do everything just as well in C++. Ok, not everything, I would have to write some template libraries to plug the holes, but in principle I could. I didn't pursue this idea very long, but after some time I found the same reasoning explained more lucidly in the rationale of Boost's "Filesystem" library:
"The motivation for the library is the need to perform portable script-like operations from within C++ programs. The intent is not to compete with Python, Perl, or shell languages, but rather to provide portable filesystem operations when C++ is already the language of choice."
Then there is the Boost::String library which makes possible the Perl-like manipulations of strings... We are almost there.

But end of digression! The main theme of this entry are C++ servlets! So back to the web programming.

For some time I tried to find some C++ web frameworks by googling, but nothing useful emerged. So I a sort of gave up. Till today. I was googling for documentation of the OSE RTOS by Enea when I stumbled upon the OSE library. OSE, whassat? Never heard of it! Another legacy do-it-all framework? It turned out that this library was long time popular in Australia, and more: it has a HTTP Servlet support! At last!! It supports HTTP Session objects, XML, and its HTTP Daemon class seems to be a sort of servlet container. From what I've seen, its interface is different form Java's HTTPServlet class, but it is a start! Maybe one could combine it with a non-JSF view technology (Velocity?) to enhance it further? So maybe I will be able to pursue my idea nonetheless? But is it worth the candle? I would be only a copy of an existing technonogy, certainly not an "killer app" and not a new approach. Or maybe not?

----
* Dr. Dobb's Journal Jun 21, 2007, http://www.ddj.com/dept/cpp/199905990

Saturday, 11 August 2007

Erlang's change of fortunes

Today morning I had a conversation with my distinguished colleague Stefan Z. while hanging out at the coffee-vending machine in the tea-kitchen at our client's office. As we are both working for the telecom industry, we came to discuss Erlang. No, not that celebrated guy, but a rather obscure programming language designed at the Ericsson Corporation.

For the fun of it, let's first look at some snippets of Erlang code!

define a data structure:
  Student={person,{name,{first,yakkety},{last,yak}},{footsize,1}}.
access a member of data struct (by pattern matching, cool!!!):
  {_,{_,{_,HisName},_},_} = Student.
printing yakkety:
  io:fwrite("his name is: ~w~n", [HisName]).
define a list:
  Fruits = [{apples,10},{pears,6},{prunes,3}].
append something to a list:
  Fruits1 = [{oranges,4},{lemons,1}  Fruits].
get the head of the list (by pattern matching, cool!!!):
  [ListHeadFruits2] = Fruits1.
simple function:
  incr(Num) -> Num + 1.
recursive function:
  fac(0) -> 1;
fac(Num) when Nun > 0 -> N * fac(Num-1).
infinite loop receiving messages (by pattern matching, cool!!!) and sending responses back (by the !operator):
  loop() ->
receive
{From, {request, Param1, Param2}} ->
From ! {self(), Param1 + Param2},
loop();
end.
a foreach loop (note the lambda function definition between fun and end!):
  extract2ndElements(List) ->
lists:map(fun({_,Second}) -> Second end, List).

Ok, it was fun, but we mustn't digress too much...

Erlang in itself is rather a paradox: you normally associate Telco equipment programming with the lowest of the lowest: asm, C99 or EC++ (castrated C++). Now, there come some guys in the 80-ties and to the corporate management they can sell a functional language for this task! This alone is incredible! But wait, there's more! Up to now Erlang was a niche language used quite exclusively at Ericsson. Now this seems to be changing. But first we need some history.

In the 80-ties Erlag was designed to be slow but reliable. This is due to the very nature of the functional languages: you don't have shared state, no side effects, all data is copied between functions. For safety's sake even the variables can be assigned only once! If you apply that to parallel computing, you don't have to worry about synchronization and critical sections - the "share nothing semantics". But it will be rather slow. In this case it was a deliberate decision: they wanted not simply a reliable language, but a "highly reliable language"! And a distributed one. So to speak a natural fit for the Telco environment. As an example: the AXD301 switch achieved the incredible 99,9999999% realiability! It's 9 nines availability! And we were struggling to achieve 5 nines* availability with C++ on CarrierGrade, High Avalability (HA) Linux and a custom HA-middelware platform in my last project! By the way, the AXD301 software has 1.7 million lines of Erlang, making it the largest functional program ever written**! Really impressive stuff IMHO.

I guess you have read (or heard about) Herb Sutter's article titled "The free lunch is over"***. So you know that the "next big thing" in software is supposed to be multicore scalability. So hear up now: as the multicores are comming pretty cheap now, Erlang can be at last both fast and reliable as it can be easily distributed on several cores! It's definitely good news! We've seen the incredible reliability which can be achieved, and have a natural programming model for distributed computation. Why? In short, as there is no shared data:

  1. programs are easily distributable,

  2. easily made fault tolerant by composing them from worker and observer processes,

  3. easily made scalable (because of 1. we simply add more processors).**

So could distributed computing be the saviour of the functional programming? Think about Google's MapReduce!

---
* For those not working with HA systems:
      Availability%   Downtime per year
99.9999 30 seconds !!!!
99.999 5 minutes
99.99 50 minutes
99.9 9 hours
** says Phillip Wadler: http://wadler.blogspot.com/2005/05/concurrency-oriented-programming-in.html

*** on DDJ or www.gotw.ca/publications/concurrency-ddj.htm

Monday, 6 August 2007

iPhone presentation with afterthoughts


Yesterday me and my distinguished colleague Stefan Z. both saw the iPhone for the first time (over here in Germany it isn't available easily yet, it'll come sometime in November). It was presented to us by our co-consultant in the telco field, a long-time Mac and Objective-C freak. He is pushing Apple technology wherever he can, and has got considerable succes with his Mac-based telco testbed infrastrustructure recently. 

He's the archetypical nerd - totally enthusiastic about technonolgy. To help you to go into the mood of the conversation I offer a couple of juicy citations: "...y'all haven't got a clue!", "...here ends your tether but I'm only beginning.", "...all the telco companies can shut down now!". 

He showed us the iPhone GUI and it was impressive: it was just as it always should be! You can operate it holding it in one hand and just using your thumb on the touch screen! His basic message was this: no other company can duplicate this on a mobile phone (or on any other operating system for that matter)! Ok, sometimes he tends to massive exaggeration but this one started me thinking. 

I guess my coleague is essentially in the right, and it boils down to the technology (not management). As I don't know much about iPhone I'll be shamelessly wallowing in conjectures now, but read on. 

The first thing I thought about was: yeah, that wouldn't be certainly possible with JavaPhone! They are doing it in Objective C and Cocoa as "Java and GUI don't mix" and "Friends don't let friends Swing" (guess who said this*). And Java is the programming language of choice today! You can even do real-time programing with JRockit's deterministic garbage collection of late! Java proponents do not conceal that Java's stronghold is more and more the enterprise application computing**. 

On the other side Java critics may say that this stronghold is rather a kind of ghetto (Java is the next COBOL - where did I hear that?). Steve Jobs said: "Java's not worth building in. Nobody uses Java anymore. It's this big heavyweight ball and chain..."*** Ok, an exaggeration, we are using Java on a project here, but you get the idea: Java is now a language for big, heavyweight, corporate applications. Nothing exciting to be expected here. 

Is there another language/system which can do something that cool? Yes, on the desktop (well browser...) you can do some cool things with Flash, maybe with JavaScript. But for the regular GUI there's nothing comparable I fear. So the Objective-C freak can be in the right. 

---
 * it was the creator of Tomcat and Ant

 ** for example "Why Java?" on http://www.wantii.com/wordpress/?p=5

*** see http://www.informit.com/discussion/index.asp?postid=d1e63fde-10d5-404b-8a14-6ff0b92c1ee1 for some lively discussion on that phrase, you can google for "Java? It's so 90-ties" for some more Java critique

Reprise:


As I said, I didn't know first thing about iPhone at the time I wrote this entry. But gradually I've learnt some new things, and now I can compare my then guesses to the facts. 

1. Yes, I was basically right: iPhone uses Cocoa*, and you'll program Cocoa with Objective-C. So you cannot program it with Java - the iPhone is just 100% anti-Java. But curiously, you can do it anyway, only through an detour. Just take the GWT (Google Web Toolkit), write an Web application in Java and GWT will translate it to JavaScript, which will be executed by iPhone's Safari browser!** I find it somehow a strange twist of fate! BTW, the Google Phone*** is rumored to be 100% pro-Java... 

2. Yes, I was partially right about GUI programming: yes, Java isn't up to the task, but Sun seems to have noticed this and works on Java-FX: "a new family of Sun products based on Java technology and targeted at the high impact, rich content market"****. The reasons why a new take was needed were summed up as a following series of questions:****
  • Why does it take a long time to write GUI programs?
  • How can we avoid the “Ugly Java technology GUI” stereotype?
  • Why do Flash programs look different than Java platform programs?
  • Why does it seem easier to write web apps than Swing programs?
  • How can I avoid having an enormous mass of listener patterns?
So these are some problems! As it seems, Java isn't anymore the one-fits-all language of yore, but is complemented by a host of new languages using the Java-VM (FX, Groovy, JRuby...). So maybe it's really: "Java is the new COBOL"

---




Sunday, 29 July 2007

Why Ant?

Since I'm relatively new to the Java universe, I come with some preconceived ideas. One of them was that when you need to build something, you need to write some makefiles. So I never could understand why should the Java crowd need one another tool? Admittedly, make has got its quirks, but it is there, it's stable and it's understood. And the makefiles are human-readable! So I searched the internet for an answer. A typical statement would look like this:

Ant has been on the top of many a developer's list as the revolutionary tool that got them out of the world of make. For those of you not familiar with make, it'll be enough t say that it just isn't the best tool to use for building Java projects, since it isn't platform-independent and it isn't that easy to use.*
Not easy to use??? Come angain? What about Ant's XML-disease:
Ant ... uses an XML configuration file, the infamous build.xml. Ant has enjoyed heavy popularity with its many advantages, but it also has some drawbacks. The build.xml files can be extremely terse, and their use requires the developer to learn the syntax up front.*
...the disease of forcing humans to use a machine-oriented markup language to configure the build process! But the main gripe, as I understood it, was make's usage of tabs to demarcate the commands to be invoked for specific rules:
Makefiles are inherently evil as well. Anybody who has worked on them for any time has run into the dreaded tab problem. "Is my command not executing because I have a space in front of my tab!!!" said the original author of Ant way too many times. Tools like Jam took care of this to a great degree, but still have yet another format to use and remember.**
You kidding? I personally didn't have any problems with tabs, never ever. That cannot be the reason for one another tool! Only when I read this I understood:
Gnu make+Unix, windows and NMAKE, etc. While single-IDE or single-platform development worked in small, in-house projects, it didn't cut it for open source dev, where one person may have a solaris box with make on, but the other developer is on a windows system, another on a version of debian-unstable they built themselves. There's is/was not enough unity of infrastructure to stabilise on tools, so James Duncan Davidson had to write one. Ant. ***
Well, the real reason is the portability: the problem, which JVM solved for us. But as I see it, there is another side to this. Did you notice the phrase "the other developer ... on a windows system"? That's it: Microsoft is killing make, is killing the Unix legacy with Java's hands! Because there's no make installation on Windows, not because make isn't platform independent (it' written in C, you have only to recompile it). And because it's rm not del, and dir not ls. Microsoft couldn't kill Unix toolchain all alone, but ironically Java helped here a great deal by defining a new level of abstraction, which requires new tools. Ok, at last, the riddle is solved...

But wait, there's a moral in this story too! Strictly speaking even two morals. First: there's a lot of half-truths on the internet, and second: the truth isn't easily revealed, but it pays to search for it.

---
* http://www.onjava.com/pub/a/onjava/2006/03/29/maven-2-0.html
** this I understood as a received wisdom coming right from the source: http://ant.apache.org/
*** http://www.1060.org/blogxter/entry?publicid=0E729BD9B8A06F372CC136D402810C82

Saturday, 28 July 2007

Google Solvers: Python vs. Perl, Java and C++

It happened a couple of months ago. My distinguished colleague Stefan Z., a dyed in the wool C/C++ developer, educated on Dijkstra, Knuth and Wirth, wanted to learn Python to extend his horizon. As he's already got the basic free Python books from the web, he proceeded to coding of a simple example. He chose the Google problem: find the solution(s) of the equation "wwwdot - google = dotcom"*, and coded the greedy (or is it the brute-force?) solution. He showed it too me, and true to the motto "Doing science for fun and profit ;-)" I guessed it would be interesting to compare this code for speed and for visual appeal to its Perl equivalent. And then to Java.

Python code looked like this:
#!/usr/bin/python

print "Program Start."

for W in range(10):
for D in range(10):
for O in range(10):
for T in range(10):
print "looking for W=", W, "D=", D, "O=", O, "T=", T
for G in range(10):
for L in range(10):
for E in range(10):
for C in range(10):
for M in range(10):
a = 100000*W+10000*W+1000*W+100*D+10*O+T
b = 100000*G+10000*O+1000*O+100*G+10*L+E
s = a - b
r = 100000*D+10000*O+1000*T+100*C+10*O+M
if s == r:
print "FOUND the solution: a=", a, "b=", b, "s=", s, "r=", r
print " W=", W, "D=", D, "O=", O, "T=", T, "G=", G, \
"L=", L, "E=", E, "C=", C, "M=", M

print "Program End."
short and to the point!

The Perl code like that:

#!/usr/bin/perl -w

print "Program Start.\n";

for $W (1..10) {
for $D (1..10) {
for $O (1..10) {
for $T (1..10) {
print "\nlooking for W=", $W, "D=", $D, "O=", $O, "T=", $T;
for $G (1..10) {
for $L (1..10) {
for $E (1..10) {
for $C (1..10) {
for $M (1..10) {
$a = 100000*$W+10000*$W+1000*$W+100*$D+10*$O+$T;
$b = 100000*$G+10000*$O+1000*$O+100*$G+10*$L+$E;
$s = $a - $b;
$r = 100000*$D+10000*$O+1000*$T+100*$C+10*$O+$M;
if ($s == $r) {
print "\nFOUND the solution: a=", $a, "b=", $b, "s=",
$s, "r=",$r;
print "\n W=", W, "D=", D, "O=", O, "T=", T, "G=", G,
"L=", L, "E=", E, "C=", C, "M=", M;
}
}
}
}
}
}
}
}
}
}

print "\nProgram End."
so it wasn't that unreadable: actually a little more readable than Python by its usage of direct range expressions and curly braces to denote scopes. On the other side the brace doesn't add any useful information in this special example but can't be omitted, so maybe Python isn't so bad after all?

In Java code I skipped unneccesary braces a thing which... would not be possible in Perl:

public class GoogleSolver
{
public static void main(String[] args)
{
System.out.println("Program Start.");
int a=0, b=0, s=0, r=0;

for (int W=1; W <11; W++)
for (int D=1; D <11; D++)
for (int O=1; O <11; O++)
for (int T=1; T <11; T++)
{
System.out.println("looking for W="+ W+ " D="+ D+ " O="+ O+ " T="+ T);
for (int G=1; G <11; G++)
for (int L=1; L <11; L++)
for (int E=1; E <11; E++)
for (int C=1; C <11; C++)
for (int M=1; M <11; M++)
{
a = 100000*W+10000*W+1000*W+100*D+10*O+T;
b = 100000*G+10000*O+1000*O+100*G+10*L+E;
s = a - b;
r = 100000*D+10000*O+1000*T+100*C+10*O+M;
if (s == r)
{
System.out.println("FOUND the solution: a="+ a+ "b="+
b+ "s="+ s+ "r="+ r);
System.out.println(" W="+ W+ "D="+ D+ "O="+ O+ "T="+
T+ "G="+ G+ "L="+ L+ "E="+ E+ "C="+
C+ "M="+ M);
}
}

System.out.println("Program End.");
}
}
so ist wasn't overly verbose except for System.out.println and the "public static main void" beast. Now the measurements on my Pentium IV (ca 3 GHz) with Redhat Enterprise Linux 3:

Perl: Program End. 2526.790u 0.310s 42:11.34 99.8% 0+0k 0+0io 244pf+0w
Python: Program End. 4494.230u 0.330s 1:15:00.51 99.8% 0+0k 0+0io 337pf+0w
Java: Program End. 38.847u 0.150s 0:38.86 100.3% 0+0k 0+0io 1pf+0w

Pretty interesting stuff here: Perl outperformed Python twice and Java was lightning fast! It seems Python's VM implementation isn't very sophisticated. OK, these are no scientifical experiments, but the numbers are reproducible and telling: Perl is pretty fast, and usable for some number crunching, Python is painlessly slow :-(, and Java is just good! Both Python and Perl are installed in a standard way in the /usr/bin directory, so there's no unfair network overhead involved! And Java deserves praise, considering that I used the most inefficient way of constructing the output strings!

Ok, I thought, now let's try something really fast, and I quickly recoded the algorithm in C++. Here I skipped unneccesary braces as well:

#include <iostream>
using namespace std;

int main()
{
cout << "Program Start.\n";
int a=0, b=0, s=0, r=0;

for (int W=1; W <11; W++)
for (int D=1; D <11; D++)
for (int O=1; O <11; O++)
for (int T=1; T <11; T++)
{
cout << "\nlooking for W="<< W<< "D="<< D<< "O="<< O<< "T="<< T;
for (int G=1; G <11; G++)
for (int L=1; L <11; L++)
for (int E=1; E <11; E++)
for (int C=1; C <11; C++)
for (int M=1; M <11; M++)
{
a = 100000*W+10000*W+1000*W+100*D+10*O+T;
b = 100000*G+10000*O+1000*O+100*G+10*L+E;
s = a - b;
r = 100000*D+10000*O+1000*T+100*C+10*O+M;
if (s == r)
{
cout << "\nFOUND the solution: a="<< a<< "b="<< b<< "s="
<< s<< "r="<< r;
cout << "\n W="<< W<< "D="<< D<< "O="<< O<< "T="
<< T<< "G="<< G << "L="<< L<< "E="<< E<< "C="
<< C << "M="<< M;
}
}
}

cout << "\nProgram End.";
}
The code is little less verbose than of Java example, but will it run better? For all we know it should! Then I run my measurements and got a little shock:

C++: Program End. 37.420u 0.060s 0:37.92 98.8% 0+0k 0+0io 197pf+0w
Java: Program End. 38.847u 0.150s 0:38.86 100.3% 0+0k 0+0io 1pf+0w

So C++ was better, but by a very small margin!** You can see the JIT technologiy is working very good for this example: the JIT compiler must run only unfrequently and then the machine code is invoked all the time! Ok, so maybe this marketing drivel about Java speed being the same as C++ isn't all that wrong? Admittedly, C++ streams aren't the latest cry in performance, but I wanted to write standard code, without tweaks for optimization...

But wait, I thought. I remember some article on whole program optimization, and it said that Java people think C++ cannot optimize and that only JVM can. Think, think... Did you used the -O switch??? Of course I didn't, so the measurement is pretty unfair on C++ here. I repeated the measurement with gcc's -O2 optimization:

C++: Program End. 8.030u 0.010s 0:08.41 95.6% 0+0k 0+0io 199pf+0w

OK, that was reeeaaaaally fast, indeed.

So recapitulating: Perl was 2 times faster than Python, Java 2 orders of magnitude faster than both of them and C++ 1 order of magnitude faster than Java. So no surprises here. Except for Python's bad performance. So programming in the dynamically typed languages isn't always so much fun!*** In C++ you know when you are using language construct, that tey'll be rather fast, and when you are using STL you'll have your big-Oh complexity documented. But when you're using Python you don't know anything! Or you must google for implementation details of lists, tuples etc, and do some code tweaking. By the way, does anyone know why Perl is faster than Python? Does its VM do some instruction cashing that Python's doesn't? Would Jython be faster, as it can enjoy benefits of the massively optimized JVM? Maybe I should try it out...

----
* This is typically Google. They adversized one time with this line: "www.{first 10-digit prime found in consecutive digits of e}.com" or similar. The idea was to to find the people who can solve the problem inside of {}, i.e. use the prime distribution theorem, and hire them.

** On another machine it was even a little slower:
C++: Program End.44.552u 0.077s 0:44.64 99.9% 0+0k 0+0io 0pf+0w

*** see for example Uncle Bob's:http://www.artima.com/weblogs/viewpost.jsp?thread=4639

Thursday, 19 July 2007

C++ sucks

Ok, I admit it's rather an unusual title for the first entry in a blog that's supposed to go against the fads of the hour! Well, let me explain...

Lately, I was porting some code of a major telco player from their standard hardware to a new board. Admittedly, it was embedded code, but nothing hard-realtime an no microcontroller stuff - just a totally regular PowerPC processor.

And this code sucks! It's not even that bad, it's object-oriented and has some separation of concerns, but it's all huge classes, huge methods, no efforts to avoid repetitive code, lot of copy and paste: you'd think it's Java, with its lack of possibilities to invent some clever shortcuts and hide the boilerplate code! This was C++ code and it sucked: ergo does C++ suck? I've never thought I'd say that, but really, I've never been so annoyed with C++ code (or any other code for that matter). Everything was so straightforward, so a=b+c, so "churn out the LOCs", so unimaginative! It reminded me somehow of bloated, repetitive and vacuous Java syntax, but it was much worse than any bad Java code I've seen. I could just see missed design opportunuties.

Sorry, it shouldn't be a rant about Java (perhaps only a little bit about its syntax...) but about C++ code. The question here is: can a programming language stop an unimaginative programmer from writing ugly code? There are two canonical answers to this question: the modern one - yes it can, through a clean language design, and the oldschool, hardcore one - no, you can write garbage in each programming language. They have both both some merit, but my private view is a different one. I think, that with a language like C++ you can always hide the ugliness away, using the proverbial "additional layer of indirection", i.e. defining some higher level abstractions instead of thinking in straightforward, procedural ways. This is the lesson that OO taught us, and a basic one! So you cannot put the blame on the programming language's design, it's rather a problem of lack of freedom in your thought, of getting bogged down with the details, of fear of simplicity. I'm not saying here that you can write garbage in each language, as some languages are better at generating it than another. I'm saying that a better language is only one half of the deal, and we must add the another half.

Ok, but there's more to it. There's another aspect to the title of this entry: do typed languages suck? And further: is iterator pattern a sign of weakness of typed languages? Are design patterns just workarounds (like J2EE patterns obviously are)? But more about it in next installments of this blog.

Wednesday, 18 July 2007

intro

Hi!

This will be my technical blog: I hope to be able to share a couple of thoughs on software and programming languages (hence the unimaginative title) in rather an irregular manner. This blog is related to my freelancing work done under the name http://www.ib-krajewski.de/ in Munich, Germany.

I'm in first line a C++ kind of a guy: doing it for a quite long time and liking it rather well. But as my interest lies in the technology rather than in evangelism, I was as well enjoing (or not) programming in Java, Perl, Python and some other less known languages (you're right, Ruby is not on the list). Technologically I come from academia and the research on distributed systems and networks, later enhanced with network management, client-server systems, and somentimes even web applications. I liked designing frameworks for other people and objects for my own systems. I hope I'll be able to say something new about languages, programming and aestetics, but it may as well degenate into rants... So be forewarned!

Marek