Monday, 30 March 2009

One cute microdesign example and some thoughts on language design.

This time I'll begin with a citation of another blogger*:
I made the switch from C++, I'd wished that Java allowed operator overloading (i.e. custom implementations of operators such as '+', or the array brackets). However, some argue that this feature adds to the complexity of C++ with little advantage, and that there is no need to pass this complexity on to the Java language. Someone who never coded in C++ would more than likely agree with that.
OK, someone who never coded C++ won't understand that, and will take the argument about the undue complexity as received wisdom. So let's show you a little piece of design, a microdesign actually, so you can (maybe) appreciate the feature. If you're a Python/Perl/Groovy (or C++ for that matter!) programmer, I must apologise, but maybe you'll find this small example nonetheless instructive.

1. A piece of microdesign

It's a piece of code from my old project I wrote several years ago. I wanted a class for interthread and interprocess messages (you know, for my favourite pattern: share-nothing, message-passing actors). I wanted to be able to construct a message using parameters of the constructor like this:

    AbstractData notif;
SocketMsg msg(DispInterface::crReqNotifType, &notif, father);
And then simply send the formatted message as binary data:
    m_connSocket.send(msg, msg.buffsize())
On the other side, I wanted to receive the binary data from socket directly into the message object, which would then parse it and provide high level access to the received data, just like:
    SocketMsg msg; // empty
if(!m_connSock.receive(msg.writebuff(), msg.buffsize()))
// if too much errors, better bail out here!
TRACE_AND_CONT("Cannot read from socket in an FD event handler!",
E_ID_Q3A_SOCK_ERR); // try to escalate
    // look what message received
if(msg.getType() == cmdOkMsgType)
Pretty high level, eh? Just hide away all that low level data parsing and formatting and use the message!

It's design on its lowest level, you cannot possibly make a smaller piece of meaningful design, but I liked that piece! Now a question for the astute reader: what pattern is that? Proxy, bridge or a facade? Or a builder maybe? Perhaps none? Well, I don't think in pattrens, I think in abstractions: what I needed was an abstraction of a self-formatting message. Don't you like it too? Unfortunately, you need some operator overloading magic to do this. Look at the following class:
class SocketMsg
SocketMsg(int type, AbstractData* message, DsetId dsetId = nillDsetId,
int cmdId = 0, int userCode = 0);
... etc

// conversions
operator const char*() const { return m_charRepr; };
inline SocketMsgBufferProxy writebuff(); // see inline section
// buffer
static const int c_buffsize = sizeof(SocketMsgStruct) + 1;
char m_buff[c_buffsize];
    // decoded data: don't remove in case we use different encoding!
AbstractData* m_data;
int m_type;
DsetId m_dsetId;
int m_cmdId;
int m_userCode;
    friend class SocketMsgBufferProxy;

You can see in the above definition that the class has a buffer for the coded binary message and several fields for the decoded message values. In the char array context, it exports its internal message buffer for reading (through the operator const char*()) so the socket's send function can read the raw data from there. In the other direction, i.e. for receiving data into the internal buffer a different mechanism is used: we are returning a SocketMsgBufferProxy object (this time through the writebuff() function) , which is defined as follows:

class SocketMsgBufferProxy
SocketMsgBufferProxy(SocketMsg* sm) : m_sm(sm) {};
operator char*();
SocketMsg* m_sm;
What it that proxy object doing? It's forwarding the address of the internal raw data buffer and revealing it through the operator const char*() again. As a matter of fact I could offer an even cleaner interface:
    m_connSock.receive(msg, msg.buffsize())
So why didn't I simply expose the internal buffer with the char* conversion operator and introduced a SocketMsgBufferProxy object inbetween? Simply because of the SocketMsgBufferProxy's destructor. When its destructor is invoked, it rescans the raw contents of the receive buffer so after the receive() line of code is finished, the message object is parsed and ready! Of course there is another possibility to achieve this: lazy parsing, but I was somehow more excited with that cool proxy object automatically coming inbetween and doing all the work! I'm only a human too and sometimes looking for the unusual...

2. And now some more general discussion

So as you can see, the operator overloading is a pretty useful little feature. So why didn't the designers of Java include it into the language definition? My opinion is that they overreacted. I can still remember the times of the “object craze” when everybody tried to be object oriented without knowing what that really was, and to use every single feature C++ offered without thinking if it's needed or not for the particular problem at hand. That was somehow similar to the “patterns craze” I can see even today: people asking me “so what pattern could I use here?” - you don't need a pattern here, you need a solution of the programming problem, or do you want just a cool name to stick it on your code? But more about that in future entry.

You wouldn't believe how stupid the people got with the operator overloading, maybe it was something like the ”powder rush” of the skiers, an incredible infatuation with the until-then unheard-of possibilities! You must consider that a fresh-made C++ programmer came from a much smaller language (namely C) and normally didn't have exposure to any higher level language concepts form Smalltalk, Lisp or Simula. So the Java designers simply thought “Hey folks, it's time to cool down!” and banned the feature. To my mind, a fatal case of underestimation of people's capabilities and of patronizing. Because the result of it is that “you cannot do any magic in Java!” (well, actually you can with introspection and bytecode insertion, but only so much, and that's simply not enough). Look at the following citation**:

In fact, I'd say that many of today's current hot trends in programming are a direct result of a backlash *against* everything that Java has come to represent: Lengthy code and slow development being the first and foremost on the list. In Java you generally need hundreds of lines of code to do what modern scripting languages do in dozens (or less).
So what's the solution? Well, this is the problem. People have been tinkering with Java for years now, and there's still no hope in sight. There's something about the Java culture which just seems to encourage obtuse solutions over simplicity. As a Java developer, I was always so amazed at how difficult it was to use the standard Java Class Libraries for day-to-day tasks.
And you know what? My pet idea is, that the reason why Java code is so dull and repetitive (and that the Java culture is, ehm..., you've just read it...) is the lack of operator overloading! You think I'm overly generalizing now? Well, the fact is, you simply cannot hide the gory details as conveniently as I was able to do it in the SocketMsg class! The best indication that the people in Javaland are thinking the same, is the Groovy JVM programming language: it has the operator overloading at its core, and is capable of doing things you wouldn't dream possible in Java. Well, that's no virtue in and by itself, but as result it gives you much leaner code! And we all need it: less code!

3. Now something even more general

When am already at it, Java programmers I met wouldn't understand the idea of the h-files (aka include files) too. Why are they needed for Heaven's sake? Isn't it an awful bore? I admit that the origin of the h-files lies in the limitations of the early C compilers, but with C++ it gained another significance: a “table of content” of the code proper. I give the word to another man from Java(FX)land***:

From my experience with C++ and Java, having method bodies in the class declaration clutters it with a mass of implementation details which is detrimental to getting an overview of the actual relationships and operations embodied by the class. It was for this reason that I decided to define the bodies of class operations and functions outside the class declaration.
Exactly what I think! You need an overview for the class, so you don't have to skip every one int and bool declaration until your head gets dizzy. You'll retort that this will be done with Javadoc comments. Well... The Javadocs... It's quite a theme of its own. Let me give only a small example here: if I've already written the function declaration like:
    void sendMessage(SocketMsg& msg, Receiver dest);
why should I repeat all this in a trivial Javadoc header? Is it "don't repeat yourself" or what? Don't get me wrong, I wouldn't check in any uncommented code, but I don't use them in (C++) header files - they only clutter up and make the definitions unreadable! For me they are well suited as implementation descriptions. And seconly, you have to run the javadoc tool first, you cannot just have a quick glance at the code! I must admit, sometimes I'm annoyed about having to edit the h-files, but when I'm reading the code afterwards, I'm happy I did that. Pretty non-progressive, isn't it?

4. Summing up

Maybe you'll say that it's not ethical to lambast a slightly ageing, slightly demodée, mainstream language, and that all is only too well known for several years now, but hey, it's just what I always thought! And like with my other favourite subjects (Agile, XML, Python, patterns), it simply takes time till I write it down, and it's no so insightful then, sorry...

*** Chris Oliver, Creator of JavaFX, in his presentation about JavaFX: