Thursday 11 December 2008

Do we need destructors?

This post is for my honoured, old-time, old-skool hacker colleague Stefan Z. As we were discussing the new and (then) hot Java language, we couldn't accept that it didn't have destructors. I cite Stefan:

The constructor/destructor pair is an incredibly powerful concept!
Well, you can't have everything: either we support garbage collection or destructors, isn't it? But it just one more point where Java sucks: the ugly try/finally block and then the explicit close() call like in the following code:

    BufferedReader reader =
new BufferedReader(new FileReader(aFileName));

try {
String line = null;
while ( (line = reader.readLine()) != null ) {
// process the line...
}
}
finally {
reader.close();
}
Ugly? You bet! And what I really don't like is that you cannot hide all the required handling in a library! In C++ you'd just write:

    ifstream f(aFileName);
string line;

while(f)
{
getline(f, line);
// process the line...
}
That's all, the plumbing is hidden in the destructor* and it is there automatically! It's the reason why Stefan called this concept an "incredibly powerful" one. But that powerful concept can cause problems in a multithreading and garbage-collected environment. As a matter of fact, a recent C++ standard proposal for the multithreading execution model opted for removing destructors from the language (!!!), or at least for not executing the static destructors in a multithreading setting! Of course, it's a shortcut in order to solve a rather complicated problem, but you get the idea, right?

So maybe the destructors are a little bit outdated, what do you think Stefan? All the more was I pleased as I recently stumbled on a Smalltalk pattern called "Execute Around Method" pattern** (to be true, I don't do any Smalltalk and have seen it in some Groovy example code). It's another possiblity to hide the plumbing in the library: you just define a static method doing all the dirty work and accepting your "payload" code - just like in an IP packet: we have the framing and the payload, and the user is only delivering the payload! Well, an example explains it best:
    def static use(closure)
{
def r = new Resource()
try
{
r.open()
closure(r)
}
finally
{
r.close()
}
}
The closure parameter stands for our "payload" code. This code is the hidden in the library. An application of this is the following Groovy code:

    new FileReader(aFileName).withReader
{ reader ->
reader.readLine(line)
// process the line...
}
// no need to close()!!!
We create a new reader, give it to the static withReader() library method, and provide a "code block" (as you'd call it in Perl) for execution. This code block (called closure in Groovy) gets as the parameter the ressource which will be closed at the end, just like the use() method shown above!

A destructor for the modern times! So the "incredibly powerful" idea can be saved?

---
* this is called a RAII-pattern in C++, see: http://www.hackcraft.net/raii/
** Kent Beck: "Smalltalk Best Practice Patterns", Prentice Hall, Englewood Cliffs, NJ, 1996.

Wednesday 26 November 2008

C vs C++ and some celebrity gossiping

Every time I read a post of Linus "Linux" Torvalds I can't help thinking "what a smug, assumptuous, xxx-yyy-zzz!". Well, I don't know the man personally, but I certainly wouldn't like to have him as my boss in any project, betcha! The first quote is a couple of years old (I cite from memory as I cannot find it anymore) and was a reply to some proposal Linus didn't like :

....go and play in your little world...
It looks innoculous enough here and now, but in the context it was realy ugly. And now, for some time everyone seems to feel obliged to speak about Linus' C++-hating post*, so I had a look at it myself. OK, nothing changed, it goes in the same vein:

*YOU* are full of bullshit. ...... is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with. ...... the end result is a horrible and unmaintainable mess. But I'm sure you'd like it more than git.
... etc, etc, etc. OK, maybe it's only his personal creative writing coach who's to be blamed, or perhaps it's the macho Linux kernel developer culture? But, aside of personal dislike, what the man says got a bell ringing with me. Why? Read on:

C++ leads to really really bad design choices. You invariably start using the "nice" library features of the language like STL and Boost and other total and utter crap, that may "help" you program, but causes:
  • infinite amounts of pain when they don't work (and anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny)
  • inefficient abstracted programming models where two years down the road you notice that some abstraction wasn't very efficient, but now all your code depends on all the nice object models around it, and you cannot fix it without rewriting your app.
In other words, the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C.
Whoa, that man is really hardcore! What he's actually saying is: don't trust any code you didn't write by yourself! And on a higher level: any abstraction we are using is a trap, lulling us in a false sense of security. And more: we can really build big, fast, complex systems without using OO abstractions!

Didn't I feel the same before? That for the efficient, near system level code we can take C, and that all the fancy object thing, where the is better done in Ruby or (even) Java? So no place for C++ here? Take Wireshark protocol analyzer as example?

Gossiping

Well, the story doesn't end here. First, there are some entertaining comments on digg**. My favourites are:

  • Linus codes a kernel for a living, so its not that surprising that he hates C++.
  • Linus is an *****, he lives here in Beaverton and the man has a big ego for someone nobody outside of the Linux community cares about.
  • Ok... Who cares if Linus Torvalds hates C++. I don't really give a damn.
  • another episode of "I am Linus and hate everything"

  • funny that linus prefers kde when it is programmed in C++
  • Personally I've evolved thinking in OO abstractions, so working with C++ is much more natural for me than C. Does that mean I'm a crappy programmer? Only Linux Torvals knows

  • The STL is the biggest piece of crap I've seen in 40 years of programming. It's a graduate students project to prove one can write a totally orthogonal, yet totally inefficient, impossible to maintain, piece of crud.
Well, as it seems, first: people aren't taking Linus such seriously, and second: STL ist the culprit!

Living in the past?

So, as to begin with something, what's the matter with STL? As we are gossiping in this installment, it's perfectly fine for me to say that the prevalent opinion on the Web (for example ***) is that Linus is referring to a problem from the past (around 2001 or so), when he's speaking abot the non-portability of C++. At that time the support for the C++ standard, and especially tempaltes, was very unconsistet across the compilers, and so the STL implementations could be nonportable between compilers! But nowadays even Visual C++ is quite up to speed here!

Then the inefficiency allegation. I don't even want to discuss it here, because it's so old (back in time to 1998 or so). There's long refutation along classical lines from that time to be found****, if only not very entertaining, and a shorter one*****, from a practitioner's point of view - Steven Dewhurst actually wrote low level code with C++ and templates:

Just to annoy people like Linus, I've also used typelist meta-algorithms to generate exception handlers with identical efficiency to hand-coded C. In a number of recent talks given at the Embedded Systems conferences, I've shown that commonly-criticized C++ language features can significantly outperform the C analogs.

Who's incompetent?

Next comes the critique that C++ tends to attract substandard programmers, and that:

... limiting your project to C means that people don't screw that up, and also means that you get a lot of programmers that do actually understand low-level issues and don't screw things up with any idiotic "object model" crap.
The first thought that comes to mind is Linus' "software Darwinism": in 2000 he lambasted people wanting a debugger in the Linux kernel. His argument was: I don't need any sissy that needs a debugger! I want people who understand the code as a whole! Any higher level abstraction or language (i.e. STL or C++) will make you a wimp and not careful enough!
The fact is, that is *exactly* the kinds of things that C excels at. Not just as a language, but as a required *mentality*. One of the great strengths of C is that it doesn't make you think of your program as anything high-level...
But isn't this just another management whip for the programmers to keep them under pressure, so they are more obedient? A manager's trick? The Linus' software management process? I'm most hardcore of you all, so I'm the overlord ;-). In that light Linus' diatribes are only politics: he's defending the status quo.

There's also a diffrent response to the "substandard programmers" reproach I must mention here. Steven Dewhurst broght in the point, that for a C programmer C++ is so complex because there are alien idioms, methodologies, tricks, and so on*****. You would be tempted to say it's to difficult for the average C coder, but there's something else! C++ isn't just C, it only happens to be backwards compatible! When you switch from C to Common Lisp you won't be an expert instantly, but the C folks assume they can just come and start programming C++. And then they cry that it's too difficult and complex.

Summary

This time there's no summary. I was just gossiping...

---
* Linus original post (admittedly taken out of context!): http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=57918, but I must admit, when he's speaking, he does make a much better impression!
** Digg gossiping: http://digg.com/linux_unix/Linus_Torvalds_hates_C
*** Hacker News discussion: http://news.ycombinator.com/item?id=51451
**** A typical reply: http://warp.povusers.org/OpenLetters/ResponseToTorvalds.html
***** Steven Dewhurst's reply: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=411


Wednesday 19 November 2008

A letter from DLL hell: msvc60.dll and msvcr80.dll

This is only a short technical note - for those who (like me) first check the Internet for solution to weird programming questions!

The Problem:

Locally everything worked fine: I could install my old (VC++ 6.0) Windows application and start it without a hitch. But at client's site (1000s of miles away) Windows refused to start as it couldn't find the msvcr80.dll library.

What the heck, I link against msvc60.dll but Windows complains about msvcr80.dll which I don't use, don't link, and don't need??? I that black magic? Help!!! That was the problem I was fighting for the best part of one of last weeks. Welcome in the manifest/DLL hell. Well, I tried first to reproduce this locally: it didn't work (i.e. it did work ;-))) on my develpoment Widows XP machine, as well as a freshly installed Windows Server 2003 Std. Edition (SP2), a running Windows Server 2003 Web Edition (SP1), and it always worked! No probs at all when starting the app!

At the client there was the Windows Server 2003 Web Edition (SP2), with "nothing installed" (as asserted by my colleague at the client' s site) except for our product and Oracle! What's going on? There are no dependencies to msvcr80.dll in my code (I checked with the Dependency Walker) and the installation worked for years before I made that last bugfix!

In the last attempt to reproduce the error I found a running installation of Windows Server 2003 Web Edition (SP2) at my client's company local site and could at last see it with my own eyes: "Cannot start application, msvcr80.dll not found".

The Solution:

It's elementary: if there's no static dependency, ther must be a dynamic one! The Process Explorer has shown, that the process has really tried to load the msvcr80.dll! How? Through the Qt plugin mechanism:

"Qt applications automatically know which plugins are available, because plugins are stored in the standard plugin subdirectories. Because of this applications don't require any code to find and load plugins, since Qt handles them automatically.

The default directory for plugins is QTDIR/plugins*, with each type of plugin in a subdirectory for that type, e.g. styles..."

Although I used the Qt version 3, this generic mechanism pulled in all the plugins found, which were OK, except for the styles plugin qtdotnet2.dll. Because of course "nothing else installed" wasn't true: the Base package of my client was there, and it had various plugins installed, including the culprit: qtdotnet2.dll, which was written for Qt version 4 and pulled in the Visual C++ 2005 runtime support!

---

Saturday 25 October 2008

Erlang and Map-Reduce

I was a fan of Google's Map-Reduce for a quite long time, as I was first doing my PhD research in distributed systems and then was working in some high-availability projects, so such interest emerged somehow naturally.

Their (i.e. Google's) achievements quite impressed me: first the whole infrastructure (lot of C++ coding - they simple wrote their own filesystem and database!!!), but equally the level of abstraction used when working with distributed data. I wrote about the superiority of that model over the SOA model before, although the SOA model is probably the best you can achieve in a heterogenous environment (and I was probably wrong there...).

So every time I see a map-reduce implementation I can't help reading about it: Hadoop is the most known open source implementation, but there's the QtConcurrent::mappedReduced algorithm in the new Qt 4.5* as well. You see, the idea seems to be catching on.

Now to the news: there is a Map-Reduce implementation in Erlang (!!!)** which runs Python scripts (!!!) and it's called Disco***! And if you don't have a massive parallel cluster at home, you can run it in the Amazon's Elastic Computing Cloud! I don't like Nokia very much, but I must admit that this one is rather cool: you simply write scripts to manipulate your data, much in the vein of UNIX shell programming, only infinitely scalable! And we know that scripting languages are much better for data manipulations than Java or C++. According to its homepage, Disco is quite a success too:

This far Disco has been succesfully used, for instance, in parsing and reformatting data, data clustering, probabilistic modelling, data mining, full-text indexing, and log analysis with hundreds of gigabytes of real-world data. ***
Wow! I like te idea of Erlang and Python working unisono!

---
* Qt 4.5 docs: http://doc.trolltech.com/main-snapshot/threads.html#qtconcurrent
** a small itroduction to Erlang: http://ib-krajewski.blogspot.com/2007/08/erlangs-change-of-fortunes.html
*** Disco's homepage: http://discoproject.org/

Tuesday 7 October 2008

C++ servlets - again

It looks like it's getting to be a new hobby of mine: collecting C++ web application frameworks. After the first one* (a simple HTTP server and session classes) and the modern one (Wt aka witty)**, now the "missing link" was found: the classic Java-like Servlet container implementation in C++! This was made possible by a friendly fellow blogger Eduardo Zea.

Eduado was kind enough to give me a link to the DDJ article describing such an implementation by Rogue Wave named Bobcat***. It's quite old (by SW-industry standards) and the link to evaluation downloads doesn't work anymore, so I think, it didn't quite catch on. But it's another one in my collection! So the actual counters are C++=3, Java=googol.

PS: To be more precise, Bobcat functionality is now part of the Hydra Express****, a Rogue Wave's SOA publishing framework. So are we all going SOAP?

---
* see: http://ib-krajewski.blogspot.com/2007/09/servlets-in-c.html
** The Wt-framework: http://www.webtoolkit.eu/wt/
*** John Hinke, Implementing C++ Servlet Containers, April 01, 2002: http://www.ddj.com/184405023
**** http://www.roguewave.com/blog/so-what-is-it-with-rogue-wave-and-xml-soa/

Monday 29 September 2008

Beautiful code

What is beatiful code? The shortest answer (which I've read somewhere but can't remember where) is:

we all know what "ugly code" is: code that someone else wrote...
But beautiful code? Isn't it in the eye of the beholder? Well, for me, beautiful equals readable. You have to see on the first sight what the overall idea of the piece of code is. On the other side, the idea itself might be crap (!!!) but then we should ask the next question: what is a beautiful design/architecture?

I, for my side, am thus a proponent of writing aesthetically appealing code. And I'm not alone! Read this:

Whether it is a natural occurrence, a quirk of human languages, or conditioning, most people find while (x==3) significantly simpler to read than while (3==x). Although neither is going to cause confusion, the latter tends to slow people down or interrupt their train of thought. In this book, we have favored readability over safety—but our situation is somewhat different than that of normal development. You will have to decide for yourself which convention suits you and your team better. *
Here we've got it: the eternal problem with the coding guidelines forcing me to write a plug-ugly (3==x)! The question is: should we write ugly code as to be on the safe side? I admit, that I never wrote such an ignominious line of code in my life. You expect trouble? Nope! I've never had any problems at this point! Well, one single time I mistyped it, but found it out in an instant! In C++ you have to develop certain sensitivity for that construct, that's all.

As for practical examples (instead of dull theoreticizing) - some time ago I asked myself: what is the most cool/beatiful piece of code you eve wrote, what would you show to others in order to impress them or to show them how crystal-clear ;-) your style is? As it was even before my lambda library, and only had plain production code on my disposal, I settled on the following piece:
/*------------------------------------------------------------------------*/ /**
* @brief configDelta
* @descr
* This function compares two configurations and returns the differences.
*
* @param[in] other - the other configuration
* @param[in] scope - what delta requested: all the new entries, all the deleted
* entries, or all changes altogether?
* @param[out] delta - the calculated diffrence
*
* @note Assumption: both configurations must be sorted!!!
*///--------------------------------------------------------------------------->

void SimpleCfgFile::configDelta(const SimpleCfgFile& other, CfgDelta scope,
vector<string*>& delta) const
{
TRACE_FUNC("SimpleCfgFile::configDelta");
delta.clear();

switch(scope)
{
case addedDelta:
TRACE_DEBUG("addedDelta");
// all in this but not in other:
set_difference(begin(), end(),
other.begin(), other.end(),
inserter(delta, delta.begin()),
less_then_deref<string*>());
break;

case removedDelta:
TRACE_DEBUG("removedDelta");
// all in other but not in this:
set_difference(other.begin(), other.end(),
begin(), end(),
inserter(delta, delta.begin()),
less_then_deref<string*>());
break;

case completeDelta:
TRACE_DEBUG("completeDelta");
// all in this but not in other + in other and not this:
set_symmetric_difference(begin(), end(),
other.begin(), other.end(),
inserter(delta, delta.begin()),
less_then_deref<string*>());
break;

default:
TRACE_ERR("Unknown scope requested for configuration delta!!!");
}

TRACE_VALUE(delta.size());
}
You see, it's not a rocket science. What I liked in this piece of code was it's conciseness, readibility and (though it's of no real importance for workings of the code) its symmetry. And moreover, I was amazed how a judicious usage of the standard library simplified the task which at first seemed to be rather a daunting one!

An lastly, it's an illustration for the fact that you don't have to use a lambda library to obtain a clear code: I just wrote the following trivial functor:
  template <class T> struct less_then_deref : binary_function<T,T,bool>
{
// OPEN TODO ---> constraint: isPtrType(T)...
bool operator() (const T& x, const T& y) const { return *x < *y; }
}
instead of the lambda expression (*$1 < *$2) and it is still readable. Or even more readable by using a telling name?

---
* taken from the following book: Groovy in Action, Dierk König et al., Manning 2007, page 157-158

Wednesday 17 September 2008

Google's technology stack

Well, I said I wouldn't write any knee-jerk reaction posts on this blog, only well thoght-through, throughly researched, and insightful entries. Certainly, I have some entries I should be rather working on, like mutithreading testing or lock-free synchronization... But I must admit, that was a bit over-optimistic, as you'll see in a second...

Recently, I stumbled across this one:
Google has recently launched the Google App Engine. From an Java enterprise developers point of view it is shamelessly easy to use, deploy, etc. Well, unfortunately it only takes Python apps for now, but it is stated that there will be more languages supported in the future. But it’s Google again putting its finger into the Java EE wound (first GWT with webapps, then Android shaking the Java ME world, and now App Engine showing how runtimes should look like).*
I blogged before about the "Google phone", which came out not as a phone, but as an SDK (BTW: do you want to make your 1st milion? Take part in the Android Developer Challenge, no kidding!). The local german "Java Magazin" published on this ocasion (i.e Android's release) an editorial, accusing Google of attacking Sun, Java, splitting the Javaland and whatever. What the fuss?

I cite Wkipedia**:
Dalvik is often referred to as a Java Virtual Machine, but this is not strictly accurate, as the bytecode on which it operates is not Java bytecode. Instead, a tool named dx, included in the Android SDK, transforms the Java Class files of Java classes compiled by a regular Java compiler into another class file format (the .dex format).
So you are writng Java code, but it's not running on the JVM! Is that forbidden?
"...some have related Dalvik to Microsoft's JVM and the lawsuit that Sun filed against Microsoft, wondering if a similar thing might happen with Google, while others have pointed out that Google is not claiming that Dalvik is a Java implementation, whereas Microsoft was."***
I don't know. But I think it shouldn't be!

Now for the Google App Engine. As I had a look at it some time ago, it didn't ring a bell with me. I rather though about it as of another grid computing offering, like Amazon's Elastic Cloud: just write your app locally and ther throw it on the grid and it will scale automatically with your needs. But when a Java person sees this, it sees Java technology attacked. The same for GWT: it is JSF as it should always has been. But come on, you are still writing your programms in Java, the difference is that the ideas don't come from Sun! I'd rather say Google is giving a second life to Java by providing new ways for using it. I wouldn't have though that 5 years ago, when they were essentially a C++/Pythonn shop!

Additionally, I can't help feeling that the Java poeople are thinking in an "imperialistic" way: boasting about their superiority, but on the other side always suspicious that someone may have try to challenge their (self proclaimed) supremacy. Like the late USSR...

But on the other side, when you look at Google, you could be tempted to think, that they are writing everything new: newly they published an own C++ test framework**** and an own (C++) transfer data encoding****, just as example. So maybe it's not an assault on Java iteself, but just a manifestation of the "Not Invented Here" syndrome? Now, the employees must do something in their 20% project-free time, so they programm every conceivable thing anew (and better?).

---
* http://adminsight.de/2008/05/05/springsource-announces-an-application-plattform/
** http://en.wikipedia.org/wiki/Dalvik_virtual_machine
*** http://www.infoq.com/news/2007/11/dalvik
**** Google test framework: http://code.google.com/p/googletest/wiki/GoogleTestPrimer, Google transfer encoding: http://code.google.com/apis/protocolbuffers/docs/overview.html, and now they even wrote their own browser: Google Chrome (ok, you knew that already...)

Wednesday 27 August 2008

The end of dumbing-down of programming?

Times are changing. Definitely. Think about this little story: do you remember what Java's USP was at the beginning? Yes, the garbage collection, and yes, the JVM portability. But the main thing was it's philosophy of not allowing bad programmers to make bad mistakes. We don't have pointers, we don't have operator overloading, we don't have multiple inheritance, our core classes are final... As one person expressed it at that time: "...they gave me a paper hammer instead of a real one so I can't hit my fingers!" And that's the reason why i din't like it, didn't really want to use it, and consequently missed out on a cash-cow :-(.*

But yesterday, I read this on the InfoQ**:
... And it is true, my experience weaves that out too: you can create environment really restricted just to keep bad developers out of trouble, but these restricted environments harm the productivity of your best programmers. Basically what you do is: you are not speeding up your bad developers and you are slowing down your best developers and that's why our productivity stinks in software right now, but the attitude is changing around.
I've read similar complaints before, but as it seems, after the Ruby-shock (RoR faster than Struts and 10x mote productive) such opinion is somehow fashionable and even almost mainstream today.

What do you say? Isn't that the old C++ philosophy we are returning to? I cite Bjarne***:
Kierkegaard was a strong proponent for the individual against "the crowd" and has some serious discussion of the importance of aesthetics and ethical behavior. I couldn't point to a specific language feature [....] but he is one of the roots of my reluctance to eliminate "expert level" features, to abolish "misuses", and to limit features to support only uses that I know to be useful.
---
* The language was just plain uninteresting to me, and I didn't see its real merits at the time. Maybe I didn't want to see them?
** http://www.infoq.com/interviews/Languages-Platforms-Neal-Ford
*** http://technologyreview.com/Infotech/17831/page3/


PS: BTW, concerning the "Java is the next Cobol" thing, there was a discussion on some German Java forum here, where I argued with 2 arguments, but as I see now, I (and everyone else) missed the most important one. Namely: Java is the new Cobol, as it's mainly used in corprate and business settings (like Cobol was). The solution is simple, isn't it?

Wednesday 9 July 2008

C++ pocket lambda library, the last

So, this will be definitely the last part! I promise! I planned this to be a three part series, and see how it has grown. But let's go down to the business: in the previous installments we achieved the following:
    // part 1: basics
find_if(vec.begin(), vec.end(), _$1 <= 10);
transform(vec.begin(), vec.end(), vec1.begin(), _$1*2);
sort(vp.begin(), vp.end(), *_$1 <= *_$2);
    // part 2: function applications
for_each(vec.begin(), vec.end(), bind(sinus, _$1*(pi/180.0)) );
    // part 2a: member access
find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::getValue) <= 2);
    // part 3: output
for_each(vec.begin(), vec.end(), cout << delay("---") << _$1 << delay("\n"));
You can see, we had defined a kind of a custom (if not too counter-intuitive) mini-language for the lambda expessions. As each language has to be learnt, we'd like it to be kept simple! So I have only one more thing to add, namely:

1. Control structures


This is really fun, because it's not difficult at all and it let us define very cute lamdbas indeed. The entire code, as it stands in the library, is like that:
    // if_then
// ---

template <class S, class T> void eval_then_expr(S& e, T& t) { e(); }
    // helper: Assgn needs the iterator for: if_then(_$1==0, _$1=44)
// --- OPEN, TODO: make general for e.g.: if_then(_$1>=3, cout << _$1)
// --

template <class T> void eval_then_expr(Assgn<T>& e, T& t) { e(t); }
    template<class S, class T> struct IfThen  : public lambda_expr {
S if_expr;
T then_expr;
IfThen(S s, T t) : if_expr(s), then_expr(t) {}
template <class U>
// OPEN, TODO: check if then_expr needs the val argument!
bool operator()(U& val) { if(if_expr(val))
eval_then_expr(then_expr, val); }
};
    template <class S, class T>
IfThen<S, T> if_then(S s, T t) { return IfThen<S, T>(s, t); }
    // t.b.c....
you see, there are some TODOs, which we'll discuss in the due time, but the technique is simple: first we store the if_expr and the then_expr (as functors), and when the IfThen functor is evaluated, we just evaluate the sub-functors, and make the then_expr dependant on the value of the if_expr. I didn't think myself this would be so simple! As this isn't a ready library, but rather an exploration of some possibilities in code, ther are a lot of open ends here. First, we probably don't need the val parameter in the function call operator. Second, if the then_expr needs the iterator (i.e. _$1), I currently just hacked in only the overload for the assignment. This must be extended with some general forwarder mechamism. But even with this rudimentary support, we can do this now:
    // count occurences
for_each(vec.begin(), vec.end(), if_then(_$1 == 1, globvar(yyy_ctr)++));
    // replace values
for_each(vec.begin(), vec.end(), if_then(_$1 == 1, _$1 = 9));
I think it's pretty useful. Other possible control structures? Well, I think they are possible, but not so useful: a loop, a switch? I don't think that it would be good design. What we really would need sometimes, is rather a possiblility to nest STL algorithms than to use a loop functor. But it's not difficult, maybe something along the lines of:
    template <class U>
bool operator()(U& val) { auto it = val.begin(); // C++Ox
for(it != val->end(); it++)
eval_loop_body(loop_body, it); }
would do, and then we could process a container of containers:
    for_each(cc.begin(), cc.end(), loop_all(cout << _$1));
and perhaps even to extend it like that:
    for_each(cc.begin(), cc.end(), loop_some(_$2 < 1, cout << _$1));
for_each(cc.begin(), cc.end(), loop_counted(10, cout << _$1));
But do we need it? I think it's more a gimmick that an useful feature, because it's not orthogonal: you have a host of special loop_xxx lambdas instead of a single mechanism. What do you think?

2. Summing this all up


In conclusion? There are 2 conlusions:

1. lambda library is cool, all this stuff is cool, I'm cool.

2. Frankly, isn't that all just appalling? All that effort and what we got is an unnatural syntax! And it's not transparent: for each new combination of operators I've got to write new code in the library (or almost)! Makes you think of Phillip Greenspun's Tenth Rule of Programming: "Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp." Please Mr. Stroustrup, why don't we have lambda-expressions as core language feature in C++???

Actually, as the things are, it seems like we are going to have lambda functions in the new C++0x standard*! For me, they look like Groovy lambdas (or are they really Ruby's ???), compare**:
Groovy: myMap.each { key, value -> println "$key => $value" }

C++0x: for_each(a.begin(), a.end(), <> (int x) -> int { return sum += x; } );
// or, equivalently, but not much Groovy-like:
for_each(a.begin(), a.end(), <>(int x) { return sum += x; } );
Should we rejoice then? The proposal paper itself lists the problems with lambda expressions:

1. lambda-libraries may render simpler code in basic cases, compare:
    // lambda lib.
os << _$1
// lambda expr.
<> (int i) extern(os) { os << i; }
You see, we need the old, ugly, annoying type specifications again! An we've just started to ejoy the typeless (ehm, generic...) programming in C++! Isn't that what the whole template thing is for! This leads immediately to the second problem:
2. there are no polymorfic lambda functions!!!

The proposal doesn't allow us to write templated, polymorfic lambda function (i.e. ones with implicit type recognition), as it would clash with the concepts feature. More specific, it wouldn't exaclty clash and explode, but rather the concepts couldn't guarantee the correct type checking in this case. So bye bye type freedom? Do we alway have to use the annoying explicite typing? Imagine a lambda function working with an iterator on a list of strings, an then write this type expicitely as input parameter. Ghastly! When you are using a lambda library solution, you'll simply write _$1, and that's it! In such a case, the whole point of the lambda expression, its conciseness, goes down the drain. I can write a standalone functor as well.
Summing up: if have only a simple thing to do, a lambda library solution is simpler. But if there is some more work to do, and we don't have elaborate type specifications for input parameters, the lambda expression solution has clearly an edge, as we don't have to learn a new, special-purpose mini-language, but can just use the standard C++ syntax.

As it seems none of the solutions is optimal: do we need both of them?. And I can say I'm rather not pleased, although the situation here is definitely better than in Python.

3. Personal note


In the past I thought (or rather believed) that you could just do everything in C++ with libraries, operator overloading and template tricks. But I guess I was just dazzled by the template syntax ;-). After this series I think there are limits to the flexibility of C++ which can be only worked around with ugly syntax (or macros).

---
* as in the last "State of C++ Evolution" document they are on the list of the features which are definitely planned to be included, see: "State of C++ Evolution (between Portland and Oxford 2007 Meetings)" of Dec.01.2007 - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2142.html
** C++0x lambda expressions proposal - http://www.research.att.com/~bs/N1968-lambda-expressions.pdf

Wednesday 25 June 2008

The future of C++

In my recent blog entry* I complained about not exactly knowing where C++ is heading, what features will C++0x contain when it finally appears, and if we'll need to switch to hex as in C++0a ;-). Then I read some interviews with Bjarne Stroustrup** and the things became clearer.


1. The process


The first ambiguity I addressed, was the problem of the very loooong time which C++ needs when acquiring new features, and hinted at the lack of corporate backing. Bjarne on that**:
BS: The progress on standard libraries has not been what I hoped for. ... We will not get ... I had hoped for much more, but the committee has so few resources and absolutely no funding for library development.
and:
BS: There is no shortage of good ideas in the committee or of good libraries in the wider C++ community. There are, however, severe limits to what a group of volunteers working without funding can do. What I expect to miss most will be thread pools and the file system library. However, please note that the work will proceed beyond '09 and that many libraries are already available; for example see what boost.org has to offer.
So that's pretty clear, the industry isn't backing C++ anymore, and the ISO beaurocracy isn't helping here. Java's JSP is definitely more lightweight. Python hasn't a commitee. IBM, Sun (and Google?) are massively backing Java. Previously Google was a stronghold of C++ (MapReduce, BigTable, BigFile), but lately Java seems to take over, in the real "weed language" manner.


2. The features


The second of my questions was the extent of new features which will make it into new standard, beacuse this was a constant source of confusion: thousands of proposals, each in a different state of progress, constantly wandering between approved, considered, considered strongly, considered not so strongly, not considered, demised, etc ;-). Now the new feature scope seems to stabilize at last. Bjarne again**:
BS: I do — based on existing work and votes — expect to get:

Libraries
- Threads
- Regular expressions
- Hash tables
- Smart pointers
- Many improvements for containers
- Quite a bit support for new libraries

Language
- A memory model supporting modern machine architectures
- Thread local storage
- Atomic types
- Rvalue references
- Static assertions
- Template aliases
- Variadic templates
- Strongly typed enums
- constexpr: Generalized constant expressions
- Control of alignment
- Delegating constructors
- Inheriting constructors
- auto: Deducing variable types from initializers
- Control of defaults
- nullptr: A name for the null pointer
- initializer lists and uniform initialization syntax and semantics
- concepts (a type system for template arguments)
- a range-based for loop
- raw string literals
- UTF8 literals
- Lambda functions
The most important feature (IMHO) is**:
BS: The new memory model and a task library was voted into C++0x in Kona. That provides a firm basis for share-memory multiprocessing as is essential for multicores.
and, of course, the auto keyword and lambdas!

Maybe more important is what won't be there**:
BS: The progress on standard libraries has not been what I hoped for. .... We will not get the networking library, the date and time library, or the file system library. These will wait until a second library TR. I had hoped for much more, ...
BS: ... What I expect to miss most will be thread pools and the file system library. However, please note that the work will proceed beyond '09 ...

But an important change of working style will take place:**
... Fortunately, the committee has decided to try for more and smaller increments. For example, C++0x (whether that'll be C++09 or C++10) will have only the preparations for programmer-controlled garbage collection and lightweight concurrency, whereas we hope for the full-blown facilities in C++12 (or C++13).
and we may expect a host of new extensions in the future!

On the other hand, at least when the multithreading is concerned, there are independent libraries available, and they offer some rather high level concepts! Take for example the latest Trolltech's Qt 4.5 framework***, which implements futures, automatic scaling for multicore, has a MapReduce implementation plus concurrent mapping and filtering algorithms. It's just like Java 7's fork-join framework* and the ParallelArray class. Bravo! Other library with high level threading support are of course the Intel's Thread Building Blocks, it's not bad either! At this point we don't need a standard document, as it seems.


3. The prospects


At last, let's pose a more general question: what kind of language wants C++ to be? In the past Bjarne Stroustrup maintained that C++ should be a "general purpose programming language". Contrast this with the statements from the last interviews**:
JB: You are looking at making C++ better for systems programming in C++0x as I understand it, is that correct? ...

BS: Correct. The most direct answer involves features that directly support systems programming, such as thread local storage and atomic types.

JB: C++ is often used in embedded systems, including those where safety and security are top priorities. What are your favorite examples and why do you think C++ is an ideal language for embedded systems especially where safety is a concern, aside from easy low-level machine access?

BS: Yes, and I find many of those applications quite exciting.

For my taste, Bjarne thinks clearly that C++ is an system and embedded programming language: e.g. he expressed his fondness for robotics systems before. That's bad news, because I don't really like embedded programming and automotive :-(((. On the other hand, system programming is quite exciting for me, provided I haven't to fiddle about low level data structures too much.

Allow me a question in this context: for me OO programming is about hiding low level details, seeing the bigger picture. If the strengths of the language lie in the hardware access then maybe it's not so useful for OO programming? We can use C for low level programming (like Wireshark's code does, and it's not a toy system) and do OO in Java or Python? So where's the place for C++ then?

But maybe the future will be totally diffrent? Maybe we won't be programming C++ anymore but rather the Qt platform, which only happens to be written in C++? This would be akin to the Java platform: because C++ doesn't provide a standard ABI, many comapnies are using Qt as to assure the portability of their code between operating systems (among others my current client). Interestingly, not only in GUI applications, abut also in general purpose programming! But what about the (maybe only preconceived) imcompatibility with the standard library, which I bemoaned in one of my previous entries? If I can give faith to Danny Kalev's words****:
And in other news, Nokia completed its acquisition of Trolltech last week.
...
Qt is currently used in Skype, Google Earth, and Adobe Photoshop Elements.
....
After Nokia's acquisition, it seems that Qt will be modified to support Symbian and other mobile environments, or at least POSIX libraries and better support for mainstream C++.
then it seems that not only yours truly is having that impression, and that there will be a remedy soon!

--
* Language trends and waiting blues: http://ib-krajewski.blogspot.com/2008/04/language-trends-and-waiting-blues.html
** An Interview with Bjarne Stroustrup: http://www.ddj.com/cpp/207000124 (but
also http://www.informit.com/articles/article.aspx?p=1192024 , although I don't quote it here)
*** http://labs.trolltech.com/page/Projects/Threads/QtConcurrent
**** C++ Reference Guide: http://www.informit.com/guides/guide.aspx?g=cplusplus (Notes on 25th of June)

Sunday 8 June 2008

C++ pocket lambda library, part III


As I first wrote my C++ lambda code about a year and a half ago, I didn't know that I'm hitting such a hot topic. I wanted just to reduce the amount of code I had to write, and didn't have any high-church functional programming thingy in mind. But now lamdbads and closures are all the rage: look at all those Groovy articles on developerWorks for example. Even Java 7 will have closures (or won't it?). Definitely, Ruby has made programming languages an interesting topic again!

BTW, do you know how currying and lambdas looks like in Haskell, a popular functional (dynamically and strongly typed) language? If you don't know what it is, we've discussed currying in a previous posting of this mini-series*. In Haskell it is very natural syntactically, you can just write (well, almost, I skipped the T=>T=>T type definition):

    product = a b = a*b
    double = product 2    <-- curried!
    double 2
This is basic, but look at that:

    doubleEach = map (2 *)
i.e. we define a partially evaluated function (as map needs 2 arguments, a function and a list) waiting for application. It's like you'd be using the bind() template of our last posting* in C++. And look at the cute lambda function shortcut: (2 *) is a function (unsurprisingly, as in Haskell everything is a function, contrary to Ruby or Groovy where everything is an object, even the functions ;-)). I like it.


1. Getting exxpresive


Admittedly the code in the 2nd part of this mini series was rather bland*: some hyper technical stuff but not really very entertaining like the 1st part (which was really fun for me to code). I wrote it only for completness' sake, as it is part of my code. I hope this installment will be more fun again.

So let us tackle the last topic we need to implement as to have an usable lambda mini-library: the expression templates. The what? Wy do we need it exacltly? I'm certainly not going to write code like: __expr template <class Expressible> express_anger(Expressible& e); - not in my life!!! Ok, ok, let's introduce the concept gently.

Do you know for example the Boost (e)Xpressive library? It expresses (expressively) regular expressions (Boost Spirit library does in much greater style for grammars) by C++ code constructs. I.e. instead of:

    sregex rex = sregex::compile( "(\\w+) (\\w+)!" );
you can write

    sregex rex = (s1= +_w) >> ' ' >> (s2= +_w) >> '!';
using the domain specific language (buzzword alarm!!!) instead. The string: "\\w+" is replaced by an (Xpressive) expression +_w. You recognize perhaps the usage of placeholders, like our _$1 or _$2, operator overloading and assignment of partial matches to external variables (external to the closure, you'd say in Perl or Groovy). But in this case we don't have a single operation which should create a functor, neither a combination of two different operations. Here we have one operation applied again and again (>> concatenator), and we have to encapsulate it in a single lambda functor!

The same problem emerges in the context of out mini-library.

    for_each(vec.begin(), vec.end(), cout << "-->" << _$1 << "\n");
we have to collect all the items which have to be sent to cout, which can be infinite in number!

Here expression templates come to the rescue. First described by Todd Veldhuizen**, they let us to define recursive templates with operator overloading. And recursion can go infinitely deep down, so we can accomodate our long shift operator sequences with our usual aplomb! What we need is following tree structure:

                  expr
                    ¦
                   op >>
                 /    \
               op >>   \
             /   \      \
           op >>  \      \
         /        \      \
      s1=+_w  ' ' s2=+w_  '!'
True to the "Modern C++ design" book's ubiquitous typelists usage, we can express this runtime structure in compile time with a following monstrous type:

    Op<Op<Op<Char, Expr>, Expr>, Char> rex;
Here, all the structural information has been recorded: just read the type from left to right and compare it with the picture of the parse tree. Now we have to supply the arguments to the constructor of an object of this type and then call a method of the rex object. But we don't do it in classical sense, the construction and the type definition will be done recursively while compiler is parsing the expression. Hence the name of the expression templates #B-D.


2. First real exxpressive code


To convince the compiler to to some sensible work for us we first define the following recursive template:

    // Shift represents a node in the parse tree
    // ---
    template<typename Left, typename Op, typename Right>
        struct Shift  : public lambda_expr
    {
        Left leftNode;
        Right rightNode;
        std::ostream& out;

        Shift(Left t1, Right t2, std::ostream& os)
            : leftNode(t1), rightNode(t2), out(os) { }

        template <class T>
            void operator() (T& t) { Op::print(leftNode, rightNode, t); }
    };
You can see, the structure of the tree node is different form the monstrous type given as example above, well, it's even more complicated. Using this approach we would express the above example tree as:

    Node<Node<Node<Char, Op, Expr>, Op, Expr>, Op, Char> rex;
Ok, why not. If it's supposed to help, I couldn't care less... ;-)

But what has this all with our lambda library? The answer is, we can apply the same concept to the problem of priniting data to cout: supposed we have a following lambda function: cout lt;< "element:" << _$1, we'll can build a type tree like:

                  expr
                    ¦
                  << op
                 /    \
             << op     \
             /   \      \
            /     \      \
         cout "element:" _$1
One interesting thing to note is the ()-operator, which prints the actual node of the parse tree using a mysterious T& t argument: it is the actual parametr of the lambda function, i.e. the uderlying iterator itself!

So if we have the top level tree node (i.e. the expr in the diagram above), we'll just call its ()-operator an the whole tree is printed out, hopefully! Two problem pose themselves here:

1. how to descend to the subnodes of the tree while printing, and
2. how do we get the whole tree structure constructed in the first place?


3. Recursive printing


To answer the first question we need the representation of the recursive printing operation in code:

    struct shiftOp // Represents << operation
    {
        template <class Left, class Right, class T>
            static void print(Left& left, Right& right, T& t)
            {
              print(left.leftNode, left.rightNode, t); // walk down
              left.out << right(t);  // if right needs t? : << _$1*2 ???
            }
 
template <class Right, class T> static void print(std::ostream& left, Right& right, T& t) { left << right(t); // at the beginning }
 
template <class Left, class T> static void print(Left& left, placeholder<1>& right, T& t) { print(left.leftNode, left.rightNode, t); left.out << t; // special case placeholder } };
First we walk down the tree in the depth first, left to right mode (the first print() function). When we arrive at the lowest left node of the tree we do the first print, then go back and print the corresponding right node. Note that we don't walk a physical tree here, we walk a type expression which is organized like a tree! The print() functions will "match" a part of the type-tree, print the matched part, and match the subtype in a recursive manner. So we are treating types in compile time as we'd treat data in the runtime! That's why this is called template META-programming.

Then we need only 2 specialisations: one for the first invocation of the shift operator, where the left operand is the output stream itself, and the second one to deal with our lambda mini-library's placeholder types. The placeholder will be printed directly to the stream. Our tree expression for the simple example above will be then as follows:

    Shift<Shift<cout, shiftOp, "element"-Expr>, shiftOp, _$1> lambda;
Now just imagine how the print() function will work on it.


4. Growing the tree


To answer the second question we need 2 simple ;-) operator overloading definitions:

    // The overloaded operator which does parsing for expressions of the
    //  form "a<<b<<c<<d" => Shift<Shift<Shift<A, op, B>, op, C>, op, D>()
 
// for ostream: cannot be taken by value!
template<class Left, class Right> Shift<Left&, shiftOp, Right> operator<< (Left& left, Right right) { return Shift<Left&, shiftOp, Right>(left, right, left); // left IS an ostream! }
 
// for lambda_expr: must be taken by value! template<class Left1, class Left2, class Right> Shift<Shift<Left1, shiftOp, Left2>, shiftOp, Right> operator<< (Shift<Left1, shiftOp, Left2> left, Right right) { return Shift<Shift<Left1, shiftOp, Left2>, shiftOp, Right>(left, right, left.out); }
The first one starts the recursive template definition at the "cout <<"-expression, an the second one goes one nesting level deeper and one <<-operator application to the right. It's an elaborate syntax, but conceptually it's not a rocket science! Note how the stream (cout) parameter is handed down the expression tree.

Yeeee-ha! We are done now! Let us try it out:

    for_each(vec.begin(), vec.end(), cout << _$1);  // OK
    for_each(vec.begin(), vec.end(), cout << _$1 << "\n" ); // compile error???
    for_each(vec.begin(), vec.end(), cout << i++ << ":" << _$1 << " " ); // again!!!
Did you see that? We still cannot use our lambda library. What is it this time? Well, we need a last building block, and for discussioon of that we need a separate paragraph.


5. External data in lambda functions


The problem is, that inside of an lambda expression we are working with functors, and not with native C++ data types. This means, we need a function call operator for each element of the lambda expression. What can be easier than that! We can make a trivial functor, which, when evaluated returns our literal value i.e. "\n" or ":"!

    // constant_ref
    //  ---
    template<class T> struct Const : public lambda_expr
    {
        T& val;
        Const(T& t) : val(t) { }
        const T& operator()() const { return val; }
        const T& value() const { return val; }

        template <class S> // for different context: needed for Shift!!!
            const T& operator()(S& s) const { return val; } // ignore input
    };
 
template<class T> Const<T> delay(T& t) { return Const<T>(t); }
This can be described as a delayed evaluation: the compiler first wraps the constant in a functor, and the constant's value is used in runtime instead of compile time. Now a lambda like: auto lambda = (cout << _$1 << "\n"); will work. BTW I wonder if that would compile...

Maybe you don't remebmer, but it the IIa part of this series*** I showed the following code snippet:

    // assign to a external counter
    int aaa;
    for_each(vecx.begin(), vecx.end(), aaa += _$1->*(&XXX::value));
only to say, that it wouldn't work yet. What's the problem? Basically the same one: we are using native C++ type instead of a functor. The solution is of course the delayed evaluation from above, but now must cater for the basic operations:

    // variable_ref
    //  ---
    template<class T> struct Var : public lambda_expr
    {
        T& val;
        Var(T& t) : val(t) { }
        T& operator()() const { return val; }
        T& operator()(T& t) const { return val; } // ignore input
        T& value() const { return val; }

        AssgnTo<T> operator=(T t) { return AssgnTo<T>(val, t); }

        template <class S>  // assign from lambda_expr
            AssgnTo<T> operator=(S s) { return AssgnTo<T>(val, s); }
    };
 
template<class T> Var<T> globvar(T& t) { return Var<T>(t); }
Here we enabled the assignment to the C++ data type. The addional operators could be implemented like this:

    // OPEN todo: be more modular!!!
    //  --- return lambda_operation<lambda_exp, oper_type>
    template<class T>
        AddAssg<T> operator+=(Var<T> v, const T& t) { return AddAssg<T>(v.value(), t); }
    template<class T>
        AddAssg<T> operator+=(Var<T> v, placeholder<1>) { return AddAssg<T>(v.value()); }
 
template<class T> Incr<T> operator++(Var<T> v, int) { return Incr<T>(v.value()); }
Now we can finally write:

    int xxx = 0;
    for_each(vecx.begin(), vecx.end(), globvar(xxx)++);
    for_each(vecx.begin(), vecx.end(), globvar(xxx) += 5);
    for_each(vecx.begin(), vecx.end(), globvar(xxx) += _$1->*(&XXX::value));
and, for example:

    for_each(vec.begin(), vec.end(), cout << _$1 << delay("\n") );
    for_each(vec.begin(), vec.end(), cout << globvar(i++) << delay(":") << _$1 );
But not, among others: globvar(aaa) = _$1->*(&XXX::value), as it's not a complete, orthogonal, and idustrial strength library. It was only a little pastime of mine...


6. The End?


Ehm, I thought this will be the last part of the description of my simple implementation, but no, I'll need another blog entry as not to bore you to death with this one. Meantime, the writing of this description has taken more time than the actual implementation itself!!! But I have some more points to make, and at the end of a long post, there are rater unlikely to be given much attention. So long then!

---
* Part II: http://ib-krajewski.blogspot.com/2008/01/c-pocket-lambda-library-part-2.html
** Techniques for Scientific C++: http://ubiety.uwaterloo.ca/~tveldhui/papers/techniques/techniques01.html
*** Part IIa: http://ib-krajewski.blogspot.com/2008/04/pocket-c-lambda-library-part-iia.html


Friday 30 May 2008

God wrote in LISP: programming and genesis.

Did you ever ask yourself what programming language God used to implement the Universe? I mean, he had a pretty tight deadline - only 6 days - so he must have been using something rather high level. And a rather complex one to boot. And as a programmer, you don't have any doubts thet God must have been using a programming language for the task: without abstraction the task is just too complex ;-).

I came across this song while hearing the OOPSLA podcasts and everything became clear: he used LISP! Hear it here: ( >> )*. I must say, I really like it a lot. It's wonderful: it's like a hymn on a great, dead language. The lyrics come from Bob Kanefsky. All I could trace about him is that he wrote some parody songs, but he must be a programmer himself judging from the quality of lyrics.

And the lyrics are right: Object Oriented languages describe how we humans are thinking about the world, but LISP (or functional languages in general) describe the thoughts of God... so pure... ;-). Let me cite: "Don’t search the disk drive for man.c...". It's almost ontology, I like it :-).

PS: As always there is a monority opinion as well, see http://www.xkcd.com/224.

---
* or here, if embedding works:

.

Thursday 15 May 2008

Why is software engineering not engineering?

As I had a look over Philippe Kruchten's book "The Rational Unified Process. An Introduction" I noticed the following passage. I mean, the book is rather dry, it just describes some organizational process, but this paragraph was different:

"Software engineering has not reached the level of other engineering discipline .... because the underlying "theories" are week and poorly understood and the heuristics are crude."
Well, nothing new here, just like we all know it to be. Then:

"Software engineering may be misnamed. At various times it more closely resembles a branch of psychology, philosophy or art than engineering. Relatively straightforward laws of physics underlie the design of a bridge, but there's no strict equivalent in software design. Software is "soft" in this respect."
Indeed! In programming we are not working with physical materials but with mental objects (or should I say artifacts?) denoted by an array of characters on a sheet of paper. Sometimes even with just boxes and lines*. So what we really are doing, when we're trying to set up some laws in software, is looking for the "laws of thought", i.e. we are trying to find a good way to organize our ideas! Ideas which will then become flesh when executed on a complicated machine. As we are working with mental objects to a much greater extend than traditional engineering, the methodology cannot be the same. The world of human thought is not so well explored as the physical world. Or maybe the physical world is just much, much simpler?

This discussion would lead us too far**, but one thing is sure: on some level of abstraction, we are no more thinking about the underlying machine, a thing which couldn't happen when we were designing a bridge. Because of that I maintain that in programming we are basicaly working with mental and not physical objects, so it cannot be counted as engineering. That might be the case earlier on, as programmers had all the iron on their hands, setting plugs and connecting cables. It had an engineering-like looks. But today? Just look on an corporate IT department - it's all about organisation, processes, abstractions of every level. For me it has more to do with "management science" and even "social science", only on a different scale.

So it should be best called computer "science" - not in the "scientific" sense, but rather in the "soft-science" sense? This would however mean, that it is somehow an art... Well, it's not what the industry wants to hear! But we don't have to tell them ;-).

---
* MDD (or should I call it MDA?) takes it to the extreme: you don't write ANY code at all, instead you are writing the "computation independent model", of course in UML, which constitutes a description of the functionality of the system. Then you are transforming this model into another model: "platform independent model", which represents the abstract computation model, then you are tranforming this into a "platform specific model" at last. And all of this is done using tools, profiles, configurations, cartriges. Ideally, you should only model the required functionality, choose the target platform, and start the model transformation chain. Cool! But is it still programming? Theoreticall it is.

** i.e. into philosophy. Just consider that civil engineers are working with mental models too, and, on the other side, the matematicians are working with "mental objects", which miraculously can describe the physical worlds with a great precision...

Sunday 27 April 2008

pocket C++ lambda library, part IIa


Are you wondering about the title of this entry? Well, it really should by part of a previous one*, but after looking at it I decided that the previous entry is pretty long already, and I wasn't willing to blow it up even more. And the theme isn't such an interesting one as to deserve a separate part in this mini-series. What is it we are talking about?

1. Access to the members


In the past I sometimes really wanted to be able to do the following:

    struct XXX { int value; int getValue() { return value; } };
    vecx<XXX*> vec_x;
    find_if(vec_x.begin(), vec_x.end(),  _$1->value == alive);
i.e. to access the members of an object out of the lambda function. Of course I'd like a code like _$1.value the best, but C++ doesn't allow us to overload the dot operator! Why not? This would mean that we could customize the method invocation mechanism itself! As it's the case in Groovy, Perl, Python or even Java**:

// Groovy:
Object invokeMethod(String name, Object args)
{
    log("Just calling me: $name");
    def result = metaClass.invokeMethod(this, name, args);
}
If you have that, you can do things like Rails in Ruby and builders in Groovy. You just intercept calls to the nonexisting methods (i.e. overload the methodMissing()/method_missing() in Groovy/Ruby or __getattr__() in Python) and install the "code block" (a closure, as to be exact) passed as one of the parameters in a custom hash map with the name parameter as key... You've got the message.

This isn't possible in C++, as it would introduce the metaclass notion into the language. At least it would require a common superclass for all C++ objects, and this contradicts the design of C++ classes (AFAIK) as thin wrappers for physical memory segments. On the other side, C++ has a more primitive notion of call intercepting: overloading of the -> operator! Alas, it only works with pointers, so we cannot provide a general solution for value based containers. An that is a bad thing enough.

So let's concentrate on the less ambitious goal: _$1->getValue()! Unfortunately, even this syntax canot be made work in our context! Why? Because for the C++ compiler the expression getValue() doesn't make sense! It doesn't refer to a class' method getValue(), which we could feed to the -> operator, it's just a string which we'd like to transform somehow in a function reference!

So what can we do (if anything)?

2. The Implementation


The best I could produce is the following:
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::value) == 2);
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::getValue) == 1);
Well, what can I say, it's passable. Now the compiler has a meaningful information in form of a member address, and it's not too ugly. How to implement it? With a standard technique from the first part***:
    // ->*
    template<class V, class O> struct ArrowStar : public lambda_expr {
        V O::* mptr;
        ArrowStar(V O::*m): mptr(m) { }
        V operator()(O* o) const { return o->*mptr; }
    };
we are overloading the member call operator for memebr access. For function member calls we need 2 more overloads. First for calls with 1 argument:
    template<class R, class O, class A> struct ArrowStarF : public lambda_expr {
        R(O::*fptr)(A);
        ArrowStarF(R(O::*f)(A)): fptr(f) { }
        R operator()(O* o, A a) const { return o->*fptr(a); }
    };
and for calls without arguments:
    template<class R, class O> struct ArrowStarFv : public lambda_expr {
        R(O::*fptr)();
        ArrowStarFv(R(O::*f)()): fptr(f) { }
        R operator()(O* o) const { return (o->*fptr)(); }
    }
nothing new here as well, just the standard operator overloading technique. For the == operation to work, I extended the EqTo operator from the part 1*** to do a little forwarding. I know, I should use the forwarders (like Le2_forw in part 1), but I was lazy:
    // lambda_expr ==
    template<class S, class T> struct EqTo : public lambda_expr {
        ...
        EqTo(S s, T t) : lexpr(s), val(t) { }
        template <class R>
            bool operator()(R r) { return lexpr(r) == val; }
    };
So let's do something useful at last:
    // shouldn't clash with lambda_expr: *_$1 <= *$2 !!!
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::getVal) <= 2);
    // read field values from vecx
    vector<int> v10_1(10);
    transform(vecx.begin(), vecx.end(), v10_1.begin(), _$1->*(&XXX::getVal));
    // assign to a external counter
    int aaa;
    for_each(vecx.begin(), vecx.end(), aaa += _$1->*(&XXX::value));
Ok, the last one won't be working just now ;-), you must wait for the part 3 of the series! For the first one to work, we must extend the forwarding class from part 1*** a little for the case where only one side must be forwarded:
    template <class S, class T> struct Le2_forw : public lambda_expr
    {
        S e1;
        T e2;
        Le2_forw(S s, T t) : e1(s), e2(t) { }
        .....
        template <class U> // one side is bound! OPEN: assume left side!
            bool operator()(U a) const { return e1(a) <= e2; }
    };
And now everything is buzzing!

3. Discussion


If you think the sytax of the meber function call ist just horrible, there is another possibility to use the member functions: bind them! This you have seen (and frowned upon) in part 2*:
    for_each(vecx.begin(), vecx.end(), cout << bind(&XXX::getVal, _$1));
it's even less readable, is it? Or maybe not? Look, compared with _$1->*(&XXX::value) it's no more SUCH a bad sight! Maybe something like call_func(_$1, &XXX::getVal) would be more readable here? The advantage of this solution is that we could accept value objects in the container, and take it's address internally. I leave the decision to you, the implementation is more or less trivial.

Summing up, I couldn't achieve much progress here, because of inherent C++ design features. So maybe a macro solution? Or another level of indirection? What we really needed here is a hook for compiler errors (method not found), where we could install our own code snippet. Wait a minute! Something like compile time asserts? But how can I get around the string => function coding problem?

Something primitive like this would be possible:
    ...
    ArrowStarStr(string& name): fname(name) { }
    R operator()(O* o) const {
        return (o->func_hashtable.get(name))(); // and don't crash here!
    }
But you cannot use a POCppO anymore, you need some macro gadgetry like in Qt,
for example:
    struct XXX {
        int value;
        int getValue() { return value; }

        // now decorate:
        STORE_LAMDBA_FUNC(getValue);
    };
    // or property based:
    struct XXX {
        DEF_LAMBDA_PROPERTY(value, int);
    };
Not so pretty, not elegant, much to much effort needed. But it is the solution we have to use following the C++ language design. We just don't have introspection and cannot overload the dot. Sorry. Any ideas?

--
* http://ib-krajewski.blogspot.com/2008/01/c-pocket-lambda-library-part-2.html
** with dynamic proxies: see http://gleichmann.wordpress.com/2007/11/22/mimicry-in-action-dynamically-implement-an-interface-using-dynamic-proxy/
***http://ib-krajewski.blogspot.com/2007/12/c-pocket-lambda-library.html

Sunday 6 April 2008

Language trends and waiting blues.

Hi everbody! I recently skimmed over an interview* about programming language trends in the DDJ. In general, there was nothing new in there, rather a re-statement of the widely known programming trends. But some phrases caught my eye nonetheless:

PJ: C and C++ are definitely losing ground. There is a simple explanation for this. Languages without automated garbage collection are getting out of fashion.
....
Another language that has had its day is Perl. It was once the standard language for every system administrator and build manager, but now everyone has been waiting on a new major release for more than seven years. That is considered far too long.
As I'm mainly a C++ programmer in my bussines life, the news of the proceeding C++'s demise worry me, if only it acknowledges what I see with my own eyes. So this fact alone is not what I want to talk about, but rather about C++ similarity to Perl. Why?

Look, can't you see a parallel to the Perl's fate here? I cite: "everyone has been waiting on a new major release for more than seven years". And there's still no Perl 6! Recall, there were plans for Parrot**, a common VM for Perl and Python (or was it just a joke?). Everyone was excited, but nope, Python won't use Parrot, Parrot is only in its 0.x versions**, so Perl 6 won't come soon, and the situation generally is a mess.

Isn't that somehow similiar to the situation of C++? The last standard (or rather a correction of it) dates back to 1998, i.e. 10 years ago! It's even longer than Perl. So maybe C++'s retreat is due to lack of new language standard like in Perl's case? When I look at Java, I must admit I envy it. Just recall the evolution: while Java 4 was still rather a primitive language without much interesting features (sorry, perhaps with exception of proxies and introspection), already Java 5 brought foreach, generics, annotations, lock-free synchronisation, lock-free data structures, autoboxing and a new the memory model. Admittedly Java 6 wasn't that iteresting language-wise but Java 7 will get things like closures or fork-join support for easy multicore parallelism***! This gets you a wholly new, interesting language.

Contrast that with years-long discussion about what is to be included in the C++0x standard. And I still don't know what exactly is to come! Will a new memory model be included? And lambda functions? Garbage collection? Sometimes we all think that the new Standard won't be named C++0x, because years and years of discussion will be still needed!

So maybe the quote of Bjarne Stroustrup****:
"Java shows that a (partial) break from the past—supported by massive corporate backing — can produce something new. C++ shows that a deliberately evolutionary approach can produce something new — even without significant corporate support."
is false? Mabe only a corporate-backed language has a chance today? Look how quickly Java developed and how the new C++ standard stalls. But maybe it's the "design by committe"-effect on the side of C++? I don't know.

--
* Programming Languages: Everyone Has a Favorite One: ttp://www.ddj.com/cpp/207401593
** Parrot Virtual Machine: http://www.parrotcode.org/
*** Java theory and practice: Stick a fork in it, Part 1: ttp://www.ibm.com/developerworks/java/library/j-jtp11137.html
**** his interview of 2006: http://technologyreview.com/Infotech/17868/page3/

Sunday 30 March 2008

QString conversions, bugs, suprises and design

1. A trivial bug


It started with a pretty simple piece of code:
    QDomElement e = n.toElement(); // try to convert the node to an element.

if(!e.isNull())
{
TRACE(e.tagName());
....
where TRACE() sends an object to cout. The simplest of tasks you'd say, but it didn't compile. What? What year is it now? Are we in the early nineties? Doesn't library writers know about the standard library? It turned out, they know, so I used a slightly modified code:
    TRACE(e.tagName().toStdString());
but it always crashed with Qt 4.3, Windows XP and VisualStudio 2005! Why? No time to check. After some reading of Qt docs, I settled with:
    TRACE(e.tagName().toAscii().data());
It worked, but the code looked extremely ugly! After copy-pasting it for n times (no, n wasn't equal to 3, as it should be according to the Agile gospel, sorry!!!) I longed for something more elegant and explicit, something like:
    TRACE(qstr_cast<const char*>(e.tagName()))
The code was quickly written and worked instantly:
    // helper for QString conversions
// -- cute and explicit syntax: qstr_cast<const char*>()!


template <class T>
inline T qstr_cast(const QString& s)
{
return T(s.toAscii().data());
}

inline const char* qstr_cast(const QString& s)
{
return s.toAscii().data();
}
Well, it worked on my machine but not in the target environment, and only in one (different) case. Why? Of course I blamed the Qt runtime support, which already let me down with the toStdString() function. Then I saw the same effect in the debugger on my developemnt machine, and I blamed it on multithreading: there must be something wrong with locking when accessing this particular QString instance. But at last I found time to remove this bug, and looked at the sychronisation, and it was 100% correct. The bug was hidden elsewhere. The new (correct) code is:
    // helper for QString conversions
// -- cute and explicit syntax: qstr_cast<const char*>()!


template <class T>
inline QByteArray qstr_cast(const QString& s, T* t = 0)
{
// the QByteArray's const char* conversion op. will be applied
// by the compiler in the const char* context!

return s.toAscii();
}
As you can see, the old code returned a pointer to the data of a temporary instance of an QByteArray object, and of course it pointed into the void. End of story!

2. Discussion of the trivial bug


Why am I writing this? Isn't it just a banal and stupid bug? I don't think so. I think there are some points to be made.

The first one is that Qt doesn't follow the "Principle of Least Surprise"*. This code should work out of the box: cout << qtStringObj;! Why? Because C++ programmers wrote code like this for centuries! Well, almost. In the "freedom languages" ;-)** you are accustomed to writing just: print someObj; and it always works. Mind I didn't use the word "modern languages" but in modern times we expect it just to be working!

The second one is that Qt doesn't want me to use the standard library! There is no shift operator for std::ostream instead they want me to use their QStream class, which I don't know and have no desire to learn. As for QString class the Qt partisans maintain that it is vastly superior to the std::string class, but what about cout? It is somehow an C++ keyword by now. And besides, why aren't I allowed to make my own choices, even if I choose to use an inferior alternative? Maybe I don't have to be that fast in my prototype implementation? You somehow feel trapped in the proprietary Qt world, and somehow cast back in time. Is that the backwards compatibility with the original 80-ties or 90-ties design? It feels somehow frumpy.

The third one is that once I decided on the syntactic appearance of the function (i.e. qstr_cast<>() and not for example cstr_ptr() or CSTR()), I subconsciously settled on a implementation: just get the internal string data and export the pointer to the char buffer. Alas, in case of the QString this doesn't work that way! Moreover, it's not only the name choice alone, this code would work with all the string classes I knew in my long C++ life. So maybe I chose the name because subconsciously I already knew how to implement it? We pride ourself on being rational beings, but we don't know how much we depend on subconscious shortcuts, which will normally work in a familiar territory, but fail when trying something new. Just like me, as I'm relatively new to Qt.

And what's the moral? It's elementary dear Watson: RTFM first! Or perhaps: don't mix Qt and STL???

---
* for example "Applying the Rule of Least Surprise" from "The Art of Unix Programming" by E.S. Raymond: http://www.faqs.org/docs/artu/ch11s01.html or "Principle of Least Astonishment" at Portland Pattern Repository: http://c2.com/cgi/wiki?PrincipleOfLeastAstonishment

** I thought it was a joke, but not, it's an essay by Kevin Barnes: http://codecraft.info/index.php/archives/20/


Tuesday 26 February 2008

Rentree plus Some Comments on Google-Solvers & Co

Hi everybody! I'm back from my travels and can write some more blog-gobbledegook again (which I'll gladly do). I was away for some serious off-piste skiing in St.Anton, descending some pretty steep, big faces, and taking my first pinwheel tumble on a 40 degrees, Alaska-like, highly exposed, rock-fringed slope. Whow! And then I was back in town, and procuring my next project in just 2 days, taking a head-first plunge here as well ;-).

So let me start my rentree into the blogosphere with something less exciting, i.e. some comments on my previous posts. In fact, I wanted to comment on the Google-Solvers post for quite a long time, so it suits me fine.

Commenting on Java vs. C++


As some of you maybe remember, in a previous post* I compared the perfomance of Python, Perl, Java and C++ (admittedly in a rather ad-hoc manner), and made at first the error of not using the optimization switch of my C++ compiler. Me, a die-hard C++ hacker!!! This resulted in Java and C++ being on par performance-wise. So you can imagine my amazement some time ago when I saw in Uncle Bob's blog these statements**:

You can blame this on the non-optimized gcc compiler I was using (cygwin) but again: Oh boo hoo!. If there are any C++ programmers out there who smirk at the supposed slowness of Java, I think you'd better reconsider.
...
I'll grant you that the JVM can be large. However, if you assume it's already there, then the size of the bytecodes for the application itself need not be very large at all. And the heap can be constrained to a relatively small size to prevent virtual memory thrashing. So, if _engineered_ a java app should give a C++ app a run for it's money in most cases. -UB

Well, this started me thinkig. When you are looking for information concerning Java performance, you can see some typical quotes like this on the web ***:

John Davies: It is just the main reason why there are still diehards that stick with C/C++ because you can guarantee performance, not necessarily faster, but it just guarantees the performance and we will run something in Java, it will run quite frequently faster on C [he means JVM] than it would have done in C/C++, but every now and then it will just pause and that pause can be extremely expensive.
If you'll allow a small comment (hypothesis?, heresy?) from yours truly: what if all those measurements are done without the -O2 switch? It's an all to easy trap to fall into (see my own sufferings*) and an excellent opportunity for the marketing people to use some techniques from the seminal book "How to Lie with Statistics" ;-). Becuse you really can't explain to me that a language which is: 1. interpreted, 2. relies heavily on dynamic memory allocation, and 3. supports garbage collection, can be faster than one which doesn't do it, no matter how much optimization you throw at it! Or can you? If you've read the "Discipline of Programming" book of E. Dijkstra, you know that the first law he establishes there is the "there-are-no-miracles law" :-). So you may guess my position on this.

BTW, maybe I (or someone else) should email Mr. Stroustrup on this one, and see what he's got to say. It would be interesting, I guess, so maybe I'll do it anyway.

And now more comments


Let's continue in the commenting vein: sometime ago I was looking for web-frameworks in C++, and wanted even reimplement Java's Servlet classes in C++ (well, I didn't...). But in the last DDJ I saw an article**** about a C++ web-framework at last! Fortunately, I didn't implement the C++ servlet classes, because it was a really stupid idea! The Wt-framework ("witty" - http://www.webtoolkit.eu/wt/) seems to be rather like modern Java frameworks - GWT, ThinWire or Wicket - you define your application classes in C++, and the HTML rendering comes from the framework! In contrast to the Servlet specification, which is very low level and which nobody uses directly now! Thank God I was lazy! I'll have to try Wt out, perhaps there is an C++ alternative for web-programming anyway.

---
* http://ib-krajewski.blogspot.com/2007/07/google-solvers.html
** here he did basically the same but with Ruby in place of Python: http://butunclebob.com/ArticleS.UncleBob.SpeedOfJavaCppRuby
*** "Improving JVM scalability and performance": http://blog.chinaunix.net/u1/45382/showart_356477.html or http://go.techtarget.com/r/1950475/6108168
**** "Wt: A Web Toolkit": http://www.ddj.com/cpp/206401952