On Software and Languages: January 2012

As I said, it's decided (at least for me) now!

A couple of years ago it became fashionable to be ecstatic about dynamic (i.e. dynamically typed*, aka scripting) languages, this sentiment is maybe best exposed by Steve Yegge here. People started to talk about "freedom languages", and it all started getting rather emotional. Like in this tweet:

"Strong types are the hobnailed boot of the Enterprise Man on the neck of the Agile Code Poet" http://goo.gl/aeEcn

It is understandable as an aftershock of the J2EE meltdown by Ruby and Rails, but I was always rather suspicious with that. I was always rather suspicious because of the following gut feeling - why should I throw away things which can possibly help me? If you are skiing, you know that: you were falling and you'll fall, the question is only when (hope not in an icy couloire or over a cliff...). It's the same in programming - you will fall.

1. Theory

A proof written as a functional program (Wikimedia)

As at that time I was interested in mathematical logic and automatic theorem proving, I happened to know about the Curry-Howard isomorphism. It roughly states that a proof in intuitionistic logic has its equivalent type expression and vice versa - a type is equivalent to some proof.

So if a type system is equivalent to a (constructive!) proof of correctness, that's cannot be bad. We are getting some limited guarantees from the compiler - limited because the type system doesn't describe all of the program's behaviour - and that's one thing less we have to worry about!

Well, OK, maybe that's all theoretical babbling - who knows if all that logic/foundational thing in mathematics isn't contradiction-free? And even if it's not, it could be not that relevant in practice?

Because when confronted with such a question (i.e. about the correctness guarantees), the standard answer was: "You gotta write more tests!" or "Strong Testing not Strong Typing"** or "I only proved the correctness of this program, I didn't test it yet" ;). And because tests are a good thing, you cannot deny that this maybe could be true in practice...

2. Own Experiences

So let me share some of my own experiences with dynamic typing.

Higher-Order Perl: Transforming Programs with Programs

A couple of years ago, as I read the excellent "Higher Order Perl" book , I got through the first and middle chapters and the code was understandable and clear.

At this point it's maybe appropriate to give the book some praise, as it's well written, and shows an unknown side of Perl - it's functional self. As the title reveals, the book is about functional programming in... Perl! Yes, that's possible, it even seems to match the language rather well. Of course the language wasn't designed for that, and you will miss all the compiler support and type inference of OCaml but then, Perl isn't statically typed, OK? The only fly in the ointment is that Perl has't been fashionable for quite a couple of years yet...

But I'm digressing here, so back to the main theme: at first it wasn't a problem that you don't know the types of the parameters in Perl. It's all convention, you have to retrieve the parameters from the argv and use them according to their type like that:

   my($top, $code) = @_;  # @_ is Perl's argv!

So all the type information is so to say in the code that uses it later on. But in the last chapter, when the author is developing a linear equation DSL-based drawing system, I sometimes got lost in code. I simply didn't know what the inputs are, and what the function will be doing with it. How I wished I could see the parameter types in the function declarations!

Another example: if you are writing Python code using a library you aren't that accustomed to, and you don't want first to read the documentation for an hour or so, but just quickly code something small you need just now, you'll very likely just to get a bunch of runtime exceptions and have to guess what is exactly going on now. What is the exact type the library is expecting here? I did everything right, didn't I? Nope, you didn't. Please look up the docs.

So AFAIK, the types are for human understanding, even if you don't need the type checking guarantees of the compiler. So common sense dictates using it - you know 80% (my wild guess here) of programming is maintenance, the code is written once and read 1000 times, and so on. We all know these old programming adages, and they don't just apply to telling variable names and insightful comments. Type information is in my eyes another part of writing understandable (as opposed to "genius"-) code.

3. Other people's experiences

OK, I'm not that much experienced in dynamic languages, so maybe I've got the wrong impression, or didn't understand something correctly, or maybe I'm just not clever enough?

But then, when I started reading the "Beginning Scala" book (well, I didn't finish reading it in the end because I got bored, but that's another story...) the following lines struck me:

As my Ruby and Rails projects grew beyond a few thousand lines of code and as I added team members to my projects, the challenges of dynamic languages became apparent. We were spending more than half our coding time writing tests, and much of the productivity gains we saw were lost in test writing. Most of the tests would have been unnecessary in Java because most of them were geared toward making sure that we’d updated the callers when we refactored code by changing method names or parameter counts.

As you see, that practical experiences are rather confirming my assertions about static typing: a) why write test which can be (kind of) automatically generated by compiler, and: b) why not let automatically document and check the interfaces?

But there's another angle on that problem which I didn't think of:

Also, I found that working on teams where there were mind melds between two to four team members, things went well in Ruby, but as we tried to bring new members onto the team, the mental connections were hard to transmit to new team members.

This goes more into the psychology of programming on the first sight, and maybe it's a cultural thing altogether, but it relates a little with: c) why not document the interfaces as you are writing code to make obvious things obvious? I mean, if the interfaces are better described then maybe the new team members will absorb the knowledge better?

Another Ruby guy tried the Go language and wrote an interesting post about his experiences with static (although implicit) typing he encountered in GoLang:

I guarantee that no matter how much I look at the Ruby code, if it runs for any length of time I'll eventually encounter a typing error because of a bug in the code…

OK, the same old story about interface documentation and checking as above. But read on, there's more to come:

I've learned that I vastly prefer the cost of slower development time to gain reliability and safety, as long as the cost isn't too high. Haskell is even safer, but it pushes the boundaries of how much pain I'm willing to endure trying to get code to pass through the type checker

Here he's making a good point, kind of reiterating my main argument: there's a trade-off between the developer time needed to type a piece of code and it's reliability. In the past we opted for ease of construction, but if we can have the reliability for very small additional cost? With type inference we hit a sweet spot here!****

So it's decided, there are real problems with dynamic typing in real-world projects, and static typing can help out here. Why would anyone want to use a dynamic language (beside for scripting some small pieces of code) then?

But wait, what about Clojure? I like this language... and I'm somehow in 2 minds here - the reason says ML but the heart says Lisp. So is this an emotional thing in the end? But in business you have to follow the path of reason, not coolness, to deliver results to your clients. And there aren't that much Lisp projects out there... So the decision isn't difficult: I'll opt for statically typed languages in general, but when offered a Clojure project (which will happen very seldom I guess) I'll jump to it! Don't you think it's a good tradeoff?

Goodie:

PS: do you remember pre-ANSI C? Probably not, but it wasn't type safe:

Within the function, the declaration of a is visible, so the compiler knows it's a char*. For a call, the compiler doesn't know what type of argument the function expects, so it's up to the caller to pass the right type. It's similar to what happens with printf-like functions.***

Here an example for those not remembering the "stone-age":

int

ParseGLFunc (interp, argc, argv, nArg)

    Tcl_Interp *interp;

    int argc;

    char *argv [];

    int *nArg;

{

    ....
}

* I'm sure you know it already, but statically typed means for a programming language to check the types at compile time whereas dynamically typed means checking of type correctness at runtime.
** An article by Bruce Eckel: "Strong Typing vs. Strong Testing"

*** found it here: "pre-ansi declarations"
**** here I must admit that I'm just impressed with the ML family of languages (i.e. OCaml, F# and well, Haskell too) and the automatic type deduction the compiler is doing! The code is just flowing from your fingers as in Python or Ruby, but it's type checked!

Everybody thinks DLL-hell is a thing of the past. Me included: #WTF, just start the Dependency Walker, look what libraries got loaded, correct it and be happy*. Not so, with the advent of Visual Studio 2010 everything got a bit more complicated!

Let me describe the problem which I encountered working in a project for one of my customers. Let's start with a...

Disclaimer:

This is only a short technical note - for those who (like me) first check the Internet for solution to weird programming questions!

The Problem:

is maybe best described by this desperate post:

Hi all,
I have a question to solve, the situation is I use a exe to load a dll(MFC regular dll),but when I use LoadLibrary() to load the dll and in the constructor of the object whose base class is CWinApp in the dll, It have a assert error, and the code line is ASSERT (AfxGetThread() == NULL).Is there anyone have the similar situation with me and have thoughts about it can tell me ,we can talk about it.
thanks in advance.

Well, the same happened to me when I was porting a C++ application from Visual Studio 2008 to VS 2010. To be honest, the situation was a little more complicated: a mixed mode C++/C# program using C++/CLI as a glue. But back to the problem: the standard explanation for that ASSERT is that you have forgotten to include the AFX_MANAGE_STATE macro (Reason 1) in exported DLL functions as described here.

Sadly, this wasn't my problem. It applies to non-MFC DLLs only**, which mine wasn't, so the whole module state management business should be cared automatically in DLL initalization code the linker is generating. This again is controlled by #define macros used whith the compiler (like _USRDLL, _AFXDLL and their hellish likes). So maybe this could be the reason? Nope, everything was correct :-(.

After more web-searching and finding only tons of happless cries for help I stumbled upon the 2nd useful link

Sounds like you have two or more CWinApp objects, and they're using the same AfxGetThread data. I suppose this could happen if your libraries are linking to different versions of MFC, including mixing of debug and release versions. You might use Dependency Walker (depends.exe) to determine the DLLs you link to implicitly, or the Sysinternals Process Explorer to determine the DLLs you're using at runtime

Well, that sounds interesting, mybe I've mixed debug and release DLLs? Nope (although I made that error later on, so it's defintely a good point to keep in mind (Reason 2)). What could it possibly be then???

The Solution:

The solution is more complicated, but somehow both typical and symptomatic for real-life nontrivial system using Microsoft technology... As my client had a really old codebase growing steadily with the years, all possible Microsoft technologies could be found there; beginning with plain old DLL, through MFC-based ones, further using C++/CLI and at last arriving at C# and .NET. This last one was burdened with a constraint: we had to use .NET v.2.0. The problem is, that Visual Studio 2010 doesn't suppport .NET 2.0! Because of time shortage they decided that only .NET 4.0 will be supported out of the box, and that the 2008 VS-compiler will be inegrated in Visual Studio 2010 to be used for older .NET versions.

Well, in itself it wouldn't pose any problems, if there wouldn't be the C++/CLI code which we used as glue between pure C++ MFC and the newer C# parts. It has to be compiled for the .NET 2.0 as well, thus should use the 2008 VS-compiler. But it is put in a DLL which int turn is loaded by a traditional C++/MFC one (created with 2010 VS-compiler!). See it? Two different compilers, 2 diffrent runtimes, 2 CWinApp objects as in (Reason 2) (or maybe the loader gets misleaded, I'm not sure about it). In any case, Visual Studio 2010 is forcing us to use 2 compilers in one application here (Reason 3).

The Mores:

I'm not sure about that one. How about this: stay away from MFC and use Qt? Or: don't migrate your code to VisualStudio 2010 (as it's rather slow to boot)? You're damned if you have to be backwards compatible?

* see here for author's previous letter from DLL-hell

** more hellish details: 3 kinds of DLLs you can use on Windows: ... TODO: link (but that's just plain boring!)

On Software and Languages

Sunday, 22 January 2012

Static vs Dynamic Debate Decided

Wednesday, 11 January 2012

ASSERT(AfxGetThread() == NULL) with VisualStudio 2010, or the DLL-hell, revamped

Blog Archive

About Me

Popular Posts

Visitors

Featured post

Monads, who's afraid of Monads (and C++)?

Search This Blog