Dynamic type languages versus static type languages-Collection of common programming errors
Well, both are very, very very very misunderstood and also two completely different things. that aren’t mutually exclusive.
Static types are a restriction of the grammar of the language. Statically typed langauges strictly could be said to not be context free. The simple truth is that it becomes inconvenient to express a language sanely in context free grammars that doesn’t treat all its data simply as bit vectors. Static type systems are part of the grammar of the language if any, they simply restrict it more than a context free grammar could, grammatical checks thus happen in two passes over the source really. Static types correspond to the mathematical notion of type theory, type theory in mathematics simply restricts the legality of some expressions. Like, I can’t say 3 + [4,7]
in maths, this is because of the type theory of it.
Static types are thus not a way to ‘prevent errors’ from a theoretical perspective, they are a limitation of the grammar. Indeed, provided that +, 3 and intervals have the usual set theoretical definitions, if we remove the type system 3 + [4,7]
has a pretty well defined result that’s a set. ‘runtime type errors’ theoretically do not exist, the type system’s practical use is to prevent operations that to human beings would make no sense. Operations are still just the shifting and manipulation of bits of course.
The catch to this is that a type system can’t decide if such operations are going to occur or not if it would be allowed to run. As in, exactly partition the set of all possible programs in those that are going to have a ‘type error’, and those that aren’t. It can do only two things:
1: prove that type errors are going to occur in a program
2: prove that they aren’t going to occur in a program
This might seem like I’m contradicting myself. But what a C or Java type checker does is it rejects a program as ‘ungrammatical’, or as it calls it ‘type error’ if it can’t succeed at 2. It can’t prove they aren’t going to occur, that doesn’t mean that they aren’t going to occur, it just means it can’t prove it. It might very well be that a program which will not have a type error is rejected simply because it can’t be proven by the compiler. A simple example being if(1) a = 3; else a = "string";
, surely since it’s always true, the else-branch will never be executed in the program, and no type error shall occur. But it can’t prove these cases in a general way, so it’s rejected. This is the major weakness of a lot of statically typed languages, in protecting you against yourself, you’re necessarily also protected in cases you don’t need it.
But, contrary to popular believe, there are also statically typed languages that work by principle 1. They simply reject all programs of which they can prove it’s going to cause a type error, and pass all programs of which they can’t. So it’s possible they allow programs which have type errors in them, a good example being Typed Racket, it’s hybrid between dynamic and static typing. And some would argue that you get the best of both worlds in this system.
Another advantage of static typing is that types are known at compile time, and thus the compiler can use this. If we in Java do "string" + "string"
or 3 + 3
, both +
tokens in text in the end represent a completely different operation and datum, the compiler knows which to choose from the types alone.
Now, I’m going to make a very controversial statement here but bare with me: ‘dynamic typing’ does not exist.
Sounds very controversial, but it’s true, dynamically typed languages are from a theoretical perspective untyped. They are just statically typed languages with only one type. Or simply put, they are languages that are indeed grammatically generated by a context free grammar in practice.
Why don’t they have types? Because every operation is defined and allowed on every operant, what’s a ‘runtime type error’ exactly? It’s from a theoretical example purely a side-effect. If doing print("string")
which prints a string is an operation, then so is length(3)
, the former has the side effect of writing string
to the standard output, the latter simply error: function 'length' expects array as argument.
, that’s it. There is from a theoretical perspective no such thing as a dynamically typed language. They are untyped
All right, the obvious advantage of ‘dynamically typed’ language is expressive power, a type system is nothing but a limitation of expressive power. And in general, languages with a type system indeed would have a defined result for all those operations that are not allowed if the type system was just ignored, the results would just not make sense to humans. Many languages lose their Turing completeness after applying a type system.
The obvious disadvantage is the fact that operations can occur which would produce results which are nonsensical to humans. To guard against this, dynamically typed languages typically redefine those operations, rather than producing that nonsensical result they redefine it to having the side effect of writing out an error, and possibly halting the program altogether. This is not an ‘error’ at all, in fact, the language specification usually implies this, this is as much behaviour of the language as printing a string from a theoretical perspective. Type systems thus force the programmer to reason about the flow of the code to make sure that this doesn’t happen. Or indeed, reason so that it does happen can also be handy in some points for debugging, showing that it’s not an ‘error’ at all but a well defined property of the language. In effect, the single remnant of ‘dynamic typing’ that most languages have is guarding against a division by zero. This is what dynamic typing is, there are no types, there are no more types than that zero is a different type than all the other numbers. What people call a ‘type’ is just another property of a datum, like the length of an array, or the first character of a string. And many dynamically typed languages also allow you to write out things like "error: the first character of this string should be a 'z'"
.
Another thing is that dynamically typed languages have the type available at runtime and usually can check it and deal with it and decide from it. Of course, in theory it’s no different than accessing the first char of an array and seeing what it is. In fact, you can make your own dynamic C, just use only one type like long long int and use the first 8 bits of it to store your ‘type’ in and write functions accordingly that check for it and perform float or integer addition. You have a statically typed language with one type, or a dynamic language.
In practise this all shows, statically typed languages are generally used in the context of writing commercial software, whereas dynamically typed languages tend to be used in the context of solving some problems and automating some tasks. Writing code in statically typed languages simply takes long and is cumbersome because you can’t do things which you know are going to turn out okay but the type system still protects you against yourself for errors you don’t make. Many coders don’t even realize that they do this because it’s in their system but when you code in static languages, you often work around the fact that the type system won’t let you do things that can’t go wrong, because it can’t prove it won’t go wrong.
As I noted, ‘statically typed’ in general means case 2, guilty until proven innocent. But some languages, which do not derive their type system from type theory at all use rule 1: Innocent until proven guilty, which might be the ideal hybrid. So, maybe Typed Racket is for you.
Also, well, for a more absurd and extreme example, I’m currently implementing a language where ‘types’ are truly the first character of an array, they are data, data of the ‘type’, ‘type’, which is itself a type and datum, the only datum which has itself as a type. Types are not finite or bounded statically but new types may be generated based on runtime information.
Originally posted 2013-11-09 21:19:52.