Blog-Archiv

Dienstag, 30. September 2014

JS got cha

A Gotcha is something that got you. Not you got it, it got you. A Gotcha is something nobody tells you, because it is kind of inevitable, like a hole on a road. "Why should I tell anybody? My car crashed into it too!"

We tend to not talk about bad experiences we had. We don't want to remember that. We more like to talk about our success stories. By that way everybody coming behind us also crashes into that hole. Maybe we don't feel so alone then? One thing is sure: mankind is the dominant species on this planet because it can learn and communicate.

This is about the biggest surprises I had with JavaScript. Normal for a JS programmer, but hard to get used to for a Java programmer.
A road full of holes!

Functions can not be overloaded

When you read code like the following, what would you expect to be the outcome?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
function foo() {
  console.log("foo() was called");
}

function foo(bar1) {
  console.log("foo(bar1) received "+bar1);
}

function foo(bar1, bar2) {
  console.log("foo(bar1, bar2) received "+bar1+", "+bar2);
}

foo();
foo("bar1");
foo("bar1", "bar2");

Output is:

foo(bar1, bar2) received undefined, undefined
foo(bar1, bar2) received bar1, undefined
foo(bar1, bar2) received bar1, bar2
Obviously all three calls go to the foo(bar1, bar2) function.
There is no function overloading in JavaScript. The last definition of foo() survived, all others were overwritten silently without a warning. A function must be unique by name in its namespace (scope). The parameter list is not significant for a function definition.

What did not prevent our function calls to work without error! Because you can call a JS function that has three parameters with

  • zero
  • one
  • two
  • ... thousand ...
parameters!

Any parameter you do not provide when calling the function will be received as undefined value. Not only that any parameter could be of any type, it could also be absent.

Additionally you can call a function that declares no parameters with as many arguments as you want! That function might work with the global JS arguments variable to get its parameter values.

1
2
3
4
5
6
function foobar() {
  for (var i = 0; i < arguments.length; i++)
    console.log("arguments["+i+"] = "+arguments[i]);
}

foobar("lots", "of", "args", "for", "foobar");

This yields:

arguments[0] = lots
arguments[1] = of
arguments[2] = args
arguments[3] = for
arguments[4] = foobar

Do you want more freedom?

Objects of same "class" can be different

When working with objects, we expect that all instances of a class (or whatever JS defines as a class) will be of the same structure. But in practice a JS Object is a map, and you can add or delete any property on it at any time. And a property can also be a function. Thus objects created by the same constructor function ("class") can be altered to have totally different properties.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
var Cat = function(name) {
  this.name = name;
}

var garfield = new Cat("Garfield");

var catbert = new Cat("Catbert");
delete catbert.name;
catbert.nickname = "Bert";

console.log("garfield's name is "+garfield.name);
console.log("garfield's nickname is "+garfield.nickname);
console.log("catbert's name is "+catbert.name);
console.log("catbert's nickname is "+catbert.nickname);

The output of this is:

garfield's name is Garfield
garfield's nickname is undefined
catbert's name is undefined
catbert's nickname is Bert
The two Cat instances now have nothing in common anymore. Applying catbert instanceof Cat would yield true anyway.

One thing JavaScript is not missing at all: flexibility. This language is made of rubber!

Variables are "hoisted"

Hoisting sails might be a vital task for a ship. Hoisting variable definitions to the function body top might be mortal for a function, because it ignores the programmer's intent. Nonetheless JavaScript does such. It pulls any local variable out of its block and puts it to the top of the function body.

Consider following code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
    var connectExpandControls = function() {
      var expandControls = document.body.getElementsByClassName("expandcontrol");

      for (var i = 0; i < expandControls.length; i++) {
        var expandControl = expandControls[i];
        var parent = expandControl.parentNode;
        var children = parent.children;
        var nextSibling, previous;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            var element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        if (nextSibling)
          connect(expandControl, nextSibling);
      }
    };

This code loops all elements with class "expandcontrol". For each of them it searches the next sibling DOM element, and if there is one, it connects that sibling with the expandcontrol (whatever that means).

But this is wrong! See the bug? Would be interesting how long an experienced JS programmer might need to find it.

This code searches the next sibling for the first expandcontrol, and then it connects all other expandcontrol instances to that first found sibling!

Other languages like C++ or Java would limit the existence (and initialization) of a local variable to the { block braces } where it has been written into. But today life is more complicated. JavaScript "hoists" all local variables within a function body to the top of the function, out of their scopes. Thus the variable nextSibling will get a value at first loop pass, and then keep this value, because it is not created newly each time the outer loop block is entered! Consequence is that the sibling-search-loop won't be executed for any further expandcontrol than the first, because afterwards nextSibling already has a value.

Here is a fixed version of the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
    var connectExpandControls = function() {
      var expandControls = ....

      for (var i = 0; i < expandControls.length; i++) {
        ....
        var nextSibling = undefined, previous = undefined;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            var element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        ....
      }
    };

The difference is that the variables nextSibling and previous are always reset to undefined now, any time the loop is entered. Thus the inner loop is executed at each pass.

Following is the fix variant that would be recommended by a lot of JS programmers. They argue:

"Do not behave as if variables were not hoisted, write them to where the interpreter will put them anyway, so you might keep control of their values".
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
    var connectExpandControls = function() {
      var expandControls = document.body.getElementsByClassName("expandcontrol");
      var expandControl;
      var parent;
      var children;
      var nextSibling, previous;
      var element;

      for (var i = 0; i < expandControls.length; i++) {
        expandControl = expandControls[i];
        parent = expandControl.parentNode;
        children = parent.children;
        nextSibling = undefined;
        previous = undefined;

        for (var j = 0; j < children.length && ! nextSibling; j++) {
            element = parent.children[j];
            if (previous === expandControl)
              nextSibling = element; // breaks loop
            else
              previous = element;
          }
        }
          
        if (nextSibling)
          connect(expandControl, nextSibling);
      }
    };

This is the reason why every JS function starts with an endless list of local variables. Moreover JS functions are used as modules, and these contain a lot of other functions and even sub-modules. So the scope and visibility of "local" variables becomes uncontrollable (just because the interpreter optimizes by variable-hoisting).

This makes life hard when refactoring big functions by splitting them into smaller ones. And there are a lot of big functions out there!

No dependency definitions

C and C++ have include directives, Java and other more modern languages prefer import (to avoid preprocessors), but JavaScript has nothing at all. To realize what that means, imagine following JS code dependencies of an HTML page:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  <script type="text/javascript" src="js/folding.js"></script>
  <script type="text/javascript" src="js/ajax.js"></script>
  <script type="text/javascript" src="js/sourceDisplay.js"></script>

  <script type="text/javascript">
    "use strict";
    
    window.addEventListener("load", function() {
      var target = document.getElementById("sourceGoesHere");
      var sourceDisplayer = sourceDisplay.create();
      sourceDisplayer.displayPage(target);
    });
  </script>

This HTML page loads three scripts, although it obviously uses just one JS object: sourceDisplay. It does so because sourceDisplay won't work when folding and ajax are missing.

Here is an outline of the dependencies as they exist in the according JS code:

  • sourceDisplay.js
    • folding.js
    • ajax.js

That means, not only JS source code contains dependencies to external variables or functions, also the HTML page that uses that code repeats these dependencies. It is a necessary consequence that these dependency definitions break sometime.

So how can you import a JavaScript reliably into your HTML page?

Carefully read the JS code and look for variables or functions that are not defined. When you found them, you have to look around for JS files that define the missing identifiers. Having found all definitions and having eliminated ambiguities, you can finally write script tags into the HTML page. This will hold until the next release, where you must do this again. Releases are weekly :-)

Hard and soft comparisons

In JS there are two different compare operators. Additionally to the traditional "==" there is also a "===", and for the negation "!=" you have a "!==". The semantics are different, "==" is called equality (soft), "===" identity (hard).

null == undefinedtrue
null === undefinedfalse
null == 0false
null === 0false
null == '0'false
null === '0'false
0 == undefinedfalse
0 === undefinedfalse
0 == '0'true
0 === '0'false
0 == ''true
0 === ''false
0 == new String('')true
0 === new String('')false
0 == '\t\r\n 'true
0 === '\t\r\n 'false
'' == undefinedfalse
'' === undefinedfalse
'' == new String('')true
'' === new String('')false
'' == falsetrue
'' === falsefalse
false == 'false'false
false === 'false'false
false == '0'true
false === '0'false
false == 0true
false === 0false
false == undefinedfalse
false === undefinedfalse
false == nullfalse
false === nullfalse

This is near to science, look at these charts on stackoverflow.
Fact is that "==" tries to coerce the types of the variables being compared, while "===" does not do that (and thus is also faster).
Identity comparison is what we mostly would expect.
I use only "===" and "!==", no more "==" and "!=".

And this is how an if condition works with these expressions:

if (undefined)false
if (null)false
if (0)false
if ('0')true
if ('')false
if (new String(''))true
if ('\t\r\n ')true

Due to such strange behaviors you can not substitute a

if ( ! rainy )
by a
if ( rainy === false )
in JavaScript, like you can in Java at any time, without even thinking.
When rainy would be undefined or null, you might get quite unexpected results, because undefined !== false and null !== false ...

Parameter default assignments

Another nice pitfall is the popular default definition for parameters, which goes badly wrong for boolean parameters:

var calculateWidth = function(element, calculateMaximum) {
  var maximum = calculateMaximum || true; // don't do this!
  if (maximum)
    ....
}

Intent of this is to give a default value of true to the variable maximum when that parameter has not been provided by the caller. But the outcome is fatal, because maximum will never be false. Consider following cases:

  calculateWidth(theElement);
  // var maximum = undefined || true;
  // -> this works as intended

  calculateWidth(theElement, false);
  // var maximum = false || true;
  // -> as false won't ever evaluate to true, maximum will ALWAYS be true!

So you always must know the parameter type when doing such.
At least for boolean types you must do the following:

var calculateWidth = function(element, calculateMaximum) {
  var maximum = (calculateMaximum !== undefined) ? calculateMaximum : true; // correct!
  ....
}


Samstag, 27. September 2014

This JS new ROFLCOPTER

If you haven't yet been rolling on the floor laughing today, you could try this. Read the article, and then the comment below ...

If this didn't make you ROFL, go further and try to find out what's the meaning of Array.prototype.slice.call(arguments, 1). For that you could go to this. Try to understand the article, read the comments below ...

Still not ROFLING? You must be a JavaScript programmer :-) They've nothing to laugh anymore :-(

For me, I could not stop laughing about these "excellent" comments:

  • "I found that javascript is easier to understand than english"
  • "This explanation has done more to deepen my understanding of javascript than anything I have ever read."
  • "Excellent explanation. You made so many other things clear too!"

If you need stronger stuff, go to this site ...

But you're right, it is really serious. Failed communication is the root of all evil, or was it premature optimization?
Whatever, I must find out what happens when I call new in JavaScript. For anyone knowing that new is next to the C-language alloc(bytes), finding this out must be like receiving the >sign of the times<!

The JavaScript keyword "new"

First I try it with an Object as argument:

1
2
var someObject = {};
var someInstance = new someObject; // [object Object] is not a function

So this fails, new requires a function to be behind it.

1
2
3
4
var someFunction = function() {
  console.log("Could it be that I am a constructor?");
};
var someInstance = new someFunction();

So, this worked, no error. But what is the return?

1
2
3
4
5
6
console.log("someInstance instanceof Object = "+(someInstance instanceof Object));
console.log("someInstance instanceof Function = "+(someInstance instanceof Function));

console.log("someInstance = "+someInstance);
console.log("someInstance.constructor = "+someInstance.constructor);
console.log("someInstance.prototype = "+someInstance.prototype);

This outputs:

someInstance instanceof Object = true
someInstance instanceof Function = false
someInstance = [object Object]
someInstance.constructor = function() {
  console.log("Could it be that I am a constructor?");
}
someInstance.prototype = undefined

The created thing

  • is a (new) Object,
  • has the given function as "constructor",
  • has no "prototype" property (lets say it has no super-Object)
  • and the function has been executed in the context of the returned Object.

Because we can not call new on Objects, it seems that classes are represented by functions in JavaScript.
To be able to distinguish between functions that create instances and such that do not, some wise people introduced the convention to write constructor-functions capitalized.

1
2
3
4
5
var Foo = function(a, b) {
  this.A = a;
  this.B = b;
};
var foo = new Foo(-1, 0);

Now before digging into the "this" keyword, here is a summary of the >sign of the times< (semantics that have gone into the C alloc() function). The new keyword ...

  • provides a new Object,
  • sets the inaccessible __proto__ of that Object to the prototype of the given function (whatever that means),
  • calls the function with that new Object as context,
  • when the function returns an Object, that Object is returned,
  • otherwise the newly created Object is returned.

So keep in mind: when your "constructor" function returns an Object, THAT will be what the new operation returns! (And you are self responsible whether the instanceof operator works on it correctly.)

1
2
3
4
5
6
7
8
var Animal = function(givenName) {
  return {
    name: givenName
  };
}

var animal = new Animal("Garfield");
console.log("animal instanceof Animal: "+(animal instanceof Animal));

This yields:

animal instanceof Animal: false

The JavaScript keyword "this"

When looking for explanations of the this keyword, you can find sentences like

  • this is whatever is left of the dot
  • this in a function is the Object that called the function

Following code refers to the Foo constructor function above.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var forgotNew = Foo(1, 2); // "this" is the "window" object

console.log("forgotNew="+forgotNew);
console.log("A = "+A);
console.log("B = "+B);

var appliedNew = new Foo(3, 4); // "this" is the object created by "new"

console.log("appliedNew.A = "+appliedNew.A);
console.log("appliedNew.B = "+appliedNew.B);
console.log("A = "+A);
console.log("B = "+B);

This outputs:

forgotNew=undefined
A = 1
B = 2
appliedNew.A = 3
appliedNew.B = 4
A = 1
B = 2
We see that
  • the function did not return anything (void)
  • when called without the new operator,
    • it wrote A and B to the context of the caller, which in this case was the browser's window Object
  • when called with the new operator,
    • it wrote A and B to the Object created by the new operator
Mind: by leaving out the new operator you create unintended global variables!

So the convention to write "constructor functions" capitalized gets some more sense. There should never be a constructor function call without a preceding new.

The "this" is an Object, I call it context. Functions are not bound to their objects. Sometimes we do not realize that context changes in JavaScript. Look at following example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var fooCapsule = {
  Foo: function(a, b) {
    this.A = a;
    this.B = b;
  }
};

fooCapsule.Foo(5, 6);
console.log("fooCapsule.A = "+fooCapsule.A);
console.log("fooCapsule.B = "+fooCapsule.B);
console.log("A = "+A);
console.log("B = "+B);

This outputs:

fooCapsule.A = 5
fooCapsule.B = 6
A = 1
B = 2
We can see that the global variables were not affected by that, still A=1 and B=2.

But imagine setting a "function pointer" to some encapsulated function and then calling it:

1
2
3
4
5
6
var myFoo = fooCapsule.Foo;
myFoo(7, 8);
console.log("fooCapsule.A = "+fooCapsule.A);
console.log("fooCapsule.B = "+fooCapsule.B);
console.log("A = "+A);
console.log("B = "+B);

This outputs:

fooCapsule.A = 5
fooCapsule.B = 6
A = 7
B = 8

We see that the globale A=7 and B=8 were affected now. Why? The function was called by a context that was not the Object the function was defined within. It was called from the browser's window Object, and thus it worked in that context.


This is another aspect of scoping in JavaScript:

  • functions are not bound to the Objects they were defined in.
And this is the reason why some JavaScript Gurus recommend to NOT use "this"!
Because it might be confusing, and functions called by a wrong context could do unintended damage.

When you don't use "this", you won't need "new". Both are not needed when using what I call "factory functions" to create Objects.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
var fooFactory = function(a, b, context) {
  if ( ! context )
    context = {};
  
  context.A = a;
  context.B = b;
  
  return context;
};

var fooBar1 = fooFactory(9, 10);
var fooBar2 = fooFactory(11, 12);
console.log("fooBar1.A = "+fooBar1.A+", fooBar1.B = "+fooBar1.B);
console.log("fooBar2.A = "+fooBar2.A+", fooBar2.B = "+fooBar2.B);
console.log("A = "+A);
console.log("B = "+B);

This outputs:

fooBar1.A = 9, fooBar1.B = 10
fooBar2.A = 11, fooBar2.B = 12
A = 7
B = 8
We see that the global variables A and B were not affected. And we created new Object instances without using the new operator. The only difference between new and creating Objects by {} is the "protoype" property, which is set by the new operator into the new Object. But this plays a role only when modelling inheritance hierarchies.




Sonntag, 21. September 2014

JavaScript Best Practices

As I am tired of looking for more JavaScript gotchas (there are too much of them) I summarize my Best Practice collection here and now. Read my other JS posts for discussion of them (did I discuss them all?).

Target of these practices is to minimize the risk of using the JavaScript programming language, and to reach a maximum of readability.

  1. Write "use strict"; at start of any function.
    That way you quickly will find undefined variables in your code.

  2. Always use var when defining variables.
    This is required in strict mode.

  3. Terminate every statement by a ; (semicolon).
    Even when it does not help against the JS interpreter adding an unintended newline, it makes your code clearer.

  4. Use === and !== instead of == and !=
    And thus prove that none of your variables changed its type on the fly.

  5. Don't use null, use undefined instead.
    One of them is dispensable, and undefined is inevitable.

  6. Avoid new and this. Avoid instanceof.
    They can lead to very hard to find bugs. Code that speculates with "classes" is error-prone.

  7. Test for undefined using a simple if ( ! x ) ...;
    Avoid (typeof x === "undefined") expressions. Free undefined variables will be detected by strict mode. Use typeof only to test the existence of external JS library objects.

  8. Use functional inheritance.
    All other inheritance variants are error-prone expert tasks.

  9. Put any non-function code into an immediately-invoked-function-expression.
    Never use the global scope for variables and private functions.

  10. For better readability, do not use $ or _ in identifiers.
    Leave these characters to shared libraries like jQuery or underscore.

  11. Pass dependencies as function parameters.
    Try to not use global objects and functions, pass any dependency as parameter.

  12. Don't use navigator to identify browser vendors, better test for browser features.
    For example if (window.stop) window.stop(); - but don't use document.all to identify IE !

  13. Don't use arrays as maps, their length would be zero then.

  14. When using a for-in loop, wrap the loop body into a hasOwnProperty() condition.
    This makes safe that no inherited properties appear in the loop.

  15. Don't use eval().
    Code that generates other code is hard to maintain, and mostly unnecessary.

  16. Document every function, its parameters, its return.
    In an untyped language like JS this is the quickest way to express how to use code.

And, if you are not sure about whether readability and documentation (comments) are important: imagine coming back after three years and having to fix and extend your code :-)

If the intent of your code is clear, it can always be fixed to do what was intended.
If the intent is not clear, your code could be misinterpreted and the author might "lose face".

Finding out what some code was expected to do at the time of its writing is hard enough. Leave us some comments about why you are doing this or that (not about what you are doing).

Can you find more ways to say how important documentation is? Leave a comment!




Samstag, 20. September 2014

JS and the Forgotten Types

JavaScript is like a laptop that has USB plugs only.
You like it: the software will self-detect which device you plug in, but ... where to plug in the power cable? Because undoubtedly there will be no software detection without power ...

I am sure some day there will be a laptop like this, expecting power from every cable before it is switched on. Nevertheless I always liked the fact that I can not plug the scanner cable into the printer socket because they had different plugs. Different types of plugs. Looking from outside you could be sure that the plug connects to the correct hardware interface inside.

Typing is important. Big projects are mostly implemented using strongly typed programming languages. The compiler then checks that the scanner cable is plugged into the scanner socket, and the printer cable is plugged into the printer socket. That everything is done as intended by the developer. A developer expresses herself by types. When some code passes a parameter of type "Machine" to some other code that expects "Human", the compiler will detect and report this.

As alternative you can use duck typing: if the "Human" has a switch with the label "On/Off", it surely can be used as "Machine", so why not decide this at runtime and then use the "Human" as "Machine"?

On this side is JavaScript. In contrast to Java it provides no type checks at all.

The Types of JavaScript


It is not that JavaScript has no data types. It has

  • String
  • Number
  • Boolean
  • Array
  • Object

but it is recommended to not use them (unnecessary, slow).
Instead you use

  • var string = "me";
  • var number = 1;
  • var boolean = true; // boolean is not a keyword!
  • var array = [];
  • var object = {};

Best practice: Do not use new with predefined data types.

Exception is

  • var now = new Date();

Mention that the definition of a variable does not allow to declare a data-type for that variable. The same is for parameters of a function. So you never know what data-type the parameter you are receiving represents.

I want to look into the concept JavaScript has for arrays, maps and objects, and if the common sense about such terms is true also in the JavaScript world. Besides I want to look at the for-in loop, and why it is regarded to be a hazard.

Arrays and Maps

var map = {
 zero: 0,
 one: 1
};

console.log("map.zero = "+map.zero);
console.log("map['one'] = "+map["one"]);
// yields:
// map.zero = 0
// map['one'] = 1

Actually I allocated an object in the code above. But as it turns out you can use an object like a map. Accessing map elements is possible via variable.property or via variable["property"], this seems to be the same in JavaScript. Here I see no difference between an object and a map.
But wait, maybe a map is an array?

var array = [];
array["zero"] = 0;
array["one"] = 1;
array[-1] = -1;
console.log("array.zero = "+array.zero);
console.log("array['one'] = "+array["one"]);
console.log("array[-1] = "+array[-1]);

// yields:
// array.zero = 0
// array['one'] = 1
// array[-1] = -1

Ok, now what is it, is an array a map or a map an array. Obviously they are accessed in exactly the same way, and both have the capability to store key/value tuples. So why make a difference between them?

var map = {};
map["zero"] = 0;
map["one"] = -1;
map[-1] = -1;
console.log("map.zero = "+map.zero);
console.log("map['one'] = "+map["one"]);
console.log("map[-1] = "+map[-1]);

// yields:
// map.zero = 0
// map['one'] = 1
// map[-1] = -1

Moreover the array is indexable via negative indexes (-1), which is not imaginable in other languages. It turns out that a JavaScript array is a map where the key is the array-index and the value is the array-element. And even this is possible in JavaScript (numbers as object-properties):

var arraymap = {
 0: "zero",
 1: "one"
};
console.log("arraymap[0] = "+arraymap[0]);
console.log("arraymap[1] = "+arraymap[1]);

// yields:
// arraymap[0] = zero
// arraymap[1] = one

Seems there is no difference between arrays, objects and maps.

But now lets look at loops. Maybe this explains the difference.

Loops

var array = [];
array["zero"] = 0;
array["one"] = 1;

var map = {};
map["zero"] = 0;
map["one"] = 1;

First lets try out the good old C-style for-loop (where all that i, j, k mistakes happened :-):

console.log("array.length = "+array.length);
var i;
for (i = 0; i < array.length; i++)
 console.log("array["+i+"] = "+array[i]);

console.log("map.length = "+map.length);
for (i = 0; i < map.length; i++)
 console.log("map["+i+"] = "+map[i]);

// yields:
// array.length = 0
// map.length = undefined

So now I am stunned. No length, although there are obviously elements in both the array and map? You can not loop either of them?

Best practice: Do not use arrays as maps, their length would be zero then.

Mind that JavaScript throws no error when executing i < map.length, although there is no length property in map!

For objects without length JavaScript has another kind of loop, the for-in loop:

var key;
for (key in array)
 console.log("array["+key+"] = "+array[key]);

for (key in map)
 console.log("map["+key+"] = "+map[key]);

// yields:
// array[zero] = 0
// array[one] = 1
// map[zero] = 0
// map[one] = 1

But now I want to know how arrays can be iterated with the C-style for-loop.

var array = [
  "zero",
  "one"
];
array.push("minus one");

console.log("array.length = "+array.length);
for (var i = 0; i < array.length; i++)
 console.log("array["+i+"] = "+array[i]);

// yields:
// array.length = 3
// array[0] = zero
// array[1] = one
// array[2] = minus one

So finally that array can be like an array is in other programming languages.

Summary:

  • a JavaScript object can be used like a map, a map is an object
  • a JavaScript array is an object
  • thus an array can be (ab)used as map
  • to use an array correctly, elements should be added either at construction or by array.push()
JavaScript does not feel like a language, it is more a language kit. You can do things in many different ways.
What in turn is not very beneficial for a common sense among programmers. They all tend to have their own JavaScript - it divides the world.

The for-in loop hazard

When you program a for-in loop and a tool like jshint checks your source code, you will get a warning when you did not do it in the right way. The warning sounds:

The body of a for in should be wrapped in an if statement to filter unwanted properties from the prototype.

What does that mean?
It means that the for-in loop enumerates properties of an object, and could find properties you do not want to be in that loop. These unwanted properties can come from a prototype manipulation.
The prototype is an object property that can be used as a reference to a super-class object. It was made to facilitate inheritance.
Now what that means is another story, fact is that every JS code can put properties into the Object prototype, so that every new object allocated after will have that property.

Object.prototype.myObjectProperty = -666;

var map = {};
map["zero"] = 0;
map["one"] = 1;

for (var key in map)
 console.log("map["+key+"] = "+map[key]);

// yields:
// map[zero] = 0
// map[one] = 1
// map[myObjectProperty] = -666

To avoid this you must wrap the loop body into an if-condition:

for (var key in map)
 if (map.hasOwnProperty(key))
  console.log("map["+key+"] = "+map[key]);

The Object-function hasOwnProperty() checks whether a property comes from the prototype or not. That way you can always iterate safely.

Best practice: When using a for-in loop, wrap the loop body into a hasOwnProperty() condition.

jQuery

Surely we do not want to write such technical code. Thanks god there is jQuery. The jQuery JavaScript library does for browsers what Java does for operating sytems: you can write browser-independent JS code with jQuery.
In jQuery, your loop for all of maps or arrays or objects would always look like one of the following (1st form is instance call, 2nd is static call):

$("li").each(function(index) {
  console.log(index+": "+$(this).text());
});
$.each(map, function(key, value ) {
  console.log(key+": "+value);
});

Best practice: Use jQuery wherever you can, it abstracts the browser.

By the way: '$' is a valid identifier character in JavaScript, like it it in Java. It can even be at the start of a variable or function name, and jQuery chose $ to be an alias name of the module. So when you write $.each(...), you could also write jQuery.each(...).

Best practice: Leave the $ to jQuery, for better readability do not use it in identifiers.




Freitag, 19. September 2014

Omigosh, JavaScript!

This is the Blog I shied away from many times, because JavaScript really divides the world.

The late Nineties. People talked about the end of programming languages. Big computer companies predicted that everything will be graphical, the UNIX commandline shell was said to be outdated. Some even told us that in future we will build software applications by drag and drop, not by writing source code.

What a contradiction: what was expected to be the end time of programming languages actually was the start time of a language called JavaScript. It was used in internet browsers to do things that were hard to specify using HTML tags. First it was called LiveScript and then renamed to JavaScript, although it has nothing to do with the Java language. It is one of the most error-prone languages I know.

Nowdays applications are still created using programming languages. JavaScript (JS) survived, even got standardized, now called ECMA-Script (European Computer Manufacturers Association). It is the only language that all browsers support. It is going to be used as programming language for client-side applications running in browsers, which also includes an MVC-implementation (Model-View-Controller).
Still it is believed that JavaScript has something to do with Java, but it still has not :-) - except that Java includes a JS scripting engine meanwhile, maybe because that believe was so strong?

Differences


Java

is a language that is compiled from .java files into .class files which then are interpreted by a Java virtual machine (JVM) that is available for nearly every platform. So you can let run your .class file on LINUX or WINDOWS, it will work without recompilation.

You can develop in Java by downloading the free Java developer kit (JDK), and you can run your Java application using a Java runtime environment (JRE, part of the JDK). Since the beginning Java provided a platform-dependent graphical user interface (GUI) called AWT, since 1999 also a platform independent one called Swing, which is now going to be replaced by JavaFX (running also on mobiles that support Java).
Java was available also in browsers (-> applets), but 2013 it was banned by Apple because a binary language seems to be too much a security hazard for an internet browser (Apple banned applets, what a coincidence:-).

Java is a very simple language, lacking lots of cryptography existing in C and C++. It is fully object-oriented, but without multiple inheritance on implementation level. It does not provide a preprocessor like C, because such obscures source code.
This is the reason why it is so popular: you can read that source code (and sometimes even understand it :-). Java is strongly typed, meaning you can implement big projects using Java.

JavaScript

is a language that is interpreted. The interpreter runs within the internet browser and reads sources from either a HTML page's script tags (JavaScript embedded in HTML) or its referenced .js files. JavaScript was made to manipulate an HTML document, it makes only little sense without the HTML document object model (DOM). Some people use it as templating-language on server side, this is where the JavaScript embedded in Java comes into play. But this is just "fashion", there are better templating languages than JavaScript.

You can develop JavaScript by writing some HTML, including a <script> tag containing some JavaScript, and then loading that HTML page into the browser of your choice, using a file-URL like file:///home/me/mypage.html. When your script does not work, try to find the browser's developer tools, look at the console, or set a JS debugger breakpoint into your source and reload the page.

JavaScript also lacks a lot of C/C++ cryptography, but reading JavaScript source is not so easy due to the many techniques that have evolved around this very flexible language. It does not support typing. A JS variable can be a String, and in the next moment it is a number. JavaScript pretends to be object-oriented, but it isn't (in common sense). And it is not a functional language, even if you see the word "function" everywhere in source code.

The JS Problem(s)

So why do I say that this is a complicated and error-prone language?
No single and easy answer. I start with the beginner's gotchas here, and add other posts where I will continue to moan :-)
I will mix in my best practice proposals and summarize and review them at end.


Scope


Scope is the space where a variable or function is visible, callable and usable, mostly marked by curly braces. When you have
{      // scope 1
  int i = 1;
  {    // scope 2
    int k = 2;
  }
}
you expect i to die when scope 1 ends, and k to die when scope 2 ends. Variable i is known in both scopes, k only in scope 2.

Not so sure in JavaScript.
Try following example and look at the console output:

function foo() {
  console.log("within foo() we 1st have i = "+(typeof i !== "undefined" ? i : "[undefined!]"));
  console.log("within foo() we 1st have k = "+(typeof k !== "undefined" ? k : "[undefined!]"));
  console.log("within foo() we 1st have m = "+(typeof m !== "undefined" ? m : "[undefined!]"));
  console.log("within foo() we 1st have o = "+(typeof o !== "undefined" ? o : "[undefined!]"));
 
  i = 22; // goes to global scope due to missing "var"
  var k = 33; // function-local scope due to "var"
  var m = 44; // function-local scope, shadowing the global m
  o = 88; // uses to global scope o due to missing "var"
 
  console.log("within foo() we 2nd have i = "+(typeof i !== "undefined" ? i : "[undefined!]"));
  console.log("within foo() we 2nd have k = "+(typeof k !== "undefined" ? k : "[undefined!]"));
  console.log("within foo() we 2nd have m = "+(typeof m !== "undefined" ? m : "[undefined!]"));
  console.log("within foo() we 2nd have o = "+(typeof o !== "undefined" ? o : "[undefined!]"));
}

var m = -4; // is known in all functions here, at any nesting depth
o = -8; // is known in all functions here, at any nesting depth

console.log("outside foo() we 1st have i = "+(typeof i !== "undefined" ? i : "[undefined!]"));
console.log("outside foo() we 1st have k = "+(typeof k !== "undefined" ? k : "[undefined!]"));
console.log("outside foo() we 1st have m = "+(typeof m !== "undefined" ? m : "[undefined!]"));
console.log("outside foo() we 1st have o = "+(typeof o !== "undefined" ? o : "[undefined!]"));

foo();

console.log("outside foo() we 2nd have i = "+(typeof i !== "undefined" ? i : "[undefined!]"));
console.log("outside foo() we 2nd have k = "+(typeof k !== "undefined" ? k : "[undefined!]"));
console.log("outside foo() we 2nd have m = "+(typeof m !== "undefined" ? m : "[undefined!]"));
console.log("outside foo() we 2nd have o = "+(typeof o !== "undefined" ? o : "[undefined!]"));

I can not even start to present my scope example without having problems: JavaScript does not always provide the console.log() function (meaning the browser vendors do not always).
So if you want to test the source code and your browser has no console.log(), you need to output somehow into the HTML page ...

Here is the output:

outside foo() we 1st have i = [undefined!]
outside foo() we 1st have k = [undefined!]
outside foo() we 1st have m = -4
outside foo() we 1st have o = -8
within foo() we 1st have i = [undefined!]
within foo() we 1st have k = [undefined!]
within foo() we 1st have m = [undefined!]
within foo() we 1st have o = -8
within foo() we 2nd have i = 22
within foo() we 2nd have k = 33
within foo() we 2nd have m = 44
within foo() we 2nd have o = 88
outside foo() we 2nd have i = 22
outside foo() we 2nd have k = [undefined!]
outside foo() we 2nd have m = -4
outside foo() we 2nd have o = 88

As you can see, scoping is only optional in JavaScript, using the var keyword. The default for any variable you define is the global scope!

Now this is the very old problem of global variables. Nobody knows why they have a wrong value at a certain time. Because anybody can use them. Global variables are one of the oldest error sources in software applications.

Best practice: Always use var when defining variables.


Scoping anomalies are not over yet. The interpreter hoists every variable within a function (no matter at what block depth) up to the top level block starting below the function declaration. If you have several different var i = 0; statements in that function, they all will be in the same variable.

function variableHoisting(param)  {
  console.log("BEGIN: number = "+number);
  if (param)  {
    var number = 1;
  }
  console.log("END: number = "+number);
}

variableHoisting("param");

This code yields:

BEGIN: number = undefined
END: number = 1

When the variable "number" would have been undefined on the first statement of the function, JavaScript would have thrown an error. But as it hoisted the var-statement to the top of the function block, no error occurred, just the value was undefined there.

This is the reason why some JavaScript Gurus recommend to put all variable definitions to the top of the function. (Because this is what the interpreter will do anyway, thus it reflects reality better.)
What in turn makes refactoring of big functions quite difficult.


Another thing is the scope within Objects. Imagine objects nested into each other, like the following:

var outerObject = {
  x: 1,
  outerPrint: function() {
    console.log("this.x = "+this.x+", this.innerObject.y = "+this.innerObject.y);
  },
  innerObject: {
    y: 3,
    innerPrint: function() {
      console.log("this.x = "+this.x+", this.y = "+this.y);
      console.log("outerObject.x = "+outerObject.x+", outerObject.innerObject.y = "+outerObject.innerObject.y);
    }
  }
};

outerObject.outerPrint();
outerObject.innerObject.innerPrint();

Mind that functions nested into an Object must use "this" (or the full namespace "outerObject.innerObject") to access sibling properties in the same Object.
Output is:

this.x = 1, this.innerObject.y = 3
this.x = undefined, this.y = 3
outerObject.x = 1, outerObject.innerObject.y = 3

Child properties are visible from the parent scope, but the parent properties are not visible from child, except when addressing them in an "absolute" way via the global namespace.


Undefined and null

Most programming languages limit themselves to just a null keyword for undefined values. Not so JavaScript.
Here you have undefined and null. Whatever sense this distinction makes.

  • When a variable with name xyz was never defined by the programmer, but used (e.g. passed as function parameter), JavaScript regards xyz to be undefined, and it throws an error in this case.
  • When you call a function without parameters, and that function declares parameters, the values of these parameters will be undefined. No error is thrown in this case!
  • When you access an array and your index is out of bounds of that array, no error happens, you get back undefined.
  • When you declare a variable var x; but you don't assign it a value, its value will be undefined (not null).

Why do we need null then? I do not know.
The only answer I found is that you can use this to assign default values to parameters when their values are undefined, letting pass null as valid value. But that does not convince me at all, I could also regard null as undefined and assign defaults then.

Care has to be taken when talking about "undefined variables". What is undefined? The variable? Or its value? In JavaScript all seem to mingle.
For example this fails with a runtime error:

console.log(xyz);  // assume xyz was never declared as variable

while this works without error and yields object.xyz=undefined:

var object = {};
console.log("object.xyz="+object.xyz);

The only way to get around the runtime error on undefined free variables is a typeof expression:

if (typeof xyz !== "undefined")
  console.log(xyz)

For the second case, where you test the member of an object (and no runtime error is thrown), you can omit the typeof expression and replace it by a simple and good readable condition:

if ( ! object.xyz )
  object.xyz = 3;

This also works in functions when assigning default values for undefined parameters (no typeof expression is needed here!):

function foobar(foo, bar) {
  if ( ! foo )
    foo = "foo default";
  if ( ! bar )
    bar = "bar default";
  ....
}

Best practice: Test for undefined using a simple if ( ! x ) x = ...;

Don't care about free undefined variables. Use the typeof expression only in exceptional cases. Readability of source code is most important.
There is a way to get around the problem with free undefined variables, the "use strict"; statement on top of the .js file. This would instruct the JavaScript interpreter to NOT tolerate the usage of free undefined variables. Not all browsers support strict mode, but using a strict-supporting browser during development will uncover mistakes right from the start.

"use strict";  // best on top of any .js file
// JavaScript code goes here ...

Best practice: Write "use strict"; on top of any module.

The strict mode additionally checks for a lot of other pitfalls.

And for the additional complexity with null:

Best practice: Do not use null at all, use undefined instead.

Reason is: "undefined" creates enough hard-to-read code, do not mix in "null" handling additionally. There will be no use cases where you need to distinguish between them. And when you really want to use it, take care to not expose it to other modules.


For clarity, a final example about "undefined".
The following causes an error:

if ( ! z )   // assume z is not a parameter and was not defined anywhere on this HTML page
  console.log("z is false");

Also this causes an error:

z;
if ( ! z )
 console.log("z is false");

But this DOES NOT cause an error:

var z;
if ( ! z )
 console.log("z is false");