Asynchronous functions,
Callback hell &
the pyramid of doom

Asynchronous functions in JavaScript are often referred to as Callback hell, the pyramid of doom, or the christmas tree from hell ...

To give a real world example, I searched for "pyramid of doom" and found the following code:

ms.async.eachSeries(arrWords, function (key, asyncCallback) {
    pg.connect(pgconn.dbserver('galaxy'), function (err, pgClient, pgCB) {
        statement = "SELECT * FROM localization_strings WHERE local_id = 10 AND string_key = '" + key[0] + "'";
        pgClient.query(statement, function (err, result) {
            if (pgconn.handleError(err, pgCB, pgClient)) return;
            // if key doesn't exist go ahead and insert it
            if (result.rows.length == 0) {
                statement = "SELECT nextval('last_resource_bundle_string_id')";
                pgClient.query(statement, function (err, result) {
                    if (pgconn.handleError(err, pgCB, pgClient)) return;
                    var insertIdOffset = parseInt(result.rows[0].nextval);
                    statement = "INSERT INTO localization_strings (resource_bundle_string_id, string_key, string_revision, string_text,modified_date,local_id, bundle_id) VALUES ";
                    statement += "  (" + insertIdOffset + ",'" + key[0] + "'," + 0 + ",'" + englishDictionary[key[0]] + "'," + 0 + ",10,20)";
                    ms.log(statement);
                    pgClient.query(statement, function (err, result) {
                        if (pgconn.handleError(err, pgCB, pgClient)) return;
                        pgCB();
                        asyncCallback();
                    });
                });
            }
            pgCB();
            asyncCallback();
        });
    });
});

Before fixing it, lets try to figure out what it does ...

To whoever wrote that code, you do not have to be embarrassed, because I and many others have written code just like that.

How to overcome the "pyramid of doom"

But it will not be solved with promises, or by indenting with two spaces instead of four spaces (yeh, some people think that is the solution) :P

  1. Avoid using anonymous functions as callbacks: They are nice, but too much of the good stuff can be bad ...
    And there are many advantages of naming things, like making it easier for others (and future self) to understand what the code does. And in this case, it will also solve the "pyramid of doom".
  2. Avoid using global variables: A popular name for use of global variables is "spaghetti code".
    The problem with global variables is that the larger your code-base grows, the more time it will take to add new features: Because you have to "search in files" to find all places where a variable is used. And think hard on how to change it without breaking stuff elsewhere.
    Please do not take this as an insult though, rather the opposite, you need to be very smart to manage such a code-base!
  3. Use subfunctions: Functions within functions, to keep the scope simple.
  4. Throw errors instead of just returning the void. That way you don't spend hours debugging. Instead, the error will be thrown right at you!
    Throwing errors at every place there might be one will (paradoxical perhaps) result in a more stable and bug free program!
  5. Name those constants: Even if you are only going to use a decimal number once, give it a name! Doing this will be less prone to bugs, and it will be easier if you later decide to change that option flag, id or whatever.
  6. Make use of database default values: It should make inserts slightly faster!
  7. Abstract SQL commands: We are coding JavaScript! Why switch context to a whole other language just to speak with a database!?
    For this I use my own database abstraction / persistence module.

If I would have written the code above, it would have looked something like this ...
I probably missed something, but you should get the idea:

function addKeyWords(localizationStrings, keywords, localId, bundleId, asyncCallback) {

	if(!keywords.length) throw new Error("Expected keywords to be an array with one or more keys:" + JSON.stringify(keywords));
	if(localId == undefined) throw new Error("Expected localId to be a number: " + localId);
	if(bundleId == undefined) throw new Error("Expected bundleId to be a number: " + bundleId);

	var keyword;

	for(var i=0; i<keywords.length; i++) {
		keyword =  keywords[i][0];
		if(!localizationStrings.has( {localId: localId, bundleId: bundleId, stringKey: keyword} )) {
			localizationStrings.add( {
				resourceBundleStringId: localizationStrings.insertIdOffset, 
				stringKey: keyword,
				string_text: englishDictionary[keyword],
				localId: localId,
				bundleId: bundleId
			} );
		}
	}
	return true;
}

In my database abstraction / persistence module, the data is pushed to the database asynchronously in the background, while keeping a local copy in memory. This allows me to code sequentially and avoid some of the headache with asynchronous JavaScript.
You might think I totally cheated on this one, but for solving problems, you should try to think outside the box! :P

Flatten those pyramids

Here's another real world asynchronous JavaScript problem that I ran into not long ago. The problem states as follows:

Read three files from the disk, then send their content in an e-mail.

Try to implement it yourself as a training exercise.
Once you know how to deal with such cases, asynchronous JavaScript will become a second nature, and no further abstractions (promises, await, etc) will be needed!

My first implementation looked something like this:

fs.readFile(file1, function(err, data) {
	text = text + data; 
	fs.readFile(file2, function(err, data) {
		text = text + data; 
		fs.readFile(file3, function(err, data) {
			text = text + data; 
			fs.readFile(file4, function(err, data) {
				text = text + data;
				mailFunction(text);
			});
		});
	});
});

Then I re-factored it to this:

function emailFiles(files, mailFunction) {
	var readFiles = 0; // Counter
	var text = "";     // Concatenate each file into this string
		
	for(var i=0, i<files.length; i++) {
		fs.readFile(files[i], readit);
	}

	function readit(err, data) {
		if (err) throw err; // or at least a console.warn
		
		text = text + data; 
		
		readFiles++;
		
		if(readFiles == files.length) {
		mailFunction(text);
	}
}
}

My re-factored solution uses a counter to keep track on how many files that have been read, then flattens the "pyramid" by giving a named function as callback.
The readit function is a sub-function (function within a function), and has access to the variables in the parent emailFiles function because of the shared scope. It's very convenient to divide your code into "documented" parts (sub-functions) with separate/shared scope in JavaScript.

Comment from the future: There is a possible bug in the refactored code! As the files are read in parallel, they might finish reading in the wrong order! If the order is important, I've made this example code:

function emailFiles(files, mailFunction) {

	var readFiles = 0; // Counter
	var text = new Array(files.length); // Text content

	for(var i=0, cb, i<files.length; i++) {
		cb = createCallback(i);
		fs.readFile(files[i], cb);
	}

	function createCallback(n) {
		function readit(err, data) {
			if (err) throw err; // or at least a console.warn

			text[n] = data;

			readFiles++;

			if(readFiles == files.length) {
				allDone();
			}
		}
	}

	function allDone() {
		var combinedText = text.join("\n");
		mailFunction(combinedText);
	}
}

The difference is that the read text is now stored in an array, and a function that returns a function is used to create the callback to fs.readFile.
Functions that returns a function is an advance concept in JavaScript that can help you make the code more simple and faster. The function readit has scoped access to parameter n.

Another example

Another code example found in an article about the "Pyramid of doom", where they compared different "solutions",

var articles = [];
get(compile_url('freep.com'), function(err, freep_response) {
if (err) throw err;
articles = articles.concat(process_response(freep_response));
get(compile_url('detroitnews.com'), function(err, det_response) {
	if (err) throw err;
	articles = articles.concat(process_response(det_response));
	get(compile_url('battlecreekenquirer.com'), function(err, battle_response) {
		if (err) throw err;
		articles = articles.concat(process_response(battle_response));
		get(compile_url('hometownlife.com'), function(err, hometown_response) {
			if (err) throw err;
			articles = articles.concat(process_response(hometown_response));
			console.log(articles);
		});
	});
});
});

First lets try to figure out what the code does:
It's pretty much like my email problem, three web-pages needs to get fetched, then concatenated.
The code isn't that bad: It's easy to understand and it throws errors. So I probably wouldn't touch it. But let's do it anyway:

function getArticles(websites, urlCompiler, pageParser, pageFetcher, callback) {
var articles = [];
var pageCounter = 0;
	
for(var i=0; i<websites.length; i++) {
	getArticlesFrom(websites[i]);
}
	
function getArticlesFrom(webpage) {
	webpage = urlCompiler(webpage);
		
	// Fetch the website
	pageFetcher(webpage, parsePage);

	function parsePage(err, pageData) {
		//if(err) throw err;
		if(err) {
			console.warn("Error when fetching " + webpage + ":\n" + err);
		}
		else {
			pageData = pageParser(pageData);
			articles.concat(pageData);
		}
			
		pageCounter++;
			
		if(pageCounter == websites.length) {
			callback(articles);
		}
			
	}
}
}

Some improvements:

Let's take one of the "solutions" from the article, and compare it to my code:

var sites = ['detroitnews.com', 'freep.com', ...];
var articles = [];
var response_counter = 0;
for (var i = 0; i < sites.length; i++) {
var site = sites[i];
get(compile_url(site), function(index, err, response) {
	if (err) throw err;
	articles = articles.concat(process_response(response));
	response_counter++;
	if (response_counter == sites.length) {
		console.log(articles);
	}
}.bind(this, i));
}

Almost the same, but it uses an anonymous function! So a .bind or closure is needed to fix the scope of i. In these situations I find it easier to just copy/paste some of the code (in the foor loop) into a named (sub) function.


Written by 6:th July 2015.


Follow me via RSS:   RSS https://zäta.com/rss_en.xml (copy to feed-reader)
or Github:   Github https://github.com/Z3TA