'In Node.js design patterns unleashing zalgo why is the asynchronous path consistent?
In the great book i'm reading now NodeJs design patterns I see the following example:
var fs = require('fs');
var cache = {};
function inconsistentRead(filename, callback) {
if (cache[filename]) {
//invoked synchronously
callback(cache[filename]);
} else {
//asynchronous function
fs.readFile(filename, 'utf8', function(err, data) {
cache[filename] = data;
callback(data);
});
}
}
then:
function createFileReader(filename) {
var listeners = [];
inconsistentRead(filename, function(value) {
listeners.forEach(function(listener) {
listener(value);
});
});
return {
onDataReady: function(listener) {
listeners.push(listener);
}
};
}
and usage of it:
var reader1 = createFileReader('data.txt');
reader1.onDataReady(function(data) {
console.log('First call data: ' + data);
The author says that if the item is in cache the behaviour is synchronous and asynchronous if its not in cache. I'm ok with that. he then continues to say that we should be either sync or async. I'm ok with that.
What I don't understand is that if I take the asynchronous path then when this line var reader1 = createFileReader('data.txt'); is executed can't the asynchronous file read finish already and thus the listener won't be registered in the following line which tries to register it?
Solution 1:[1]
JavaScript will never interrupt a function to run a different function.
The "file has been read" handler will be queued until the JavaScript event loop is free.
Solution 2:[2]
The async read operation won't call its callback or start emitting events until after the current tick of the event loop, so the sync code that registers the event listener will run first.
Solution 3:[3]
Yes,I feel the same when read this part of the book. "inconsistentRead looks good"
But in the next paragraphs I will explain the potential bug this kind of sync/async functions "could" produce when used (so it could not pass too).
As a summary, was happen in the sample of use is:
In an event cycle 1:
reader1 is created, cause "data.txt" isn't cached yet, it will respond async in other event cycle N.
some callbacks are subscribed for reader1 readiness. And will be called on cycle N.
In event cycle N: "data.txt" is read and this is notified and cached, so reader1 subscribed callbacks are called.
In event cycle X (but X >= 1, but X could be before or after N): (maybe a timeout, or other async path schedule this) reader2 is created for the same file "data.txt"
What happens if: X === 1 : The bug could express in a no mentioned way, cause the data.txt result will attempt to cache twice, the first read, the more fast, will win. But reader2 will register its callbacks before the async response is ready, so they will be called.
X > 1 AND X < N: Happens the same as X === 1
X > N : the bug will express as explained in the book:
You create reader2 (the response for it is already cached), the onDataReady is called cause the data is cached (but you don't subscribe any subscriber yet), and after that yo subscribe the callbacks with onDataReady, but this will not be called again.
X === N: Well, this is an edge case, and if the reader2 portion run first will pass the same as X === 1, but, if run after "data.txt" readiness portion of inconsistentRead then will happen the same as when X > N
Solution 4:[4]
this example was more helpful for me to understand this concept
const fs = require('fs');
const cache = {};
function inconsistentRead(filename, callback) {
if (cache[filename]) {
console.log("load from cache")
callback(cache[filename]);
} else {
fs.readFile(filename, 'utf8', function (err, data) {
cache[filename] = data;
callback(data);
});
}
}
function createFileReader(filename) {
const listeners = [];
inconsistentRead(filename, function (value) {
console.log("inconsistentRead CB")
listeners.forEach(function (listener) {
listener(value);
});
});
return {
onDataReady: function (listener) {
console.log("onDataReady")
listeners.push(listener);
}
};
}
const reader1 = createFileReader('./data.txt');
reader1.onDataReady(function (data) {
console.log('First call data: ' + data);
})
setTimeout(function () {
const reader2 = createFileReader('./data.txt');
reader2.onDataReady(function (data) {
console.log('Second call data: ' + data);
})
}, 100)
output:
?? node zalgo.js
onDataReady
inconsistentRead CB
First call data: :-)
load from cache
inconsistentRead CB
onDataReady
when the call is async the onDataReady handler is set before file is read and in the async the the itration finishes before onDataReady is setting the listener
Solution 5:[5]
I think the problem can also be illustrated with a simpler example:
let gvar = 0;
let add = (x, y, callback) => { callback(x + y + gvar) }
add(3,3, console.log); gvar = 3
In this case, callback is invoked immediately inside add, so the change of gvar afterwards has no effect: console.log(3+3+0)
On the other hand, if we add asynchronously
let add2 = (x, y, callback) => { setImmediate(()=>{callback(x + y + gvar)})}
add2(3, 3, console.log); gvar = 300
Because the order of execution, gvar=300 runs before the async call setImmediate, so the result becomes console.log( 3 + 3 + 300)
In Haskell, you have pure function vs monad, which are similar to "async" functions that get executed "later". In Javascript these are not explicitly declared. So these "delayed" executed code can be difficult to debug.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Quentin |
| Solution 2 | JMM |
| Solution 3 | pelicanorojo |
| Solution 4 | |
| Solution 5 | Zhe Hu |
