According to Wikipedia, "In computer science, a stream is a sequence of data elements made available over time. A stream can be thought of as items on a conveyor belt being processed one at a time rather than in large batches."
In this blog, I will be implementing our very own Stream in javascript.
Before understanding Stream, let's take a detour and understand what lists are? Specifically, Linked list.
Lists are finite structures which have a "head" (the first value), and a "tail" (rest of the values as another list). It's a very primitive and well known recursive datastructure.
In javascript you can describe a list as follows-
Wait... Were you expecting something else? Node with pointers - next, prev, value, etc. No!, we don't need those. In javascript (or any other programming language with closures and lambdas), the above description is enough for defining lists.
cons(hd, tl)
is a function/ closure that is waiting for another function f
to apply arbitrary logic on hd
and tl
. It's wierd for some people to conceive lists as function.
But, It's a well known fact that - "data is function" & "function is data".
You can understand this by a very simple example - You can always replace a (pure) function with a table of input and
it's corresponding output. Ofcourse this will be very inefficient, but it clearly shows this fact.
cons
help us create pairs of data. e.g-
Then, we can use car
and cdr
to access the data elements. car
access the first element, and cdr
access the second.
Once you have an idea of storing pair, you can compose these pairs to form arbitrary data structures. For lists, you represent it by repeated post composition. e.g- to represent list [1, 2, 3, 4], you can write-
undefined
is used as a special value to represent: end of list
And then you can define other useful HOFs like map
, filter
, foldl
etc. for manipulating lists.
We can use lists to represent streams, but the problem with lists is that it's eager. We want streams to be lazy - we want to process
elements one at a time. You can implement this logic using cons
, where the second element is a suspended/ delegated operation
that upon request produces the next element in the stream.
We suspend the second element of the pair by wrapping it in a thunk/ lambda with no arguments. car
inspects
the first element in the stream, and cdr
executes the delegated procedure to produce next element in the stream.
We can also define useful HOFs for manipulating streams-
Converting back and forth from javascript's arrays to streams is a good idea-
One thing to note in the above implementation of stream is that cdr
executes the suspended procedure every single time it's
called. It might be a good idea to cache/ memoize the result of first execution, so that the next time it's called we dont have
to execute the procedure again-
So cdr
changes a little.
Now we are pretty much done, we can now define streams in javascript using the above constructs-
Writing car
& cdr
is rather unwieldy, let's define a function at
that takes an index and return value stored at
that index-
Asuming I have a way of creating a file descriptor and read some bytes from it, I can create a stream of those bytes
using cons
Streams are a very powerful way of representing finite/ infinite lists. And defining in javascript is trivial. Hope you found some insights on how streams are implemented in javascript.
There is one problem with the above implementation though. Functions like, at
, foldl
, foldr
, filter
,
drop
are recursive, so it is bounded by the max stack size of the environment and hence are susceptible to
Stack-Overflow errors.
Some programming languages like Scala, Haskell do tail call optimization to avoid Stack Overflow errors in recursive programs. Unfortunately javascript does not have this built-in.
In order to fix this, we use a pattern called: Trampoline. Planning to write on it next.
And with that I will wrap this up. Thanks for reading. Keep Learning :)