Working with FlatBuffers
In the previous article we discussed the basics of the flatbuffers algorithm and how it de/serializes. Now its time to get practical. Let’s start constructing our own flatbuffer schema, build the data, and work with the data from a remote source.
Lets build Netflix! All of the code I’ll be building here are found in my flatbuffers-benchmarks repo. I have only included a small portion of the code found in my aforementioned repo for readability sake. For more examples of schema’s and usage check out the test directory of Google FlatBuffers.
With Netflix domain model is a (L)ist (O)f (L)ist (O)f (M)ovie (O)bjects,
Lolomo. I know, I know, Netflix is not comprised of only movies, but this acronym, Lolomo, has been around for quite some time, long enough where this was true. So lets build the Lolomo structure. A
Lolomo will contain an
id, which is a
string, and a list of
Rows will contain a
id, and a list of videos.
Videos contain a set of fields such as title, rating, maturity, runningTime, yearCreated, etc. etc.
Video is where most of the code was abbreviated.
Time to build
Now that we have our schema, its time to generate our bindings. First we need to build the flatbuffers project to get
flatc, the flatbuffers compiler. This is pretty simple on OSX. In short, to build should come down to executing two commands:
cmake -G "Unix Makefiles" && make. There may be some env variables to export or xcode tool something or other to download, but overall, really straight forward. Once you have the executable
Now we should have
lolomo_generated.js file which contain bindings for the lolomo schema.
Building FBS Structures in JS
I am not the biggest fan of how flatbuffers are constructed. They must be constructed from the bottom-up which can be difficult for novice programmers. So to construct a flatbuffer object, the following order must be taken:
- Construct all non scalar values.
- Construct the object.
So to construct the
Row object we must do the following:
- Initialize the
- Create each
videoslist keeping track of the
offsetthat comes from each.
- Construct the
- Start construction (
startRow()) of the
Rowobject and feed it the offsets created from the
- Complete (
If you are like me then its best to see a code example. First lets create the
Video constructor function since its our easiest object to create.
Video construction only requires a
builder object (explained later) and the
The first thing you may notice is that we return what comes out of
Video.endVideo function. This is the offset to the video. We need that offset to be stored in the list we will create when creating the
Row object. If you remember the array example from my previous post, you can think of the value coming out of
endVideo to be the location in the
Uint8array where the video exists.
Row constructor is a bit more complicated as it has a
Alright! Our construction from the bottom up is almost complete. The last thing we need to do is construct the
Lolomo. This will consist of a vector of
Row objects and the
We are done. This is the hardest part of working with flatbuffers. The construction code. All items must be constructed from the bottom up, which can be tricky the first time. This is why I am concerned over the
Usability of flatbuffers, but the concern is pretty minimal. Construction code needs to be created once and the everyday developer will not need to touch it.
Receiving Data and working with FBS in Node
Now that we can construct our
Lolomo lets send it over the wire. I’ll assume that
getData() is a function that takes a
callback, and the
callback will be called with a Uint8Array. From there, it is simple to construct the flatbuffers bindings.
At this point, working with the
lolomo object is simple, the schema is the API. Lets
console.log the first video’s
title in the last rows. Now remember, we have simply pointed to a block of memory and called it a
Lolomo. We have not done any deserialization, except for the indices required to get to the first video, and the length of the
What is next?
The next article will be directed towards performance. More specifically the performance of serialization of these two. Hopefully after that, we will break down the cost of memory.