In Part 1, I covered why I decided to even build this thing in the first place. When we left off, I had discovered FFmpeg and decided to build a multimedia player to apply all my newly acquired knowledge.
I was going to base the design of my multimedia player (from here on referred to as v0.2.0) on ffplay. The main difference would be that v0.2.0 would have a GUI and would be on the web. This led me to another set of decisions. What libs do I use for building the GUI and an even more important question was "What language should I build this with?".
Seeing as this an app for the web the obvious answer for the choice of language would be JS. That would also be the wrong answer. While this is an app for the web, it need to have desktop-level speeds. FFmppeg was written in C and I wasn't enthused about having to port it to JS. Also the JS ports I had seen, while impressive from a purely technical perspective, were not where I needed them to be performance-wise.
I had heard about WASM and seeing as it provided near-native speeds, writing v0.2.0 in C/Cpp and compiling to WASM was the obvious option. Having settled on C/C++, I then started exploring the open source ecosystem to find what I would use to build the GUI. At this point, I had accepted the fact that any lib I chose had to be grounded in first hand knowledge of the concepts & constructs that the lib helped me build. It became evident fairly quickly that the best (note that I didn't say only) material on learning about GUI concepts would be found within video game/rendering material. I settled on using learnopengl.com as it had the least fluff in terms of what I needed to learn. Again, this was a great decision because at this point I didn't even know that windowing libs could be seperate from rendering libs.
I worked through the entire book even though in hindsight, I should have stopped after the first few chapters (hard to resist a good nerd-snipe). At the end, I had a fairly good idea of what I would need for v0.2.0. I then tested out a bunch of libs and wasn't really liking what I saw based on what I needed and then I stumbled on DearImgui.
Just like FFmpeg, I can't say enough good things about this project and to my suprise, it's primarily maintained by one guy. Once I saw that it had different windowing/renderer combinations. I stopped my search. This was exactly what I need because I didn't know which combo would be the best ahead of time and I wasn't keen on writing a lot of connector code. I now had everything I needed to build v0.2.0 but I had another insight. Rather than build for the web directly, why not build a desktop version first and then port it to the web. At this point, I had gained a huge respect for the intricacies of video internals and I was not willing to fight a complexity battle on that front and the web front at the same time. The obvious choice was to divide and conquer and build the desktop version first.
Recall that v0.2.0 was going to be based on ffplay. I had read through the ffplay code and when I saw that it still used SDL2, I decided that v0.2.0 would be based on SDL3. By changing versions, I (correctly) assumed that I would have to actually understand what was going on in the code. Sometimes it pays off to take the harder route if it leads to a better understanding of your system. SDL is another lib that I'm a huge fan off. Amazing API design and the examples are very, very helpful in learning the concepts.
So at this point, my tech stack was SDL3, DearImgui and FFmpeg. I still didn't know a lot about video internals but at least I could explain the difference between encoders/decoders & muxing/demuxing. Within a few months, I had a working desktop prototype of v0.2.0 and by this point, roughly a year had gone by since I started working on this.
The next step was putting it on the web and this was when things got really interesting. I say they got interesting because the performance levels I was trying to achieve were still thought to be impossible but I just had this gut feeling that that wasn't the case. Computers are extremely malleable and the only true restrictions on them are physics and those who design the systems. I didn't know how yet but I just knew there was a way to do this and if it turned out that there wasn't then I was going to build it.
Click for Part 3...Brought to you by
mulVid.