Protocols, formats, and CODECs, oh my!

>we need to do something about this different codec bullshit.
>why is this such a hassle?
>is there a solution?

I think a set of conventions can minimize the hassle. You could look at different codecs as being like different languages. A lot of hassle would be eliminated if everyone in the world spoke the same language (e.g. Esperanto), but that is unlikely to happen and, of course, there would be a loss of culture, creativity, and diversity.

In some countries, all street signs are in a single language, in others they may be in 2 or 3 languages. When you use an ATM in California, you usually push a button to choose English or Spanish before proceeding (although you'd think your language preference would be able to be encoded on your magnetic stripe.)

In the "vblogosphere" we should develop a set of guidelines for how to handle a "babelized" world. There will be standard formats/codecs that are widely supported and some vblogs may just choose one, but a set of conventions for indicating what is available will help to make things easier for everyone.

Actually, there are more things than just CODECs that can vary, here is a (partial) list:

Delivery protocol: How the data is delivered from the server to the player. Examples are http, rtsp, and mms.

File format: Also known as "container format". The file format contains meta data such as length, author and copyright info, information on how many "tracks" are in the file, what codec each track is encoded with, etc. Examples are .mov, .wmv, .avi, and .mp4.

CODEC: Short for COmpressor/DECompressor. There are a wide variety of codecs for both video and audio. The MPEG-4 standard actually defines multiple video and audio codecs that can be used within the .mp4 file format.

There are two main kinds of video compression, Inter-frame and intra-frame. Intraframe compression also known as "spatial" compression is the same as the compression used to compress a JPEG image. It finds "spatial" redundancy in an image and eliminates it. Interframe compression, also known as "temporal" compression, finds redundancy over time and eliminates it. If you have a talking head, a good interframe compressor will figure out that only the lips are moving and only send the lips part of the image in successive frames. If there is a baseball moving across the screen, an algorithm called "motion detection" will encode a command that says "move this part of the previous image a little to the left" rather then resend the entire image, etc. CODEC standards and implementations get better over time as new techniques are developed and as processors get faster so that more sophisticated algorithms can be used.

Server: Web servers can serve any file format/codec using good ol' HTTP. The latest Real Server can stream Real, QuickTime, and Windows Media formats. The QuickTime/Darwin Streaming Server can stream .mov, .mp4, and a few other formats.

Player: Most players come in both a browser plugin version and a standalone version. Most can play multiple (but not every) format and protocol. Generally a web browser is configured to use a single player for each mime type. Apple and Real complained that Microsoft was "stealing" mime-types and disabling their players, so they joined the "Ask, tell, help" initiative to promote good "Internet manners":

If you're still reading this message, you may be feeling overwhelmed. With Internet video there is a lot of complexity "under the hood". You can avoid a lot of this complexity by choosing a vendor-specific solution like QuickTime (my personal favorite) just as most Americans can get by quite fine only speaking English. However, if you live in Europe or travel worldwide, being able to "use" other languages can be a big help.

I'm interested in helping develop guidelines to make life in a "babelized" world as easy as possible. As some of you know, the vBlog Central service supports multiple protocols, formats, codecs, servers, and players. As open standards for videoblogging develop, I will also be working on making sure that vBlog Central supports those standards.

Perhaps, some of this discussion may be too technical for this list and could be moved elsewhere, but I've been using the term "codec" in a vague way, as many people do, to include issues of format and player, and wanted to provide at least an overview of the issues.

— Sean

M. Sean Gilligan : 831-466-9788 x11
Catalla Systems, Inc.