IP Video – Don’t believe the spec !

Since the acceptance of IP video, terms like MPEG4, H.264, 4CIF, D1, MJPEG, QCIF, CIF, DVD quality HD, Mega-pixel & MJPEG have been thrown around as statements of image quality.

Unfortunately these terms have absolutely no relation to actual image quality. In fact in most IP product specifications there is no true measure of the final image quality. We used to measure the quality of a camera in lines of resolution. In current terms 540 lines in colour is a reasonable resolution out the back of the camera. But when we record this to a DVR or IP solution, we don’t mention the number of lines anymore. We refer to the resolution in CIF, 2CIF, 4CIF, D1 or some other High-Definition or mega-pixel resolution.

Lets look at what the different terms actually mean.

    • 4CIF is a count of the pixels in a video signal. In PAL this is somewhere between 702(h) x 576(v) and 752(h) x 576(v) pixels depending on how it is measures.
    • 2CIF is 702 x 288 pixels. Basically we drop every second vertical line and then on display repeat every second line.
    • CIF is 353 x 288 pixels. Often it is stated as 320 x 240 but this is an NTSC spec.
    • D1 is effectively the same as 4CIF
    • Mega-pixel is a measure of pixels beyond 1,000,000 but as there are no standards to this measure, it is not a known quantity. EG 4CIF is a 4:3 aspect image with 702 x 572 pixels. This is a 404,352 pixel camera.
      A 1 Mega-pixel camera could be in 4:3 or 16:9 or some other non standard aspect. Such as 1736 x 576, which would be a very wide screen image
    • HD is the free to air high definition TV standard we are all moving to in 2012. It is referred to as 720p for High Def or 1080p for True High Def.
      720p is 1280 x 720 = 921,600 pixels
      1080p is 1960 x 1080 = 2,116,800 pixels
    • MPEG4, H.264, JPEG2000 and MJPEG are methods of video compression. Currently H.264 is one of the most efficient compressions for CCTV in the industry and is looking like being the standard for a long time.

Theses statements alone mean nothing other than we are capturing video an encoding it. There is no reference to image quality in any of these.

To better explain this lets first look at resolution in terms of pixel count. A CIF measure states how many pixels we have to work with. The most obvious benefit of more pixels is the ability to electronically zoom in on a part of the image post event.

The following image shows the same number plate electronically zoomed in CIF, 4CIF (Standard Definition TV) and HD (High Definition TV)


In this example it can be clearly seen that the detail in the image is clearer with the higher pixel counts. This means a mega-pixel camera is better than a 4CIF camera. Right?

Not quite. If the compression applied is not an efficient one, the image quality can still be poor.

Now you are possibly thinking that the H.264 is the latest and therefore the best quality. Well again the compression does not directly state the quality either.

They compression and the resolution combined do affect image quality but cannot be used as a statement of quality. To explain this, compare the following two images.




Both of these images are H.264 at 4CIF. The one on the left has a poor implementation of H.264 while the one on the right has an efficient implementation.

This can be related to a DVD you purchase from a shop compared with one you download from the Internet. The quality is often very different and yet they are both D1 resolution and MPEG2 compression.

Technically how the compression is implemented is quite involved but in layman’s terms, it can be described as follows.

In a JPEG based compression such as MJPEG, we simply compress the entire image and send it as a JPEG file. How hard we compress it reflects the image quality and file size. The bigger the file size, the better the quality. This is a very inefficient compression but can result in excellent quality at very high bandwidths and storage arrays.

In a motion based compression such as MPEG4 or H.264 (also known as MPEG4 part 10) we analyse the video for changes in detail from one image to the next. The image is broken down to I-frames, which are similar to a full JPEG image and P-frames, which are the detected differences in each image after the I-frame. Often an I-frame is sent approximately every 100 frames or every 4 seconds. Between that we are detecting and sending only the changes in the scene as P-frames.

The ability to detect finer details of change in the scene reflects on the end image quality and the bandwidth required to transmit it.

Some codec’s actually only send I-frames making them no better than a JPEG compression. Others send I and P-frames but their ability to detect fine detail is limited and as such they images come out blocky and at higher bandwidths. Some codec’s actually have insufficient processing speed to combat fast changing scenes and are not able to process the image right to the bottom of the frame with in the 20mS they have available before the next frame so they send the entire frame below were they had processed up to as a P-frame. This results in a large bandwidth and progressively the lower portions of the image become blocky.

In terms of bandwidth requirements, the following graph shows how differently implemented compressions measure up. It should be noted that bandwidth directly relates to storage requirements.

clip_image008The top line is a MJPEG compression, requiring 1.5Mbps continuous.

The next is a MPEG4 compression that has no motion estimation, requiring 1.3Mbps. It is effectively only sending I-frames.

The blue line is an IndigoVision 8000 series using MPEG4, requiring 1.1Mbps

The pink line in an IndigoVision 9000 series using H.264. Note that the H.264 is far more efficient at processing video in no motion periods. The resultant video when averaged out is 700Kbps.

The reason IndigoVision’s compression is so much more efficient than other brands is we have 34% of our workforce in R&D, which allows us to invest significant research in to improving our compressions. Further to this, we can re-flash the firmware’s in existing devices in the field to give them the latest enhancements, bringing them up to today’s standards.

In summary, the quality of a codec cannot be measured in a CIF rating or compression algorithm. You need to take into account the actual image quality, bandwidth and storage based on a sample piece of footage. It is recommended that an identical test video or scene of various conditions, including a resolution test chart be used to compare all products. Making a recording from each product and assessing the resultant image quality, TV lines of resolution, file size and export format.

We regularly compare out product against other products is such a manor. Following is the test results of a comparison between IndigoVision 9000 PoE IP Camera and 2 other well known brands.

Tags: , , , , , , , ,


Leave a comment
  1. Ziad Kassis April 18, 2012 at 11:34 am #

    You’re right on! You drilled down to the details. I would add optics as a major factor in image quality.

    • Tim Norton April 18, 2012 at 1:13 pm #


      Thanks for taking the time to comment.
      You are correct that optics needs to be considered as well. I spend a lot of time advising how to choose the right lens focal length but not much on the the quality of the lens.
      Since the introduction of mega pixel CCTV cameras, this is even more important.

      Unfortunately there are vendors that will supply you an analogue camera with a plastic lens to keep the price down but I have also encountered similar on mega pixel.
      For example a 3MP or 5MP CCTV camera supplied with a 1.3MP lens.

      PS you entered your website as martsight.com – I have corrected it to smartsight (you do want the back link I am guessing)


  1. Spy Cameras for Home | Choosing the Camera Resolution - October 6, 2011

    […] the commonly available 4:3 and 16:9 formats, we have the 4CIF as a base reference, followed by 720p HD, 1080p HD, 1.3MP, 2MP,5MP,11MP and 16MP. For each model, […]