This is a preliminary html conversion.

A better version with table of contents and better formatting should be there soon, but this document is quite popular and I didn't want to take it down.

Creating video cds.

For some definitions of various words used throughout this text, see the bottom of this file. The toolset ----------- The following text assumes that you are using some Unix or Linux variation with the following packages installed: - mplayer, if you intend to encode from dvd, you will have to build it from source with the dvd options enabled. mplayer is also used for transcoding avi and unencrypted mpeg files. - mjpegtools, contains the tools for encoding mpeg and mpeg2 files - toolame or mp2enc, mpeg layer 2 audio compressor - vcdtools, used for creating video cd images - cdrdao, used for burning image files to a cd. It will still be usefull for windows users (you can use cygwin for building and running those tools, but maybe you want to use a windows encoder, in that case the information about the different formats may be of use still) Video CD formats. ----------------- Basicly there are 5 more or less accepted 'standards', and 2 variations on the 2 most popular vcd standards. The 'official' standards: cd-i video : 364x240, 24,25 or 30fps, 1000-1200kbit/sec cbr mpeg1 video, 320kbit/sec cbr mpeg1 layer 2 audio videocd 1.1 : 352x240 24 or 30fps, 1200kbit/sec cbr mpeg1 video 224kbit/sec cbr mpeg1 layer 2 audio videocd 2.0 : 352x240(ntsc) or 352x288(pal/secam) 24, 25 or 30fps 1200kbit/sec cbr mpeg1 video 224kbit/sec cbr mpeg1 layer 2 audio cvd : 352x480(ntsc) or 352x576(pal/secam) 23.976, 24, 25, 29.97, 30 frames/sec or 50, 59.94 and 60 fields/sec 1800kbit/sec vbr mpeg2 video 224kbit/sec vbr mpeg2 layer2 audio svcd : 480x480(ntsc 4/3), 480x576(pal 4/3) 720x480(ntsc 16/9), 720x576(pal/secam 16/9) 23.976, 24, 25, 29.97, 30 frames/sec or 50, 59.94 and 60 fields/sec 2500kbit/sec vbr mpeg2 video, 224kbit/sec vbr mpeg2 layer 2 audio In all cases, the audio must use 44.1khz sampling rate. Then there are hqvcd or xvcd and xsvcd. Those are variations on the vcd 2.0 and the combination of cvd and svcd standards. When to use which format.. -------------------------- The 3 things to consider for choosing a standard are: - player compatibility - quality - playtime per cd Compatability ------------- There is no reason ever to use the cd-i video disc format (or vcd 1.0), it only plays on cd-i players and software players, no standard videocd player will play them. If cd-i compatability is what you need, use the vcd 1.1 or 2.0 standards. You will need to include a cd-i videoplayer app, those can be found on the web by searching google for cdi. Otherwise, there are few reasons for using the videocd 1.x and 2.0 standards at all, in all cases it is possible to create a cvd disk that will contain the same playlength in a better quality. The only remaining reason for using vcd 1.x and 2.0 is that you got your source material in vcd complient mpeg1 files and dont want the loss of re-encoding. Of course when your target is a hardware videocd player then you are limited to using the vcd 1.x and 2.0 standards, tho hqvcd may work for you. So, in almost all cases you should end up making mpeg2 files and creating svcd or cvd complient cds. The good news here is that in fact the cvd 'standard' is refered to by the svcd standard and is for most purposes simply a subset of it. This means that many hardware svcd players and svcd capable dvd players will also play the cvd format. A player/resolution compatibility matrix | 352x240 | 352x288 | 352x480 | 352x576 | 480x480 | 480x576 | 720x480 | 720x576 -----+---------+---------+---------+---------+---------+---------+---------+---------- cdi | yes | yes | no | no | no | no | no | no -----+---------+---------+---------+---------+---------+---------+---------+---------- vcd | yes | yes | no | no | no | no | no | no -----+---------+---------+---------+---------+---------+---------+---------+---------- cvd | yes | yes | yes | yes | no | no | no | no -----+---------+---------+---------+---------+---------+---------+---------+---------- svcd | yes | yes | yes (1) | yes (1) | yes | yes | yes (2) | yes (2) -----+---------+---------+---------+---------+---------+---------+---------+---------- dvd | yes | yes | yes (3) | yes (3) | yes | yes | yes (3) | yes (3) 1. svcd players should be able to play this but quite a few seem to not be. 2. this usually only works on svcd players that support widescreen displays. 3. SVCD capable DVD players should be able to play this, those resolutions are also valid dvd resolutions. It seems that not every player can do this tho. Many hardware players will not play files that are more then 90 minutes long or will refuse to play either everything past the 90 mins mark or anything before 90 minutes from the end of the file. At times this is 99 minutes. A cdi player has to be equiped with an mpeg decoder cardridge (standard in more recent models, not standard on the original cdi players. A dvd player has to be vcd and svcd capable, not all players are, and at times players are only svcd capable. cvd players are typically found in China only, and are no longer produced. Any svcd player marketed in China will incorporate the cvd standard, and the svcd standard is supposedly a superset, but due to the different resolution used for svcd, there are players around that can read cvd but cant properly display the video. There is a difference between pal and secam players, but there is no differnce between pal and secam encoded video on a cd or dvd. Quality and playtime -------------------- This leads us to the 2 remaining parameters, quality and playtime. Given the same compressor with identical (or comparable) settings, more playtime always reduces quality and the 'art' of good video compression is finding the best compromise for the 2. Havign said that, there are ways to improve quality without reducing playtime, but those make the compression exponentially more expensive cpu wise (note that this is a one-time problem, it wont matter at all to the player used for viewing) Of course the compressor that you use is the major factor in the quality of the result. Basicly, quick and dirty compressors as often used on the windows platform (ie: xing, nero svcd plugin etc) don't do a very good job at getting the best out of a super videocd. besides the quality of the compressor, the settings you use for compression can affect quality a lot. Generally spoken, the factors that affect quality: - resolution - detail and noise level of the source material - bitrate - search radius (or maximum motion vector length) - accuracy of macro block subsampling - quantizer - speed of dynamic quantizer The resolution has to be within the following boundaries: For VCD: hres must be 352, vres must be 240 for ntsc or 288 for pal/secam. For SVCD: hres must be either 352, 480 or 720 vres must be either 480 or 576 Interestingly, most pal capable dvd and svcd players will accept any hres/vres combination, ntsc players are almost without exception limited to a vres of 480. Most vcd players will do pal->ntsc and ntsc->pal conversion, tho the results are not always very good, esp for pal->ntsc conversion. Also interesting is that it is very well possible to make a 16/9 aspect svcd at 480x576 or a 4/3 aspect svcd at 720x576, so it seems that decoders dont really care as long as each parameter itself is within range. For bitrate there is a reasonable lower boundary, which is dictated not by physical limitations but by quality. When using 352x480 at 24fps, encoding from a source with low noise and low detail, you can go as low as 800kbit/sec. The upper limit is dictated by the speed of the cd drive in your average player. The spec says to not go beyond approx 3100kbit/sec for video + audio + system layer combined (add approx 16kbit/sec to the total of your video + audio streams for the system layer), which leaves a 4speed reader just enough time to do a retry at times (due to the nature of optical reading, this will often happen a few times during the playback of a cd, and this should not be noticable to the viewer) Optimizing for quality: - Use the highest bitrate that is feasable (usually 2800kbit/sec) - Use the most accurate subsampling possible. - Use a search radius of at least 24, 32 is prefered for hres of 480 or 720 - use a low quantizer and a relatively slow dynamic quantizer Optimizing for playtime: - Use as low a bitrate as possible - Use the most accurate subsampling possible. - Use a search radius of at least 24, 32 is prefered for hres of 480 or 720 - Use a higher quantizer and faster dynamic quantizer As you can see, you should always try to use an accurate subsampling and a high search radius, those 2 will help getting the best out of the available bitrate more then anything else except for at times noise reduction on the source. Noise has the nasty habbit of being rather random, and as a result, being very difficult to compress (not that we usually care about correct reproduction of noise, but that is something the compressor cannot know) Because of this, it is a very good idea to ensure the source is as free of noise as possible. An encoding example ------------------- - source: dvd with animated movie in ntsc 3-2 pulldown format, in 720x480 with a widescreen (16/9) aspect ratio First of all, create a fifo with the name stream.yuv by typing 'mkfifo stream.yuv' Use mplayer to play the movie with the following extra commandline arguments: -ao pcm -af resample=44100 -vo yuv4mpeg and do this in the directory where the stream.yuv fifo was created. Open a second prompt and cd to the same directory and type: cat stream.yuv | mpeg2enc -f 5 -F 2 -a 3 -n n -I 0 -p -b 2000\ -B 240 -S 779 -V 230 -q 6 -Q 0.3 -r 32 -4 2 -2 1 -N -o outfile.m2v What all those parameters mean: -f 5 : we want a user-rate svcd mpeg file -F 2 : we have a 24fps source (it is a movie) -a 3 : 16/9 aspect ratio -n n : ntsc mode (important for color and gamma correction) -I 0 : the source is 'progressive mode' video (non interlaced) -p : include a 3-2 pulldown header. This will tell players to convert to 30fps and to add interlacing for proper display on a tv. -b 2000 : 2000kbit/sec video -B 240 : reserve 240kbit/sec for non video components (audio, system) -S 779 : split at 779MB (approx max size of an 80 minutes cd) -V 230 : video decoder buffer (230kbyte for a standard svcd player) -q 6 : quantizer factor of 6 -Q 0.3 : dynamic quantizer speed, 0.1 is very low, 0.9 is way too high -r 32 : search radius of 32 -4 2 : accuracy for 4x4 subsamples -2 1 : accuracy for 2x2 subsamples -N : mild noise reduction filter This will get you a fairly high quality, widescreen svcd compatible mpeg file that will play on most svcd and dvd players. Note however that widescreen playback of svcds is handled differently by different players, and at times is not handled correctly at all. The same applies to using 720 as hres. Rescaling to fit a 'normal' tv.. Because not all players work well with widescreen, you may want to convert to a 4/3 display before encoding the mpeg file. This however means that you are going to waste a lot more bandwidth on encoding black bands instead of interesting video. How much this affects the quality depends a lot on the compressor and settings used. Basicly, you should be able to use the mpeg2enc commandline with only 1 small change, the -a 3 should become -a 2. You may also consider to use -r 24 instead of -r 32 if your hardware is not really fast. The mplayer commandline will change to do some scaling and expanding to correctly create a picture that fits a traditional 4/3 aspect ration TV set. To do so you should add something like -vop lavcdeint,expand=-1:480,scale=480:352 to the mplayer commandline. Note that this will also reduce the resolution to effectively 480x352, which will expand to approx 640x352 on your TV screen. You could consider scaling to 720x352 instead, esp. if you are going to play it back on a dvd player that uses a scart or component video connection to your TV. You will notice that in both cases I use a 2000kbit/sec bitrate, which is lower then the 2500kbit/sec that is set by the official standard. The first reason is the combiantion of animated movie, the dynamic quantizer , the mild noise reduction filter, and the fact that we have a 24fps source Generally spoken, the detail level of animated movies is a lot lower then from a 'real world' bit of video or movie, so it is easier to compress without quality loss. Then, the noise reduction and dynamic quantizer combi ensure that high motion or the few high detail parts in it will be handled gracefully (you will get some mpeg artifacts when having a lot of motion in very detailed parts of the movie, but such things are highly exceptional, and virtually impossible to spot unless playing in slow motion and taking a very carefull look) 24fps vs 30fps saves 20%, so obviously we need less bandwidth. Looking at the 2 samples above, the widescreen version will have a much better resolution, but the motion quality of the 'normal' version will be a lot better. When done with the video, you will have to encode the audio and multiplex the streams to a mpeg program stream. To encode the audio, type 'toolame -b 224 audiodump.wav outfile.mp2' or when using mp2enc, type 'mp2enc -V -o outfile.mp2 < audiodump.wav' When the audio encoder is finished, you can multiplex the streams using mplex. Note that you will have to tell mplex that you are creating a user rate svcd. The mplex commandline should look as follows: mplex -t 5 -V -r 2240 -b 230 -o outfile-part-%d.mpg outfile.m2v outfile.mp2 You should now have a set of super video cd complient mpeg files, each no larger then some 780MB. You now need to use the vcdxgen tool to create an xml description for the video cd, this will be used by the image creation tools. 'vcdxgen -t svcd outfile-part-1.mpg' will create a file videocd.xml that can be handed to the vcdxbuild tool. You can optionally edit this file first to add your own menus and such, but that is beyond the scope of this text. Now create an image by typing 'vcdxbuild videocd.xml' and after a few minutes you should have 2 new files: videocd.cue and videocd.bin Those 2 files make up the cd image that you can burn to a cd with cdrdao Encoding video from a tv source ------------------------------- Analog TV sources have a set of specific problems which make them rather nasty for mpeg compression. First of all, analog TV is noisy. second, it is an interlaced source that does not give you a way to figure out where the top and bottom fields are, or that it is in fact doing a 3-2 pulldown (telecine) when dealing with ntsc video. pal has the additional issue that many pal converted movies are actually simply a 24fps movie played at a 4% higher speed, which results in having to do a slowdown of the video and resampling of the audio. The result is that you will always need deinterlacing filters, and that you may need to do a reverse telecine manually (reverse telecine goes beyond this text, refer to the mjpegtools documentation) or will have to do speed conversion. Assuming you have a regular ntsc or pal video source, all you should have to care about is deinterlacing and denoising. Generally spoken, the cvd resolution and bitrate are ideally suited for encoding analog tv, provided you use a noise reduction filter also. Using 352x480, I have been able to make cvd disks that contain almost 90 minutes of super-vhs quality video. Eventho it is possible to record at resolutions of upto 640x480 (ntsc) or 768x576 (pal/secam), in practise there is so much signal loss and noise that 352x480 or 352x576 is close to the actual resolution as you receive it. Recording from digital TV? you can basicly consider this as identical to encoding from a medium quality digital source (it wont get close to DVD quality, but definitely goes as far as a high quality divx file) Encoding from digital sources ----------------------------- The main differences bwteen DVD and lower quality digital sources have to do with the target resolution and needing extra filters in mplayer. First of all, scaling is ugly, if you can at all prevent having to scale video, you should do so. Then, there are things that scale better then others, but generally spoken you should never magnify by more then a factor 2 in any direction, and use a gausian filter whenever you increase resolution. Reduction can be done with a simple bicubic scalar, and is usually less problematic, but there are a few pitfalls. The most important one is that if your source is in a yuv format (which effectively means all mpeg, mpeg2 and mpeg4 variations), you must first convert back to rgb using a high quality color conversion, preferably with shape following interpolation or a very good yuv scalar. Failure will result in diagonal lines that are shaded to the edges being distorted, and similar scaling artifacts. Then, you often have to deal with artifacts from a previous compression, uncompressed digital video sources are rare and require a kind of storage space that few people have. In many cases you can tell mplayer to compensate for such artifacts, for example by adding pp=hb/vb to the -vop argument of mplayer to enable horizontal and vertical 'deblocking'. When encoding from a dvd that is not using 3-2 pulldown, you will often have to de-interlace the video, for most other digital media this is not needed (and should be skipped when not needed because it reduces quality a bit) Resolution/bitrate matrix ------------------------- |352x240 | 352x288 | 352x480 | 352x576 | 480x480 | 480x576 | 720x480 | 720x576 ----+--------+---------+---------+---------+---------+---------+---------+--------- 1200| mpeg1 | mpeg1 | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | good | decent | low | bad | bad | unusable| unusable| unusable | 60 | 60 | 90+ | 90+ | 90+ | 90+ | 90+ | 90+ ----+--------+---------+---------+---------+---------+---------+---------+--------- 1500| mpeg1v | mpeg1v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | good | good | decent | decent | bad | unusable| unusable| unusable | 90+ | 90+ | 90+ | 90+ | 90+ | 90+ | 90+ | 90+ ----+--------+---------+---------+---------+---------+---------+---------+--------- 1800| mpeg1v | mpeg1v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | good | good | good | good | decent | decent | bad | bad | 70-90 | 70-90 | 70-90 | 70-90 | 60-90 | 60-90 | 60-70 | 60-70 ----+--------+---------+---------+---------+---------+---------+---------+-------- 2000| -- | -- | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | | | good | good | good | decent | decent | decent | | | 60-80 | 60-80 | 60-80 | 60-80 | 50-70 | 50-70 ----+--------+---------+---------+---------+---------+---------+---------+-------- 2500| -- | -- | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | mpeg2v | | | excelent| excelent| excelent| excelent| good | good | | | 50-60 | 50-60 | 40-60 | 40-60 | 40-50 | 40-50 ----+--------+---------+---------+---------+---------+---------+---------+-------- 2800| -- | -- | -- | -- | mpeg2v | mpeg2v | mpeg2v | mpeg2v | | | | | excelent| excelent| excelent| excelent | | | | | 35-50 | 35-50 | 30-40 | 30-40 Each cell lists compression format, quality and playtime in minutes. The reported playtimes and quality also depend quite a bit on the noise and detail level of the source as well as the quantizer, search radius and subsampling settings. Whenever there is a -- in the table, the bitrate no longer helps quality in any noticable way so you should look for the highest bitrate that still has info for the resolution you are using. mpeg1 means constant bitrate mpeg1 video, and is compatible vith videocd 1.1 (ntsc only) or videocd 2.0. mpeg1v means variable bitrate mpeg1 video and can be used on hqvcd or xvcd, there are a few hardware video cd or super video cd players that can play this, but dvd players often can when you burn it to a regular v2.0 video cd. mpeg2v is variable bitrate mpeg2, usable on super video cds (and cvd and xsvcd variations) For all variable bitrate encodings, the bitrate is the peak bitrate, not the average. For video cd, the bitrate should be no more then 1200kbit/sec. For hqvcd, the peak bitrate measured over any 48kbytes of data should not exceed 1800 and for svcd variations, the peak bitrate measured over any 230kbytes of data should not exceed the 2800kbit/sec (in all cases for video, you may squeeze out a little bit more by using a lower bitrate audio stream, but there are players that will have trouble with this, and you are no longer vcd/svcd complient) As mentioned earlier, the official spec is 2500kbit/sec for video and 224kbit/sec for audio on super video cd. The order of qualifications for quality is: unusable : Too many artifacts to be any good. bad : Very bad quality, but video is recognizable. low : high motion/high detail scenes will look bad, but video is 'watchable'. decent : some noticable artifacts in high motion/high detail still. good : slight bluring of high motion, no visible artifacts excelent : no visible artifacts, no noticable bluring. Rule of thumb table for search radius ------------------------------------- 352x240 16 352x288 16 352x480 24 352x576 24 480x480 24 480x576 24 720x480 32 720x576 32 Reasonable settings for q (quantizer) and Q (dynamic quantizer 'speed') ----------------------------------------------------------------------- Basicly you should keep q between 4 and 8. 4 gives slightly better quality at the cost of more artifacts in 'active' areas (see below), while 8 gives some more artifacts overall, but a more constant quality. A too low quantizer will result in a drop of quality towards the bottom of a frame. This effect depends a lot on the compressor that you use, but mpeg2enc is fairly unintelligent when it comes to distributing the available bandwidth over a frame so when it starts running out of bandwidth it will bump up the quantizer as far as needed to try to fit in the available bandwidth still. When this happens you will see artifacts (blocking) at the lower part of the picture. In this case you will have to use a slightly higher quantizer, or use a higher value for the dynamic quantizer (see below) Using the Q parameter you can tell mpeg2enc to change the quantizer in higly active areas of a frame. Basicly, mpeg2enc tries to detect areas with high motion and lots of detail, and increases the quantizer for those specific parts of a frame. This will help reducing bandwidth usage while having a very mild impact on visual quality. This results in a noticable increase in visual quality (while in fact the quality of reproduction is about as good as without this feature, but the way that motion is perceived by the human eye makes that it looks better still) Reasonable values are between 0.1 and 0.5 approx, usually 0.3 will give a very decent result. You will have to experiment with this parameter for every source and have to look out for 'shadowing' (look for ghost images or 'shadows' of detailed parts of a frame) Subsampling ----------- If you run on either a cpu that has 3dnow support or on anything faster then 1ghz, you simply want to use -4 2 -2 1 as quality settings for the subsample search. This is a bit of a speed/quality tradeoff still, but the difference with -4 1 is not noticable, while the performance difference with encoding is. On other hardware you most likely want to use the default settings due to the high performance penalty. If you don't care how long it takes then go for the adviced settings still. Ideal combinations ------------------ Ideal format for 'real' movies: If you can get a movie in ntsc format with the typical 3-2 pulldown mpeg header, that is by far the prefered source. A pal format movie will give a slightly better resolution, but the additional quality may get lost due to de-interlacing, which is not needed when dealing with ntsc resolution material in 24fps + pulldown. Compress to 2500kbit/sec mpeg2 video in 480x480 or 720x480 with a q of 4 and a Q between 0.2 and 0.4 Ideal format for animated or low motion movies: Again, get it in ntsc pulldown format if possible. Compress to a 2000 kbit/sec mpeg2 video in 480x480 or 720x480 with a q of 4 and a Q between 0.2 and 0.5 Ideal format for TV: Get the source in pal format if possible. Compress to 1800-2200kbit/sec 352x576 or 480x576 with a q of 6 and a Q between 0.2 and 0.5. Make sure to use -N or another form of noise reduction. Do not try to reduce bitrate requirements by using inverted telecine to reduce ntsc video from 30 to 24fps unless the original was once 24fps and you are able to detect and undo the telecine process that was used to convert to 30fps. Using it on regular video often hurts video quality worse then the little bit of extra bandwidth or quantisation artifacts that 30fps will cause. Also, de-interlacing by doing inverse telecine can cause severe picture tearing artifacts when used on original 30fps video. When encoding a TV recording to svcd format, you may consider to not de-interlace the material at all if the source was made with a video camera or any other device that creates 60 individual fields/sec instead of 30 'field seperated' frames/sec. The resulting motion quality could be a lot better, but you must inform mpeg2enc that the source is interlaced by giving it a -I 1 argument, and you must tell mplayer to generate interlaced output by changing the -vo parameter to -vo yuv4mpeg:interlaced Also, this works very well on traditional (CRT) based TV sets, but usually has no real effect on a modern 'digital' TV set. In all other cases, there is little point in using interlaced material on super video cds, and video cd 1.1 and 2.0 do not support it at all. Almost any svcd and dvd player will properly interlace the output when needed, and by using non-interlaced video for compression, you save both on bandwidth and compression time for a given quality. Definitions ----------- aspect ratio : width/height for a display. bitrate : number of bits/second. elementary stream : audio or video encoded in the mpeg format. 1 or more elementary streams and a system stream can be multiplexed into a program stream. measured quality : The quality of reproduction, usually expressed as the percentage of the original information that gets reproduced correctly. mpeg : motion picture experts group, used here to denote standards created by this group. program stream : multiplexed mpeg audio/video/system streams. This is often the only type of stream recognized by hardware video players visual quality : The quality of video as perceived by a viewer. Due to imperfections in how the human eye sees moving pictures, a lot of tricks are possible that will reduce the measured quality without affecting visual quality. Most noticable are that in low light situations, color gets lost anyway, and that small details that move at a high speed are not very visible, hinting is enough usually.

references:
www.vcdhelp.com has excelent information about the video cd standard and its variations.
www.icdia.org is the homepage of 'The NEW International CD-I association'..
Old by now, but a good reference for CD-I related information, and it has pointers to the cdi player app for cdi compatability.


(C) Copyright 2003 Bart van Leeuwen. All rights reserved. Publication and distribution allowed only with inclusion of this copyright notice and a reference to the original location (http://www.bartsplace.net/)

Publications index