HD lip sync issues require more attention

By Ken Kerschbaumer

Lip-sync has been a vexing problem for years but will the move to HD make it even more of a challenge? That was the focus of a discussion at the AES convention this past weekend. While the vast amount of encoding and decoding that an HD signal experiences increases the lip sync problem there’s a more fundamental issue: do the sharper images of HD mean that viewers will be able to more easily tell if lips are out of sync?

Andrew Mason, of the BBC research department, says the BBC has been conducting tests whereby viewers are shown identical material in both SD and HD where audio is out of sync. Viewers are then asked to adjust the audio until it is in sync. The premise is that if the viewers can more accurately dial in the HD images than there is evidence that yes, more resolution does require tighter lip sync.

“SD viewers had a 50 ms wider swing of sync than HD so, if the numbers are to be believed HD may require a change in lip sync parameters,” he says. Higher frame rates also could increase the need for tighter specs. Mason, however, made it clear that further testing needs to be completed before any solid conclusions can be drawn.

Ken Hunold, Dolby Laboratories broadcast applications engineer, says that delays are usually a result of video processing occurring more slowly than the audio processing. Sports technologies like the First-and-Ten lines and even CCD cameras all introduce video delay.

He recommends, as a first step, drawing a flow chart of all the equipment in the broadcast chain and writing down the delay each device introduce. That can help identify the weak links in the chain and find out where problems can best be addressed.

The trick, adds Randy Conrod, Harris product manager, is to make sure that if there is a delay it is video before audio and not audio before video. Nearly everyone has been pre-conditioned to respond reasonably favorably to delayed audio thanks to experiences like watching scoreboard video or even presentations at conferences where the viewer is far from the screen. But when the audio is ahead of the video it appears “un-natural.”

“The compromise route is to have the studio output provide enough delay to guarantee that the early audio noticeable threshold isn’t reached,” he says.

Ironically, many of the issues are a result of simply too much processing gear being in the chain. In the 1970s when processing gear cost upwards of $100,000 lip sync issues were few and far between. “Now that equipment is like paper clips,” he adds. “You think you don’t have any issues until you have too many of them.”

Video servers, for example, will happily capture and protect an incoming misaligned audio/video signal. “But coming out of the server the issue can be fixed,” says Conrod.

And then there is the home front. The growing processing power of set-top boxes and complexity of HDTV sets with built-in scan converters can all negatively impact the viewer experience. Hunold says it’s imperative that the consumer electronics industry become involved in addressing delay issues that occur at the home.

Adds Conrod: “All the conversion processes in the home add more time to the video process. Even if the signal is tightly timed to the set-top box issues can arise on the output.”