Take One logo


Subtitling for web videos

If you were to ask any broadcaster or internet video users about how to apply subtitles or captions on web distributed video and chances are the replies will be that it is relatively easy to do so and that all you need is a simple text file with some timing and perhaps style information included (e.g. and SRT, DFXP, WebVTT, SMIL or SAMI file).

The answer they give is right if they weren’t too concerned about how the text looked or if it was presented with the same degree of timing accuracy as is the norm for conventional television broadcastings. If you were to ask them whether that would work for live web video (simulcasts or web-only live broadcasts) uncertainty would start to kick in. There are some bits of software out there that can provide in vision captions, or subtitling, but it’s important to remember to keep to the ‘safe’ area for on screen graphics, or else you will lose some of your words when they are displayed on a TV screen. Also allow enough reading time on screen, that format is usually 3 seconds per line and no more than 6 words per line depending the content.

“Captioning” is included with the word “subtitling” in this blog article. Caption users have different priorities to subtitle users and due to this the technologies that have been developed to support caption provision in web video distribution do not provide all the features needed for translation subtitles to be of the highest quality. Indeed some automatic translations, much like automatic transcription tools, can be very off point!

In reality subtitling on the internet is in fact still in its infancy. There are the tools, but you still need an experience human to really make it work. The wide range of competing video distribution standards are proprietary and offer different mechanisms and levels of provision for subtitling. Unlike traditional broadcast, there are clear standards for subtitle files and no consistent strategies for the presentation of subtitles. There are generally agreed formats, for things like number of words, type of type style, duration of title and so on, but these are guides based on experience – in our case from subtitling for the BBC over many years and for corporates where their videos need to be accessible to those who can use audio.

Due to this, it presents many problems for broadcasters of video online. Online video has a wider demographic audience, with a larger range of translation requirements. Trying to create wide range of translations for a web video (plus one caption service at least) causes enough problems without having to create all the translations and captions in different formats to match all the distribution mechanisms the content of the video may be delivered through. Adding the issues of presentation inconsistency and all the different user interfaces to select and enable the captions or subtitles and web subtitling becomes very difficult.

A system developed by Screen Systems has addressed this issue by creating technology which can add image based subtitles at a high quality to web video. The images are not burned in and can be switched on and off by the user, using the controls in the player. It delivers the text as a bitmap image and this eliminates all the issues with text rendering and it is key to a consistent high quality service across different video player platforms. The images are sent via a separate HTTP download instead of being sent within the video. A small issue could be that a larger bandwidth is needed compared to text-based subtitles but is still small compared to the video itself and only subtitles for the selected language are sent to the player. If the subtitles are switched off, then there is no overhead.

Video on Demand services only require a player that supports scripting or (plugins) and transparent overlay and a mechanism which reports the current position of the video playing. The video stream is not altered to accommodate the subtitles and other languages can be added to the subtitle server without any changes to the video distribution.
The main problem leads on form this and it is the problem of live web video subtitling.
In a live web broadcast the position of the video cannot be correctly reported due to the fact that this differs with each viewer depending on when they “switched the live broadcast on”. Also, each viewer has the opportunity to pause or rewind the live stream (if the feature is provided).

Screen Systems had to develop a different synchronisation mechanism for live web broadcasts, while still using the image based subtitles on a different server. The main difference is in the “script” player or “plugin”. In the solution, they added low bandwidth timing information to the video stream. This is used to synchronise the subtitles to the frames of the video the viewer is watching. The “script” on the viewer’s device can obtain the current time value of the frame from the stream and it can be used to determine which subtitle should be used, if any, to be displayed from the subtitle server.

To develop player components (plugins or scripts) may be seen as a limitation of this solution compared to using “off the shelf” methods provided by the player (if any). However, any web video service has elements of custom functionality added when they are put online. (e.g. DRM). The advantage of the Screen System solution is that the same mechanism of the image based subtitles held on a separate server can be used for consistent broadcast quality subtitling on both Video on Demand and live web broadcasting.

For smaller businesses who are hosting their videos on You Tube, the captioning tool within You Tube is very flexible once but can be time consuming if it’s not something you use frequently – so engaging subtitling services from experience producers can actually save a lot of time and eventually money.




Archive by Date