Optimise your voice content for voice search with these 3 top tips

I recently went to The Technology in Marketing Show at The Olympia in London and one of the questions I was asked was “How do I optimise my videos for voice?” No surprise really.

Google’s most recent stat on voice reports that 1 in every 5 mobile searches are carried out by voice. That means about 12% of all Google searches (a whooping 420million searches) are mobile voice searches on Voice Assistants such as Siri. And as the technology improves to deal better with voice recognition and voice commands, it’s predicted to rise. I’m sure Siri will be a top baby name in 2020!

In truth though, voice is still in its infancy and optimising video for voice search broadly replicates what you should be already doing to optimise your video rankings. There are, however, 3 little nuances that could help your new, and existing, video content achieve a ranking for when Siri (or others) come a searching.

1) A video script (and transcript) that is conversational in tone

Users of voice search tend to speak with their Voice Assistant much in the same way as they talk to a friend. Their search is more conversational in tone. More specific. More localised. Resembling more a two way conversation, with Voice Assistant’s being asked follow up questions to the original search.

Their searches also demonstrate a much clearer intent e.g. How do I. In fact, “How to” questions were the most commonly asked questions by users of Voice Assistants in 2017.

When developing any video content, you should always know what your customers are searching for as it’s crucial to develop content that responds. However, if you factor in the slight nuances of voice search from the offset, you can work more cleverly on the video’s content and script development.

You can have a more conversational script where the content is more “how-to/FAQ” in nature. By thinking of how a person would actually ask for your content when speaking, allows you to write a script that answers their questions in this tone. It also allows you to pre-empt their follow up questions, as well as to include their keywords of intent e.g. What, How. Being more conversational in tone means you cover off all the long keyword phrases and questions that people will casually say in their voice search.

This, in turn, means the corresponding video transcript will be more targeted to voice and, as it’s your video transcript that will act as your video’s “page copy,” it will mean you can become more relevant to the voice search, as well as text searches, helping your video to rank.

2) A more local meta description

It’s not just the script that can be developed in a more conversational way. There are little refinements you can make to the meta description too when having voice search in mind.

Longer descriptions (between 100-200 words) work well for video optimisation as it explains what your video is about. Users of voice are much more likely to use regional words or phrasing when speaking to Voice Assistants and the meta description provides an ideal place to reflect these.

For example, a person, based in Gateshead, may ask their Voice Assistant “Where can I get new school shoes near me for the Bairn today?” If you were selling children’s shoes in and around Gateshead, you would want to include the word “Bairn” in your description as well as Gateshead and the surrounding localities too.

Having voice in mind means you can tweak and refine to optimise your meta description better for all search.

3) Interactive video to deliver immediate results

Strictly speaking, this is more about optimising your video format rather than video optimisation per sea but definitely worth a mention here.

Interactive video example

Ted Baker’s interactive video that had clickable hotspots embedded into the video. Viewers could open a pop-up, read more information about the item and either hit the “shop now” button or continue watching the video.

Users of voice search are predominately using their Voice Assistant on their smartphones and 51% use it to find out about a product they wish to buy. There is more immediacy to the searches than on a desktop, with people often wanting to know about or buy a product NOW!

Interactive video allows people to buy products from right within the video player window, by allowing you to add products to your basket. The interactive video Ted Baker launched to support their Christmas shopping campaign is a perfect example.

Having a video format that responds to what people want to know about your product and then giving them an immediate way to buy it, capitalises on the immediacy of voice vs text search.


While it is still early days for voice search, and tends to be more popular in consumer rather than business markets, it is steadily gaining ground as a way for people to search for information. Considering voice within your overall SEO content plans will mean that your videos, and other content, do well for when Siri comes a calling.




Share this article

Got a video project you’d like to talk to us about?