In the future, will journalists be put out of their jobs by robots? It’s a distinct possibility, starting with live coverage of sporting events. In a research paper published late last month, Rahul Anand Sharma, and CV Jawahar, scientists at IIIT Hyderabad, along with Pramod Sankar K, of Xerox Research Center India, explain how they used machine learning techniques to generate text-based cricket commentary, with an accuracy rate of 90 percent.
“In the first stage, the video is segmented into ‘scenes’, by utilising the scene category information extracted from text commentary. The second stage consists of classifying videoshots as well as the phrases in the textual description into various categories. The relevant phrases are then suitably mapped to the video-shots,” says an excerpt from the paper titled “Fine-Grain Annotation of Cricket Videos”.
Gadgets 360 spoke to Professor CV Jawahar about the paper, and he explained that the solution could be used by sports websites to automate and assist reporters in writing real-time cricket commentary.
“We use readily available data like broadcast videos and Cricinfo commentary as examples for these machine learning methods,” says Jawahar. “To learn such a representation, several examples are needed. A computer program then learns from these examples using machine learning algorithms, and tags parts of the video with these labels.”
(Also see: Deep Learning – Teaching Computers to See Like People)
The video dataset was collected from the YouTube channel for the Indian Premier League (IPL) tournament, while snippets of commentary were scraped by crawling through commentary for about 300 matches on Cricinfo. The scientists’ computer algorithms were able to accurately label a batsman’s cricketing shot by using visual-recognition techniques on an action that last a mere 35 frames, or 1.2 seconds.
Exploring the potential innovations or applications that could result from the research, Jawahar said that beyond automated commentary generation, their work also enables enthusiasts and experts to study the game deeply, and search for a specific aspect of the game. According to the paper, the annotation of the videos allows the researchers to build a retrieval system that can search across hundreds of hours of content for specific actions that last only a few seconds. Cricket teams could use the annotation technology to analyse strengths and weaknesses of a particular player – batting strokes that are effective against a particular team or bowler, or to study the kinds of deliveries a batsman is weak against.
“For example, one could learn how Rahul Dravid modified his straight drive over years, from automatically annotated video databases,” says Jawahar. “It can also help in game strategy planning for the team management. This can also help in training or coaching emerging players.”
The machine learning techniques can be applied to other sports as well, with the team having made attempts at commentary generation for tennis, with a paper presented at the British Machine Vision Conference 2015. According to the team, commentary generation for games like cricket and soccer is harder than tennis, since the events that take place in these sports are more diverse. Robot journalists are already playing a significant role in mainstream journalism in many news beats, by writing a bulk of the copy, and assisting the desk editor with the the copy for the first draft.
For example, the Associated Press tied up with US-based language generation software provider Automated Insights to produce machine-generated stories about college baseball and earnings reports. Another US-based player in this space, Narrative Science uses its patented artificial intelligence platform, Quill, to mine data for meanings and insights, and automatically generate reports. The automated reports usually lack the same colour, quotes, and asides that machines can’t gather, at least for now. But who know, even that may change in the future.