Robot Soccer Commentary |
Deepayan Chakrabarti,
IInd yr. CSE,
Roll no: 96089,
deepay@iitk.ac.in
Motivation |
The aim of this project is the automatic generation of Natural-Language commentary of a soccer match.In totality, this would involve the combination of Graphics, Object recognition and Natural-Language Processing. But in this project, I am assuming that the Image Processing and Object Recognition parts have already been carried out and their output is available to my program. So the heart of the project is the detection of events occuring in the soccer game, and reporting these events in Natural Language. Given the positions of the involved players and the ball, the program has to detect events such as passes, intercepts,goals, etc. In addition, the program must also be able to infer the intentions of the players to some extent. By intentions, I am referring to events such as tackles, where the player who comes up to tackle may not actually come into direct contact with the ball. So the program has to detect these intentions based on the how a player moves over a period of time. This forms the EVENT RECOGNITION part of the project.
The next part,the COMMENTARY GENERATION part, is to communicate these events to the user. This communication has to be carried out in Natural Language. In fact, the output should be in a sublanguage specific to soccer. This sublanguage must incorporate sentences and phrases which are normally used by soccer commentators to give the output a real flavour. So, the aim of this part is to generate a Natural Language representation of the events which have been detected by the previous part.This representation is context-specific, ie., the commentary depends on the present circumstances. As an example, if the frequency of a particular event(say a pass) is high, then different styles and sentences will be used to communicate the event( X passes to Y; X gives it to Y; The ball is with Y now; etc. ).This will hopefully prevent the commentary from becoming predictable and boring.
The following is a sample output incorporating many of the important features of the program:
3> Y comes up to tackle B.
4> A runs along the flanks.
6> B gives it to A.
8> A going forward with the ball.
10> Intercepted by X.
11> X going forward with the ball.
13> A coming up from behind.
16> Now, Z.
16> C comes up to tackle Z.
......
Methodology |
The first step in any scene-recognition program is the conversion of low-level vision data to high-level scene information. In the present case, this means that ,from an input file containing positions of the players and the ball, the program has to figure out which player is currently in possession of the ball. The program also has to maintain some data about the previous histories of positions; this is useful in the event-recognition part. Now, given this high level data, several event-detection modules are run independently. Inside each module is a set of conditions which define that particular event. Whenever an event gets detected, the commentary generation part is called. A message string reporting that event is formed, and is put in a queue of messages, its position in the queue depending upon its priority. This message queue eventually gets printed on the screen, thus giving the commentary.
As an example:
The 2 players A and B belong to the same team. In step (i), the program
sees that A is closest to the ball. In fig (ii), ball velocity changes
from 0 to some finite value. So someone must have hit the ball. Since A
is closest, the program assumes that A must have hit it. Ball velocity
changes again in fig (iv), implying that B must have touched the ball.
Now, the pass-detection module steps in. Since both players who touched
the ball are of the same team, it must have been a pass. By similar reasoning,
intercepts, tackles, etc. can also be detected.
The case is somewhat different for detecting events like "overtake",
which may span a long period of time. Here, the event is divided into several
subparts:
a) Ball goes away from player who had the ball last
b) A new player comes closer to the ball than the one who was closest previously.
Each of these subparts is recognised in turn, so that finally the whole
event is recognised.
Continuing the pass example, the detected pass triggers a pass-template.
A string such as
" A passes to B" is generated and put in a message queue. This
message finally gets communicated to the user.
Results |
The aim of the project is to give a realistic commentary. To this end,
the program recognises several events which are commonly commented upon
in real soccer commentary. The following is a list of these events :
These events form the bulk of the major events in any soccer match.
Hence reporting these events as they occur might be expected to give a
reasonable level of commentary. Also, the program has multiple ways to
report the same sort of event. For example, the program might say "Intercepted
by X" or "X steals the ball" or "A loses the ball to
X" to comment upon the same event. So the commentary doesn't use the
same Natural Language template too often. This prevents the commentary
from becoming repetitive.
Also there is a provision for the user to influence the type of commentary.
Each Natural Language template has some priority attached to it. If we
want the commentary to be of some special quality, we can increase priorities
of the templates which have that quality. This will lead to increased usage
of those templates. Thus we can get a commentary specially flavoured for
us.
This is an example of an input file. The file begins with a comment
line which specifies the type of output the file produces. Then, the player
and ball positions are written on the file, one ordered row for every time
frame. In each row, 14 numbers are written. These are the (x,y) values
of the 6 players, and the values for the ball.The field is assumed to be
a 40*20 grid, with origin at the centre of the field. Press here to see a sample input file Help on making input files |
This shows the output obtained for the sample input file discussed
above. The number field preceding each line is the time stamp. The first
row in the input file represents time=-1(because it is used in initialization).
Thus actual commentary can begin from time=0. Press here to see the output |
This is another input file which detects overtakes. Press here |
Here is the output. Press here for the output |
Thus, given an input file of positions, the program can detect a set
of events and communicate them to the user. The program has the capability
to read from a file which is concurrently being written into. This allows
the program to be easily linked up to a simulator running side-by-side,thus
allowing the generation of live commentary of the match. As the file is
being written into, the program reads from the the End-Of-File and gets
the latest positions. So,live commentary can be obtained.
Conclusion |
The project program has been able to detection and commentary of some of the more frequent events occuring in a normal soccer match. Still, the set of detectable events is small compared to range of different events that take place during the 90 minutes of a soccer game. An obvious (and perhaps never-ending) improvement would be the addition of more event-detection capability. A general approach for such event detections has been discussed in the "Methodology" part. This approach may be extended and used to achieve this aim.
A bigger aim might be the addition of LEARNING . That is, given a model commentary of a soccer match, the program should be able to learn the various events that can occur, and how to detect them. Such a program could continuously improve upon its repertoire of detectable events, and in the long run, may achieve a very high level of commentary generation.
Bibliography |
The following is a list of reference papers on the subject:
Implementation Details |
The program has been written in C. There is only one file containing
the whole program. This file is called "com1.c".
Press here to see the source code (
Nearly 700 lines )
The program has been compiled in GNU C. Along with the standard C header
files such as stdio.h, stdarg.h,etc. , the program also requires math.h
. Hence the program has to be compiled by the command:
gcc com1.c -lm
A functional approach has been adopted in programming. The program has
been divided into several functions,each performing a separate subtask.
The first step is
get_coords: This
function gets the player and ball coordinates from the input file. This
information is stored in a temporary array.
to_GSD: Associated with each
player is a record containing all relevant information about the player(his
x,y position,his present velocity, etc. ). The function combines this information
with the incoming data, to get the new values for the player records. Some
preliminary processing is also carried out. For example, the player closest
to the ball is detected. Ball positions and other related info are also
updated.
agents:
This is the heart of the program. It contains all the event
detection modules. Each module functions independently. The data generated
from the previous parts is used to recognise events. Each module contains
rules defining a particular event. Rules are basically single conditions
on position, velocity,etc. of the players and the ball, linked together
by "AND" or "OR" clauses. When an event is detected,
a call is made to the next function:
select:
This function does the job of interfacing the event recognition and commentary
generation parts. This function decides the Natural Language template to
be actually used,and also sees to it that the same event does not get reported
more than once over a period of time.
add_msg:
This unit actually inserts the required information into the NL templates
to get the text which is to be output as commentary. The message is then
placed in a message queue depending upon its priority.
prune:
If too many messages get generated, then this part removes the less important
ones. This unit is also responsible for seeing to it that messages that
don't get communicated(say, for lack of time) die out and get removed from
the queue.
give_msg: This
unit delivers the message to the user. After outputting a message, it is
removed from the queue.
The whole process is again repeated. Thus, all these functions are run continuously in a loop.
Extensions:
I have mentioned earlier that the major area of extension would be in the addition of learning to the system. I would recommend an approach that has been used in some traffic description systems. Here, the program is initially given a large set of variables related to the system. Over time, the program tries to find the variables which remain invariant ,or change in a well-defined manner, during the course of an event. Once such variables are detected, they can be used to define the event. So,later on, when the program sees that a particular set of variables has not changed for some time, the corresponding event can be inferred.
Another obvious area of improvement is to increase the event-detection
capability of the program. More event-recognition modules may be added,
using the same approach as used in the program. Multiple templates describing
the same event be developed, leading to an increse in commentary quality.