Background Theory

Visual Servoing closes the perception and action loop for a robot. Hence the background theory needed for it covers topics in computer vision and robotics. A short description of the different areas involved is presented below. Emphasis is also given into how exactly it fits into visual servoing and not into a detailed and exhaustive description.

COMPUTER VISION

I. Camera Calibration

A camera can be thought of as transforming the 3D “real world” into a 2D “image”. For this case, obtaining the transformation that takes us from 3D to 2D  is called camera calibration. The simplest and most used model for this is called the pinhole camera model given by:

This model shows how to obtain the point in the image (u,v) given the point in the 3D world (X,Y,Z). The relation between these 2  is made using the 3×3 intrinsic camera matrix. This formulation is important in deriving the most basic feature used in image-based visual servoing – a point in the image.

II. Pose Estimation

Another important section in computer vision is pose estimation. A pose is composed of the translation and rotation of  the object in a reference frame (for example the camera frame). Pose estimation algorithms take the input image (2D) and extract the pose (3D) of a known object. Examples of well-known and “classical” pose-estimation algorithms are:

Model-based object pose in 25 lines of code
Dementhon D. , Davis L.
International Journal of Computer Vision, 1995, Vol. 15, Issue 1

Three-dimensional object recognition form single two-dimensional images
Lowe, D.
Artificial Intellengence, 1987

Several newer methods have also been developed in pose estimation. The main idea remains: the extraction of the 3D pose information from the 2D image. The 3D pose estimate of the object is the basis of features used in the control law of position-based visual servoing.

III. Feature Detection and Tracking

Another important part of computer vision needed in visual servoing is the ability to detect the features in the image to be used. After detection, tracking methods are used to follow the feature’s location throughout the servo. Feature detection and tracking can be as simple as detecting a single interest point and tracking it throughout the servo. It may also be needed to detect and track an arbitrary object (which is currently an active area of research in computer vision). Below are references to some “classic” and well-known detection and tracking algorithms:

An Iterative Image Registration Technique with an Application to Stereo Vision
Lucas, B. and Kanade, T.
International Joint Conference on Artificial Intelligence,1981, pages 674-679

Detection and Tracking of Point Features
Tomasi C. and Kanade T.
Carnegie Mellon University Technical Report CMU-CS-91-132, April 1991.

Good Features to Track
Shi J. and Tomasi, C.
IEEE Conference on Computer Vision and Pattern Recognition, 1994, pages 593-600

ROBOTICS

I. Forward Kinematics

Forward Kinematics takes as an input the value of the joint positions and outputs the pose of the end-effector. This is done systematically using the Denavit-Hartenberg algorithm. This isn’t directly used in visual servoing itself, however when the end-effector is not in the field of view of the vision system, forward kinematics can give its location.

II. Inverse Kinematics

Opposite to forward kinematics, inverse kinematics takes as an input the value of the end-effector pose and outputs the value of the joint positions which can achieve this pose. This is the more interesting formulation since most problems in robotics give as an input where we want the end-effector to be (i.e. grasping) and all robots are controlled by controlling each joint position (although commercial models abstract this layer of control). The same methods in inverse kinematics to transform the given 3D pose to joint-space can be used in visual servoing methods.

III. Robot Jacobian and its Inverse

The robot jacobian is a matrix obtained by getting the time derivative of the forward kinematics equation. So it gives a relation between the 3D velocity of the end effector and the joint velocities:

v = J(q) qdot

In the equation above v is an n-vector with n being the number of degrees of freedom of the robot. For example for a 6DOF robot, v is the velocity component of each translation (X,Y,Z) and rotation (roll, pitch, yaw) component. qdot is an m-vector with m being the number of joints of the robot. It represents the velocities of the joint motion (radian/sec for an angular joint and m/sec for a prismatic joint). For example, REEM’s arm has 7 angular joints. In this case, qdot is a 7-vector corresponding to each joint’s rotational velocity. Finally the robot jacobian is represented by J(q) which is an nxm matrix. It should be noted that the robot jacobian is a function of q (the current joint position) as denoted by the notation J(q). This means that the jacobian’s values depend on the configuraion of the robot.

However, the 3D velocity is often known and we wish to know the joint velocities necessary to do this. A popular way to do this is by using the general Moore–Penrose pseudo-inverse. In particular, the SVD method is quite good since it also reveals if a singularity exists. Singularities are when a robot is in a specific configuration where it loses the ability to move in a particular degree of freedom (this can be problematic if not taken into account).

The robot Jacobian and inverse is used a lot in the formulation of visual servoing. Furthermore, another type of Jacobian is introduced, the image Jacobian which works together with the “classical” robot Jacobian. For information on this and the theory involved in different visual servoing methods, go to the page Visual Servoing Methods

  1. No comments yet.
  1. April 15, 2012 at 3:44 pm
Comments are closed.