Evaluation of lens distortion

Online.Evaluation of lens distortion

Published on Tuesday, July 9, 1996 by Gideon Ariel Evaluation of Lens Distortion Errors in Video-Based Motion Analysis

ABSTRACT

Video based motion analysis systems are widely used to study human movement. These systems use computers to aid in the capturing, storing, processing, and analyzing of video data. One of the errors inherent in such systems is that caused by distortions introduced by the camera and lens. Wide-angle lenses are often used in environments where there is little room to position cameras to record an activity of interest. Wide-angle lenses distort images in a somewhat predictable manner. Even “standard” lenses tend to have some degree of distortion associated with them. These lens distortions will introduce errors into any analysis performed with video-based motion analysis systems.

The purposes of this project were:

  • 1. Develop the methodology to evaluate errors introduced by lens distortion.
  • 2. Quantify and compare errors introduced by use of both a “standard” and a wide-angle lens.
  • 3. Investigate techniques to minimize lens-induced errors.
  • 4. Determine the most effective use of calibration points when using a wide-angle lens with a significant amount of distortion

A grid of points of known dimensions was constructed and videotaped using two common lenses (a standard and a wide-angle lens). Recorded images were played back on a VCR. A personal computer was equipped to grab and store the images on disk. Using these stored images, two experiments were conducted. For the first experiment, three operators (subjects) each digitized all points in the grid twice. For the second experiment, the digitized grid of points from one of the three operators was re-processed using six sets of calibration points, each at a different location in the image. Errors were calculated as the difference in distance from the known coordinates of the points to the calculated coordinates. It was seen that when using a wide-angle lens, errors from lens distortion could be as high as 10% of the size of the entire field of view. Even with a standard lens, there was a small amount of lens distortion. It was also found that the choice of calibration points influenced the lens distortion error. By properly selecting the calibration points, and avoidance of the outermost regions of a wide-angle lens, the error from lens distortion can be kept below approximately 0.5% with a standard lens and 1.5% with a wide-angle lens.

INTRODUCTION

Video-based motion analysis systems are widely used to study human movement. These systems use computers to aid in the capturing, processing, and analyzing of video data. The process of analyzing video data includes performing a calibration by identifying several points of known coordinates on the recording media. The analysis algorithm uses these points to create a linear mapping from the video images to actual coordinates. As with any data acquisition system, it is of interest to scientists and engineers to determine the accuracy and reliability of a particular system. Motion analysis systems have many possible sources of error inherent in the hardware, such as the resolution of recording, viewing and digitizing equipment, and lens imperfections and distortions. Software errors include those caused by rounding and interpolation. In addition, there are errors which are introduced by the use of the system, such as inaccurate or incomplete calibration information, placement of cameras relative to the motions being investigated, and placement of markers at points of interest. Other errors include obscured points of interest and limited video sampling rates. Because of space limitations during certain applications of motion analysis, wide-angle lenses are often used. The central region of a wide-angle lens is similar to that of a standard lens; however the periphery of a wide-angle lens is shaped to allow a larger field of view. The end result is that the image from a wide-angle lens is distorted, especially in the periphery. See Figure 1 for a demonstration of the distortion associated with a wide-angle lens. This type of distortion is referred to as “barrel” distortion. Image distortions will introduce errors into any analysis performed with video-based motion analysis systems. It is therefore of interest to determine how great this error may be and in what region of a lens it is sufficiently small.

Results from analysis of video data are highly dependent on the accuracy of the calibration procedure. Hence, when using a wide-angle lens, the location of the calibration (control) points in the image is important. Typically, the points are chosen to enclose the entire region in which there will be data to analyze. However, with wide-angle lenses, it is expected that errors are greater further away from the center of the image. Perhaps points near the center should then be used as the control points.

PURPOSE

The specific purposes of this project were:

  • 1. Develop a methodology to evaluate errors introduced by lens distortion.
  • 2. Quantify and compare errors introduced by use of both a “standard” and a wide-angle lens.
  • 3. Investigate techniques to minimize errors induced by lens distortion.
  • 4. Determine the most effective use of calibration points when using a lens with a significant amount of distortion.

METHODS

Data Collection

(Figure 2). The grid was mounted on a sheet of foam core and attached to a wall. The center point of the grid was marked for easy reference. A Quasar camcorder (model VM-37) was placed on a camera stand perpendicular to the grid, with the center of the lens aligned with the center of the grid. A grid was constructed with thin, black, vertical and horizontal lines spaced 3.8 cm (1.5 in.) apart on a white background. The total grid size was 53.3 x 48.1 cm (21.0 x 15.0 in). The intersections of the eleven horizontal and fifteen vertical lines defined a total of 165 points

Two lenses were used to record video. The first was a standard 1:1.4 lens. The other was a 0.5X wide-angle lens. The camcorder’s zoom feature was not utilized, thus allowing for the greatest possible viewing area. For each lens, the camera was positioned at a distance from the grid such that the grid almost completely filled the field of view, paying special attention to the left and right borders. For the standard lens this distance was 88.3 cm (34.8 in.); for the wide-angle, 50.2 cm (19.8 in.).

Data collection consisted of videotaping the grid with each of the two lenses. The lens type, distance from the grid to the camera and camcorder settings were identified by voice on the tape. Approximately fifteen seconds of video were recorded with each lens.

Data Analysis

An Ariel Performance Analysis System (APAS) was used to process the video data. Recorded images were played back on a VCR. A personal computer was equipped to grab and store the images on disk. Several frames were chosen from the recording and saved, as per APAS requirements. From these, analyses were performed on a single frame for each lens. Using these stored images, two experiments were conducted.

Experiment 1

For the first experiment, three operators (subjects) each digitized all points in the grid twice. Note that here “digitizing” refers to the process of the operator identifying the location of points of interest in the image with the use of a mouse-driven cursor. Often digitizing is used to refer to the process of grabbing an image from video format and saving it in digital format on the computer. The subjects for this study had varying degrees of expertise in the digitizing process. Digitizing and subsequent processing resulted in X and Y coordinates for the points. Because of the large number of points (165) being digitized , the grid was subdivided into separate regions for purposes of digitizing and analysis. Figure 3 illustrates this subdivision.

For this experiment, the four points nearest the center of the grid were used as the control points (points marked “1” in Figure 2). These were chosen because it was anticipated that errors would be smallest near the center of the image. Using control points which were in the distorted region of the image would further complicate the results. The control points were digitized and their known coordinates were used to determine the scaling from screen units to actual coordinates. These coordinates ranged from 0 to approximately +266.7 mm in the X direction and 0 to approximately +190.5 mm in the Y direction. To remove the dependence of the data on the size of the grid, normalized coordinates were calculated by dividing the calculated X coordinates by 266.7 mm and the Y by 190.5 mm. Thus, coordinates in both the X and Y directions ranged approximately from -1 to +1, and were dimensionless.

Experiment II

For the second experiment, the digitized grid of points from one of the three operators was reprocessed using six sets of control points. For the first condition, the control points were at +/- 1 grid units in the X and Y directions from the center of the grid (i.e., {1,1}, (1,-1}, {-1,1}, and {-1,-1}). For the other conditions, the control points were at 2, 3, 4, and 5 grid units. A final condition was with the control points furthest from the center (7 grid units in X, 5 in Y). A graphical display of these locations is shown in Figure 2.

For all trials the error for each digitized point was calculated as the difference in distance from the known coordinates of the point to the calculated coordinates.

RESULTS

Experiment I

The raw data from the standard and wide-angle lenses from the first experiment are shown in Appendix A in Figures A-1 and A-2, respectively.

The data are presented as graphs of the calculated normalized coordinates of points. Grid lines on the graphs correspond to the grid lines which were videotaped. Each graph contains the data from the two trials from one of the subjects. Note the barrel distortion evident in the wide-angle lens. Even the standard lens exhibits noticeable errors.

For each lens/subject combination, the calculated X and Y coordinates (unnormalized) of each point from the two trials were averaged. The error of each point was calculated as the distance between the calculated average location and the known location of that point. These error values were then normalized by calculating them as a percent of the maximum coordinate in the horizontal direction (26.67 cm). This dimension was chosen arbitrarily to be representative of the size of image. Figures A-3 and A-4 are contour plots of the error as a function of the normalized X-Y location in the image for each of the subjects. Each graph presents the data for the average of the two trials for one of the three subjects. See Appendix B for a description of how to interpret contour plots. Note that with both lenses it was clear that errors were small near the center of the image and became progressively greater further away from the center. Also, an apparent discontinuity existed along the lines of the grid subdivision (Figure 3). This was most likely a result of the control points being redigitized for each individual section; a small error in the control point digitization would be multiplied for points further away from the center.

Another quantitative way of viewing this data was to examine how the error varied as a function of the radial distance from the center of the image. This distance was normalized by dividing by the maximum coordinate in the horizontal direction (26.67 cm). Figures A-5 and A-6 present this data for the average of the two trials from each subject for the standard and wide-angle lenses, respectively. In addition, coordinates and errors for all three subjects were averaged for each lens. Graphs of these average errors as a function of radial distance from the center of the screen are shown in Figures A-7 and A4.

Linear and binomial regressions were then fit to the averaged data for each subject. The linear fit was of the form:

Error = Ao + A1 R

where R was the radial distance from the center of the image (normalized), and Ao and A1 were the coefficients of the least-squares fit. The binomial fit was of the form:

Error = Bo + B1 R + B2 R2

where Bo, B1, and B2 were the coefficients of the fit. The results of these least squares fits are presented in Table 1 below. The columns labelled “RC” are the squares of the statistical regression coefficients (r2). Note that the rows labelled “avg” represent the regressions from the data averaged across all subjects and are not the average of the individual coefficients.

Experiment II

The raw data for the second experiment, in which the control points were varied, are shown in Figures A-9 and A-10 for the standard and wide-angle lenses, respectively. The data are presented as graphs of the calculated coordinates of points. Grid lines on the graphs correspond to the grid lines which were videotaped. Each graph contains the data averaged from the two trials. Locations of the calibration points are indicated on the graphs and are identified by a pair of numbers describing the control point locations (i.e., 1×1, 2×2, 3×3, 4×4, 5×5, and 7×5).

Figures A-11 and A-12 are contour plots of the percent error as a function of the normalized actual X-Y location in the image for each of the calibration conditions. Figures A-13 and A-14 display how the error varied as a function of the normalized radial distance from the center of the image for the two lenses.Third order polynomial regressions were fit to the averaged data for each calibration condition. The cubic fit was of the form:

Error = Co + C1 R + C2 R2 + C3 R3

where R was the normalized radial distance from the center of the image (in millimeters), and Co, C1, C2, and C3 were the coefficients of the least-squares fit. The results of these least-squares fits are presented in Table 2 below. The values in the column labelled RC are squares of the regression coefficients. Finally, Figures A-15 and A-16 present these regression curves combined into single graphs for the standard and wide-angle lenses.

DISCUSSION

When reviewing these results, several points need to be noted. First, this study utilized a two-dimensional analysis algorithm. Only four calibration points were used to define the scaling from screen coordinates to actual coordinates. The use of more than four points would likely result in smaller errors. Second, all coordinates and calculated errors were normalized to dimensions of the image. Although there were many possibilities for the choice of dimension (e.g., horizontal, vertical or diagonal image size, maximum horizontal/vertical/diagonal coordinate; average of horizontal and vertical image size or maximum coordinate; etc.), the dimension used to normalize was felt to best represent the image size.

It is clear from these data that errors do exist when analyzing video data. It is also evident that these errors arise from a number of sources. There seemed to be a large amount of “random noise” introduced by the act of digitizing. The same point digitized by different people, or the same person a number of times exhibited results that varied non-systematically (Pigures A1 and A-2). This error can most likely be attributed to the act of digitizing. There are factors which limit the ability to correctly digitize the location of a point, such as: the point being more than 1 pixel in either or both dimensions, irregularly shaped points, a blurred image, shadows, etc. Because of these factors, it was often a subjective decision as to where to position the cursor when digitizing. There appeared to be more consistency within a single subject digitizing multiple times than between subjects. Since this error was expected to be essentially random, there was justification for using the averaged values for each subject for other analyses.

In Experiment I of this study, two types of regressions were fit to the data: linear and binomial (Table 1). The interpretation of the coefficients of the linear regression can provide insight into the data. Al, the slope of the error distance relation represents the sensitivity of the error to the distance from the origin. Thus it is a measure of the lens distortion. Ao, the intercept of the linear relation can be interpreted as the error at a distance of zero. If the relation being model led were truly linear, this would be related to the random error not accounted for by lens distortion. However, in this case, it is not known if the error-distance relation is linear. The RC values give an indication of how good the fit was. Using this, the wide-angle lens had a better fit when compared to the standard lens (0.41 vs. 0.77). This further suggests that the errors with the standard lens were more “random” than with the wide-angle lens.

The binomial curve fits seemed to more correctly represent the data; however, the interpretation of these coefficients is not very straightforward. Similarly, in Experiment II of the study, the data seemed to be best represented by a cubic relation.

From Experiment I, it was seen that error from both lenses was directly related to the distance from the center of the image (Figures A-7, A4; Table 1). This result in the standard lens was somewhat surprising. A more uniform and random error was expected from this lens. It was believed that these errors were more a consequence of the choice of control points rather than lens distortion. Since the control points define the scaling factor between the image on the screen and real units, even a small error in the digitization of these points will be magnified when used to transform points further away from the center of the image. From the results of Experiment II, it is apparent that there is some systematic error that is a function of distance from the center of the image. Hence, even the standard lens has some degree of distortion. From Figures A-15 and A-16, it is clear that the choice of calibration points strongly influenced the magnitude of the error as well as the distribution of errors over the screen for both lenses. Some trends that can be noted from these graphs are:

  • 1. Errors at the center of the image were relatively small. In most cases, as the distance from the center increased, the error magnitude increased to a peak, followed by a minimum near the control points, and then increased again leading to the furthest points from the center.
  • 2. Errors furthest away from the center tended to decrease as the control points were moved outwards.
  • 3. The location and the magnitude of the first maximum increased as a function of the control point distance from the center.

The absolute maximum error was the smallest for both lenses when the control points were at 5 by 5 grid units. With this setup, the maximum errors (as estimated from the cubic regression) were approximately 0.75% and 1.5% for standard and wide-angle lenses, respectively.

Recall that the total grid size was 7 by 5 grid units. Thus, the control points in this case were located at the top and bottom of the image or 100% of the vertical field of view, and approximately 71% of the horizontal field of view. With improper choice of calibration points, error introduced by lens distortion may be as great as 3% in a standard lens, and 6% in a wide-angle lens.

If the region of the image greater than half the horizontal image distance (normalized R greater than 1) from the center were not considered, errors could be kept even smaller. Figure 4 below displays the shape of this region.

With this, the absolute maximum error was the smallest when the control points were at four grid units from the center of the screen for both lenses. In this case, the maximum errors as estimated from the cubic regressions were approximately 0.5% for the standard lens and 1.5% for the wide-angle lens. Thus, the control points in this case were located at approximately 80% of the distance from the center to the top and bottom of the image, and 57% of the distance from the center to the left and right sides.

Applications

This information would be useful in research environments where motion analyses are actually performed. For any given application, there are several factors which must be considered in choosing a camera and lens arrangement, including:

  • size of volume in which activity will take place
  • location within the volume in which a majority of the action will take place
  • distance available from location of activity to camera
  • maximum acceptable error.

In general, one will want to have the camera positioned in such a way that the volume of space in which the activity will take place fills the total lens image as much as possible. This position provides the highest degree of resolution. Recall that in the arrangement used in this study, the grid of points was 53.3 X 41.9 cm. With the standard lens, the grid almost completely filled the screen when the camera was 88.3 cm away from the grid. Similar results will be obtained any time the ratio of the camera-to-object distance to the horizontal image size is approximately 1.66, or the ratio of the camera-to-object distance to the vertical image size is approximately 2.11. When a wide angle lens is used, these values would be 0.94 and 1.20. For example, if the camera is constrained to be within 2 m of the area being videotaped, the video would record an area 1.21 X 0.95 m if a standard lens was used; 2.12 X 1.67 m if a wide-angle lens was used.

CONCLUSIONS

This study has taken a thorough look at one of the sources of error in video based motion analysis. A methodology was developed to evaluate lens distortion. Using this methodology, it was seen that with a wide-angle lens, errors from lens distortion could be as high as 6%. Even with a standard lens, there was a small amount of lens distortion. The choice of calibration points influenced the lens distortion error. By properly selecting the calibration points and avoidance of the outermost regions of a wide-angle lens, the error could be kept below approximately 0.5% with a standard lens and 1.5% with a wide-angle lens.

Reference: /main/adw-10e.html