Most existing 3D capture technologies can’t be used to record objects in motion, and they usually require a stationary platform for operation. Some technologies can record 3D data at video rates (e.g., flash LADAR), but their spatial and range resolution are severely limited.
ADVERTISEMENT |
TetraVue, a provider of 3D and optics solutions, is developing an alternative approach to flash laser detection and ranging (LADAR) that can record simultaneous intensity and range maps using standard, charge-coupled device (CCD) and complementary metal oxide semiconductor (CMOS) sensor arrays. This enables 3D imaging with megapixel-class, high-frame-rate sensors. The resulting 3D camera can be used in the same fashion as 2D camcorders, but it records the geometry and texture of all objects in the scene at frame rates of 1,000 fps and beyond. Testing has verified millimeter-class range resolution using a 2 megapixel (Mpx) sensor for objects moving at speeds up to 20 m/s (such as an operating floor fan). This testing illustrates the new kinds of applications that can benefit from 3D video capability, such as monitoring and measuring operating equipment and machinery, or detailed measurement of all aspects of high-speed events such as impacts.
Introduction
Significant progress has been made in the ability to capture and record dimensionality of objects in all three spatial dimensions during the past several decades. A variety of technologies, from structured-light cameras to single-point 3D scanners and trackers, allow recording geometry and coordinate information with varying degrees of precision and working distances. This 3D data can then be processed and manipulated in software to recreate the object digitally. Each technology has ranges of parameter sets where its capabilities are most effective.
One area that continues to present significant challenges to effective 3D measurement are moving objects and operating equipment. Nearly all existing 3D measurement techniques build up the geometry representation over a series of images and points, requiring seconds to minutes to complete. This severely restricts their use for objects that move with significant speed. Flash LADAR techniques acquire 3D images and depth maps instantly, but are limited in optical (~100 px × 200 px) and depth resolution (>1 cm). Motion capture (MoCap) techniques of tracking a limited number of reflectors or emitters through a video sequence are also commonly used for 3D recording of basic motion, but such systems require large, dedicated studios and are limited to a few hundred points.
TetraVue is developing a 3D camera based on an alternative approach to flash LADAR imaging that doesn’t require high bandwidth temporal measurements. This approach permits the use of standard CCD and CMOS sensor arrays and electronics to capture 3D information at the same time as the traditional 2D imagery and video, using the same sensor and a single aperture. Instead of high bandwidth kilopixel sensors that are traditionally associated with flash LADAR, TetraVue has used 2 and 6 Mpx sensors to acquire 3D images and video. No special processing circuitry is required beyond the digital image processing circuits found in any 2D camcorder. The end result is an image map and a range map (i.e., an intensity value and coordinate value for each pixel) that are acquired simultaneously and thus self-registered at the frame rate of the sensor (e.g., 30 fps). The single frame capture and large format video capability combined with a nanosecond-class illumination pulse make recording high-resolution 3D information of moving macroscopic objects possible, freezing all motion during the frame.
First, the technology approach that TetraVue is pursuing will be described in more detail and how it compares with existing 3D technologies that can be used with moving targets and platforms. A prototype system has been built to demonstrate the high-resolution capability of this approach using a 2 Mpx high-definition (HD) sensor and tested on a variety of targets.
In particular, results will be presented for tests on moving targets ranging from a scaled helicopter with spinning rotor, a rotating can and box, and an operating floor fan, with object speeds from 0.1 m/s to 20 m/s. Finally, the potential application of this 3D video technology to 3D capture of operating machinery and high-speed events will be discussed.
Technology
As described, there are several existing technologies that build up 3D point clouds of surfaces over an extended series of scans or measurements. Because there is a time delay, such technologies are suitable only for static objects. Some have employed optical flow techniques to track points through the scanning interval, but this is effective for limited types of motion and low speeds. To record motion of extended objects in three dimensions requires the simultaneous capture of all points on the object.
Currently, the choice of technologies for moving objects is either some version of flash LADAR or MoCap, but both have limitations that prevent wider utility. For present purposes, flash LADAR refers to any 3D capture device that acquires the 3D information simultaneously using an array of detectors. There are 3D camera devices that use arrays of linear mode avalanche photodiodes combined with a short-pulse illumination laser. These typically can be used at long ranges (>1 km) and use avalanche photodiodes detector arrays of 128 px × 128 px or smaller. Each pixel requires its own high bandwidth analog/digital converter (ADC) circuit to capture the timing information of the reflected laser pulse to determine the range.
There are also devices that use amplitude modulation in the transmit and receive systems, using specialized pixels in custom detector arrays that are approximately 120 px × 160 px or smaller. These devices are designed for short ranges (<5 m) and often can only be used indoors. For either approach, larger pixel formats require significant investment in new chip and processing designs and fabrication, preventing further scale-up.
In addition, the bandwidth limitations in current implementations also limit the range resolution achievable to centimeters or larger. MoCap solutions typically require large, dedicated studios with large-camera arrays to track hundreds of markers over a few subjects. Photogrammetry techniques using calibrated camera sets are beginning to prove effective for small, fixed volumes of interest and short ranges, albeit with significant computational costs.
As said, the limitations of the current 3D technologies compatible with moving objects restrict the transverse and range resolutions. For flash LADAR, the limitations are associated with the need for a custom chip set and the high cost of larger pixel formats, in addition to the significantly higher bandwidths required to achieve improved range resolution.
TetraVue has taken a different approach to time-of-flight flash LADAR by adding elements in front of the detector array, called the light slicer, to permit the use of standard CCD or CMOS integrating arrays. This enables 3D cameras that can acquire megapixels of 3D coordinate information in each frame. A nanosecond-class illumination pulse fills the field of view (FOV) and the camera records the time-of-flight using a single lens such as the digital single-lens reflex (dSLR) lenses to choose the FOV as any current camera or camcorder. The process to determine range is simple and so the range information can be displayed real time. The equivalent of THz bandwidths is possible to achieve millimeter-range resolution to ranges of tens of meters.
Using this approach, TetraVue has been able to acquire 3D images with 3024 px × 2048 px dSLR sensor as well as HD (1920 px × 1080 px—1080 p) video sensor. Most of the testing with the demonstration systems has been done in the laboratory, which limits the test range to 6 m, with range resolutions of 3–5 mm. This was verified with a series of test targets of known dimensions, such as LEGO blocks and rectangular blocks. Figure 1a shows a 3D image of three stacks of LEGO blocks with 8-mm, 16-mm, and 32-mm step heights. The image was recorded face on and then rotated and colored by height for display. Figure 1b shows a false-color range image of a resolution target, with blocks in 12-mm, 25-mm, 38-mm, and 50-mm heights in three groups of 12-mm, 25-mm, and 50-mm lateral dimensions. In this instance, the color map covers approximately 75 mm in depth.
(a) (b)
Figure 1:
TetraVue has used its demonstration 3D camera to validate the performance of this new approach using a variety of stationary and moving objects. In addition to the testing inside the laboratory, the 3D camera performance has been tested to 50 m, albeit with lower-range resolution. The camera has been used outdoors in direct sunlight, and 3D measurements have been made of a series of moving targets as is described in more detail below.
Table 1 summarizes the parameter space tested to date and illustrates the performance possible for a 3D camcorder based on this technology. The majority of the testing used a catalog HD format CCD, making 2 million simultaneous range measurements at up to 30 fps coupled with the traditional grayscale HD video.
Laboratory system |
||
Range |
2–50 m |
|
Range resolution/precision |
|
<3 mm @ 6 m |
Array size/image size |
3024 × 2016 |
|
Frame rate |
30 fps |
|
Data rate |
60 Mpts/s |
|
Field of view |
Up to 60° × 30° |
|
Object speed |
20 m/s (72 kph) |
|
Post-processing |
None, in real time |
Table 1: TetraVue has used its laboratory demonstration 3D camera to verify high-resolution performance over a wide range of operational parameters.
The combination of single-pulse flash, large-pixel count sensor, and single-frame/single-aperture capture means that TetraVue’s technique can be used to obtain 3D imagery (geometry plus texture) in real time in a video format. This is especially suited for 3D acquisition of moving targets as well as acquiring 3D imagery while the camera is moving, similar to how 2D camcorders and digital cameras are used today. In addition, the use of CCD or CMOS sensor arrays means that high-frame rate sensors (e.g., 1,000 fps), high-resolution 3D video can be acquired for high-speed events.
Results
Although TetraVue’s 3D camera technology can be used to quickly acquire 3D images of stationary objects and scenes, it is especially suited for capturing high-resolution 3D imagery of fast-moving objects at a wide range of speeds. The nanosecond illumination pulse and simultaneous capture of geometry and texture means that even speeds up to 8,000 m/s produce minimal blurring. The large pixel count means that high-spatial resolution can be achieved, even with wide FOVs. It is also possible to use megapixel sensors that have 1,000 fps speeds (or even higher speeds with burst-mode framing cameras) to record 3D imagery of high-speed events.
To evaluate this capability for 3D movies, several test objects were recorded with the laboratory demonstration camera, including a model helicopter, a spinning box and can, and an operating floor fan as described below. Representing 3D video on static 2D paper is a challenge, but representative frames from the video illustrate the capability.
Figure 2 shows two frames from a video feed capturing the motion of a toy helicopter. Figure 2a shows the grayscale image map and the corresponding false-color range map for one frame of the 3D video at an instant in time, and figure 2b shows the image and range maps a few frames later. The blade of the helicopter was spun by hand at 2–3 m/s at the tips, and the two frames show the blade at different locations. Because of the nanosecond-class laser strobe, there is no motion blur so that both image and range maps are in sharp focus.
The false color scale has been compressed to cover just the helicopter size in each frame, from blue (far) to red (near). The timing was shifted between the two frames, which also shifted the center of the range measurement and range map slightly. The range maps are displayed in real time during the data acquisition, and because the same sensor acquires the intensity and range information, the pixels between the two maps are identically registered.
(a)
(b)
Figure 2:
The relative range accuracy and resolution of moving objects were evaluated using geometric shapes that were placed on a turntable, located 6 m from the camera and spinning at 33 RPM. In one case, a 6-in. cube cardboard box was placed at the center of the turntable, and in the second case, a 6.6-cm diameter aluminum can was placed at the edge of the turntable. Figure 3a shows the line-outs from the range measurement for a single frame of the rotating box, and the moving can is shown in figure 3b.
The measured angle of the corner of the box was 90.6°, which is within the error of the cardboard construction. The root-mean-square (RMS) range error of the flat surface was 3 mm, even when at oblique angles to the camera line of sight, as shown. Figure 3b shows the line-out of the range map from the aluminum can (red curve), the line-out from vertical average (black curve), and the actual shape (hatched circle). The shape determined by the range map matches the shape of the can within the 3 mm RMS range error until the edges of the can is reached. This discrepancy at the edge is due in part because of the reduced signal returned from sides of the can because of the high angle of incidence, and in part because of some mixing with the background distance behind the can. An approach has been identified to improve the 3D camera performance in similar situations.
(a) (b)
Figure 3:
Finally, testing was performed using an operating floor fan that is 10 inches in diameter of the outside housing. This fan was placed six meters from the camera and operated at full speed (the blade tips are moving at ~20 m/s). The blades themselves are approximately 4-cm deep with some curvature. The 3D videos were captured with the fan normal to the camera (e.g., the fan rotation axis was approximately parallel to the optic axis). The video clip runs for 10 seconds. The two frames shown in figure 4 illustrate the 3D video collected. However, in the case of the fan data, the recorded 3D imagery has been processed for enhanced visualization and playback and screen captures of the playback are shown in figure 4.
A simple automatic segmentation routine was used to separate surfaces and to identify moving and static surfaces. The greyscale image for each frame can been textured over the geometry from the coordinate data, and then the result is rendered using basic lighting. The intensity can be blended with a false-color height map to enhance the visualization of the depth. A 3D video playback module allows the camera position to be changed using the computer mouse during playback.
The module also allows the texture and color blend to be turned on and off. The static pixels as displayed can be assigned infinite persistence during playback, which fills in the surfaces behind the blade so no shadow is present. The video was recorded from the perspective in figure 3a, and figure 3b shows a different frame with the render camera pose set to a different position.
(a) (b)
Figure 4:
The RMS range error on the blades shown in figure 4 is 2.5 mm, and qualitatively, the measured blade shape matches the curvature of the blades, although a baseline 3D measurement of the blades for comparison using a different technique is still lacking. Again, because of the nanosecond-class illumination pulse, there is no blurring from the motion.
The enhanced visualization module illustrates how the impact of the raw 3D imagery (shown in figure 2) can be improved with simple visualization techniques. These examples all have moving objects that are confined to a within the static field of view of the 3D camera because the current laboratory demonstration system is bulky and difficult to move quickly. Work is underway to package the components in a more portable system.
Applications
The previous examples illustrate how high-resolution 3D video technology can be used to record the complete surface information of an object or set of objects in their natural environment, particularly while moving. This makes it possible to monitor in 3D machinery and equipment while they are operating. Examples include monitoring the position and motion of industrial equipment for quality assurance and safety purposes without interrupting the workflow. High-resolution 3D video also provides the opportunity to record in 3D how components of complex machinery such as jet engines or helicopters move in three dimensions during operation, making it possible to validate engineering designs with a higher degree of confidence.
In cases where a complete 360° record is desired, cameras can be placed to minimize occluded areas and the individual 3D frames synchronized and registered together. Alternatively, if simultaneous 3D images are not required, a single camera can be manually or automatically moved around the objects of interest as if shooting a traditional video. As described above concerning eliminating the shadows of the fan blades, static areas that overlap between frames can be combined into single surfaces, filling in voids and shadows that normally would be present. Because of the high-speed strobe and the availability of wide FOVs with high-transverse resolution, high-speed events can be recorded. Examples of such events are impacts, ruptures, and explosions that cannot normally be recorded in 3D.
As described above, for these high-speed events, TetraVue’s 3D camera can make use of high-speed framing sensors to rapidly capture 3D frames, limited by the sensor and the data storage rate of the camera design. The limitations for the 3D camera are the same as today’s 2D high-speed cameras. For these applications, trajectories of small objects or large numbers of objects can be recorded in 3D and tracked over many frames because of the high-spatial resolution available. The registered imagery makes it possible to readily identify individual objects in the FOV and how each relates to other objects in the scene. This capability has the potential for improving the ability to validate engineering and safety designs with higher confidence.
Conclusion
TetraVue has demonstrated a new approach to flash LADAR that makes it possible to acquire high-resolution 3D imagery and video with standard CCD or CMOS imaging sensors. The result is the real-time production of self-registered intensity maps and range maps, using megapixel-class sensors. For the range maps, mm-class range resolution has been demonstrated at ranges up to six meters from the camera and lower resolution at ranges up to 50 m. The rich datasets can contain texture and geometry information for many objects, both static and moving at any speed, which makes possible new ways of visualizing the 3D video.
The testing done to date with moving targets have shown that there is no difference in 3D performance between stationary objects, and scenes and objects moving at speeds up to 20 m/s (and there should be no difference for speeds up to 8000 m/s.) This new capability will permit high-resolution 3D measurement of complex machinery and components while operating as well as 3D measurement of high-speed events.
Add new comment