1. Introduction
TVs and monitors are among the most widely used display devices in the industry and at home. These display devices have evolved to larger screens and thinner thicknesses [
1], but are still fundamentally built-in displays with rectangular screen shapes. Such conventional display devices have various fundamental constraints. First, they require as much flat space as the device size. Second, the space in which the display is located should be invariant. Third, the screen size is limited by the size of the device, resulting in less flexibility. Fourth, they work only in the installed space [
2].
If there is a new display that can produce a TV or monitor screen on a dynamic non-flat surface that is flexible in terms of both screen location and size, it can overcome the aforementioned constraints of conventional display devices. We describe such a new device as “Display in the Wild” (DIW), and this device will be applicable and valuable in various industries. For example, to display a large screen at outdoor performances or events, we typically need to install a large flat screen or connect numerous smaller TVs. With DIW, on the other hand, we can use Unmanned Aerial Vehicles (UAVs) and a lightweight canvas to quickly configure a large screen at a low cost and display a TV-like screen on a dynamic non-planar surface. DIW can also create a TV-like screen on curtains that are not flat while being easily found at home, as well as on tents that are commonly used in outdoor camping.
In this paper, we propose both hardware and software configurations to implement DIW. For the hardware, we choose a projector as a suitable display device to overcome the limitations of fixed positions and sizes, as projectors can generate a variable-sized screen anywhere. However, geometric distortions can occur when the projector displays on a surface that is dynamic and non-planar. To solve this problem, a geometric distortion correction method is required to display an undistorted screen (e.g., the rectangular screen on a TV). Here, a capture device that can predict the distortion is needed since the projector itself cannot sense the screen geometry. Given that a depth camera can quickly capture 3D surfaces in real time, we decide to use one as a capture device to predict the distortion on dynamic non-plane surfaces. In summary, we utilize a projector and a depth camera for our hardware configuration to implement DIW.
In order to complete the DIW system with the proposed hardware configuration, we developed essential software configurations for DIW: (i) a quick prediction method for geometric distortion on dynamic non-planar surfaces, and (ii) a fast correction method for geometric distortions. Although there are many distortion-correction methods for projection, the existing methods hardly correct the distortions present in dynamic non-planar surfaces. In fact, they mainly project the structure patterns on surfaces to extract and match features using RGB cameras and find distortions in a reconstructed 3D space [
3,
4,
5,
6,
7,
8,
9]. However, it is difficult to cope with dynamic distortion because the projector may generate patterns that interfere with viewing of the screen contents. Some methods [
10,
11,
12,
13] can deal with dynamic distortion by projecting non-visible patterns. However, such methods also have drawbacks, in that they often unreliably extract and match features in certain lighting environments, thus failing to correct the distortions. Moreover, the existing feature extraction and matching processes usually require heavy computation, which makes them inappropriate for predicting dynamic distortions at high speed. In addition, to fill in missing information on the surface, they fit 3D mesh shapes to the distorted surface in the correction step [
7,
9], which also results in a high computation complexity. Therefore, the conventional projector correction methods are hardly applicable to DIW systems, which require fast and reliable correction methods on dynamic non-planar surfaces.
Additionally, most correction-methods perform calibrations between the projector and capture device. Some existing manual and/or automatic calibration methods have shown good results with RGB or depth cameras [
14,
15,
16,
17,
18,
19,
20,
21,
22]. However, existing calibration methods are not suitable for DIW, since they focus only on the accuracy of calibration between the projector and capture device, rather than fast detection of the geometric distortion of the projection. Furthermore, some calibration methods may need to re-calibrate when intrinsic parameters change (e.g., display resolutions) in the projector, as the method involve intrinsic parameters in the calibration process.
Therefore, to overcome the aforementioned problems of the existing methods and complete the DIW system with the proposed hardware, we propose: (i) a quick estimation method for geometric distortions from the projection; (ii) a fast distortion correction method; and (iii) a DIW-specific calibration between the projector and depth camera. Using the depth camera, our method can quickly predict/correct distortions on the projection surface. This is because we specialize our calibration method for fast detection of the projection surface in the depth image. To efficiently find the correction area of the user’s viewpoint, only the boundary pixels on the detected projection surface in the depth image are used. When determining warping and correction, the proposed method can quickly recover the high-resolution depth map of the projection surface by adopting a Gaussian weighted interpolation process that fills out the missing information in the original depth map for a very low computation cost. This method can provide an alternative to the traditional mesh fitting methods that result in high computation costs. In addition, the proposed calibration method focuses on extrinsic parameters (i.e., the transformation matrix for positional relationships between the projector and depth camera), making it easy to respond to display resolution changes.
Figure 1 illustrates the overall process of our proposed DIW system. The main contributions of our work are summarized as follows:
We first devise the concept of “Display in the Wild” as a new display device that overcomes the limitations of fixedness for both display position and surface condition that are present in existing display devices.
We propose a new hardware configuration specialized for DIW that couples a projector with a depth camera for fast and reliable corrections to the distortions that can occur due to dynamic non-planar surfaces.
To complete this DIW system, we also propose software configurations that allow the DIW display to be deployed in real time applications. Our software configurations include a fast geometric distortion estimation, correction, and DIW-specific calibration. These allow us to perform robust projector correction without using markers.
We prove the usefulness of our DIW system by performing comprehensive quantitative and qualitative tests under dynamic non-planar surface conditions.
This paper is organized as follows: In
Section 2 we discuss the existing methods for distortion correction of the projector and for calibrations. Then, in
Section 3 we describe our fast correction and calibration methods for the DIW system. We thoroughly conducted quantitative and qualitative experiments to demonstrate the performance of the proposed method, as described in
Section 4. Lastly, we discuss limitations and conclusions for our proposed methods in
Section 5 and
Section 6, respectively.
3. Proposed Method
In this study, we developed our DIW system based on the assumptions that: (i) the projector and depth camera are attached and fixed together; (ii) that the target surface projected by the projector is non-planar and dynamic; and (iii) that the depth camera is placed parallel to the longitudinal axis. Our proposed DIW system calculates the relationship between the projector and depth camera (the extrinsic parameters) at first using our proposed calibration method. After that, the projector distortion is corrected in real time through the following iterative process:
A target surface area that is projected by the projector is predicted quickly via the depth image with the estimated calibration parameters.
The most effective correction area (i.e., the maximum rectangular region inside the projection surface) is calculated for the undistortion processing on the display.
Finally, the original projection image is warped into the effective correction area, and the projector outputs the warped image in order to remove the geometric distortion on the target surface.
3.1. Hardware Configuration for the DIW System
We use a projector and depth camera as the hardware configuration to implement our DIW system. In this paper, we choose Kinect v2 and a portable laser projector (Model: Celluon PicoBit B06X93WFBP) where Kinect v2 captures depth images with 30 frames per second (fps) and the projector displays a screen with 60 Hz refresh rates. In choosing hardware configurations, we consider the following important conditions.
First, Kinect v2 has already demonstrated its performance in sensing precise depth images in many fields and has easy development. Even though it has a somewhat larger size than the recent Intel RealSense Depth Camera D435, we determined that the size of Kinect v2 is not big enough to interfere with the mobility, and further Kinect v2 offers fast implementation and testing advantages. Second, we use a laser projector instead of any existing LCD (Liquid-Crystal Display) or DLP (Digital Light Processing) projectors. These conventional projectors have a disadvantage in that one must adjust the focus according to the projection distance from lens. Because a DIW system should be able to generate the screen on a dynamic non-planar surface, adjusting the focus can be a complex problem for correcting the distortion in a DIW display. Moreover, in general LCD or DLP projectors have a large size. This creates a problem in mobility for easily creating a screen at a desired position. Unlike conventional projectors, the laser projector does not use a lens, so we do not need to adjust the focus. Furthermore, the laser projector is very small, so we decided it was suitable for the DIW system.
It is worth noting that, the physical allowable rotation angle and distance between the projector and the screen are limited in hardware configurations. Physically, the allowable rotation angle is dependent on the vertical/horizontal FOV(field of View)s of the projector. For example, when the vertical/horizontal FOVs are and , the allowable angular range for vertical direction is and for horizontal . Also, the allowable distance of the DIW system can be determined by the projection limit of the projector and sensing limit of the depth camera. Usually Kinect v2 has an allowable distance of eight meters, and the projector has a longer allowable distance, so eight meters can be judged as the allowable distance of our DIW system.
3.2. Projector to Depth Camera Calibration
We first calibrate the projector and depth camera in closed form. For this, we denote the 3D depth camera coordinates centered at the depth-camera position with points
, and the 3D projector coordinates centered at the projector position with points
, as shown in
Figure 2. The affine transform matrix
that transforms the
into
is defined as
where
and
represent the rotation (consisting of
) and translation (consisting of
) matrices in the affine transform
, respectively.
We can easily obtain
using the depth image and its intrinsic parameters. On the other hand, if
is not directly available, the projector does not capture any geometric information. Instead of
, we use 2D projection image coordinates for the images projected onto the target surface. We denote the 2D projection image coordinates as
, which corresponds to
. The mathematical representation for the mapping of
to
is represented as:
where
and
are the width and height focal lengths of the projector, respectively. The above transformation
can be expressed using the 2D projection image coordinate
as:
To calculate elements of the transformation
in Equation (
3) for more than one pair of
and
, we form a linear system
, the details of which will be described in
Appendix A. Then, by gathering the pairs, we solve the linear system using an SVD solver via our calibration process. It is worth noting that the SVD solver is known to yield a stable solution for a linear equation
.
The proposed DIW system projects a chessboard pattern onto a surface using the projector and captures the projected chessboard pattern using the RGB camera embedded in the depth-camera to gather the pairs. Note that most depth-cameras contain built-in RGB cameras for sensing color information. Then, our system detects the corners of the projected chessboard pattern using the captured image by implementing a corner detection method in the OpenCV library [
44]. We map the chess board corners that are detected via the RGB camera to the corresponding positions in the depth image using the calibration values of the depth camera, which are provided by the manufacturer.
Figure 3 shows the overall process of gathering 3D points at the depth camera coordinates and the corresponding 2D coordinates at the projector image.
To collect a large number of sample points, our method generates a chessboard pattern with 25 different positions, moving the chessboard pattern 5 times by 20 pixels in both the horizontal and vertical directions. The chessboard that we used in this study has 54 corners (
) which can be obtained per each chessboard image. Therefore, the total 1350 corners can be gathered with 25 different images as shown in
Figure 4. This calibration with a large number of sample points yields an accurate performances, which is given in
Section 4.
The proposed calibration method can be used directly to detect the projection surface of the projector since it calculates the relationship between
and
via Equation (
3). In addition, since the intrinsic parameters of both the depth camera and the projector are not included as unknown parameters in our calibration, it is possible to easily cope with a change in the projector resolution.
3.3. Projection Surface Prediction
Once we obtain the transformation matrix through the proposed calibration, we can calculate the corresponding to .
We convert the depth image
, where
represents the pixel position, into 3D point clouds
at the 3D depth camera coordinates using the predefined intrinsic parameters. Substituting each 3D point of
into Equation (
3), we can calculate the corresponding 2D projection image coordinates
. We then propose using the projection surface masks
for the depth image by using the simple and cost-effective equations as follows:
where
w and
h are the width and height of the 2D projection image, respectively, and
and
are the pixel positions at
. If the projection surface masks
is 255, the pixel of the depth image at
belongs to the projection surface as shown in
Figure 5. In this way, we can simply detect projection regions based on the depth image in real time.
3.4. User Position
In this paper, we correct the distortion of the projection based on the assumptions that: (i) the user looks at the center of the screen (projection surface); and (ii) the user’s position is already known. In order to consider the user’s viewpoint in the correction process, we define 3D user coordinates with the user’s center as the origin, viewing this direction as the
z-axis, left-to-right as the
x-axis, and bottom-to-top as the
y-axis. The user coordinates are defined relative to those of the projector one for easy adaptation of warping-based correction, which is fully described in
Section 3.6. We also denote a point in the user coordinates as
. Finally, the affine transform
that transforms
into
is defined as
where
and
represent the rotation (consisting of
) and translation (consisting of
) matrices in the affine transform
, respectively.
From our experiments, we determined that the slight variations in the user position and viewpoint produce perceptually unnoticeable geometric distortions on the screen. This is because of the auto-compensation mechanism of the human visual system (HVS) [
45], that is, the distortions produced by viewpoints are not noticeable even if the position of the user system is roughly approximated.
Note that this paper focuses on flexible and fast corrections with a projector and a depth-camera for the DIW system. Estimating a user’s position with a variety of poses and occlusions requires advanced user-detection algorithms with very high computation costs [
46,
47]. Most methods detect the user’s position in standing conditions, but, this is outside the scope of our paper. Therefore, we manually set the user’s position so that our DIW system runs in real time.
3.5. Correction Area Calculation
To correct the geometric distortion on the projection surface, it is necessary to determine the correction area that shows the maximal rectangular image region to the user. In this study, we argue that it is inefficient to calculate the correction area using all of the 3D points that belong to the projection area. Instead, we propose calculating the correction area using only the boundary pixels at the projection area, which is simple and efficient, as shown in
Figure 6.
First, we find the boundary pixels of the projection area on the projection surface masks
using the fast contour detection method [
48]. When several contours are detected due to discontinuities of the projection surface, we choose the largest contour as the boundary to find the correction area.
Figure 6a shows an example of detecting the boundary of the projection surface masks
. Second, we create a distortion shape image at the user’s viewpoint in order to calculate the correction area from the detected boundary pixels efficiently. To do this, we perform a process similar to a computer graphics pipeline. The boundary pixels on the depth image are transformed into 3D points in the 3D user coordinates according to Equation (
5), as shown in
Figure 6b. These 3D points are projected onto a plane on the viewing volume of the user’s perspective. Then, we rasterize the projected points into a
image, as shown in
Figure 6c. In the rasterization step, a polygon filled with the white color is drawn by using the projected 2D points with orders provided by the contour detection method [
48]. Finally, the largest rectangle inside the filled regions with the white color of the rasterized image is calculated by Vandevoorde’s method [
49].
After the largest rectangle (the correction area) is found on the 2D distortion shape image, as shown in
Figure 6, we now define it using the 3D user coordinates for use in the correction process. Therefore, we define the largest rectangle as the correction area using the fields of view and transformations parametrically as
where
,
,
, and
are the fields of view and transformations for defining the largest rectangle as the correction area in the 3D user coordinates,
represents the largest rectangle on the 2D distortion shape image relative to the starting point, width and height, respectively.
and
in Equation
6 are the width and height sizes of projection plane, respectively, and
and
are the width and height sizes of the rasterized image, respectively. In this way, we can parametrically define the rectangular correction area using the 3D user coordinates as FOVs (
and
) and transformations (
and
).
3.6. Warping and Correction
In our calibration, we can transform the 3D depth image coordinates into the 2D projection image coordinates by using Equation (
3), but it is impossible to convert them in the opposite direction, as it is an ill-posed problem. However, the depth information (3D information) of the projection image is essential for our undistortion process. Fortunately, we can somewhat calculate the depth of the projection image using sparse 3D information via our calibration and the depth image. Therefore, we must estimate unknown parts of the depth information in the projection image.
Existing correction methods [
7,
9] re-construct 3D parametric or meshed surfaces by fitting known 3D points to estimate the unknown information. However, these approaches have high computation costs, resulting in the slow correction of the geometric distortion of the projection. Instead, we generate a sparse projection depth image from known depth information in the projection image. Then, we quickly convert a sparse projection depth image into a dense projection depth image using Gaussian weight interpolation with
masks.
Figure 7 shows the whole process of generating the dense projection depth image.
For generating a warping table to correct the projection distortion, we transform the dense projection depth image into 3D point clouds as
in the user coordinates via intrinsic parameters of the projector and
. We calculate the angles in the correction area, showing that each 3D point of
is included in the correction area parametrically as:
where
is the
z-axis value for the perspective projection of distortion at the user’s viewpoint,
is the angle relative to the
x-axis, and
is the angle relative to the
y-axis. Then we check if these angles are inside the correction area as
When Equation (
8) is satisfied,
is inside the correction area (the largest rectangle). Using the fields of view and transformations that define the correction area, we calculate the image positions to be displayed for correction by
where
w and
h represent the width and height of the projection image, respectively, and
is the position of the pixel to be displayed at
. Finally, we make a warped projection image to correct the distortion given by
where
g is the warped image and
f represents the original projection image. This process can warp the projection image through backward mapping, thus constructing the corrected image quickly.
4. Result and Analysis
To investigate the effectiveness of the proposed calibration and correction methods on our DIW system, we conducted both quantitative and qualitative experiments under various conditions. The depth and RGB cameras are built in Kinect v2 with
and
resolutions, respectively. For testing, the projector and Kinect v2 were connected to a laptop (MacBook Pro 2015 early with i5 2.7 GHz and 8 GB RAM) on which we ran our DIW system with the proposed calibration and correction methods. Since the depth image produced by Kinect v2 has lens distortion, we preprocessed correction of the lens distortion using Brown’s distortion model [
50] with 5 correction parameters,
,
,
,
, and
, which are supplied by the manufacturer, as shown in
Table 1. As our laser projector does not use a lens for projection [
51], we did not consider lens distortion in our tests. For user position, we set up
by assuming the user is located one meter behind the projector for all experiments.
4.1. Quantitative Evaluations
We first performed a comparative evaluation with the methods in [
14,
21] to determine the reprojection accuracy of the proposed calibration method. We also tested the correction accuracy as we projected and corrected horizontal/vertical line images in
Figure 8 and measured the straightness of the lines on the corrected image. Finally, to verify the speed of the distortion correction method, which is essential to our DIW system, we performed a comparison evaluation for the correction time of each image with the methods in [
7,
9].
4.1.1. Reprojection Errors of Calibration
Our calibration method has the advantages that it can be applied directly to detect the projection surface and cope with variable intrinsic parameters such as screen resolutions in the projector. Thus, we tested the calibration accuracy of our method while it keeps these advantages. We calculated the reprojection errors of our method. Then, we compared our calibration method with the existing calibration methods in [
14,
21], which use an RGB camera and depth camera for calibration, respectively. For comparison, all methods being tested for calibration accuracy use the same number of samples, which are obtained by projecting a
chessboard to 25 different surfaces (i.e., a total of 1350 samples under test).
Table 2 shows the calibration results of our proposed method and the existing ones. In addition, we calculated the reprojection errors of the three methods, as shown in
Figure 9. These results show that all three methods have similar results. However, our calibration is specialized to detect the projection surface directly and quickly for DIW, and it can easily cope with projector resolution changes.
4.1.2. Line Straightness Error Ratio Test
In order to test precision of the proposed correction method quantitatively, we calculated the straightness of vertical and horizontal line images, which are captured at the user position using a camera. To calculate the simple gradients of lines on the line images, we propose using the Line Straightness Error Ratio (
) given by:
where
is the line straightness error ratio for vertical straight lines,
is the line straightness error ratio for the horizontal lines,
k is the angle with respect to the image
x-axis of the image, and
is the tolerance degree allowed by the error ratio. In this study, we set
as 10°, which means that
is 1 with no distortion (
) and
is 0 with
. If the angle
k is larger than the tolerance degree
, it has a negative value. To measure
, the line pattern images shown in
Figure 10 are projected and corrected by the proposed method. Then, we captured them using the camera, and
k is calculated with two edge points at the end of each line. We repeated the
experiments five times in horizontal and vertical images, respectively, for the accuracy.
Table 3 and
Figure 8 show our
experimental results. Based on these results, our correction method has a good
performance over
at
. This means that our correction results have a less than
difference in the straight lines, indicating that the proposed method can accurately correct distortion on non-planar surfaces.
4.1.3. Performance Test for Running Speed
In the proposed DIW, fast correction is considered an important requirement since the distortion of a dynamic non-planar surface need to be estimated and corrected in real time. Therefore, we compared the correction speed of the proposed method with the conventional RGB camera-based methods. To test the performance of the correction speed, we measured the time needed to correct an image with a
resolution, which is the maximum resolution of our projector.
Table 4 shows the correction times of the proposed and existing methods, indicating that the proposed method can correct the distortion of the image very quickly compared to the conventional methods. The main reason for the fast correction of our method is that we use the depth camera to predict the distorted surface directly, followed by filling in the missing information using simple interpolation. This experiment shows that the fast correction of our method is appropriate for dynamic surfaces, which is essential for DIW.
4.2. Qualitative Evaluations
In this paper, we carried out not only quantitative evaluation, but also qualitative evaluation for the proposed DIW. For the qualitative test, we setup two scenarios in which our DIW system can be used in practice: (i) we built a large screen using UAVs and a lightweight canvas to evaluate the practical use of our DIW system in space-constrained environments such as celebrations; and (ii) we manually made dynamic non-planar surface to evaluate extreme environments.
Figure 11 and
Figure 12 show the qualitative results of our method, where the left column in each figure depicts the result obtained with the proposed method and the right column shows the output of a conventional projector. Our results show that the proposed DIW system effectively corrects distortion, resulting in a TV-like rectangular screen. Furthermore, our method shows good undistortion results on the dynamic non-planar surfaces made from UAVs and lightweight canvas. These results demonstrate that our correction method and DIW system can be useful in real and extreme environments.
To show the effectiveness of our DIW system, we uploaded a video clip (
Video 1) at “
https://youtu.be/RIpj0pED6NA”. The first part of
Video 1 demonstrates the performance of our DIW system that consists of UAVs and a lightweight canvas. The second part of
Video 1 shows the performance of our DIW system in an extreme case where screen geometries severely change in time. In
Video 1, the left part of the screen shows projection images generated by our DIW system, and the right part of the screen shows projection images displayed by a projector only.
As shown in Video 1, screens with UAVs and canvas have irregular local and global movements due to wind of UAVs. In this case, our DIW system stably produces undistorted square screens in real time. On the other hand, the projection-only system yields clearly noticeable distortion. Regarding the second scenario, the projector-only system causes significant geometric distortions on extremely varying screen surfaces, while the proposed DIW system consistently provides rectangular-shaped images on that surface.
6. Conclusions
In this paper, we propose a new display system, namely “Display in the Wild”, to overcome limitations of the conventional displays. To implement DIW, we suggest both hardware and software configurations. For the hardware of DIW, we select the projector and depth camera in order to generate screen anywhere with different sizes and to correct the geometric distortion occurring on dynamic non-planar surfaces in real time. For the software configuration, we propose a DIW-specific calibration between the projector and depth camera. In addition, we also propose a fast correction method that correct the geometric distortions on dynamic and non-planar surfaces. Our experimental results prove that the proposed DIW system generates the rarely distorted screen on a dynamic non-planar surface in real time, thus showing usefulness in practical scenarios (e.g., lightweight screen using UAVs) and under extremely varying surface conditions. Since the proposed method can be divided into per-pixel processing, we consider extending our DIW system to ultra-high definition (UHD)-based projector applications via implementing parallel processing with GPUs for future work.