CMOS稳像

CMOS稳像 J.-B. Chun et al.: Suppressing Rolling-Shutter Distortion of CMOS Image Sensors by Motion Vector Detection Contributed Paper Manuscript received August 22, 2008 0098 3063/08/$20.00 © 2008 IEEE 1479 Suppressing Rol...

J.-B. Chun et al.: Suppressing Rolling-Shutter Distortion of CMOS Image Sensors by Motion Vector Detection Contributed Paper Manuscript received August 22, 2008 0098 3063/08/$20.00 © 2008 IEEE 1479 Suppressing Rolling-Shutter Distortion of CMOS Image Sensors by Motion Vector Detection Jung-Bum Chun, Hunjoon Jung, and Chong-Min Kyung, Senior Member, IEEE Abstract - This paper focuses on the rolling shutter distortion of CMOS image sensor coming from its unique readout mechanism as the main cause for image degradation when there are fast-moving objects. This paper proposes a post image processing scheme based on motion vector detection to suppress the rolling shutter distortion. Motion vector detection is performed based on an optical flow method at a reasonable computational complexity. A practical implementation scheme is also described. Index Terms - CMOS image sensor, rolling-shutter distortion, post-processing technique I. INTRODUCTION Since solid-state image sensors replaced films in the consumer electronics market, more and more digital gadgets such as cellular phones and PDA’s are increasingly equipped with digital camera function. Image sensor, which converts photons into electrons via photoelectric effect, consists of photodiodes or phototransistors as light-sensing area and peripheral circuitry to control the signal read-out. Image sensors are classified into CCD image sensor and CMOS image sensor (CIS). While CCD can store charges like a memory and can transfer them by means of controlling gate voltages [1], MOSFET’s in CIS can only transfer charges from photodiodes to readout circuitry without storing them. CCD is fabricated through its dedicated fabrication process and is generally known for better image quality than CIS. But a major drawback of CCD is the process incompatibility with CMOS, which makes it difficult to implement peripherals such as timing generator and analog-to-digital converter on the same die with the image sensor. On the other hand, CIS can be built on the same chip as their peripherals due to the process compatibility, which gives CIS an economical advantage over CCD. Another important difference between CMOS and CCD lies in the signal readout mechanism. While all photodiodes of CCD are exposed to a scene simultaneously to obtain signals corresponding to an image frame, each CIS row, being sequentially accessed, is given a different exposure time window as shown in Fig. 1 (a). We call the readout mechanism of CIS rolling shutter (RS) mechanism and that of CCD synchronous shutter (SS) mechanism. Jung-Bum Chun is with the Electrical Engineering Department, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea (e- mail: jbchun@vslab.kaist.ac.kr). Hunjoon Jung is with Clairpixel Co., Ltd., Seoul, 153-803, Korea (e-mail: henry@clairpixel.com). Chong-Min Kyung is with the Electrical Engineering Department, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea (e- mail: kyung@ee.kaist.ac.kr). The RS mechanism does not incur any problem as long as the object and the camera are stationary with each other. If either one is moving with respect to the other, then the RS mechanism will produce distorted images as shown in Fig. 1 (b). Three images in the left were taken when the panel is stationary whereas the images in the right were taken when the panel is rotating clockwise. It is shown that distortion patterns are different according to the direction of motion of objects. Fig.1 Rolling shutter mechanism. (a) Integration time ( IΔ ) of each row, shown as shaded region, is sliding downwards as each row is sequentially accessed to form an image. D and b denote the timing delay between the first and last rows and the blank time between two consecutive frames, respectively. MIΔ denotes the integration time when a mechanical shutter is employed. (b) Three images in the left are taken when the panel is stationary while the others, showing different distortion patterns according to the direction of motion, are taken when the panel rotates. This paper proposes a post-processing scheme to reduce the image distortion caused by the RS mechanism. Previous works on the RS mechanism and known alternatives are described in section II. A mathematical analysis of the RS IEEE Transactions on Consumer Electronics, Vol. 54, No. 4, NOVEMBER 2008 1480 mechanism is given in section III. In section IV, the implementation scheme for the proposed algorithm is described. Experimental results are given in section V. II. PREVIOUS WORKS AND OTHER ALTERNATIVES The RS distortion can be completely removed by using a mechanical shutter. By adopting the mechanical shutter, all the photodiodes of CIS are exposed to light during the same time interval, denoted as MIΔ in Fig. 1 (a), regardless of readout mechanism. Downside of using the mechanical shutter is reduced integration time ( IΔ Æ MIΔ ), and the additional size and cost caused by additional mechanical devices. Since primary applications of CIS are portable devices such as cellulars and PDA’s to which imaging is a subordinate function, such overheads due to mechanical shutters is not always justified. On the other hand, the RS distortion can be reduced by raising the readout speed. In Fig. 1 (a), IΔ denotes the exposure time and D denotes the maximum access time difference between rows. D can be reduced by speeding up the readout while IΔ remains unchanged. However, raising the readout frequency becomes more difficult and requires more power consumption as the number of pixels increases. El Gamal et al. [2] described a novel CIS architecture where each pixel integrates an analog-to-digital converter to digitize the signal and a latch to store the digitized signal. This architecture enables CIS to operate in the same way as CCD does. However, this architecture is economically impractical due to poor fill factor and poor sensitivity. Geyer et al. [3] proposed a new camera projection model for camera with the RS mechanism to mitigate the reduction in accuracy and described a framework for analyzing structure-from-motion problems in RS cameras. By parameterizing the velocity of the camera coordinate, the RS effect is applied to the traditional pinhole camera model [4] to derive the projection matrix. Ait-Aider et al. [5] proposed a technique for pose recovery and 3-D velocity computation by taking the RS effect into account. They took advantage of image deformation induced by the RS mechanism and computed 3-D poses and velocity based on rigid sets of 3-D points. Liang et al. [6] made the first attempt to correct the RS distortion based on motion vector detection. They found the global motion vector by the block matching and voting method [7]. The block matching technique is similar to that of MPEG 4 where we cannot but sacrifice the accuracy of motion vector to decrease the computation time. Excessive computational load due to the smoothing operation to compensate for inaccuracy of motion vector is not acceptable in mobile devices, major applications of CIS. In this paper, we utilized an optical flow method which usually produces more accurate motion vector than block matching. To reduce computation, we set center-oriented subwindows, applied the optical flow algorithm and produced one motion vector from those outputs. For low- power consideration, the motion vector detection is performed only when an image capture takes place. III. ROLLING SHUTTER ANALYSIS An image sensor array with the RS mechanism is defined in Fig. 2. The sensor has a W x N pixel array and can generate up to FM frames per second. We define Cartesian coordinates where the origin is located at the top-left corner of the sensor array. In addition, we refer to a new M x N array (in the unit of pixel) located at the center as effective area whose top-left corner is located at (x1, y1). The effective area corresponds to the actual image output of the sensor. Generally, the effective area is used for an image output while the rest, called margin area, is utilized by other auxiliary routines like black-level compensation, color interpolation and so on. Now consider a situation where an image is taken by the image sensor when there is a rectilinearly moving object with velocity v. We assume that integration time IΔ is so small that we can ignore any motion blur in the image and the motion occurs globally throughout the sensing area. We assume that the camera is fixed and the scene moves with a relative velocity. It is possible to find distortion patterns of the effective area with respect to individual components of velocity vector v. A. Horizontal (x-axis) Motion CIS is controlled row by row and all the pixels belonging to a row have the same exposure timing. Consider only a horizontal motion with motion vector (vx,0) where the moving object is the effective area itself. If the sensor starts reading out the pixel array from the first row after an integration time IΔ , then it takes τH to read out the whole array, where H is the number of rows in the array and τ is the time spent to read out a single row. Fig.2 Definitions of related parameters and axes; an image sensor array consists of effective area for actual image output and margin area for other image processing purposes τ is either a known parameter or can be approximated from other given parameters. When clock frequency f is given, τ is given by W/f since reading a row requires W clocks. If frame rate FM is given instead of f , a period for a single image frame can be given by 1/FM - b where b is blank time J.-B. Chun et al.: Suppressing Rolling-Shutter Distortion of CMOS Image Sensors by Motion Vector Detection 1481 between image frames as shown in Fig. 1 (a). By equating τH with 1/FM - b, τ can be obtained as M M HF bF f W −== 1τ (1) In Fig. 3, the y-coordinate of the k-th row of the effective area is 11 −+ ky and the time spent by the rolling shutter until it reaches the row is τ)1( 1 −+ ky . Since the row also moves with velocity vx, multiplying τ)1( 0 −+ ky by vx yields dx,k, the displacement of the k-th row in the x direction as )1( 1, −+= kyvd xkx τ (2) Fig.3 Distortion in a horizontal motion; a rectangular object is distorted into a parallelogram due to the RS distortion Since dx,k is represented as the sum of a constant plus a component proportional to k, the rectangular area is distorted into a parallelogram, where the so-called skew angle θ formed by y-axis and a side of the parallelogram is a good measurement to show the degree of distortion. Maximum horizontal skew, dx,max, is defined as the difference between dx,k of the first (k = 1) and the last (k = N) row; )1( )1( 11 1,,max, −= −−+= −= Nv yvNyv ddd x xx xNxx τ ττ (3) θ is thus given by τθ xx vN d 1max,1 tan 1 tan −− =−= (4) B. Vertical (y-axis) Motion Let us consider a vertical motion of the effective area with velocity (0, vy). To understand the vertical distortion, we define scan velocity vscan, denoting the number of rows read out per second, as given by the inversion of τ ; τ 1=scanv (5) The displacement of the k-th row of the effective area in the vertical direction is denoted by dy,k. When a capture starts, the rolling shutter starts its readout at velocity vscan and the k- th row of the object starts its downward motion at velocity vy denoting the number of rows traversed by the moving object in a second. Since the time elapsed until the rolling shutter meets the k-th row of the effective area can be written as scanky vkyd /)1( 1, −++ or yky vd /, . By equating the two expressions, dy,k can be obtained as follows. τ τ y y yscan y ky v kyv vv kyv d − −+= − −+= 1 )1( )1( 1 1 , (6) Maximum vertical stretch denoted by dy,max is defined as the difference between dy,1 and dy,N; τ τ y y yNyy v Nv ddd − −=−= 1 )1( 1,,max, (7) Fig.4 Distortion in a vertical motion. Dotted rectangle denotes the effective area, while shaded rectangles denote how it appears when vy = 0. (a) Original image appears undistorted when vy = 0. (b) Images in the RS system are vertically shrunk (top) when vy < 0, and vertically expanded (bottom) when vy > 0. If τyv−1 > 0, the sign of dy,max is given by that of vy. Vertical motion in the RS mechanism causes vertical distortion as shown in Fig. 4. When vy is positive, the object is stretched by dy,max and when vy is negative, it is shrunk by |dy,max |. C. Motion Vector Composition For general motions with nonzero values for vx and vy, the distortion from the results of the previous sections. Eq. (2), (3), (6) and (7) are still valid under the same definitions. However, the skew angle needs to be rewritten as max, max,1tan y x dN d += −θ (8) IEEE Transactions on Consumer Electronics, Vol. 54, No. 4, NOVEMBER 2008 1482 The analysis up to now can explain the distortion patterns in Fig. 1. Because a vertical motion is dominant in the top and the middle case, the object is stretched or shrunk. In the bottom case where a horizontal motion is dominant, the object is skewed. When we consider only rectilinear and global motions, the RS distortion can be represented by an affine transformation from non-distortion space (x, y, 1) to distortion space (x’, y’, 1) as shown in Fig.5. The image in the non-distortion space can be regarded as the result of the SS system. We can represent the transformation as ⎟⎟ ⎟ ⎠ ⎞ ⎜⎜ ⎜ ⎝ ⎛ = ⎟⎟ ⎟ ⎠ ⎞ ⎜⎜ ⎜ ⎝ ⎛ ⎟⎟ ⎟ ⎠ ⎞ ⎜⎜ ⎜ ⎝ ⎛ = ⎟⎟ ⎟ ⎠ ⎞ ⎜⎜ ⎜ ⎝ ⎛ 111001 ' ' 232221 131211 y x Ay x aaa aaa y x (9) Fig.5 Distortion by rolling-shutter imager can be represented by an affine transformation The same image as would be obtained by the SS system can also be obtained in the RS system by ‘undoing’ the distortion through the inverse transformation if the transformation matrix A can be found. To find matrix A in Eq. (9), instead of substituting matching points between two spaces and solving simultaneous equations with respect to aij, we took advantage of well-known properties of the affine transformation. In Eq. (9), a11 and a22 reflect scaling factors with respect to x- and y-axis, respectively. Since no scaling takes place in the direction of x-axis, a11 is considered to be a unity. Vertical scaling factor a22 can be given by (N + dy,max) / N since the height of the sample is changed from N to N + dy,max. On the other hand, a12 reflects a shearing factor with respect to x-axis which is given by cotθ = (N+dy,max) / dx,max whereas a21 is zero since there is no shearing effect in the direction of y-axis. Thus A is given by ⎟⎟ ⎟⎟ ⎟⎟ ⎟ ⎠ ⎞ ⎜⎜ ⎜⎜ ⎜⎜ ⎜ ⎝ ⎛ + + = 100 0 1 1, max, 1, max, max, y y x x y d N dN d d dN A (10) The variables can be evaluated by Eq. (3) and (7) along with known sensor parameters such as N and τ if motion vector (vx, vy) is acquired. IV. IMPLEMENTATION SCHEME TO REDUCE ROLLING- SHUTTER DISTORTIONS To reduce the complexity in the implementation, three assumptions were made, shown below along with the rationale for each assumption. z Motions are rectilinear: Any general motion can be approximated into a combination of rectilinear ones for a sufficiently short period of time. z Motion blur is less than a certain level: Motion blur, always accompanied by any motion, is affected by exposure time and motion velocity. However, for a short period of exposure time, motion blur can also be assumed to be negligible. z There is only a global motion in an image: Distortions due to partial motion are less apparent than those due to a global motion. Therefore, partial motions are ignored. Under these assumptions, we propose a post-processing routine as shown in Fig. 6. The routine receives RGB image as input and generates RGB output image. As a whole, the routine consists of motion detection stage and transformation stage. It operates in either preview mode or capture mode. In the preview mode, the sensor generates low-resolution video streams before a capture takes place. In the capture mode, high-resolution target image data are generated by the sensor after a capture signal is generated. Motion vectors, extracted from consecutive images by the motion detection stage in the preview mode, are used in the transformation stage to adjust the image. To reduce power consumption, only two image frames after each capture signal are used in the motion detection, which is accomplished by delaying the mode switching until two preview images are saved after the capture. J.-B. Chun et al.: Suppressing Rolling-Shutter Distortion of CMOS Image Sensors by Motion Vector Detection 1483 A. Motion Vector Detection Stage Motion vector detection is an important issue in image processing and computer vision studies. Applications like video compression, video mosaic and video surveillance belong to its major application and they utilize block matching technique or optical flow method for the motion vector detection. On the other hand, image stabilization techniques [8], [9] are used for video camera to compensate for image deterioration due to unexpected swing of user’s hands. Their motion vector detection and image stabilization method correspond to the first and second step of our approach shown in Fig. 6. To find global motion vectors, Oshima et al. [8] utilize a specialized gyro sensor and Kinugasa et al. [9] derive their motion vectors from consecutive images by projecting images into x- and y-axis and comparing the current projection with the previous projection. [9] is similar to our approach in utilizing consecutive images but its one-dimensional motion vector detection technique is less accurate than typical two- dimensional techniques. Most of known methods to find motion vectors are based on consecutive images. The most straightforward way to find a motion vector from two consecutive images is to perform the exhaustive search for all possible relative motions between two frames so that the sum of errors between Fig.6 Block diagram of the proposed system two images is minimal. The computation complexity becomes O(M2N2) when the size of image is M x N. Various methods to reduce the complexity were published [10]-[13]. The most popular one is Lucas-Kanade (LK) algorithm [12] on which our motion detection method is based. In the LK algorithm, for two image data F(X) and G(X) for X = (x, y), error function E(h) is defined as ∑ −+= X XGhXFhE 2)]()([)( (11) where h is 2-D shift amount. Motion vector detection between F and G is to find h which makes the error function minimal. Both sides of Eq. (11) are differentiated and equated to zero: 0 )]()(')()[(2 )]()(')([ 2 = −+= −+∂ ∂≈∂ ∂ ∑ ∑ X X XGXhFXFXF XGXhFXF hh E (12) Eq. (12) can be solved with respect to h. F(X) is then shifted by h and the same procedure is iterated until h reaches a target value. In this way, the complexity is reduced to )log( MNMNO . Because the proposed routine is expected to be run on mobile devices of which computing power is not as powerful as stationary ones, it is needed to further reduce the complexity at the LK stage of the proposed routine. In Fig. 6, the RGB2Gray block, the first stage of motion vector detection

                    本文档为【CMOS稳像】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

CMOS稳像

你可能还喜欢