Postprint of: Kłosowski M., Jendernalik W., Jakusz J., Blakiewicz G., Szczepański S., A CMOS Pixel With Embedded ADC, Digital CDS and Gain Correction Capability for Massively Parallel Imaging Array, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, Vol. 64, Iss. 1 (2017), pp. 38-49, DOI: 10.1109/TCSI.2016.2610524 © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. # A CMOS Pixel with Embedded ADC, Digital CDS and Gain Correction Capability for Massively Parallel Imaging Array M. Kłosowski, W. Jendernalik, J. Jakusz, G. Blakiewicz, and S. Szczepański Abstract-In the paper, a CMOS pixel has been proposed for imaging arrays with massively parallel image acquisition and simultaneous compensation of dark signal nonuniformity (DSNU) as well as photoresponse nonuniformity (PRNU). In our solution the pixel contains all necessary functional blocks: a photosensor and an analog-to-digital converter (ADC) with built-in correlated double sampling (CDS) integrated together. It is implemented in standard 0.18 µm CMOS technology. The size of the pixel with 9-bit resolution is 21 $\mu$ m $\times$ 21 $\mu$ m. Measurements of the 128 $\times$ 128 pixel array confirm functionality of the proposed solution. CDS reduces dark FPN from 12 LSB (3%) to 0.8 LSB (0.2%) and light FPN from 14 LSB (3.7%) to 7 LSB (1.8%). Further reduction of the light FPN (to ~1 LSB) was achieved by compensating PRNU using massively parallel innovative digital multiplication which features good resolution (1/511), does not disturb CDS executed at the same time, and can be implemented within a small pixel area. Index Terms—CMOS pixel, digital pixel, digital image sensor, digital pixel sensor (DPS), correlated double sampling (CDS), fixed pattern noise (FPN), DSNU, PRNU. # I. INTRODUCTION Massively parallel CMOs image sensors (CISs) with analog-to-digital converters (ADC) implemented at the pixel level are extensions of conventional CISs with columnwise ADC. Such CISs capture the image frame by means of a global shutter and convert the video signal into a digital form in a massively parallel manner. This approach exhibits several advantages over other types of sensors. First, owing to a parallel analog-to-digital (A/D) conversion, it is easier to obtain faster acquisition with lower power consumption (compared to column-parallel ADC [1], [2]). Second, implementing ADC in the pixels permits maximization of signal to noise ratio (SNR) of the image sensor [3], [4]. Another benefit of image sensors with parallel image acquisition is a possibility of constructing more advanced sensors within 3D-IC technology [5]–[7]. This work was supported in part by the Polish National Science Centre under grant no. 2011/03/B/ST7/03547. Authors are with the Department of Microelectronic Systems, Faculty of Electronics Telecommunications and Informatics, Gdańsk University of Technology, Poland, (e-mail: waljende@pg.gda.pl). Despite their merits, CISs with massively parallel A/D conversion are characterized by relatively large pixel sizes and small fill factor. These result from the necessity of implementing ADC that contains components such as analog comparator, digital memory, counters, and combinatorial logic. As a consequence, it is difficult to implement ADC within a small area of less than $20~\mu m \times 20~\mu m$ . Although it is possible to obtain considerable size reduction of digital circuitry in small-scale CMOS technologies, such technologies are not suitable for implementing high-quality light-to-voltage converters (LVCs) due to excessive leakage currents. Majority of the solutions proposed in the literature were implemented in technologies not smaller than 130 nm [2], [8]–[15]. Important components of CIS are circuits for reducing fixed pattern noise (FPN) in the image. The noise is a result of the imaging array nonuniformity which can be of two types, dark signal nonuniformity (DSNU) and photo-response nonuniformity (PRNU). Because—from the point of view of pixel performance parameters—both nonuniformity types originate from the spread of the offsets and slopes of photo-electric characteristics of various pixels [16], they can be compensated using correlated double sampling (CDS) and gain correction (GC). In a conventional column-parallel CIS, CDS circuits are situated outside the pixel array, typically together with the column-parallel ADC [1], [17]. If completely parallel processing is required, ADC, CDS, and GC have to be implemented in the pixels, which is in conflict with maintaining the small size of the pixels. In the literature, one can encounter analog realizations of CDS in the pixels [18]–[21], however, implementing small-size GC is difficult because PRNU compensation requires high GC precision. In majority of prototype CIS realizations with massively parallel A/D conversion presented in the literature, CDS and GC functionalities were not implemented in the pixels [2], [8]–[11]. Instead, only non-correlated double sampling (DS) was implemented in the pixels [8], [9], whereas CDS was realized outside the imaging array [2]. Such an approach was merely used for the sake of verifying novel signal processing concepts [8], testing new architectures [11], or demonstrating particular capabilities of digital CIS [2], [9]. On the other hand, in order to ensure proper sensor operation, DSNU and PRNU compensation is not always necessary [10]; in some cases, a simple Fig. 1. Digital pixel with CDS. (a) Circuit diagram. (b) Portion of a 9-bit reversible LFSR counter. compensation using DS turns to be sufficient [8]. In this paper, a new CMOS pixel solution with built-in ADC, digital CDS with GC functionality, has been presented. A high-precision GC was realized by innovative digital multiplication technique that permits effective in-pixel PRNU compensation. The digital CDS is completely integrated in a pixel. Although a similar CDS was used previously [17] in column-parallel processing, our CDS solution is area-efficient enough to be integrated in a pixel-parallel processing. The pixel with 9-bit resolution ADC and CDS has the size of $21 \,\mu\text{m} \times 21 \,\mu\text{m}$ in the standard 0.18 $\mu\text{m}$ technology. The performance of DSNU compensation by the in-pixel CDS has been validated by measuring 128 × 128 pixel imaging array. Implementation of GC does not require significant increase of the pixel area and it does not interfere with simultaneously executed CDS. The pixel with CDS and an additional GC functionality (with a resolution of 1/500) features a size of $21 \mu m \times 36 \mu m$ . Because of the cost, operation of this version of the pixel has been verified using a $1 \times 128$ pixel array which is sufficient to demonstrate efficiency of the proposed PRNU compensation method. #### II. CMOS PIXEL WITH ADC & CDS # A. Principle of Operation The circuit diagram of the pixel is shown in Fig. 1(a). The pixel contains LVC with a MOS photogate (PG) photosensor and single-slope ADC that also realizes CDS. ADC contains an analog comparator (CMP), 9-bit reversible counter (Fig. 1(b) and Fig. 2) and the control logic. LVC sequentially generates two signal samples on the sense-node, a reset-signal and a photo-signal that have to be subtracted according to the CDS operation. Digital subtraction is realized in ADC by changing the counting direction [17]. Fig. 3 shows the time series during the pixel operation. The process can be described as follows. Initially, A/D conversion of the reset-signal is executed, which is realized by counting (downwards) $\Phi_{\text{down}}$ Fig. 2. Dynamic register applied in the reversible counter. clock pulses until the RAMP signal becomes equal to the reset sample value. The photo-signal conversion is realized by upward counting of $\Phi_{up}$ pulses until reaching similar equality (here, between the RAMP signal and the photo sample value). When the counting process is finished, the counter status corresponds to the difference between both samples and it can be read at the $PIX_{out}$ output. Subtracting of two subsequent samples is equivalent to first-order high-pass filtering which suppresses signals of the frequency lower than $(1/\Delta T_{\text{sample}})/6$ , where $\Delta T_{\text{sample}}$ is the period of sample acquisition. In the considered pixel, $\Delta T_{\text{sample}}$ is strictly related to the image acquisition rate as shown in Fig. 3. For acquisition speed up to 700 fps, $\Delta T_{\text{sample}}$ is typically 700 $\mu$ s, which permits—in ideal case—reduction of interference coming from LVC and the comparator in the frequency bandwidth from 0 Hz to about 238 Hz. For higher rates, $\Delta T_{\text{sample}}$ reduces, e.g., for 3500 fps it is 140 $\mu$ s. In practice, filtering efficiency of CDS is limited by the possibility of significant correlation of interference in subsequent signal samples. This correlation depends on many factors such as LVC type and how it is controlled, operating conditions (e.g., temperature), technology, as well as parasitic effects. For example, the increase of the temperature or the leakage current results in reduced correlation between the samples thus leading to degradation of CDS efficiency. Fig. 3. Timing diagram for A/D conversion with CDS. ## B. Light-to-Voltage Converter (LVC) In prototype digital CIS realizations, a photodiode in an integrating mode is often utilized as a photosensor [8]-[11], [15]. This is due to its simple construction and the fact that relatively good trade-off between the area, sensitivity, and image quality can be achieved even in cheap standard CMOS technology. However, in such a sensor, the integration phase is not separated from the reset phase, which considerably complicates realization of "true" CDS. In the proposed pixel (Fig. 1), LVC with MOS photogate has been utilized which features more complex construction yet it is suitable for realizing true CDS. Based on initial experiments, an optimum way of LVC control has been determined that permits obtaining the best correlation between the samples. The obtained parameters of the controlling signals of the photogate (PG), transfer gate (TG), and the reset transistor (RSTA) have been reported in Section IV. #### C. Reversible Counter In the proposed pixel, operation of the A/D conversion and CDS are independent of the counter coding. It can be a standard binary coding, Gray coding, or a pseudo-random coding. Because the counter occupies the largest portion of the pixel area, the main criterion when selecting the coding was a possibility of minimizing the counter area. In the prototype pixel, a 9-bit synchronous counter with pseudo-random LFSR coding [15] has been applied because such a counter features small number of components (9 registers and two XOR gates), as well as easy changing the counting direction and simple realization of the reset operation. Partial schematic diagram of the counter has been shown in Fig. 1(b). Although pseudorandom code is inconvenient in image processing, it is not a problem when only image acquisition is of concern. The counter registers have been realized as dynamic circuits controlled by a two-phase clock (Fig. 2). In order to further reduce the counter area, only the first register (denoted as LATCH2PR in Fig. 1(b)) has been equipped with the reset input. Resetting of the entire counter to the initial state of 111111111 is realized algorithmically by means of propagating the logical '1' along the counter. The counter has 511 states because the 000000000 is not present. The registers are using 1.2 V supply, whereas all control signals ( $\Phi_2$ , $\Phi_{up}$ , $\Phi_{down}$ , *STAT*, and *RST*) are produced in 1.8 V domain, which counteracts suppressing the logical '1' by the pass transistors. # D. Detailed Description of Timing Diagram According to Fig. 3, conversion of the first sample (resetsignal) from LVC begins with the global *RSTA* impulse which resets capacitances of the sense nodes in all pixels. Subsequently, a D-latch at the comparator output is switched to the 'count' state and downward counting begins according to the global two-phase clock $\Phi_{1\text{down}}$ and $\Phi_2$ . When the counting is finished, the counter state corresponds to the value of the first CDS sample. Immediately after conversion of the reset-signal sample is complete, a TG pulse appears in order to transfer the charge accumulated in the photogate capacitance to the capacitor in the sense node. An A/D conversion of the second sample (photo-signal) starts. Upward impulse counting is governed by a two-phase clock $\Phi_{\text{lup}}$ and $\Phi_2$ . Upon completing the second sample conversion, the counter state corresponds to the final value of the pixel ( $PIX_{\text{out}}$ ) that takes into account CDS. An additional global signal *CNT\_EN* is deactivated upon generating the last counting impulse (independently of the counting direction) and causes to stop those counters that were yet in the 'count' state (e.g., due to pixel overdriving). The same signal allows for blocking the counter idependently of the state signal *CMP* from the comparator. Additional feedback activated by the *STAT* signal permits static operation of the LATCH2P registers after stopping the counter. This, in turn, allows for reading out the value of each pixel after A/D conversion. After the readout, all counters are reset by setting the first bit to '1' and ninefold setup of the $\Phi_{1\text{up}}$ and $\Phi_2$ phases in order to introduce '1' to all counter registers. This is a global operation. During this operation, the *CNT\_FORCE* signal is activated that allows for changing the Fig. 4. Analog comparator. register states independently of other signals. A D-latch at the comparator output also permits synchronization of the *CMP* signal with the $\Phi_2$ clock, which allows for stopping the counter without risk of metastability. #### E. Analog Comparator The schematic diagram of the comparator is shown in Fig. 4. The comparator uses over 90 percent of the overall power consumed by the pixel, therefore power-down is realized using already existing signal buses BIAS or RAMP. To switch off the comparator, the high state of 1.8 V is set on RAMP. Because the LVC voltage is always lower than 1.6 V, it will cut off transistors M1, M4 and M5 leading to $I_{bias}$ =0. At the same time, the high state at the M5 drain will cut off transistors M6 and M7. The comparator speed depends on $I_{bias}$ . For $I_{bias}$ = 70 nA, over 3000 A/D conversions (with CDS) per second have been obtained. Besides the power consumption and speed, such features like offset and noise are also important. Those comparator nonidealities degrade ADC performance. Example of relevant analysis can be found in [25]. ## F. Pixel Layout The pixel layout (Fig. 5) has been designed in 0.18 $\mu$ m austriamicrosystem 1P6M technology. In LVC, transistors with thin oxide with reduced leakage have been used. The remaining circuits have been implemented using transistors with standard leakage because such devices permit denser layout arrangement. The photogate region size is 5 $\mu$ m $\times$ 5 $\mu$ m, which (due to the lack of microlenses) results in a small fill factor of 5.5%. The signal paths (16 digital and 5 analog) as well as the supply paths are passing through the pixel. # III. GAIN CORRECTION # A. Method PRNU of the imaging array can be modeled by means of a distribution of the LVC gain in the pixel array [16]. Assuming that the LVC photoconversion characteristic is linear, gain correction (GC) of LVC can be realized by multiplying the pixel value $PIX_{out}$ by a constant between 0 and 1. The GC implementation has to satisfy two fundamental conditions: (i) it has to work independently of the counter coding; (ii) it must not interfere with CDS. The GC method selected here satisfies both conditions and Fig. 5. Layout of the pixel with digital CDS. The pixel size is $21\mu m \times 21\mu m$ works by blocking the clock impulses $\Phi_{1up}$ and $\Phi_{1down}$ counted by the reversible counter. The concept of the GC approach has been illustrated in Fig. 6. In the pixel array without GC (Fig. 6(a)), the signal that blocks the counting process (CNT\_EN\_GLOBAL) is common for all pixels so that it is not possible to idependently block the counters in individual pixels. In case of the pixel array with GC (Fig. 6(b)), the blocking signal (CNT\_EN\_LOCAL) is generated independently for each pixel. The multiplication coefficients (COEFF) are stored in the GC circuit memories using binary coding (not LFSR). Writing of these coefficients is executed once before initiating the pixel array. The coefficients are written in series using a 1-bit line passing through all GC circuits. The value of the multiplication coefficient is relative to the capacity of the pixel's counter. For example, if the counter capacity is 511, then $COEFF = 101010101_B = 341_D$ corresponds to multiplying by $341/511 \approx 0.667$ . In the course of A/D conversion, column data buses are utilized for sending the reference impulse sequence REF which is shown in Fig. 7. This sequence has been defined so as to permit blocking of the selected impulses of the $\Phi_{\text{lup}}$ and $\Phi_{\text{ldown}}$ clocks in a possibly uniform manner. The counter will be stopped for a single clock period if an impulse appears on at least one REF line that is turned on in a given pixel. It is the state of the memory cell COEFF of the pixel that decides whether a given line is on. The following function controls blocking of the counter in the pixel: $$CNT\_EN\_LOCAL = CNT\_EN\_GLOBAL \land \\ \neg \left[ (REF(0) \land \neg COEFF(0)) \lor \\ (REF(1) \land \neg COEFF(1)) \lor \\ \dots \\ (REF(8) \land \neg COEFF(8)) \right]$$ $$(1)$$ A schematic diagram of the GC circuit generating the CNT\_EN\_LOCAL signal according to (1) has been shown in Fig. 8. Simplified diagram of the 9-bit GC (gain correction) circuit. Fig. 7. Timing diagram for multiplication by a coefficient of 341/511. Fig. 8. A particular waveform of this signal corresponding for multiplication by $341/511 \approx 0.667$ has been shown in Fig. 7. It can be observed on a zoomed portion of the waveform that for 26 impulses of the $\Phi_2$ clock, 17 impulses are not blocked, which corresponds to the multiplication coefficient of 0.654. For sufficiently large number of $\Phi_2$ impulses, irregularity of the clock blocking will not be as pronounced and the multiplication coefficient will be close to its nominal value of 0.667. #### B. Resolution of Multiplication and Coefficient Range Resolution of the proposed multiplication method is 1/511 for the considered counter circuit. It is possible to reduce the number of bits of the GC circuit by means of removing the most significant ones. This would narrow down the range of multiplication coefficients but it would not reduce the resolution of the coefficient changes. For example, if a 9-bit counter is coupled to a 9-bit GC circuit, the range of changes of the multiplication factor is $\langle 0,1\rangle$ with a 1/511 step. If the number of bits of the GC circuit is reduced, the coefficient range will also reduce, e.g., to $\langle 0.5,1\rangle$ for 8 bits and to $\langle 0.875,1\rangle$ for 6 bits. On the other hand, the minimum coefficient step will be the same and equal to 1/511. The aforementioned feature is an advantage from the PRNU compensation standpoint because, in practice, the lower range of the correction coefficient is not needed yet it is important to maintain its good resolution. Removing the more significant GC bits simplifies the pixel architecture and the related reduction of the *REF* bus bits of the highest frequency reduces power consumption and interferences. ## C. Nonlinearity of Multiplication The proposed multiplication operation introduces nonlinear distortion to A/D conversion that results from irregular counter blocking. Such distortion can be estimated by simulation. For the considered circuits, integral and differential nonlinearities (INL, DNL) vary within $\pm 1.4$ LSB, and $\pm 0.6$ LSB, respectively. Multiplying by 1 or 0 does not introduce any nonlinearity. The example shown in Fig. 7 concerning multiplication by 341/511 is the worst case for which INL is 1.4 LSB. Multiplication nonlinearity does not depend on the counter coding. ## D. Circuit Implementation The GC circuit of Fig. 8 has been implemented using dynamic logic. The memory of multiplication coefficients COEFF has been implemented as a shift register consisting of dynamic latches controlled by the two-phase clock. One needs 7.9=63 transistors to store one 9-bit coefficient. The number of transistors can be further reduced using classical RAM memory cells. The entire 9-bit GC circuit consists of 85 transistors. The GC circuit layout has been designed as a separate $15~\mu m \times 21~\mu m$ block fitting into the previous Fig. 9. Simplified architecture of a tested pixel array. designed pixel with CDS (cf. Fig. 5). The shape of the pixel obtained by connecting both layouts is not optimal (it is a 21 $\mu m \times 36~\mu m$ rectangle), yet it is not important from the point of view of demonstrating the proposed PRNU compensation method. The number of bits of the GC circuit and its size can be reduced because the measurements indicated that—in order to correct sensitivity of the LVCs utilized here—6-bit GC circuits are sufficient (a required range of multiplication coefficients is from 0.88 to 1.00). It can be estimated that the pixel layout with 9-bit ADC&CDS and 6-bit GC would fit into a 26 $\mu m \times 26~\mu m$ square assuming implementation in 0.18 $\mu m$ technology. #### IV. EXPERIMENTAL RESULTS #### A. Test Pixel Array The pixels have been tested in an imaging array that has been implemented in a form of an integrated circuit. Its architecture has been shown in Fig. 9. Image acquisition in a 128×127 pixel array can be realized without nonuniformity compensation or with DSNU compensation using CDS. In the last column, where each pixel is connected with the GC circuit, it is also possible to execute PRNU compensation. Visual data is output from the integrated circuit using 9-bit port the throughput of which is 50 fps for 128×127 array and 5000 fps for 128×1 one. The analog *RAMP* signal was generated by 12-bit D/A TLV5633 converter buffered by a low-noise amplifier LT6202 featuring a small offset voltage. ## B. Optimal Control of Light-to-Voltage Converter Fig. 10 shows a photo-conversion characteristic of the pixel, obtained by averaging characteristics of all (16256) pixels in the imaging array. It can be observed that the pixel values only reach ~420 DN despite the fact that the number of the counter states is 511. It results from the necessity of maintaining certain margin because the counters do not have any mechanisms to stop the counting process after the maximum value has been achieved. For high light intensity, counter overflow leads to dark spots in the image. An appropriate selection of the signal *PG*, *TG* and *RSTA* parameters affects LVC sample correlation and CDS Fig. 10. Photoconversion characteristic of a pixel, obtained after averaging the characteristics of all pixels in the 128×127 imaging array. Fig. 11. Integral nonlinearity of a pixel for different off-voltages on TG. On-voltage during transfer is 0.8 V. The plots are an average of all pixels. efficiency. Based on FPN and INL measurements, optimum parameters of these signals have been determined as follows: on voltages for photogate, transfer gate, and reset transistor PG-on = TG-on = 0.8 V, RSTA-on = 1.44 V; off voltages PGoff = TG-off = RST-off = 0 V. Increasing TG-on beyond 0.8 V did not affect image quality, however, increasing PG-on beyond 1 V led to noticeable increase of the dark current in some pixels. RSTA-on can be larger than 1.44 V, but it is limited by ICMR of the comparator. It is important that the voltage of 0 V on PG should be kept for possibly short period of time (i.e. only when the charge is being transferred to the sense-node), otherwise FPN increases dramatically. Reducing the PG-off voltage below 0 V increases FPN. Reducing TG-off below 0 V does not affect FPN but it improves INL. Fig. 11 shows pixel linearity versus negative TG-off (supplied during photon integration). Selecting this voltage to be between -75 mV and -50 mV allows for obtaining optimum linearity. A positive effect of a small negative transfer gate voltage during photon integration can be explained by reduction of the leakage current of the TG transistor. Improvement of the Fig. 12. Distribution of dark pixel value: (a) without CDS (average pixel value = 55 LSB, $\sigma$ = 12.04 LSB), (b) with CDS (average = 2.92, $\sigma$ = 0.79), (c) with CDS for sensor cooled to about -20 °C (average = 0.11, $\sigma$ = 0.40). Temporal noise is not included. Frame rate 50 fps. Fig. 13. Distribution of the pixel value under illumination: (a) without CDS (average pixel value = 361.7, $\sigma = 13.5$ ), (b) with CDS (average = 365.4, $\sigma = 6.14$ ). Temporal noise is not included. Frame rate 50 fps. sensor operation under such conditions has been also noticed and reported in [2]. # C. Array Nonuniformity without Compensation Fig. 12(a) shows a histogram of the dark pixel values in the 128×127 array without CDS (i.e., in a single-sampling mode). Standard deviation of the pixel values resulting from DSNU is 12 LSB. With illumination at about 85% of saturation, the standard deviation is only slightly higher, equal to 13.5 LSB (Fig. 13(a)). This indicates that nonuniformity of pixels is mostly related to spread of the offsets of their photoconvertion characteristics (i.e. DSNU) but not to spread of the slopes of these characteristics (i.e. PRNU). Some deviations from a typical Gauss curve can be observed on the histograms of Fig. 12(a) and 13(a): the plots have a "comb-like" character. This is caused by interferences from the digital part of the pixels (specifically, the counters) and their effect on the analog comparators. These interferences make some of the encoded values more likely to be obtained than others. ## D. Nonuniformity Compensation by CDS Upon using CDS, standard deviation of the dark pixel values has been reduced from 12 LSB to 0.8 LSB (Fig. 12(b)). The mean of the dark pixel values should be zero. However, a small, 2.9 LSB offset of the pixel values remains that is a result of the dark current and additional n+ diffusion between the photogate and the sense-node. The dark current and, consequently, the aforementioned offset can be reduced to 0.1 LSB by cooling the sensor, which was shown in Fig. 12(c). Cooling also reduced the residual DSNU to $\sigma = 0.4$ LSB. CDS reduces the spread of the dark pixel values and, as a result, reduces the spread of the illuminated pixel values. After application of CDS, the standard deviation of the illuminated pixels has been reduced from 13.5 LSB to 6.14 LSB (Fig. 13(b)). Further reduction of the spread required correction of the slopes of the characteristics (i.e., PRNU compensation) by means of GC. It follows from the histogram of Fig. 13(b) that the required correction coefficients are in the range 0.88 to 1. It should be mentioned that CDS also reduces the "comb-like" interference that can be observed in both histograms (Figs. 12(a) and 13(a)). This is due to the fact that the counters contain the same values (they count in parallel) in the first stage of CDS, which increases the interference generated by the digital part of the pixel. In the second stage of CDS, the counters start counting from different initial values which reduces the level of interference. # E. SNR and DR of Imaging Array with CDS Fig. 14 demonstrates operation of CDS as a high-pass filter. It can be observed that the dark temporal noise level for low frequencies is noticeably lower compared to the single-sampling mode (considerable reduction of 1/f noise). At the same time, CDS leads to about two-fold increase of the noise power for higher frequencies, which is a generic property of CDS circuits [22]. Fig. 15 shows SNR plots for the imaging array with the Fig. 14. Spectrum of a dark temporal noise of the 128×127 imaging array for different sampling modes. DC not included. Frame rate 50 fps. CDS turned on and off. The plots marked as TN and FPN indicate SNR with turned on CDS and with taking into account only the temporal noise and FPN, respectively. Because FPN becomes dominant (due to PRNU) over the temporal noise for high illumination intensity, the total SNR is subjected to saturation [23]. In the considered illumination range, CDS increases SNR of the imaging array by 8–16 dB. As a result, the dynamic range (DR) improves from 32 dB to 48 dB. ## F. PRNU Compensation by Gain Correction (GC) Fig. 16 shows the measured photoconvertion characteristics of the pixel (from the 128×1 array) for different GC coefficients. The measurements have been restricted to the range of 0.6 to 1.0 because it is sufficient for PRNU compensation in typical imaging arrays. Fig. 17 shows the plots of INL nonlinearity of the pixel characteristics caused by the multiplication algorithm. The plots have been prepared for one pixel upon averaging over 500 frames because it allows for better visualization of the INL fluctuations; because of PRNU, they are not visible when averaging characteristics over multiple pixels. In the case of multiplying by 341/511, nonlinearity does not exceed the range of –1.4 to 1.4 LSB predicted by simulation. For multiplication by 1 (i.e., 511/511) INL is not much different from the pixel nonlinearity in 128×127 array (Fig. 11), which do not have GC circuits. The GC circuits and multiplication control signals *REF* distributed over the entire imaging array are the sources of additional interference. In order to investigate this effect, SNR of the imaging array has been measured during distribution of the complete 9-bit *REF*. Furthermore, all pixels have been set to the same multiplication factor: such a setup resulted in the maximum interference level. The measurements indicated that the GC circuits and distribution of the *REF* signals have a minor effect on the noise level of the imaging array. It should be emphasized that in case of multiplication by 1 (i.e., 511/511) the *REF* signals are not required; however, they were distributed in order to increase the interference. Fig. 15. SNR of the 128×127 array with CDS; Total – fixed and temporal noise are included; TN – only temporal noise is included; FPN – only fixed pattern noise is included. Total with Single Sampling – SNR without CDS. #### G. SNR, DR and FPN after CDS & GC Utilization of PRNU compensation reduces FPN and increases SNR of the imaging array as shown in Fig. 18. However, this effect is not well pronounced because the SNR plot also takes into account the temporal noise and it is presented in the logarithmic scale. The benefits of PRNU compensation are clearly seen when only FPN is considered, which is typically plotted in a linear scale as in Fig. 19. The GC coefficients have been determined for illumination of 20 lux. Therefore, this is where FPN reaches its minimum value (~0.5 LSB), and SNR has a peak (Fig. 18). For low illumination levels (below 3 lux), PRNU compensation only slightly increases FPN due to nonlinearity of the GC multipliers. PRNU compensation does not increase DR which is bounded from above by the number of bits of the ADC counter in the pixel. # H. Other Parameters All measured parameters of the imaging arrays have been gathered in Table I. It can be noticed that SNR and DR for the 128×1 array are slightly better than for the 128×127 array, which is due to better operating conditions of the pixels located at the edge of the structure. Power consumption of the imaging arrays is dominated by the analog comparators of the pixels. Future investigations could be oriented towards reduction of the static and dynamic power consumption of the comparators. The parameters in Table I have been measured for 50 fps (A/D conversion time with CDS was 1.4 ms). Measurement data for higher speeds are provided in the next section. # I. Nonuniformity Compensation at Higher Speed The tests for speeds higher than 50 fps have been carried out only for the 128×1 array, which is due to a limited throughput of the output port of the integrated circuit. Still, the results obtained for a smaller number of pixels are meaningful. The Fig. 16. Photoconversion characteristics of the pixel with embedded GC. Fig. 17. Integral nonlinearity of a single pixel with embedded GC. Single randomly chosen pixel. Average of 500 frames. measurements have been performed for the speeds up to 3500 fps because higher speeds would require the increase of illumination beyond 450 lux which resulted in over exposing (i.e., degradation of the pixel parameters due to a horizontal propagation of the light in the integrated circuit). When increasing the speed from 50 fps to 3500 fps, the *RAMP* slope has been increased appropriately. Also, the time difference between CDS samples has been reduced from $\Delta T_{\text{sample}} = 720 \ \mu \text{s}$ to $\Delta T_{\text{sample}} = 140 \ \mu \text{s}$ . Fig. 20 shows the DSNU plot (solid line) versus image acquisition speed. It can be observed that increasing the speed up to about 1000 fps reduces DSNU because shortening of $\Delta T_{\text{sample}}$ limits the influence of the leakage currents in LVC. A slow increase of DSNU is observed for the speeds higher than 1000 fps, which is because speed limitations of the analog comparator. For comparison, Fig. 20 also shows DSNU (dashed line) for constant slope of *RAMP* and constant $\Delta T_{\text{sample}}$ = 140 $\mu$ s in the entire range of changes of the image acquisition speed. One can notice that DSNU is then higher due to insufficient speed of the comparator. The comb-like characteristic of the plot results from the applied measurement Fig. 18. SNR of the 128×1 imaging array. Frame rate 50 fps. Fig. 19. FPN of the 128×1 imaging array. Frame rate 50 fps. method (the limited step of the RAMP slope changes). CDS reduced the dark and light FPN by about one order of magnitude and increased SNR by 10–20 dB in the entire considered range of the image acquisition speed (Figs. 21 and Fig. 22). GC correction leads to further reduction of the light FPN and improvement of SNR; however, for the sake of brevity these results are not shown in the paper. The tests demonstrated the correctness of GC operation for the clock frequency up to 100 MHz. #### J. Example of Image Fig. 23 shows the actual images obtained from the 128×127 image array for various sampling modes. The same object (a penguin on the wooden background) has been chosen for demonstration purposes as in [24] (the work concerning "analog" pixels). In [24] the image was taken under 500 lux halogen lamp lighting, whereas in this work under 50 lux natural lighting. In both works, exactly the same lens was used. CDS allowed us to obtain an image (right panel of Fig. 23) of a satisfactory quality despite 10-fold lower light intensity than in [24]. Fig. 20. DSNU (dark FPN) of the 128×1 imaging array versus frame rate for variable and constant slope of *RAMP*. Fig. 21. FPN of the 128×1 imaging array versus frame rate for different sampling modes. #### V. COMPARISON Solutions of CMOS digital pixels are characterized by a large variety. There are solutions in which some (initial) stages of A/D conversion (e.g., $\Sigma\Delta$ modulation), analog-to-time conversion, or analog-to-frequency conversion has been implemented at pixel-level. Although, the final conversion stage to binary word is realized outside the pixel array. Other pixel solutions contain built-in a complete ADC and multi-bit memory. Given this variety of pixel solutions, fair comparison of their parameters is difficult. However, a meaningful comparison can be realized when considering only one type of pixels. Table II shows the parameters of selected pixels with builtin a complete ADC, digital memory and a photosensor. It should be emphasized that the papers [2], [10], [11], [15] and this work consider "standard" digital pixels for arrays with image acquisition, whereas [8] and [9] are about special pixels that can also perform image processing. The pixel from this work outperforms other solutions with respect to the number Fig. 22. SNR of the $128\times1$ imaging array for different frame rates and sampling modes (SS – single sampling mode). TABLE I MEASUREMENT RESULTS OF THE IMAGING ARRAYS | Parameter | Condition <sup>a</sup> | Value | | | |----------------------------------------------|-----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | Sensitivity | 615 nm | 918 DN/lux·s (2.39 V/lux·s) | | | | Dark FPN (DSNU) | single sampling<br>with CDS<br>with CDS <sup>c</sup> | 3% sat. <sup>b</sup> (31.2 mV rms)<br>0.2% sat. <sup>b</sup> (2.08 mV rms)<br>0.15% sat. <sup>b</sup> (1.56 mV rms) | | | | Temporal Noise | in darkness<br>max. noise in light | 0.17% sat. <sup>b</sup> (1.77 mV rms)<br>0.48% sat. <sup>b</sup> (5 mV rms) | | | | Light FPN<br>(DSNU+PRNU) at<br>400 DN output | single sampling with CDS with CDS c with CDS & GC c | 3.7% sat. <sup>b</sup> (38.5 mV rms)<br>1.85% sat. <sup>b</sup> (19.3 mV rms)<br>0.82% sat. <sup>b</sup> (8.54 mV rms)<br>0.25% sat. <sup>b</sup> (2.6 mV rms) | | | | INL | $TG_{off} = -75 \text{ mV}$ | ± 0.38% sat. b (± 1.5 LSB) | | | | SNR bd | single sampling<br>with CDS<br>with CDS & GC <sup>c</sup> | 27 dB (29 dB °)<br>35 dB (40 dB °)<br>46 dB ° | | | | Dynamic Range<br>(DR) <sup>d</sup> | single sampling<br>with CDS<br>with CDS & GC <sup>c</sup> | 32 dB (34 dB °)<br>48 dB (49.5 dB °)<br>49.5 dB ° | | | | Image Lag | 400 DN pulsed output | 11% - rising<br>21% - falling | | | | Power consumption | idle | 131 μW - analog part<br>6 μW - digital part | | | | | conversion & readout | 4.7 mW - analog part<br>132 μW - digital part | | | | Conversion energy per pixel | in darkness | 429 pJ - analog part<br>4.1 pJ - digital part | | | | | under light | 469 pJ - analog part<br>11.8 pJ - digital part | | | <sup>&</sup>lt;sup>a</sup> General conditions: conversion time 1.4 ms CDS, 50 fps, Ibias = 70 nA. of bits and the physical size. Better resolution (10 bits) was only presented in [15], whereas smaller size (9.4 $\mu$ m $\times$ 9.4 $\mu$ m) was only achieved in [2], but those pixels do not contain CDS and GC. Our pixel features one of the smallest fill factors, but has higher sensitivity (20 times sensitive than [2]). The data on DR of the imaging arrays has only been found in [10], [15] and in *this work*. The DR values given in there (over 100 dB) are much higher than what was obtained in this work (48 dB). <sup>&</sup>lt;sup>b</sup> sat. – near saturation level (400 DN output). <sup>&</sup>lt;sup>c</sup> For the 128×1 pixel array only. <sup>&</sup>lt;sup>d</sup> Temporal and fixed pattern noises included. TABLE II COMPARISON OF CMOS PIXELS WITH EMBEDDED PHOTOSENSOR, ADC AND DIGITAL MEMORY. | | Commindor | OF CHIED THEE | | I HOTOBERBOR, I | B C III IB B I G I I I I | | | |-----------------------------|------------------------------------------|-----------------------------|----------------------------|----------------------------|---------------------------|----------------------------------|-------------------------------------| | | S. Kleinfelder et al. [2] 2001 | A. Kitchen et al. [10] 2005 | X. Wang et al. [15] 2006 | K. Ito et al.<br>[11] 2009 | A. Lopich et al. [8] 2011 | S. J. Carey et al. [9] 2013 | This work | | Purpose | image acquisition | image acquisition | image acquisition | image acquisition | image processing | image process. | image acquisition | | CMOS technology | 0.18 μm | 0.35 μm | 0.18 μm | 0.35 μm | 0.35 μm | 0.18 μm | 0.18 μm | | Pixel size | 9.4μm × 9.4μm | 45μm × 45μm | 23μm × 23μm | 53μm × 53μm | 100μm × 117μm | 32μm × 32μm | 21μm × 21μm<br>(21×36 with GC) | | Photo sensor type | MOS photogate | pn photodiode | pn photodiode | pn photodiode | pn photodiode | pn photodiode | MOS photogate | | Fill factor | 15% | ~15% | 25% | 14.9% | 2% | 6.2% | 5.5% | | Sensitivity | 0.117 V/lux·s | n/a | 6.9 kHz/mW/cm <sup>2</sup> | n/a | n/a | n/a | 2.39 V/lux·s | | ADC word length | 8 bit | 8 bit | 10 bit | 8 bit | 8 bit | 8 bit | 9 bit | | Nonuniformity compensation | External digital CDS | _ | _ | _ | In-pixel<br>digital DS | In-pixel<br>digital DS | In-pixel<br>digital CDS, GC | | Size of pixel array | 352 × 288 | 64 × 64 | 28 × 28 | 64 × 48 | 19 × 22 | 256 × 256 | 128 × 127<br>128 × 1 with GC | | Dark FPN | 0.79%<br>0.027% with CDS | 0.8% | 5% | 9.7% | n/a | 1.6% | 3%<br>0.2% with CDS | | Light FPN | ~0.55% | n/a | n/a | n/a | n/a | n/a | 3.7%<br>1.85% with CDS | | Dark temporal noise | 0.15% with CDS | n/a | n/a | n/a | n/a | n/a | 0.17% with CDS | | Light temporal noise | ~0.65% | n/a | n/a | n/a | n/a | n/a | 0.48% | | SNR | n/a | n/a | n/a | n/a | n/a | n/a | 40 dB with CDS<br>45 dB with CDS&GC | | DR | n/a | 100 dB | 115 dB | n/a | n/a | n/a | 48 dB | | Power consumption per pixel | 50 mW per chip<br>at 10 <sup>4</sup> fps | Average current 1.6 μA | 0.25 μW | n/a | 63 μW | ~18 µW<br>at 10 <sup>5</sup> fps | 1.7 μW<br>at 3500 fps CDS | Fig. 23. Real image obtained from the 128×127 imaging array: left image – single sampling mode, right image – CDS mode. Such high DR of the image sensors in standard CMOS technology were achieved in [10] and [15] using special techniques. However, due to the lack of detailed information concerning nonuniformity and SNR, it is not possible to meaningfully compare these sensors with other solutions. In terms of speed, the best results were obtained in [9] and [12], where the fps values are very high, $10^5$ and $10^4$ fps in [9] and [12], respectively. The result obtained in this work, 3500 fps, can be further improved by increasing the speed of the analog comparators in the pixels. Having in mind the mechanisms of nonuniformity compensation, this work can only be compared to [2], where digital CDS was utilized as well. CDS efficiency in [2] and in this work is similar: in both cases a reduction of FPN by about one order of magnitude was achieved. However, in [2], digital CDS was implemented outside the imaging array. Our work is the only one where both CDS and GC were implemented within the pixels and their efficiency has been validated through SNR measurements. # VI. CONCLUSION In this paper, a CMOS pixel with built-in photosensor, ADC, digital CDS, and an innovative precise correction of the photosensor sensitivity has been proposed. Using the pixel, one can implement an imaging array with massively parallel A/D conversion and simultaneous DSNU and PRNU compensation. The proposed circuit solutions of nonuniformity compensation can be utilized in imaging arrays not only for visible light but also for other ranges such as X-ray or infrared. Furthermore, the proposed innovative digital multiplication, here utilized for photosensor sensitivity correction, can be used to adjust the gain of each single-slope ADC. # REFERENCES - [1] S. Okura, et al. "A 3.7 M-Pixel 1300-fps CMOS Image Sensor With 5.0 G-Pixel/s High-Speed Readout Circuit," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 1016–1024, Apr. 2015. - [2] S. Kleinfelder, S. H. Lim, X. Q. Liu, and A. El Gamal, "A 10 000 Frames/s CMOS Digital Pixel Sensor," *IEEE J. Solid-State Circuits*, vol. 36, no. 12, pp. 2049–2058, Dec. 2001. - [3] D. Yang, B. Fowler, and A. El Gamal, "A Nyquist-Rate Pixel-Level ADC for CMOS Image Sensors," *IEEE J. Solid-State Circuits*, vol. 34, no. 3, pp. 348–356, March 1999. - [4] B. Pain and E. Fossum, "Approaches and analysis for on-focal-plane analog-to-digital conversion," in *Proc. of SPIE*, vol. 2226, 1994, pp. 208–218. - [5] A. Lopich, P. Dudek, "Architecture and Design of a Programmable 3D-Integrated Cellular Processor Array for Image Processing," presented at the IFIP/IEEE Int. Conference on Very Large Scale Integration, VLSI-Soc 2011, Hong Kong, pp. 349–353. - [6] G. W. Deptuch, et al. "Design and Tests of the Vertically Integrated Photon Imaging Chip," IEEE Trans. Nuclear Science, vol. 61, no. 1, pp. 663–674, Feb. 2014. - [7] M. Goto, et al. "Pixel-Parallel 3-D Integrated CMOS Image Sensor With Pulse Frequency Modulation A/D Converters Developed by Direct - Bonding of SOI Layers," *IEEE Trans. Electron Devices*, vol. 62, no. 11, pp. 3530–3535, Nov. 2015. - [8] A. Lopich and P.Dudek, "A SIMD Cellular Processor Array Vision Chip With Asynchronous Processing Capabilities," *IEEE Trans. Circuits* Syst. I, vol. 58, no. 10, pp. 2420–2431, Oct. 2011. - [9] S. J. Carey, et al. "A 100,000 fps Vision Sensor with Embedded 535GOPS/W 256x256 SIMD Processor Array," presented at VLSI Circuits Symposium 2013, Kyoto, pp. C182–C183. - [10] A. Kitchen, A. Bermak, A. Bouzerdoum, "A Digital Pixel Sensor Array With Programmable Dynamic Range," *IEEE Trans. Electron Devices*, vol. 52, no. 12, pp. 2591–2601, Dec. 2005. - [11] K. Ito, B. Tongprasit, T. Shibata, "A Computational Digital Pixel Sensor Featuring Block-Readout Architecture for On-Chip Image Processing," *IEEE Trans. Circuits Syst. I*, vol. 56, no. 1, pp. 114–123, Jan. 2009. - [12] D.X.D. Yang, A.E. Gamal, B. Fowler, H. Tian, "A 640×512 CMOS Image Sensor with Ultrawide Dynamic Range Floating-Point Pixel-Level ADC," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1821– 1834, Dec. 1999. - [13] Z. Ignjatovic, D. Maricic, M. F. Bocko, "Low Power, High Dynamic Range CMOS Image Sensor Employing Pixel-Level Oversampling ΣΔ Analog-to-Digital Conversion," *IEEE Sensors Journal*, vol. 12, no. 4, pp. 737–746, Apr. 2012. - [14] W. Bidermann, *et al.* "A 0.18μm High Dynamic Range NTSC/PAL Imaging System-on-Chip with Embedded DRAM Frame Buffer," in *Proc. IEEE Int. Solid-State Circuits Conf.*, 2003, pp. 212–213. - [15] X. Wang, W. Wong, R. Hornsey, "A High Dynamic Range CMOS Image Sensor With Inpixel Light-to-Frequency Conversion," *IEEE Trans. Electron Devices*, vol. 53, no. 12, pp. 2988–2992, Dec. 2006. - [16] S. Lim, A. El Gamal, "Gain Fixed Pattern Noise Correction via Optical Flow," *IEEE Trans. Circuits Syst. I*, vol. 51, no. 4, pp. 779–786, Apr. 2004. - [17] Y. Nitta, et al. "High-Speed Digital Double Sampling with Analog CDS on Column Parallel ADC Architecture for Low-Noise Active Pixel Sensor," in Proc. IEEE Int. Solid-State Circuits Conf., San Francisco, 2006, pp. 2024–2031. - [18] T. Geurts, et al. "A 25 Mpixel, 80fps, CMOS Imager with an In-Pixel-CDS Global Shutter Pixel," presented at 2015 International Image Sensor Workshop (IISW), Vaals, The Netherlands. - [19] Y. De Wit, Tomas Geurts, "A Low Noise Low Power Global Shutter CMOS Pixel Having Single Readout Capability And Good Shutter Efficiency," presented at 2011 International Image Sensor Workshop (IISW), Hokkaido, Japan. - [20] T. Inoue, S. Takeuchi, S. Kawahito, "A CMOS Active Pixel Image Sensor with In-pixel CDS for High-Speed Cameras," in *Proc. of SPIE*, vol. 5580, 2005, pp. 293–300. - [21] M. Perenzoni, et al. "A 160×120-Pixels Range Camera With In-Pixel Correlated Double Sampling and Fixed-Pattern Noise Correction," *IEEE J. Solid-State Circuits*, vol. 46, no. 7, pp. 1672–1681, July 2011. - [22] H. M. Wey, W. Guggenbuhl, "An Improved Correlated Double Sampling Circuit for Low Noise Charge-Coupled Devices," *IEEE Trans. Circuits Syst.*, vol. 37, no. 12, pp. 1559–1565, Dec. 1990. - [23] A. E. Gamal, H. Eltoukhy, "CMOS image sensors," IEEE Circuits & Devices Magazine, pp. 6-20, May/June 2005. - [24] W. Jendernalik, G. Blakiewicz, J. Jakusz, S. Szczepanski, R. Piotrowski, "An Analog Sub-Miliwatt CMOS Image Sensor With Pixel-Level Convolution Processing," *IEEE Trans. Circuits Syst. I*, vol. 60, no. 2, pp. 279–289, Feb. 2013. - [25] W. Jendernalik, "On analog comparators for CMOS digital pixel applications. A comparative study," Bulletin of the Polish Academy of Sciences Technical Sciences, vol. 64, no. 2, pp. 271–278, Jun 2016. M. Klosowski received his M.Sc. and Ph.D. degrees in electrical engineering from Gdańsk University of Technology, Poland, in 1994 and 2001, respectively. Since 2001, he has been with the Department of Microelectronic Systems, Gdańsk University of Technology, Poland. His research interests lie in the area of vision sensors, video processing and applications of FPGAs to high-performance computing. **W. Jendernalik** received the M.Sc. and the Ph.D. degrees in electrical engineering from Gdańsk University of Technology, Poland, in 1997 and 2006, respectively. From 2001 to 2006, he was a Research Assistant with the Department of Electronic Circuits, Gdańsk University of Technology, where he worked in the field of continuous-time analog filters. Since 2007 he has joined the Department of Microelectronic Systems, as an Assistant Professor. His research interests include analog integrated circuits with emphasis on low-power vision processors and high-frequency analog filters. **J. Jakusz** received the M.Sc. and the Ph.D. degrees in electrical engineering from Gdańsk University of Technology, Poland, in 1990 and 2000, respectively. From 1990 to 1997, he was a Teaching and Research Assistant with the Department of Electronic Circuits, Gdańsk University of Technology, where he worked in the field of analog integrated filter design. Since 2000 he has been an Assistant Professor with the Department of Microelectronic Systems. His main research interests are in analog integrated circuits, vision processors and analog filters. **G. Blakiewicz** received the M.Sc., Ph.D. and D.Sc degrees in electrical engineering from Gdańsk University of Technology, Poland, in 1990, 1997 and 2013 respectively. From 1990 to 1997, he was a Research Assistant with the Department of Electronic Circuits, Gdańsk University of Technology, where he worked in the field of discrete-time analog filters. Since 1997 he has been an Assistant Professor with the Department of Microelectronic Systems. From 2003 to 2004, he was a Visiting Assistant Professor in Electrical Engineering Department, Portland State University. His main research interests include analog design with emphasis on low-voltage, low-power, and reduction of sensitivity to substrate noise. S. Szczepański received the M.Sc. and Ph.D. (hons) degrees in electronic engineering from Gdańsk University of Technology, Poland, in 1975 and 1986, respectively. In 1986, he was a Visiting Research Associate with the Institute National Polytechnique de Toulouse (INPT), Toulouse, France. From 1990 to 1991, he was with the Department of Electrical Engineering, Portland State University, Portland, OR, on a Kosciuszko Foundation Fellowship. From August to September 1998, he was a Visiting Professor with the Faculty of Engineering and Information Sciences at the University of Hertfordshire, Hatfield, U.K. He is currently a Professor and Head of Department of Microelectronic Systems, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. He has published more than 150 papers and holds three patents. His teaching and research interests include circuit theory, fully integrated analog filters, high-frequency transconductance amplifiers, analog integrated circuit design in bipolar and CMOS technology, and current-mode analog signal processing.