Đề tài Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders

Luận văn tiến sĩ khoa học: Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders CHAPTER Page I INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : 1 A. Problem Statement 1 B. Objective and Approach . 2 C. Main Contributions 3 D. Dissertation Overview 5 II SCALABLE VIDEO CODING : : : : : : : : : : : : : : : : : : : 7 A. Video Compression Standards 7 B. Basics in Video Coding 10 1. Compression . 11 2. Quantization and Binary Coding 12 C. Motion Compensation 16 D. Scalable Video Coding 20 1. Coarse Granular Scalability . 21 a. Spatial Scalability 21 b. Temporal Scalability 22 c. SNR/Quality Scalability 23 2. Fine Granular Scalability 23 III RATE-DISTORTION ANALYSIS FOR SCALABLE CODERS : 25 A. Motivation . 26 B. Preliminaries . 28 1. Brief R-D Analysis for MCP Coders 28 2. Brief R-D Analysis for Scalable Coders . 30 C. Source Analysis and Modeling 31 1. Related Work on Source Statistics . 32 2. Proposed Model for Source Distribution 34 D. Related Work on Rate-Distortion Modeling . 36 1. R-D Functions of MCP Coders . 36 2. Related Work on R-D Modeling 40 3. Current Problems 42 E. Distortion Analysis and Modeling 45 1. Distortion Model Based on Approximation Theory 45 a. Approximation Theory . 46 b. The Derivation of Distortion Function 47 2. Distortion Modeling Based on Coding Process . 50 F. Rate Analysis and Modeling . 54 1. Preliminaries . 54 2. Markov Model 56 G. A Novel Operational R-D Model . 61 1. Experimental Results 65 H. Square-Root R-D Model . 66 1. Simple Quality (PSNR) Model . 67 2. Simple Bitrate Model 69 3. SQRT Model . 72 IV QUALITY CONTROL FOR VIDEO STREAMING : : : : : : : 76 A. Related Work . 76 1. Congestion Control . 76 a. End-to-End vs. Router-Supported . 77 b. Window-Based vs. Rate-Based 78 2. Error Control . 78 a. Forward Error Correction (FEC) . 79 b. Retransmission . 80 c. Error Resilient Coding . 80 d. Error Concealment . 85 B. Quality Control in Internet Streaming 85 1. Motivation 86 2. Kelly Controls 88 3. Quality Control in CBR Channel 92 4. Quality Control in VBR Networks . 94 5. Related Error Control Mechanism . 98 V TRAFFIC MODELING : : : : : : : : : : : : : : : : : : : : : : 100 A. Related Work on VBR Tra±c Modeling . 102 1. Single Layer Video Tra±c 102 a. Autoregressive (AR) Models 102 b. Markov-modulated Models . 104 c. Models Based on Self-similar Process . 104 d. Other Models 105 2. Scalable Video Tra±c 106 B. Modeling I-Frame Sizes in Single-Layer Tra±c . 107 1. Wavelet Models and Preliminaries . 107 2. Generating Synthetic I-Frame Sizes 110 C. Modeling P/B-Frame Sizes in Single-layer Tra±c . 114 1. Intra-GOP Correlation . 115 2. Modeling P and B-Frame Sizes . 117 D. Modeling the Enhancement Layer 121 1. Analysis of the Enhancement Layer 123 2. Modeling I-Frame Sizes . 126 3. Modeling P and B-Frame Sizes . 127 E. Model Accuracy Evaluation . 129 1. Single-layer and the Base Layer Tra±c . 132 2. The Enhancement Layer Tra±c . 133 VI CONCLUSION AND FUTURE WORK : : : : : : : : : : : : : : 137 A. Conclusion . 137 B. Future Work 139 1. Supplying Peers Cooperation System 140 2. Scalable Rate Control System 141 REFERENCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 142 VITA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 155 . Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders

pdf172 trang | Chia sẻ: maiphuongtl | Lượt xem: 1935 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Đề tài Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
400 500 -400 400 1200 2000 2800 3600 bytes v_1 v_2 v_3 (b) Jurassic Park I Fig. 58. Histograms of {v(n)} for {φPi (n)} with i = 1, 2, 3 in (a) Star Wars IV and (b) Jurassic Park I. Both sequences are coded at Q = 14. To understand how to generate {v˜(n)}, we next examine the actual residual process v(n) = φPi (n) − aφ˜I(n) for each i. We show the histograms of {v(n)} for P-frame sequences i = 1, 2, 3 in the single-layer Star Wars IV and Jurassic Park I in Fig. 58. The figures shows that the residual process {v(n)} does not change much as a function of i. In Fig. 59 (a), we show the histograms of {v(n)} for sequences coded at dif- ferent Q. The figure shows that the histogram becomes more Gaussian-like when Q increases. Due to the diversity of the histogram of {v(n)}, we use a generalized Gamma distribution Gamma(γ, α, β) to estimate {v(n)}. Fig. 59 (b) shows that the smaller the quantization step Q, the larger the value of parameter a in (5.17), which is helpful for further modeling sequences coded from the same video content but at different quantization steps. 121 From Fig. 55 (b), we observe that the correlation between {φBi (n)} and {φI(n)} could be as small as 0.1 (e.g., in Star Wars IV coded at Q = 18) or as large as 0.9 (e.g., in The Silence of the Lambs coded at Q = 4). Thus, we can generate the synthetic B-frame traffic simply by an i.i.d. lognormal random number generator when the correlation between {φBi (n)} and {φI(n)} is small, or by a linear model similar to (5.16) when the correlation is large. The linear model has the following form: φBi (n) = aφ˜ I(n) + v˜B(n), (5.23) where a = r(0)σB/σI , r(0) is the lag-0 correlation between {φI(n)} and {φBi (n)}, σB and σI are the standard deviation of {φBi (n)} and {φI(n)}, respectively. Process v˜B(n) is independent of φ˜ I(n). We illustrate the difference between our model and a typical i.i.d. method of prior work (e.g., [68], [95]) in Fig. 60. The figure shows that our model indeed preserves the intra-GOP correlation of the original traffic, while the previous methods produce white (uncorrelated) noise. Statistical parameters (r(0), σP , σI , γ, α, β) needed for this model are easily estimated from the original sequences. D. Modeling the Enhancement Layer In this section, we provide brief background knowledge of multi-layer video, investi- gate methods to capture cross-layer dependency, and model the enhancement-layer traffic. Due to its flexibility and high bandwidth utilization, layered video coding is com- mon in video applications. Layered coding is often referred to as “scalable coding,” which can be further classified as coarse-granular (e.g., spatial scalability) or fine- granular (e.g., fine granular scalability (FGS)) [107]. The major difference between 122 0 50 100 150 200 250 300 350 400 -500 2500 5500 8500 11500 bytes Q=14 Q=10 Q=4 (a) 0 0.2 0.4 0.6 0.8 1 0 5 10 15 quant. step co rr el at io n StarWars Jurassic Troopers Silence StarTrek (b) Fig. 59. (a) Histograms of {v(n)} for {φP1 (n)} in Jurassic Park I coded at Q = 4, 10, 14. (b) Linear parameter a for modeling {φPi (n)} in various se- quences coded at different Q. coarse granularity and fine granularity is that the former provides quality improve- ments only when a complete enhancement layer has been received, while the latter continuously improves video quality with every additionally received codeword of the enhancement layer bitstream. In both coarse granular and fine granular coding methods, an enhancement layer is coded with the residual between the original image and the reconstructed image from the base layer. Therefore, the enhancement layer has a strong dependency on the base layer. Zhao et al. [115] also indicate that there exists a cross-correlation between the base layer and the enhancement layer; however, this correlation has not been fully addressed in previous studies. In the next subsection, we investigate the cross- correlation between the enhancement layer and the base layer using spatially scalable The Silence of the Lambs sequence and an FGS-coded Star Wars IV sequence as 123 -0.1 0.1 0.3 0.5 0 70 140 210 280 350 lag co rr el at io n actual our model i.i.d methods (a) -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 50 100 150 200 250 300 350 lag co rr el at io n actual our model i.i.d methods (b) Fig. 60. (a) The correlation between {φP1 (n)} and {φI(n)} in Star Wars IV. (b) The correlation between {φB1 (n)} and {φI(n)} in Jurassic Park I. examples. We only show the analysis of two-layer sequences for brevity and similar results hold for video streams with more than two layers. 1. Analysis of the Enhancement Layer Notice that We do not consider temporal scalable coded sequences, in which the base layer and the enhancement layer are approximately equivalent to extracting I/P-frames and B-frames out of a single-layer sequence, respectively [87]. For discussion convenience, we define the enhancement layer frame sizes as fol- lows. Similar to the definition in the base layer, we define εI(n) to be the I-frame size of the n-th GOP, εPi (n) to be the size of the i-th P-frame in GOP n, and ε B i (n) to be the size of the i-th B-frame in GOP n. Since each frame in the enhancement layer is predicted from the corresponding 124 0 0.2 0.4 0.6 0.8 1 0 50 100 150 lag co rr el at io n Q=4 Q=24 Q=30 (a) 0 0.2 0.4 0.6 0.8 1 0 50 100 150 lag co rr el at io n cov(P1_BL,P1_EL) cov(P2_BL,P2_EL) cov(P3_BL,P3_EL) (b) Fig. 61. (a) The correlation between {εI(n)} and {φI(n)} in The Silence of the Lambs coded at Q = 4, 24, 30. (b) The correlation between {εPi (n)} and {φPi (n)} in The Silence of the Lambs coded at Q = 30, for i = 1, 2, 3. frame in the base layer, we examine the cross-correlation between the enhancement layer frame sizes and the corresponding base layer frame sizes in various sequences. In Fig. 61 (a), we display the correlation between {εI(n)} and {φI(n)} in The Silence of the Lambs coded at different Q. As observed from the figure, the correlation between {εI(n)} and {φI(n)} is stronger when the quantization step Q is smaller. However, the difference among these cross-correlation curves is not as obvious as that in intra-GOP correlation. We also observe that the cross-correlation is still strong even at large lags, which indicates that {εI(n)} exhibits LRD properties and we should preserve these properties in the synthetic enhancement layer I-frame sizes. In Fig. 61 (b), we show the cross-correlation between processes {εPi (n)} and {φPi (n)} for i = 1, 2, 3. The figure demonstrates that the correlation between the enhancement layer and the base layer is quite strong, and the correlation structures 125 0 0.2 0.4 0.6 0.8 1 0 50 100 150lag co rr el at io n BL_I_cov EL_I_cov (a) 0 0.2 0.4 0.6 0.8 1 0 50 100 150lag co rr el at io n BL_P_cov EL_P_cov (b) Fig. 62. (a) The ACF of {εI(n)} and that of {φI(n)} in Star Wars IV. (b) The ACF of {εP1 (n)} and that of {φP1 (n)} in The Silence of the Lambs. between each {εPi (n)} and {φPi (n)} are very similar to each other. To avoid repetitive description, we do not show the correlation between {εBi (n)} and {φBi (n)}, which is similar to that between {εPi (n)} and {φPi (n)}. Aside from cross-correlation, we also examine the autocorrelation of each frame sequence in the enhancement layer and that of the corresponding sequence in the base layer. We show the ACF of {εI(n)} and that of {φI(n)} (labeled as “EL I cov” and “BL I cov”, respectively) in Fig. 62 (a); and display the ACF of {εP1 (n)} and that of {φP1 (n)} in Fig. 62 (b). The figure shows that although the ACF structure of {εI(n)} has some oscillation, its trend closely follows that of {φI(n)}. One also observes from the figures that the ACF structures of processes {εPi (n)} and {φPi (n)} are similar to each other. 126 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 0 50 100 150 lag co rr el at io n ca_BL_cov ca_EL_cov (a) Q = 30 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 0 50 100 150 lag co rr el at io n ca_BL_cov ca_EL_cov (b) Q = 4 Fig. 63. The ACF of {A3(ε)} and {A3(φ)} in The Silence of the Lambs coded at (a) Q = 30 and (b) Q = 4. 2. Modeling I-Frame Sizes Although cross-layer correlation is obvious in multi-layer traffic, previous work neither considered it during modeling [9], nor explicitly addressed the issue of its modeling [115]. In this section, we first describe how we model the enhancement layer I-frame sizes and then evaluate the performance of our model in capturing the cross-layer correlation. Recalling that {εI(n)} also possesses both SRD and LRD properties, we model it in the wavelet domain as we modeled {φI(n)}. We define {Aj(ε)} and {Aj(φ)} to be the approximation coefficients of {εI(n)} and {φI(n)} at the wavelet decomposition level j, respectively. To better understand the relationship between {Aj(ε)} and {Aj(φ)}, we show the ACF of {A3(ε)} and {A3(φ)} using Haar wavelets (labeled as “ca EL cov” and “ca BL cov”, respectively) in Fig. 63. 127 -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 0 100 200 300 400 lag cr o ss co rr el at io n actual our model (a) our model -0.3 -0.1 0.1 0.3 0.5 0.7 0.9 0 100 200 300 400 lag cr o ss co rr el at io n actual Zhao et al. (b) model [115] Fig. 64. The cross-correlation between {εI(n)} and {φI(n)} in The Silence of the Lambs and that in the synthetic traffic generated from (a) our model and (b) model [115]. As shown in Fig. 63, {Aj(ε)} and {Aj(φ)} exhibit similar ACF structure. Thus, we generate {AJ(ε)} by borrowing the ACF structure of {AJ(φ)}, which is known from our base-layer model. Using the ACF of {AJ(φ)} in modeling {εI(n)} not only saves computational cost, but also preserves the cross-layer correlation. In Fig. 64, we compare the actual cross-correlation between {εI(n)} and {φI(n)} to that between the synthetic {εI(n)} and {φI(n)} generated from our model and Zhao’s model [115]. The figure shows that our model significantly outperforms Zhao’s model in preserving the cross-layer correlation. 3. Modeling P and B-Frame Sizes Recall that the cross-correlation between {εPi (n)} and {φPi (n)} and that between {εBi (n)} and {φBi (n)} are also strong, as shown in Fig. 61. We use the linear model 128 0 100 200 300 400 -500 0 500 1000 1500 2000 2500 bytes w_P1 w_P2 w_P3 (a) Star Wars IV 0 100 200 300 400 500 600 700 -500 0 500 1000 1500 2000 2500 bytes w_P1 w_P2 w_P3 (b) The Silence of the Lambs Fig. 65. Histograms of {w1(n)} in (a) Star Wars IV and (b) The Silence of the Lambs (Q = 24), with i = 1, 2, 3. from Section 2 to estimate the sizes of the i-th P and B-frames in the n-th GOP: εPi (n) = aφ P i (n) + w˜1(n), (5.24) εBi (n) = aφ B i (n) + w˜2(n), (5.25) where a = r(0)σε/σφ, r(0) is the lag-0 cross-correlation coefficient, σε is the standard deviation of the enhancement-layer sequence, and σφ is the standard deviation of the corresponding base-layer sequence. Processes {w˜1(n)}, {w˜2(n)} are independent of {φPi (n)} and {φBi (n)}. We examine {w1(n)} and {w2(n)} and find they exhibit similar properties. We show two examples of {w1(n)} in Fig. 65. As observed from Fig. 65, the histogram of {w1(n)} is asymmetric and decays fast on both sides. Therefore, we use two exponential distributions to estimate its PDF. We first left-shift {w1(n)} by an offset δ to make the mode (i.e., the peak) ap- pear at zero. We then model the right side using one exponential distribution exp(λ1) 129 0 100 200 300 400 -500 0 500 1000 1500 2000 2500 bytes actual estimate (a) Star Wars IV 0 100 200 300 400 500 600 700 -500 0 500 1000 1500 2000 bytes actual estimate (b) The Silence of the Lambs Fig. 66. Histograms of {w1(n)} and {w˜1(n)} for {εP1 (n)} in (a) Star Wars IV and (b) The Silence of the Lambs (Q = 30). and the absolute value of the left side using another exponential distribution exp(λ2). Afterwards, we generate synthetic data {w˜1(n)} based on these two exponential dis- tributions and right-shift the result by δ. As shown in Fig. 66, the histograms of {w˜1(n)} are close to those of the actual data in both Star Wars IV and The Silence of the Lambs. We generate {w˜2(n)} in the same way and find its histogram is also close to that of {w2(n)}. E. Model Accuracy Evaluation As we stated earlier, a good traffic model should capture the statistical properties of the original traffic and be able to accurately predict network performance. There are three popular studies to verify the accuracy of a video traffic model [95]: quantile- quantile (QQ) plots, the variance of traffic during various time intervals, and buffer 130 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 original frame size sy n th et ic fra m e si ze (a) Star Wars IV 0 500 1000 1500 2000 2500 3000 0 1000 2000 3000 original frame size sy n th et ic fra m e si ze (b) The Silence of the Lambs Fig. 67. QQ plots for the synthetic (a) single-layer Star Wars IV traffic and (b) The Silence of the Lambs base-layer traffic. overflow loss evaluation. While the first two measures visually evaluate how well the distribution of the synthetic traffic and that of the original one matches, the overflow loss simulation examines the effectiveness of a traffic model to capture the temporal burstiness of original traffic. The QQ plot is a graphical technique to verify the distribution similarity between two test data sets. If the two data sets have the same distribution, the points should fall along the 45 degree reference line. The greater the departure from this reference line, the greater the difference between the two test data sets. Different from QQ plot, the variance of traffic during various time intervals shows whether the second-order moment of the synthetic traffic fits that of the original one. This second-order descriptor is used to capture burstiness properties of arrival processes [9]. This measure operates as follows. Assume that the length of a video sequence is l and there are m frames at a given time interval. We segment the one- 131 dimensional data into a m × n matrix, where n = l/m. After summarizing all the data in each column, we obtain a sequence of length n and then calculate its variance. Thus, we can obtain a set of variances given a set of time intervals. Besides the distribution, we also examine how well our approach preserves the temporal information of the original traffic. A common test for this is to pass the synthetic traffic through a generic router buffer with capacity c and drain rate d [95]. The drain rate is the number of bytes drained per second and is simulated as different multiples of the average traffic rate r¯. In the following two sections, we evaluate the accuracy of our model in both single-layer and multi-layer traffic using the above three measures. We should note that simulations with additional video sequences have demonstrated results similar to those shown throughout this section. 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 0 1 2 3 4 5 time interval (s) by te s actual our model GBAR Gamma_B Nested_AR (a) Star Wars IV 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 0 1 2 3 4 time interval (s) by te s actual our model GBAR Gamma_B Nested_AR (b) The Silence of the Lambs Fig. 68. Comparison of variance between synthetic and original traffic in (a) sin- gle-layer Star Wars IV and (b) The Silence of the Lambs base layer. 132 1. Single-layer and the Base Layer Traffic We first show QQ plots of the synthetic single-layer Star Wars IV and the synthetic base layer of The Silence of the Lambs that are generated by our model in Fig. 67 (a) and (b), respectively. As shown in the figure, the generated frame sizes and the original traffic are almost identical. In Fig. 68, we give a comparison between variance of the original traffic and that of the synthetic traffic generated from differen models at various time intervals. The figure shows that the second-order moment of our synthetic traffic is in a good agreement with that of the original one. We also compare the accuracy of several models using a leaky-bucket simulation. To understand the performance differences between various models, we define the relative error e as the difference between the actual packet loss p observed in the buffer fed with the original traffic and that observed using the synthetic traffic generated by each of the models: e = |p− pmodel| p . (5.26) In Table V, we illustrate the values of e for various buffer capacities and drain rates d. As shown in the table, the synthetic traffic generated by our model pro- vides a very accurate estimate of the actual data loss probability p and significantly outperforms the other methods. In addition, our synthetic traffic is approximately 30% more accurate than the i.i.d. models of prior work in estimating the loss ratio of P-frames. In Fig. 69, we show the relative error e of synthetic traffic generated from different models in H.26L Starship Troopers coded at Q = 1, 31, given d = r¯. Since GOP- GBAR model [31] is specifically developed for MPEG traffic, we do not apply it to H.26L sequences. The figure shows that our model outperforms the other three models 133 Table V. Relative Data Loss Error e in Star Wars IV. Buffer Traffic type Drain rate capacity 2r¯ 4r¯ 5r¯ 10ms Our Model 1.80% 0.93% 0.50% GOP-GBAR [31] 2.44% 2.51% 4.01% Nested AR [68] 4.02% 2.05% 5.63% Gamma A [95] 5.54% 1.04% 0.99% Gamma B [95] 5.76% 1.81% 1.15% 20ms Our Model 0.93% 0.61% 1.13% GOP-GBAR [31] 3.84% 2.16% 3.77% Nested AR [68] 5.81% 2.77% 8.46% Gamma A [95] 5.20% 0.61% 2.57% Gamma B [95] 4.89% 1.93% 2.05% 30ms Our Model 0.25% 0.33% 0.95% GOP-GBAR [31] 4.94% 3.33% 5.68% Nested AR [68] 6.94% 4.14% 9.92% Gamma A [95] 4.88% 1.10% 4.48% Gamma B [95] 4.67% 2.17% 4.03% in Starship Troopers coded at small Q and performs as good as model Gamma A [95] in the large Q case (the relative error e of both models is less than 1% in Fig. 69 (b)). 2. The Enhancement Layer Traffic We evaluate the accuracy of the synthetic enhancement layer by using QQ plots and show two examples in Fig. 70, which displays two QQ plots for the synthetic The Silence of the Lambs and Star Wars IV enhancement-layer traffic. The figure shows that the synthetic frame sizes in both sequences have the same distribution as those in the original traffic. We also compare the variance of the original traffic and that of the synthetic traffic in Fig. 71. Due to the computational complexity of model [115] in calculating long sequences, we only take the first 5000 frames of Star Wars IV and The Silence of the Lambs. As observed from the figure, our model well preserves the second-order 134 0% 1% 2% 3% 4% 10 20 30 40 buffer capacity (ms) re la tiv e er ro r our model Nested AR Gamma_A Gamma_B (a) Q = 1 0% 5% 10% 15% 20% 10 20 30 40 buffer capacity (ms) re la tiv e er ro r our model Nested AR Gamma_A Gamma_B (b) Q = 31 Fig. 69. Given d = r¯, the error e of various synthetic traffic in H.26L Starship Troopers coded at (a) Q = 1 and (b) Q = 31. moment of the original traffic. We next examine the data loss ratio predicted by our synthetic traffic passed through a generic buffer as shown in the previous section. Recall that the model in [9] is only applicable to sequences with a CBR base layer and the one in [115] is suitable only for short sequences. Therefore, we are not able to show results using leaky-bucket simulations for these multi-layer models given the nature of our sample sequences. In Fig. 72 and Fig. 73, we show the overflow data loss ratio of the enhancement layers in both The Silence of the Lambs (54, 000 frames) and Star Wars IV (108, 000 frames) with different drain rates d for buffer capacity c = 10 ms and c = 30 ms, respectively. The x-axis in the figure represents the ratio of the drain rates to the average traffic rate r¯. The figure shows that the synthetic enhancement layer preserves the temporal information of the original traffic very well. 135 0 2000 4000 6000 0 2000 4000 6000 original frame size sy n th et ic fra m e si ze (a) Star Wars IV 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 original frame size sy n th et ic fra m e si ze (b) The Silence of the Lambs Fig. 70. QQ plots for the synthetic enhancement-layer traffic: (a) Star Wars IV and (b) The Silence of the Lambs. 0.E+00 1.E+08 2.E+08 3.E+08 4.E+08 5.E+08 6.E+08 7.E+08 0 1 2 3 4 time interval (s) by te s actual our model Zhao et al. (a) Star Wars IV 0.E+00 2.E+09 4.E+09 6.E+09 8.E+09 1.E+10 1.E+10 1.E+10 2.E+10 2.E+10 0 1 2 3 4 time interval (s) by te s actual our model Zhao et al. (b) The Silence of the Lambs Fig. 71. Comparison of variance between the synthetic and original enhancement layer traffic in (a) Star Wars IV and (b) The Silence of the Lambs. 136 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 drain rate da ta lo ss ra tio actual synthetic (a) The Silence of the Lambs 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 drain rate da ta lo ss ra tio actual synthetic (b) Star Wars IV Fig. 72. Overflow data loss ratio of the original and synthetic enhancement layer traffic for c = 10 ms for (a) The Silence of the Lambs and (b) Star Wars IV. 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 drain rate da ta lo ss ra tio actual synthetic (a) The Silence of the Lambs 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 drain rate da ta lo ss ra tio actual synthetic (b) Star Wars IV Fig. 73. Overflow data loss ratio of the original and synthetic enhancement layer traffic for c = 30 ms for (a) The Silence of the Lambs and (b) Star Wars IV. 137 CHAPTER VI CONCLUSION AND FUTURE WORK The ideas presented in this document have been expressed in terms of an R-D mod- eling framework and a traffic model for scalable video coders, with the final goal of providing high quality video to end users. In this chapter, we summarize the major work we did and indicate some future directions for extension of the work. A. Conclusion Rate-distortion analysis has attracted great research interest after Shannon’s work was published [97]. The focus of previous work has been to a large extent the deriva- tion into some ideal bounds, which give us insight of achievable and non-achievable regions but are not directly applicable in practice. In stead, one goal in this work is to provide a practically useful R-D function for scalable coders. In Chapter III, we first modeled the statistical properties of the input to scalable coders and then presented a detailed analysis of rate and distortion for scalable coders. We also reviewed the performance bound for a generic hybrid coder using motion- compensated prediction. Based on the understanding of scalable coding processes and approximation theory, we derived a distortion model and an operational R-D model. Although this R-D model is accurate, its complex format limits its usage in video streaming applications. Therefore, we proposed another operational R-D model for streaming applica- tions. We expressed it in the PSNR domain for the convenience of quality control. Interestingly, we found that in the PSNR domain, both our R-D model and the the- oretical upper bound in [81] have a similar concave shape in the working range of scalable coders, which also matches the trend of actual R-PSNR curves. 138 R-D model Sender Router Receiver packet loss p(t) compressed video ( ) ( ( 1 ) , ( 1 ) )r t f r t p t= − − decide sending rate constant quality video Congestion control Fig. 74. R-D based quality control. In view of the inherent lack of stable quality associated with the base layer, we provided a quality control algorithm to provide constant quality video to end users in both CBR and VBR channels. In CBR channel, the algorithm proposed in Chapter IV performs better than most existing constant quality algorithms, in regard to both computational cost and performance. Furthermore, we studied modified Kelly control and showed that it can provide a stable environment for video transmission. Thus, we coupled our R-D model with this controller to achieve constant quality even under varying network conditions. The whole work in Chapter III and IV can be depicted in Fig. 74. In Chapter V, we presented a framework for modeling H.26L and MPEG-4 multi- layer full-length VBR video traffic. This work precisely captured the inter- and intra- GOP correlation in compressed VBR sequences, by incorporating wavelet-domain analysis into time-domain modeling. Whereas many previous traffic models are devel- oped at slice-level or even block-level [95], our framework uses frame-size level, which allows us to examine the loss ratio for each type of frames and apply other methods to improve the video quality at the receiver. We also proposed novel methods to model cross-layer correlation in multi-layer sequences and successfully described the inter-layer correlation. 139 B. Future Work In future work, we are interested in designing peer-to-peer streaming systems, where scalable video coders will play an important role and our traffic model will be helpful in its design. A peer-to-peer streaming system differs from a general peer-to-peer system in three aspects: (1) Peer-to-peer video streaming uses streaming mode and has high user requirements on video quality; (2) In a peer-to-peer video streaming system, a requesting peer can also play the role of a supplying peer as long as a certain amount of media data has been stored; (3) A requesting peer in a peer-to-peer streaming system can receive video data from multiple supplying peers simultaneously, while a requesting peer in a general peer-to-peer system usually only has one supplying peer at one time instant. There are two challenges in designing a peer-to-peer streaming system. One is to cooperate multiple supplying peers with high bandwidth utilization, and the other is to ensure a continuous playback with graceful quality adaptation. To address these two issues, we plan to design a scalable peer-to-peer video streaming system. Although a fine granularly scalable coded bitstream is preferred, general layered coded bitstreams are also applicable. In the proposed scheme, we will abide by a differentiated admission policy, which means that if a supplying peer has enough resource to provide service to several re- questing peers, we admit the requesting peer with the highest outgoing bandwidth. Intuitively, this policy has two benefits: (1) It will quickly increase the system ca- pacity. If a requesting peer with the highest outgoing bandwidth has been admitted, sometime later it will become another supplying peer and is able to contribute more to the system than those peers with less outgoing bandwidth; (2) It will encourage 140 the requesting peers to offer more outgoing bandwidth. In what follows, we discuss how to cooperate supplying peers in this scheme. 1. Supplying Peers Cooperation System Assume that for each requesting peer Pr, there is a supplying peer set Ps, which includes M supplying peers P 1s , P 2 s , . . . , P M s at time t and these supplying peers are selected via existing peer-to-peer lookup mechanisms (e.g., [101]). We also define the incoming bandwidth of Pr is Ir and the outgoing bandwidth of Pr is Or. It is obvious that if a supplying peer P is has the higher layers of the data stream, it must also have the lower layers. Since the base layer bandwidth is guaranteed, we know that the outgoing bandwidth Or is always larger than or equal to the base layer bandwidth Wb. We describe the cooperation scheme as follows: • To maximize the outgoing bandwidth of supplying peers, we select the first supplying peer as the lower layer supplying peer. Each packet is labeled with a layer number and a packet number. • After transmitting the base layer (which is CBR coded in FGS coders), the incoming bandwidth of requesting peer Pr is updated to Ir − Wb. Although supplying peer PMs has the highest outgoing bandwidth, its sending rate might be slow due to various reasons (e.g., requests from other peers). If the enhance- ment layer can be finely divided, the requesting peer will be able to allocate different portion of the enhancement layer to different supplying peers to achieve fast transmission and better video quality. • If a supplying peer P is fails, the buffer at the requesting peer side will allow a quick supplying-peer switch without quick quality degradation. If no other supplying peers can take over the data that P is used to transmit, the sending 141 portion of other supplying peers will be adjusted and the video quality at the receiver might be degraded. In addition, a quality control scheme is often in demand for continuous playback. 2. Scalable Rate Control System Since the current best-effort Internet does not provide any QoS guarantees to video applications, end users often suffer from quality fluctuations and playout starvation (i.e., receiver-buffer underflow). While the former mainly results from varying band- width, the latter happens when the receiver buffer is empty and the playout rate is faster than the incoming frame rate. Many studies have been conducted to provide good video quality to end users. Steinbach et al. [100] propose a client-controlled method to flexibly scale the playout rate to prevent playout starvation. However, end users often prefer constant playout rate. Thus, as an alternative, adaptive rate control mechanisms are proposed to adjust the sending rate according to the available bandwidth and the feedback from receiver buffers [69], [88], [94]. The fundamental idea of these mechanisms is to dynamically allocate bandwidth. When the total bandwidth of all available supplying peers is insufficient to support the requested bitstream from a requesting peer Pr, Pr can either request more frames covering fewer number of layers or fewer frames covering more layers. The switch threshold TH is decided by buffer condition, playout rate, and available incoming bandwidth Ir. 142 REFERENCES [1] P. Abry and V. Darryl, “Wavelet analysis of long-range-dependent traffic,” IEEE Trans. Inform. Theory, vol. 44, Jan. 1998. [2] J. G. Apostolopoulos and S. J. Wee, “Video compression standards”, available at Apr. 2002. [3] D. Bansal and H. Balakrishnan, “Binomial congestion control algorithms,” in Proc. IEEE INFOCOM, Anchorage, Alaska, Apr. 2001, pp. 631–640. [4] W. R. Bennett, “Spectra of quantized signals,” Bell Sys. Tech. Journal, vol. 27, pp. 446–472, July 1948. [5] T. Berger, Rate Distortion Theory, Englewood Cliffs, NJ: PrenticeHall, 1971. [6] J. A. Bilmes, “A gentle tutorial of the EM algorithm and its application to parameter estimation for gaussian mixture and hidden Markov models,” Inter- national Computer Science Institute, Berkeley, California, Apr. 1998. [7] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, “An archi- tecture for differentiated services,” IETF RFC 2475, 1998. [8] R. Braden, D. Clark, and S. Shenker, “Integrated services in the internet archi- tecture: An overview,” IETF RFC 1633, 1994. [9] K. Chandra and A. R. Reibman, “Modeling one- and two-layer variable bit rate video,” IEEE/ACM Trans. on Networking, vol. 7, pp. 398–413, June 1999. [10] D. Clark and W. Fang, “Explicit allocation of best effort packet delivery ser- vice,” IEEE/ACM Trans. on Networking, vol. 6, pp. 362–373, Aug. 1998. 143 [11] J.-J. Chen and D.W. Lin, “Optimal bit allocation for coding of video signals over ATM networks,” IEEE. J. on Sel. Areas in Comm., vol.15, pp. 1002–1015, Aug. 1997. [12] T. P.-C. Chen and T. Chen, “Markov modulated punctured auto-regressive processes for video traffic and wireless channel modeling,” in Packet Video, Apr. 2002. [13] T. Chiang and Y. Q. Zhang, “A new rate control scheme using quadratic dis- tortion model,” IEEE Trans. on CSVT, vol. 7, pp. 246–250, Feb. 1997. [14] A. Cohen and J.-P. D’ales, “Nonlinear approximation of random functions,” SIAM Journal on Appl. Math, vol.57, pp. 518–540, Apr. 1997. [15] A. Cohen, I. Daubechies, O. G. Guleryuz, and M.T. Orchard, “On the impor- tance of combining wavelet-based nonlinear approximation with coding strate- gies,” IEEE Trans. on Information Theory, vol. 48, pp. 1895 - 1921, July 2002. [16] A. L. Corte, A. Lombardo, S. Palazzo, and S. Zinna, “Modeling activity in VBR video sources,” Signal Processing: Image Communication, vol. 3, pp. 167–178, June 1991. [17] T. M. Cover and J. A. Thomas, Elements of Information Theory, New York: John Wiley, 1991. [18] M. Dai, D. Loguinov, and H. Radha, “Statistical analysis and distortion mod- eling of MPEG-4 FGS,” in Proc. IEEE ICIP, Barcelona, Spain, Sept. 2003, pp. 301–304. [19] A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queu- ing algorithm,” in ACM SIGCOMM, vol.1, pp.3–26, 1990. 144 [20] R. A. Devore, B. Jawerth, and B. J. Lucier, “Image compression through wavelet transform coding,” IEEE Trans. on Information Theory, vol. 38, pp. 719 - 746, Mar. 1992. [21] R. A. Devore, “Nonlinear approximation,” in Acta nnumerica, New York: Cam- bridge Univ. Press, Cambridge, 1998. [22] W. Ding and B. Liu, “Rate control of MPEG video coding and recording by rate-quantization modeling,” IEEE Trans. on CSVT, vol.6, pp. 12–20, Feb. 1996. [23] P. Embrechts, F. Lindskog, and A. McNeil, “Correlation and dependence in risk management: Properties and pitfalls,” available at Aug. 1999. [24] A. Erramilli, O. Narayan, and W. Willinger, “Experimental queueing analysis with long-range dependent packet traffic,” IEEE/ACM Trans. Networking, vol. 4, pp. 209–223, Apr. 1996. [25] T. Eude, R. Grisel, H. Cherifi, and R. Debrie, “On the distribution of the DCT coefficients,” in Proc. IEEE Conf. Acoustics, Speech, Signal Processing, vol. 5, Apr. 1994, pp. 365–368. [26] V. Firoiu, and M. Borden, “A study of active queue management for congestion control,” in Proc. IEEE INFOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 1435–1444. [27] F. H. P. Fitzek and M. Reisslein, “MPEG-4 and H.263 video traces for network performance evaluation (extended version),” available at berlin.de, Oct. 2000. 145 [28] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Congestion Avoidance,” IEEE/ACM Trans. on Networking, vol. 1, pp. 397–413, Aug. 1993. [29] S. Floyd, “TCP and explicit congestion notification,” ACM Computer Commu- nication Review, vol. 24, pp. 8–23, Oct. 1994. [30] S. Floyd, M. Handley, and J. Padhye, “Equation-based congestion control for unicast applications,” in Proc. ACM SIGCOMM, Stockholm, Sweden, Sept. 2000, pp. 43–56. [31] M. Frey and S. Nguyen-Quang, “A Gamma-based framework for modeling variable-rate MPEG video sources: The GOP GBAR model,” IEEE/ACM Trans. on Networking, vol. 8, pp. 710–719, Dec. 2000. [32] M. W. Garrett and W. Willinger, “Analysis, modeling and generation of self- similar VBR video traffic,” in Proc. ACM SIGCOMM, London, UK, Aug. 1994, pp. 269–280. [33] A. Gersho, “Asymptotically optimal block quantization,” IEEE Trans. on In- formation Theory, vol. 25, pp. 373–380, July 1979. [34] A. Gersho and R. Gray, Vector Quantization and Signal Compression, Boston, MA: Kluwer Academic Publishers, 1992. [35] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences,” IEEE Journal on Selected Areas in Communications, vol. SAC-5, pp. 1140–1154, Aug. 1987. [36] H. Gish and J. N. Pierce, “Asymptotically efficient quantizing,” IEEE Trans. on Information Theory, vol. IT-14, pp. 676–683, Sept. 1968. 146 [37] R. M. Gray, Source Coding Theory, Boston, MA: Kluwer Academic Publishers, 1990. [38] H.-M. Hang and J.-J. Chen, “Source model for transform video coder and its ap- plication —Part I: fundamental theory,” IEEE Trans. on CSVT, vol. 7, pp. 287– 298, Apr. 1997. [39] B. G. Haskell, A. Puri, A. N. Netravali, Digital Video: An Introduction to MPEG-2, Boston, MA: Kluwer Academic Publishers, 2002. [40] Z. He and S. K. Mitra, “A unified rate-distortion analysis framework for trans- form coding,” IEEE Trans. on CSVT, vol. 11, pp. 1221–1236, Dec. 2001. [41] D. P. Heyman, A. Tabatabai, T. V. Lakshman, “Statistical analysis and simu- lation study of video teleconference traffic in ATM networks,” IEEE Trans. on CSVT, vol. 2, pp. 49–59, Mar. 1992. [42] D. P. Heyman, “The GBAR source model for VBR video conferences,” IEEE/ACM Trans. on Networking, vol. 5, pp. 554–560, Aug. 1997. [43] C.-Y. Hsu, A. Ortega, and A. Reibman, “Joint selection of source and channel rate for VBR video transmission under ATM policing constraints,” IEEE Jour- nal on Selected Areas in Communication, vol. 15, pp. 1016–1028, Aug. 1997. [44] C. Huang, M. Devetsikiotis, I. Lambadaris, and A. R. Kaye, “Modeling and sim- ulation of self-similar variable bit rate compressed video: A unified approach,” in Proc. ACM SIGCOMM, Cambridge, MA, Aug. 1995, pp. 114–125. [45] D. Huffman, “A method for the construction of minimal redundancy codes,” in Proc. IRE, Sept. 1952, pp. 1098–1101. 147 [46] H. E. Hurst, “Long-term storage capacity of reservoirs,” Trans. on American Society of Civil Engineers, vol. 116, pp. 770-799, 1951. [47] T. Y. Hwang and P. H. Huang, “On new moment estimation of parameters of the Gamma distribution using its characterization,” Annals of the Institute of Statistical Mathematics, vol. 54, Issue 4, 2002. [48] ISO/IEC JTC1, “Coding of moving pictures and associated audio for digi- tal storage media at up to about 1.5Mb/s–Part2: video,” ISO/IEC 11172-2 (MPEG-21), 1993. [49] ISO/IEC JTC1, “Information technology-coding of audio-visual objects- Part2: video,” ISO/IEC 14496-2 (MPEG-4), 1999. [50] ITU-T, “Codec for videoconferencing using primary digital group transmis- sion,” ITU-T Recommendation H.120; version 1, 1984; version 2, 1988. [51] ITU-T, “Video codec for audiovisual services at p × 64 kbits/s,” ITU-T Rec- ommendation H.120; version 1, 1990; version 2, 1993. [52] ITU-T and ISO/IEC JTC1, “Generic coding of moving pictures and associated audio informaiton–Part2: video,” ISO/IEC 13818-2 (MPEG-2), 1994. [53] ITU-T, “Video coding for low bitrate communication,” ITU-T Recommendation H.263; version 1, 1995; version 2, 1998. [54] N. Jayant and P.Noll, Digital Coding of Waveforms, Englewood Cliffs, NJ: Prentice Hall, 1984. [55] JPEG, “JPEG2000 part I final committee draft version 1.0,” ISO/IEC JTCI/SC29 WGI, Mar. 2000. 148 [56] S.-R. Kang, Y. Zhang, M. Dai, and D. Loguinov, “Multi-layer active queue man- agement and congestion control for scalable video streaming,” in Proc. IEEE ICDCS, Tokyo, Japan, Mar. 2004, pp. 768–777. [57] K. Kar, S. Sarkar, and L. Tassiulas, “Simple rate control algorithm for max total user utility,” in Proc. IEEE INFOCOM, Anchorage, Alaska, Apr. 2001, pp. 133–141. [58] F. P. Kelly, A. Maulloo, and D. Tan, “Rate control in communication networks: Shadow prices, proportional fairness and stability,” Journal of the Operational Research Society, vol. 49, pp. 237–252, 1998. [59] M. Krunz and S. K. Tripathi, “On the characterization of VBRMPEG streams,” in Proc. of ACM SIGMETRICS, Seattle, WA, June 1997, pp. 192–202. [60] S. Kunniyur and R. Srikant, “End-to-end congestion control schemes: Utility functions, random losses and ECN marks,” in Proc. IEEE INFOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 1323–1332. [61] D. Leviatan and I. A. Shevchuk, “Coconvex approximation,” Journal of Approx. Theory, vol. 118, pp. 20–65, 2002. [62] A. Lombardo, G. Morabito, and G. Schembra, “An accurate and treatable Markov model of MPEG-Video traffic,” in Proc. IEEE INFOCOM, San Fran- cisco, CA, Mar. 1998, pp. 217–224. [63] A. Lombardo, G. Morabito, S. Palazzo, and G. Schembra, “A Markov-based algorithm for the generation of MPEG sequences matching intra- and inter- GOP correlation,” European Trans. on Telecommunications, vol. 12, pp. 127– 142, Mar./Apr. 2001. 149 [64] S. H. Low and D. E. Lapsley, “Optimization flow control I: Basic algorithm and convergence,” IEEE/ACM Trans. on Networking, vol. 7, pp. 861–874, Dec. 1999. [65] W. Li, “Overview of fine granularity scalability in MPEG-4 video standard,” IEEE Trans. on CSVT, pp. 301–317, Mar. 2001. [66] J. Lin and A. Ortega, “Bit-rate control using piecewise approximation rate- distortion characteristics,” IEEE Trans. on CSVT, vol. 8, pp. 446–459, Aug. 1998. [67] F. Ling, W. Li, and H. Sun, “Bitplane coding of DCT coefficients for image and video compression,” Proc. SPIE Visual Communications and Image Processing, San Jose, CA, Jan. 1999, pp. 500–508. [68] D. Liu, E. I. Sa´ra, and W. Sun, “Nested auto-regressive processes for MPEG- encoded video traffic modeling,” IEEE Trans. on CSVT, vol. 11, pp. 169–183, Feb. 2001. [69] T. Liu, H.-J. Zhang, W. Qi, and F. Qi, “A systematic rate controller for MPEG- 4 FGS video streaming,” Multimedia Systems, vol. 8, Dec. 2002. [70] D. Loguinov and H. Radha, “Increase-decrease congestion control for real-time streaming: Scalability,” in Proc. IEEE INFOCOM, New York, June 2002, pp. 525–534. [71] S. Ma and C. Ji, “Modeling video traffic using wavelets,” IEEE Communication Letters, vol. 2, no. 4, pp. 100–103, Apr. 1998. [72] S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain,” IEEE/ACM Trans. on Networking, vol. 9, pp. 634–649, Oct. 2001. 150 [73] B. Maglaris, D. Anastassiou, P. Sen, G. Karlsson, and J. Robbins, “Perfor- mance models of statistical multiplexing in packet video communications,” IEEE Trans. on Comm., vol. 36, pp. 834–844, July 1988. [74] S. Mallat and F. Falzon, “Analysis of low bit rate image transform coding,” IEEE Trans. on Signal Processing, vol.46, pp. 1027–1042, Apr. 1998. [75] L. Massoulie´, “Stability of distributed congestion control with heterogeneous feedback delays,” IEEE Trans. on Automatic Control, vol. 47, pp. 895–902, June 2002. [76] B. Melamed and D. E. Pendarakis, “Modeling full-length VBR video using Markov-renewal-modulated TES models,” IEEE Journal on Selected Areas in Communications, vol. 16, pp. 600–611, June 1998. [77] J. L. Mitchell, MPEG Video: Compression Standard, Boston, MA: Kluwer Academic Publishers, 2002. [78] MPEG, “Coding of moving pictures and audio,” ISO/IEC JTC1/SC29/WG11 N3908, Jan. 2001. [79] F. Muller, “Distribution shape of two-dimensional DCT coefficients of natural images,” Electronics Letters, vol. 29, Oct. 1993. [80] A. N. Netravali and B. G. Haskell, Digital Pictures Presentation, Compression, and Standards. New York: Plenum, 1988. [81] J. O’Neal, T. Natarajan, “Coding isotropic images,” IEEE Trans. on Informa- tion Theory, vol. 23, pp. 697–707, Nov 1977. [82] A. Ortega and K. Ramchandran, “Rate-distortion methods for image and video compression,” IEEE Signal Processing Magazine, vol. 15, pp. 23–50, Nov. 1998. 151 [83] J. Padhye, V. Firoiu, D. F. Towsley, and J. F. Kurose, “Modeling TCP reno performance: A simple model and its empirical validation,” IEEE/ACM Trans. on Networking, vol. 8, pp. 133–145, Apr. 2000. [84] J. Postel, “User datagram protocol,” RFC 768, IETF standard, Aug. 1980. [85] J. Postel, “Transmission control protocol C DARPA Internet program protocol specification,” RFC 793, IETF standard, Sept. 1981. [86] H. Radha, M. V. Schaar, and Y. Chen, “The MPEG-4 fine-grained scalable video coding method for multimedia streaming over IP,” IEEE Trans. on Mul- timedia, vol. 3, pp. 53–68, Mar. 2001. [87] M. Reisslein, J. Lassetter, S. Ratnam, O. Lotfallah, F. H. P. Fitzek, and S. Panchanathan, “Video traces for network performance evaluation,” available at 2004. [88] R. Rejaie, M. Handley, “Quality adaptation for congestion controlled video playback over the Internet,” in Proc. of ACM SIGCOMM, Cambridge, MA, Sep. 1999, pp. 189–200. [89] R. Rejaie, M. Handley, and D. Estrin, “RAP: An end-to-end rate-based conges- tion control mechanism for real-time streams in the Internet,” in Proc. IEEE INFOCOM, New York, USA, Mar. 1999, pp. 1337–1345. [90] V. J. Ribeiro, R. H. Riedi, M. S. Crouse, and R. G. Baraniuk, “Multiscale queuing analysis of long-range-dependent network traffic,” in Proc. IEEE IN- FOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 1026–1035. [91] J. Rissanen and G. Langdon, “Arithmetic coding,” IBM Journal of Research and Development, vol. 23, pp. 149–162, Mar. 1979. 152 [92] O. Rose, “Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems,” in Proc. of the 20th Annual Conference on Local Computer Networks, Minneapolis, MN, Oct. 1995, pp. 397–406. [93] O. Rose, “Simple and efficient models for variable bit rate MPEG video traffic,” in Performance Evaluation, vol. 30, pp. 69–85, 1997. [94] D. Saparilla and K. Ross, “Optimal streaming of layered video,” in Proc. IEEE INFOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 737–746. [95] U. K. Sarkar, S. Ramakrishnan, and D. Sarkar, “Modeling full-length video us- ing Markov-modulated Gamma-based framework,” IEEE/ACM Trans. on Net- working, vol. 11, pp. 638–649, Aug. 2003. [96] M. van der Schaar, “System and network-constrained video compression,” Ph.D. dissertation, Eindhoven University of Technology and Delft University of Tech- nology, Netherlands, 2001. [97] C. E. Shannon, “A mathematica theory of communication,” Bell Syst. Tech. Journal, vol. 27, pp. 379–423, 1948. [98] M. Shreedhar and G. Varghese, “Efficient fair queuing using deficit round- robin,” IEEE/ACM Trans. on Networking, vol. 4, pp. 375–385, June 1996. [99] S. R. Smoot and L. A. Rowe, “Study of DCT coefficient distributions,” in Proc. SPIE Symposium on Electr. Imaging, San Jose, CA, vol. 2657, Jan. 1996. [100] E. Steinbach, N. Farber, and B. Girod, “Adaptive playout for low latency video streaming,” in Proc. IEEE ICIP, Thessaloniki, Greece, Oct. 2001, pp. 962–965. 153 [101] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. “Chord: A scalable peer-to-peer lookup service for Internet applications,” in Proc. ACM SIGCOMM, San Diego, CA, Aug. 2001, pp. 149–160. [102] G. J. Sullivan and T. Wiegand, “Rate-Distortion optimization for video com- pression,” IEEE Signal Processing Magazine, vol. 15, pp. 74–90, Nov. 1998. [103] D. S. Taubman, “Directionality and scalability in image and video compres- sion,” Ph.D. dissertation, University of California At Berkeley, Berkeley, CA, 1994. [104] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Cod- ing, New York: McGraw-Hill, 1979. [105] Q. Wang, Z. Xiong, F. Wu, and S. Li, “Optimal rate allocation for progressive fine granularity scalable video coding,” IEEE Signal Processing Letters, vol. 9, pp. 33–39, Feb. 2002. [106] Y. Wang, M. T. Orchard, and A. R. Reibman, ”Multiple description image coding for noisy channels by pairing transform coefficients,” in Proc. IEEE Workshop on Multimedia Signal Processing, Princeton, NJ, June 1997, pp. 419– 424. [107] Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video Processing and Communica- tions, NJ: Prentice Hall, 2001. [108] D. Wu, Y. T. Hou, B. Li, W. Zhu, Y.-Q. Zhang, and H. J. Chao, “An end-to-end approach for optimal mode selection in Internet video communication: Theory and application,” IEEE Journal on Selected Areas in Communications, vol. 18, pp. 1–20, June 2000. 154 [109] D. Wu, Y. T. Hou, W. Zhu, H.-J. Lee, T. Chiang, Y.-Q. Zhang, and H. J. Chao, “On end-to-end architecture for transporting MPEG-4 video over the Internet,” IEEE Trans. on CSVT, vol. 10, pp. 923–941, Sept. 2000. [110] D. Wu, Y.T. Hou, W. Zhu, Y.-Q. Zhang, J.M. Peha, “Streaming video over the Internet: approaches and directions,” IEEE Trans. on CSVT, vol. 11, pp. 282– 300, Mar. 2001. [111] G. S. Yovanof and S. Liu, “Statistical analysis of the DCT coefficients and their quantization error,” in Conf. Rec. 30 thAsilomar Conf. Signals, Systems, Computers, Pacific Grove, CA, Nov. 1996, pp. 601–605. [112] Y. Zhang, S-R. Kang, and D. Loguinov, “Delayed stability and performance of distributed congestion control,” in Proc. ACM SIGCOMM, Portland, OR, Aug. 2004, pp. 307–318. [113] L. Zhao, J. W. Kim, and C.-C. Kuo, “MPEG-4 FGS video streaming with constant-quality rate control and differentiated forwarding,” in Proc. SPIE Vi- sual Communications and Image Processing, San Jose, CA, Jan. 2002, pp. 230– 241. [114] X. J. Zhao, Y. W. He, S. Q. Yang, and Y. Z. Zhong, “Rate allocation of equal image quality for MPEG-4 FGS video streaming,” in Packet Video, Pittsburgh, PA, Apr. 2002. [115] J.-A. Zhao, B. Li, and I. Ahmad, “Traffic model for layered video: An approach on Markovian arrival process,” in Packet Video, Nantes, France, Apr. 2003. 155 VITA Min Dai received her B.S. and M.S. degree in precise instruments from Shanghai Jiao Tong University, China, in 1996 and 1998, respectively. She has been pursuing her Ph.D. degree in electrical engineering at Texas A&M University since 1999. She was a research intern with LSI Logic Company, San Jose, CA, from January 2002 to August 2002. Afterwards, she joined the Internet Research Lab, Department of Computer Science, Texas A&M University. Her research interests include scalable video streaming, video traffic modeling, and image denoising. She may be contacted at: Min Dai C/O Shanren Dai 11 Shucheng Road, the 8th Floor Hefei, Anhui, 230001 P. R. China

Các file đính kèm theo tài liệu này:

  • pdfrate-distortion-analysis-and-traffic-modelling-of-scalable-video-coders.pdf