Luận văn tiến sĩ khoa học: Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders
CHAPTER Page
I INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : 1
A. Problem Statement 1
B. Objective and Approach . 2
C. Main Contributions 3
D. Dissertation Overview 5
II SCALABLE VIDEO CODING : : : : : : : : : : : : : : : : : : : 7
A. Video Compression Standards 7
B. Basics in Video Coding 10
1. Compression . 11
2. Quantization and Binary Coding 12
C. Motion Compensation 16
D. Scalable Video Coding 20
1. Coarse Granular Scalability . 21
a. Spatial Scalability 21
b. Temporal Scalability 22
c. SNR/Quality Scalability 23
2. Fine Granular Scalability 23
III RATE-DISTORTION ANALYSIS FOR SCALABLE CODERS : 25
A. Motivation . 26
B. Preliminaries . 28
1. Brief R-D Analysis for MCP Coders 28
2. Brief R-D Analysis for Scalable Coders . 30
C. Source Analysis and Modeling 31
1. Related Work on Source Statistics . 32
2. Proposed Model for Source Distribution 34
D. Related Work on Rate-Distortion Modeling . 36
1. R-D Functions of MCP Coders . 36
2. Related Work on R-D Modeling 40
3. Current Problems 42
E. Distortion Analysis and Modeling 45
1. Distortion Model Based on Approximation Theory 45
a. Approximation Theory . 46
b. The Derivation of Distortion Function 47
2. Distortion Modeling Based on Coding Process . 50
F. Rate Analysis and Modeling . 54
1. Preliminaries . 54
2. Markov Model 56
G. A Novel Operational R-D Model . 61
1. Experimental Results 65
H. Square-Root R-D Model . 66
1. Simple Quality (PSNR) Model . 67
2. Simple Bitrate Model 69
3. SQRT Model . 72
IV QUALITY CONTROL FOR VIDEO STREAMING : : : : : : : 76
A. Related Work . 76
1. Congestion Control . 76
a. End-to-End vs. Router-Supported . 77
b. Window-Based vs. Rate-Based 78
2. Error Control . 78
a. Forward Error Correction (FEC) . 79
b. Retransmission . 80
c. Error Resilient Coding . 80
d. Error Concealment . 85
B. Quality Control in Internet Streaming 85
1. Motivation 86
2. Kelly Controls 88
3. Quality Control in CBR Channel 92
4. Quality Control in VBR Networks . 94
5. Related Error Control Mechanism . 98
V TRAFFIC MODELING : : : : : : : : : : : : : : : : : : : : : : 100
A. Related Work on VBR Tra±c Modeling . 102
1. Single Layer Video Tra±c 102
a. Autoregressive (AR) Models 102
b. Markov-modulated Models . 104
c. Models Based on Self-similar Process . 104
d. Other Models 105
2. Scalable Video Tra±c 106
B. Modeling I-Frame Sizes in Single-Layer Tra±c . 107
1. Wavelet Models and Preliminaries . 107
2. Generating Synthetic I-Frame Sizes 110
C. Modeling P/B-Frame Sizes in Single-layer Tra±c . 114
1. Intra-GOP Correlation . 115
2. Modeling P and B-Frame Sizes . 117
D. Modeling the Enhancement Layer 121
1. Analysis of the Enhancement Layer 123
2. Modeling I-Frame Sizes . 126
3. Modeling P and B-Frame Sizes . 127
E. Model Accuracy Evaluation . 129
1. Single-layer and the Base Layer Tra±c . 132
2. The Enhancement Layer Tra±c . 133
VI CONCLUSION AND FUTURE WORK : : : : : : : : : : : : : : 137
A. Conclusion . 137
B. Future Work 139
1. Supplying Peers Cooperation System 140
2. Scalable Rate Control System 141
REFERENCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 142
VITA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 155 .
Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders
172 trang |
Chia sẻ: maiphuongtl | Lượt xem: 1935 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Đề tài Rate-Distortion Analysis and Traffic Modelling of Scalable Video Coders, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
400
500
-400 400 1200 2000 2800 3600
bytes
v_1
v_2
v_3
(b) Jurassic Park I
Fig. 58. Histograms of {v(n)} for {φPi (n)} with i = 1, 2, 3 in (a) Star Wars IV and
(b) Jurassic Park I. Both sequences are coded at Q = 14.
To understand how to generate {v˜(n)}, we next examine the actual residual
process v(n) = φPi (n) − aφ˜I(n) for each i. We show the histograms of {v(n)} for
P-frame sequences i = 1, 2, 3 in the single-layer Star Wars IV and Jurassic Park I
in Fig. 58. The figures shows that the residual process {v(n)} does not change much
as a function of i.
In Fig. 59 (a), we show the histograms of {v(n)} for sequences coded at dif-
ferent Q. The figure shows that the histogram becomes more Gaussian-like when
Q increases. Due to the diversity of the histogram of {v(n)}, we use a generalized
Gamma distribution Gamma(γ, α, β) to estimate {v(n)}. Fig. 59 (b) shows that the
smaller the quantization step Q, the larger the value of parameter a in (5.17), which
is helpful for further modeling sequences coded from the same video content but at
different quantization steps.
121
From Fig. 55 (b), we observe that the correlation between {φBi (n)} and {φI(n)}
could be as small as 0.1 (e.g., in Star Wars IV coded at Q = 18) or as large as
0.9 (e.g., in The Silence of the Lambs coded at Q = 4). Thus, we can generate
the synthetic B-frame traffic simply by an i.i.d. lognormal random number generator
when the correlation between {φBi (n)} and {φI(n)} is small, or by a linear model
similar to (5.16) when the correlation is large. The linear model has the following
form:
φBi (n) = aφ˜
I(n) + v˜B(n), (5.23)
where a = r(0)σB/σI , r(0) is the lag-0 correlation between {φI(n)} and {φBi (n)},
σB and σI are the standard deviation of {φBi (n)} and {φI(n)}, respectively. Process
v˜B(n) is independent of φ˜
I(n).
We illustrate the difference between our model and a typical i.i.d. method of prior
work (e.g., [68], [95]) in Fig. 60. The figure shows that our model indeed preserves
the intra-GOP correlation of the original traffic, while the previous methods produce
white (uncorrelated) noise. Statistical parameters (r(0), σP , σI , γ, α, β) needed for
this model are easily estimated from the original sequences.
D. Modeling the Enhancement Layer
In this section, we provide brief background knowledge of multi-layer video, investi-
gate methods to capture cross-layer dependency, and model the enhancement-layer
traffic.
Due to its flexibility and high bandwidth utilization, layered video coding is com-
mon in video applications. Layered coding is often referred to as “scalable coding,”
which can be further classified as coarse-granular (e.g., spatial scalability) or fine-
granular (e.g., fine granular scalability (FGS)) [107]. The major difference between
122
0
50
100
150
200
250
300
350
400
-500 2500 5500 8500 11500
bytes
Q=14
Q=10
Q=4
(a)
0
0.2
0.4
0.6
0.8
1
0 5 10 15
quant. step
co
rr
el
at
io
n
StarWars
Jurassic
Troopers
Silence
StarTrek
(b)
Fig. 59. (a) Histograms of {v(n)} for {φP1 (n)} in Jurassic Park I coded at
Q = 4, 10, 14. (b) Linear parameter a for modeling {φPi (n)} in various se-
quences coded at different Q.
coarse granularity and fine granularity is that the former provides quality improve-
ments only when a complete enhancement layer has been received, while the latter
continuously improves video quality with every additionally received codeword of the
enhancement layer bitstream.
In both coarse granular and fine granular coding methods, an enhancement layer
is coded with the residual between the original image and the reconstructed image
from the base layer. Therefore, the enhancement layer has a strong dependency on the
base layer. Zhao et al. [115] also indicate that there exists a cross-correlation between
the base layer and the enhancement layer; however, this correlation has not been
fully addressed in previous studies. In the next subsection, we investigate the cross-
correlation between the enhancement layer and the base layer using spatially scalable
The Silence of the Lambs sequence and an FGS-coded Star Wars IV sequence as
123
-0.1
0.1
0.3
0.5
0 70 140 210 280 350
lag
co
rr
el
at
io
n
actual
our model
i.i.d methods
(a)
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 50 100 150 200 250 300 350
lag
co
rr
el
at
io
n
actual
our model
i.i.d methods
(b)
Fig. 60. (a) The correlation between {φP1 (n)} and {φI(n)} in Star Wars IV. (b) The
correlation between {φB1 (n)} and {φI(n)} in Jurassic Park I.
examples. We only show the analysis of two-layer sequences for brevity and similar
results hold for video streams with more than two layers.
1. Analysis of the Enhancement Layer
Notice that We do not consider temporal scalable coded sequences, in which the
base layer and the enhancement layer are approximately equivalent to extracting
I/P-frames and B-frames out of a single-layer sequence, respectively [87].
For discussion convenience, we define the enhancement layer frame sizes as fol-
lows. Similar to the definition in the base layer, we define εI(n) to be the I-frame size
of the n-th GOP, εPi (n) to be the size of the i-th P-frame in GOP n, and ε
B
i (n) to be
the size of the i-th B-frame in GOP n.
Since each frame in the enhancement layer is predicted from the corresponding
124
0
0.2
0.4
0.6
0.8
1
0 50 100 150
lag
co
rr
el
at
io
n
Q=4
Q=24
Q=30
(a)
0
0.2
0.4
0.6
0.8
1
0 50 100 150
lag
co
rr
el
at
io
n
cov(P1_BL,P1_EL)
cov(P2_BL,P2_EL)
cov(P3_BL,P3_EL)
(b)
Fig. 61. (a) The correlation between {εI(n)} and {φI(n)} in The Silence of the Lambs
coded at Q = 4, 24, 30. (b) The correlation between {εPi (n)} and {φPi (n)} in
The Silence of the Lambs coded at Q = 30, for i = 1, 2, 3.
frame in the base layer, we examine the cross-correlation between the enhancement
layer frame sizes and the corresponding base layer frame sizes in various sequences. In
Fig. 61 (a), we display the correlation between {εI(n)} and {φI(n)} in The Silence of
the Lambs coded at different Q. As observed from the figure, the correlation between
{εI(n)} and {φI(n)} is stronger when the quantization step Q is smaller. However, the
difference among these cross-correlation curves is not as obvious as that in intra-GOP
correlation. We also observe that the cross-correlation is still strong even at large lags,
which indicates that {εI(n)} exhibits LRD properties and we should preserve these
properties in the synthetic enhancement layer I-frame sizes.
In Fig. 61 (b), we show the cross-correlation between processes {εPi (n)} and
{φPi (n)} for i = 1, 2, 3. The figure demonstrates that the correlation between the
enhancement layer and the base layer is quite strong, and the correlation structures
125
0
0.2
0.4
0.6
0.8
1
0 50 100 150lag
co
rr
el
at
io
n
BL_I_cov
EL_I_cov
(a)
0
0.2
0.4
0.6
0.8
1
0 50 100 150lag
co
rr
el
at
io
n
BL_P_cov
EL_P_cov
(b)
Fig. 62. (a) The ACF of {εI(n)} and that of {φI(n)} in Star Wars IV. (b) The ACF
of {εP1 (n)} and that of {φP1 (n)} in The Silence of the Lambs.
between each {εPi (n)} and {φPi (n)} are very similar to each other. To avoid repetitive
description, we do not show the correlation between {εBi (n)} and {φBi (n)}, which is
similar to that between {εPi (n)} and {φPi (n)}.
Aside from cross-correlation, we also examine the autocorrelation of each frame
sequence in the enhancement layer and that of the corresponding sequence in the base
layer. We show the ACF of {εI(n)} and that of {φI(n)} (labeled as “EL I cov” and
“BL I cov”, respectively) in Fig. 62 (a); and display the ACF of {εP1 (n)} and that of
{φP1 (n)} in Fig. 62 (b). The figure shows that although the ACF structure of {εI(n)}
has some oscillation, its trend closely follows that of {φI(n)}. One also observes from
the figures that the ACF structures of processes {εPi (n)} and {φPi (n)} are similar to
each other.
126
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 50 100 150
lag
co
rr
el
at
io
n
ca_BL_cov
ca_EL_cov
(a) Q = 30
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 50 100 150
lag
co
rr
el
at
io
n
ca_BL_cov
ca_EL_cov
(b) Q = 4
Fig. 63. The ACF of {A3(ε)} and {A3(φ)} in The Silence of the Lambs coded at (a)
Q = 30 and (b) Q = 4.
2. Modeling I-Frame Sizes
Although cross-layer correlation is obvious in multi-layer traffic, previous work neither
considered it during modeling [9], nor explicitly addressed the issue of its modeling
[115]. In this section, we first describe how we model the enhancement layer I-frame
sizes and then evaluate the performance of our model in capturing the cross-layer
correlation.
Recalling that {εI(n)} also possesses both SRD and LRD properties, we model it
in the wavelet domain as we modeled {φI(n)}. We define {Aj(ε)} and {Aj(φ)} to be
the approximation coefficients of {εI(n)} and {φI(n)} at the wavelet decomposition
level j, respectively. To better understand the relationship between {Aj(ε)} and
{Aj(φ)}, we show the ACF of {A3(ε)} and {A3(φ)} using Haar wavelets (labeled as
“ca EL cov” and “ca BL cov”, respectively) in Fig. 63.
127
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
0 100 200 300 400
lag
cr
o
ss
co
rr
el
at
io
n
actual
our model
(a) our model
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
0 100 200 300 400
lag
cr
o
ss
co
rr
el
at
io
n
actual
Zhao et al.
(b) model [115]
Fig. 64. The cross-correlation between {εI(n)} and {φI(n)} in The Silence of the
Lambs and that in the synthetic traffic generated from (a) our model and (b)
model [115].
As shown in Fig. 63, {Aj(ε)} and {Aj(φ)} exhibit similar ACF structure. Thus,
we generate {AJ(ε)} by borrowing the ACF structure of {AJ(φ)}, which is known
from our base-layer model. Using the ACF of {AJ(φ)} in modeling {εI(n)} not only
saves computational cost, but also preserves the cross-layer correlation. In Fig. 64, we
compare the actual cross-correlation between {εI(n)} and {φI(n)} to that between
the synthetic {εI(n)} and {φI(n)} generated from our model and Zhao’s model [115].
The figure shows that our model significantly outperforms Zhao’s model in preserving
the cross-layer correlation.
3. Modeling P and B-Frame Sizes
Recall that the cross-correlation between {εPi (n)} and {φPi (n)} and that between
{εBi (n)} and {φBi (n)} are also strong, as shown in Fig. 61. We use the linear model
128
0
100
200
300
400
-500 0 500 1000 1500 2000 2500
bytes
w_P1
w_P2
w_P3
(a) Star Wars IV
0
100
200
300
400
500
600
700
-500 0 500 1000 1500 2000 2500
bytes
w_P1
w_P2
w_P3
(b) The Silence of the Lambs
Fig. 65. Histograms of {w1(n)} in (a) Star Wars IV and (b) The Silence of the
Lambs (Q = 24), with i = 1, 2, 3.
from Section 2 to estimate the sizes of the i-th P and B-frames in the n-th GOP:
εPi (n) = aφ
P
i (n) + w˜1(n), (5.24)
εBi (n) = aφ
B
i (n) + w˜2(n), (5.25)
where a = r(0)σε/σφ, r(0) is the lag-0 cross-correlation coefficient, σε is the standard
deviation of the enhancement-layer sequence, and σφ is the standard deviation of
the corresponding base-layer sequence. Processes {w˜1(n)}, {w˜2(n)} are independent
of {φPi (n)} and {φBi (n)}. We examine {w1(n)} and {w2(n)} and find they exhibit
similar properties. We show two examples of {w1(n)} in Fig. 65.
As observed from Fig. 65, the histogram of {w1(n)} is asymmetric and decays
fast on both sides. Therefore, we use two exponential distributions to estimate its
PDF. We first left-shift {w1(n)} by an offset δ to make the mode (i.e., the peak) ap-
pear at zero. We then model the right side using one exponential distribution exp(λ1)
129
0
100
200
300
400
-500 0 500 1000 1500 2000 2500
bytes
actual
estimate
(a) Star Wars IV
0
100
200
300
400
500
600
700
-500 0 500 1000 1500 2000
bytes
actual
estimate
(b) The Silence of the Lambs
Fig. 66. Histograms of {w1(n)} and {w˜1(n)} for {εP1 (n)} in (a) Star Wars IV and
(b) The Silence of the Lambs (Q = 30).
and the absolute value of the left side using another exponential distribution exp(λ2).
Afterwards, we generate synthetic data {w˜1(n)} based on these two exponential dis-
tributions and right-shift the result by δ. As shown in Fig. 66, the histograms of
{w˜1(n)} are close to those of the actual data in both Star Wars IV and The Silence
of the Lambs. We generate {w˜2(n)} in the same way and find its histogram is also
close to that of {w2(n)}.
E. Model Accuracy Evaluation
As we stated earlier, a good traffic model should capture the statistical properties of
the original traffic and be able to accurately predict network performance. There are
three popular studies to verify the accuracy of a video traffic model [95]: quantile-
quantile (QQ) plots, the variance of traffic during various time intervals, and buffer
130
0
1000
2000
3000
4000
5000
0 1000 2000 3000 4000 5000
original frame size
sy
n
th
et
ic
fra
m
e
si
ze
(a) Star Wars IV
0
500
1000
1500
2000
2500
3000
0 1000 2000 3000
original frame size
sy
n
th
et
ic
fra
m
e
si
ze
(b) The Silence of the Lambs
Fig. 67. QQ plots for the synthetic (a) single-layer Star Wars IV traffic and (b) The
Silence of the Lambs base-layer traffic.
overflow loss evaluation. While the first two measures visually evaluate how well the
distribution of the synthetic traffic and that of the original one matches, the overflow
loss simulation examines the effectiveness of a traffic model to capture the temporal
burstiness of original traffic.
The QQ plot is a graphical technique to verify the distribution similarity between
two test data sets. If the two data sets have the same distribution, the points should
fall along the 45 degree reference line. The greater the departure from this reference
line, the greater the difference between the two test data sets.
Different from QQ plot, the variance of traffic during various time intervals shows
whether the second-order moment of the synthetic traffic fits that of the original
one. This second-order descriptor is used to capture burstiness properties of arrival
processes [9]. This measure operates as follows. Assume that the length of a video
sequence is l and there are m frames at a given time interval. We segment the one-
131
dimensional data into a m × n matrix, where n = l/m. After summarizing all the
data in each column, we obtain a sequence of length n and then calculate its variance.
Thus, we can obtain a set of variances given a set of time intervals.
Besides the distribution, we also examine how well our approach preserves the
temporal information of the original traffic. A common test for this is to pass the
synthetic traffic through a generic router buffer with capacity c and drain rate d [95].
The drain rate is the number of bytes drained per second and is simulated as different
multiples of the average traffic rate r¯.
In the following two sections, we evaluate the accuracy of our model in both
single-layer and multi-layer traffic using the above three measures. We should note
that simulations with additional video sequences have demonstrated results similar
to those shown throughout this section.
1.E+06
1.E+07
1.E+08
1.E+09
1.E+10
0 1 2 3 4 5
time interval (s)
by
te
s
actual
our model
GBAR
Gamma_B
Nested_AR
(a) Star Wars IV
1.E+06
1.E+07
1.E+08
1.E+09
1.E+10
1.E+11
0 1 2 3 4
time interval (s)
by
te
s
actual
our model
GBAR
Gamma_B
Nested_AR
(b) The Silence of the Lambs
Fig. 68. Comparison of variance between synthetic and original traffic in (a) sin-
gle-layer Star Wars IV and (b) The Silence of the Lambs base layer.
132
1. Single-layer and the Base Layer Traffic
We first show QQ plots of the synthetic single-layer Star Wars IV and the synthetic
base layer of The Silence of the Lambs that are generated by our model in Fig. 67
(a) and (b), respectively. As shown in the figure, the generated frame sizes and the
original traffic are almost identical.
In Fig. 68, we give a comparison between variance of the original traffic and
that of the synthetic traffic generated from differen models at various time intervals.
The figure shows that the second-order moment of our synthetic traffic is in a good
agreement with that of the original one.
We also compare the accuracy of several models using a leaky-bucket simulation.
To understand the performance differences between various models, we define the
relative error e as the difference between the actual packet loss p observed in the buffer
fed with the original traffic and that observed using the synthetic traffic generated
by each of the models:
e =
|p− pmodel|
p
. (5.26)
In Table V, we illustrate the values of e for various buffer capacities and drain
rates d. As shown in the table, the synthetic traffic generated by our model pro-
vides a very accurate estimate of the actual data loss probability p and significantly
outperforms the other methods. In addition, our synthetic traffic is approximately
30% more accurate than the i.i.d. models of prior work in estimating the loss ratio of
P-frames.
In Fig. 69, we show the relative error e of synthetic traffic generated from different
models in H.26L Starship Troopers coded at Q = 1, 31, given d = r¯. Since GOP-
GBAR model [31] is specifically developed for MPEG traffic, we do not apply it to
H.26L sequences. The figure shows that our model outperforms the other three models
133
Table V. Relative Data Loss Error e in Star Wars IV.
Buffer Traffic type Drain rate
capacity 2r¯ 4r¯ 5r¯
10ms Our Model 1.80% 0.93% 0.50%
GOP-GBAR [31] 2.44% 2.51% 4.01%
Nested AR [68] 4.02% 2.05% 5.63%
Gamma A [95] 5.54% 1.04% 0.99%
Gamma B [95] 5.76% 1.81% 1.15%
20ms Our Model 0.93% 0.61% 1.13%
GOP-GBAR [31] 3.84% 2.16% 3.77%
Nested AR [68] 5.81% 2.77% 8.46%
Gamma A [95] 5.20% 0.61% 2.57%
Gamma B [95] 4.89% 1.93% 2.05%
30ms Our Model 0.25% 0.33% 0.95%
GOP-GBAR [31] 4.94% 3.33% 5.68%
Nested AR [68] 6.94% 4.14% 9.92%
Gamma A [95] 4.88% 1.10% 4.48%
Gamma B [95] 4.67% 2.17% 4.03%
in Starship Troopers coded at small Q and performs as good as model Gamma A
[95] in the large Q case (the relative error e of both models is less than 1% in Fig. 69
(b)).
2. The Enhancement Layer Traffic
We evaluate the accuracy of the synthetic enhancement layer by using QQ plots and
show two examples in Fig. 70, which displays two QQ plots for the synthetic The
Silence of the Lambs and Star Wars IV enhancement-layer traffic. The figure shows
that the synthetic frame sizes in both sequences have the same distribution as those
in the original traffic.
We also compare the variance of the original traffic and that of the synthetic
traffic in Fig. 71. Due to the computational complexity of model [115] in calculating
long sequences, we only take the first 5000 frames of Star Wars IV and The Silence
of the Lambs. As observed from the figure, our model well preserves the second-order
134
0%
1%
2%
3%
4%
10 20 30 40
buffer capacity (ms)
re
la
tiv
e
er
ro
r
our model
Nested AR
Gamma_A
Gamma_B
(a) Q = 1
0%
5%
10%
15%
20%
10 20 30 40
buffer capacity (ms)
re
la
tiv
e
er
ro
r
our model
Nested AR
Gamma_A
Gamma_B
(b) Q = 31
Fig. 69. Given d = r¯, the error e of various synthetic traffic in H.26L Starship Troopers
coded at (a) Q = 1 and (b) Q = 31.
moment of the original traffic.
We next examine the data loss ratio predicted by our synthetic traffic passed
through a generic buffer as shown in the previous section. Recall that the model in [9]
is only applicable to sequences with a CBR base layer and the one in [115] is suitable
only for short sequences. Therefore, we are not able to show results using leaky-bucket
simulations for these multi-layer models given the nature of our sample sequences. In
Fig. 72 and Fig. 73, we show the overflow data loss ratio of the enhancement layers in
both The Silence of the Lambs (54, 000 frames) and Star Wars IV (108, 000 frames)
with different drain rates d for buffer capacity c = 10 ms and c = 30 ms, respectively.
The x-axis in the figure represents the ratio of the drain rates to the average traffic
rate r¯. The figure shows that the synthetic enhancement layer preserves the temporal
information of the original traffic very well.
135
0
2000
4000
6000
0 2000 4000 6000
original frame size
sy
n
th
et
ic
fra
m
e
si
ze
(a) Star Wars IV
0
2000
4000
6000
8000
10000
0 2000 4000 6000 8000 10000
original frame size
sy
n
th
et
ic
fra
m
e
si
ze
(b) The Silence of the Lambs
Fig. 70. QQ plots for the synthetic enhancement-layer traffic: (a) Star Wars IV and
(b) The Silence of the Lambs.
0.E+00
1.E+08
2.E+08
3.E+08
4.E+08
5.E+08
6.E+08
7.E+08
0 1 2 3 4
time interval (s)
by
te
s
actual
our model
Zhao et al.
(a) Star Wars IV
0.E+00
2.E+09
4.E+09
6.E+09
8.E+09
1.E+10
1.E+10
1.E+10
2.E+10
2.E+10
0 1 2 3 4
time interval (s)
by
te
s
actual
our model
Zhao et al.
(b) The Silence of the Lambs
Fig. 71. Comparison of variance between the synthetic and original enhancement layer
traffic in (a) Star Wars IV and (b) The Silence of the Lambs.
136
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
drain rate
da
ta
lo
ss
ra
tio
actual
synthetic
(a) The Silence of the Lambs
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
drain rate
da
ta
lo
ss
ra
tio
actual
synthetic
(b) Star Wars IV
Fig. 72. Overflow data loss ratio of the original and synthetic enhancement layer traffic
for c = 10 ms for (a) The Silence of the Lambs and (b) Star Wars IV.
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
drain rate
da
ta
lo
ss
ra
tio
actual
synthetic
(a) The Silence of the Lambs
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
drain rate
da
ta
lo
ss
ra
tio
actual
synthetic
(b) Star Wars IV
Fig. 73. Overflow data loss ratio of the original and synthetic enhancement layer traffic
for c = 30 ms for (a) The Silence of the Lambs and (b) Star Wars IV.
137
CHAPTER VI
CONCLUSION AND FUTURE WORK
The ideas presented in this document have been expressed in terms of an R-D mod-
eling framework and a traffic model for scalable video coders, with the final goal of
providing high quality video to end users. In this chapter, we summarize the major
work we did and indicate some future directions for extension of the work.
A. Conclusion
Rate-distortion analysis has attracted great research interest after Shannon’s work
was published [97]. The focus of previous work has been to a large extent the deriva-
tion into some ideal bounds, which give us insight of achievable and non-achievable
regions but are not directly applicable in practice. In stead, one goal in this work is
to provide a practically useful R-D function for scalable coders.
In Chapter III, we first modeled the statistical properties of the input to scalable
coders and then presented a detailed analysis of rate and distortion for scalable coders.
We also reviewed the performance bound for a generic hybrid coder using motion-
compensated prediction. Based on the understanding of scalable coding processes
and approximation theory, we derived a distortion model and an operational R-D
model. Although this R-D model is accurate, its complex format limits its usage in
video streaming applications.
Therefore, we proposed another operational R-D model for streaming applica-
tions. We expressed it in the PSNR domain for the convenience of quality control.
Interestingly, we found that in the PSNR domain, both our R-D model and the the-
oretical upper bound in [81] have a similar concave shape in the working range of
scalable coders, which also matches the trend of actual R-PSNR curves.
138
R-D
model
Sender Router Receiver
packet loss p(t)
compressed
video
( ) ( ( 1 ) , ( 1 ) )r t f r t p t= − −
decide sending rate
constant quality
video
Congestion
control
Fig. 74. R-D based quality control.
In view of the inherent lack of stable quality associated with the base layer, we
provided a quality control algorithm to provide constant quality video to end users in
both CBR and VBR channels. In CBR channel, the algorithm proposed in Chapter
IV performs better than most existing constant quality algorithms, in regard to both
computational cost and performance. Furthermore, we studied modified Kelly control
and showed that it can provide a stable environment for video transmission. Thus,
we coupled our R-D model with this controller to achieve constant quality even under
varying network conditions. The whole work in Chapter III and IV can be depicted
in Fig. 74.
In Chapter V, we presented a framework for modeling H.26L and MPEG-4 multi-
layer full-length VBR video traffic. This work precisely captured the inter- and intra-
GOP correlation in compressed VBR sequences, by incorporating wavelet-domain
analysis into time-domain modeling. Whereas many previous traffic models are devel-
oped at slice-level or even block-level [95], our framework uses frame-size level, which
allows us to examine the loss ratio for each type of frames and apply other methods
to improve the video quality at the receiver. We also proposed novel methods to
model cross-layer correlation in multi-layer sequences and successfully described the
inter-layer correlation.
139
B. Future Work
In future work, we are interested in designing peer-to-peer streaming systems, where
scalable video coders will play an important role and our traffic model will be helpful
in its design.
A peer-to-peer streaming system differs from a general peer-to-peer system in
three aspects: (1) Peer-to-peer video streaming uses streaming mode and has high
user requirements on video quality; (2) In a peer-to-peer video streaming system, a
requesting peer can also play the role of a supplying peer as long as a certain amount
of media data has been stored; (3) A requesting peer in a peer-to-peer streaming
system can receive video data from multiple supplying peers simultaneously, while a
requesting peer in a general peer-to-peer system usually only has one supplying peer
at one time instant.
There are two challenges in designing a peer-to-peer streaming system. One
is to cooperate multiple supplying peers with high bandwidth utilization, and the
other is to ensure a continuous playback with graceful quality adaptation. To address
these two issues, we plan to design a scalable peer-to-peer video streaming system.
Although a fine granularly scalable coded bitstream is preferred, general layered coded
bitstreams are also applicable.
In the proposed scheme, we will abide by a differentiated admission policy, which
means that if a supplying peer has enough resource to provide service to several re-
questing peers, we admit the requesting peer with the highest outgoing bandwidth.
Intuitively, this policy has two benefits: (1) It will quickly increase the system ca-
pacity. If a requesting peer with the highest outgoing bandwidth has been admitted,
sometime later it will become another supplying peer and is able to contribute more
to the system than those peers with less outgoing bandwidth; (2) It will encourage
140
the requesting peers to offer more outgoing bandwidth.
In what follows, we discuss how to cooperate supplying peers in this scheme.
1. Supplying Peers Cooperation System
Assume that for each requesting peer Pr, there is a supplying peer set Ps, which
includes M supplying peers P 1s , P
2
s , . . . , P
M
s at time t and these supplying peers are
selected via existing peer-to-peer lookup mechanisms (e.g., [101]). We also define the
incoming bandwidth of Pr is Ir and the outgoing bandwidth of Pr is Or.
It is obvious that if a supplying peer P is has the higher layers of the data stream,
it must also have the lower layers. Since the base layer bandwidth is guaranteed, we
know that the outgoing bandwidth Or is always larger than or equal to the base layer
bandwidth Wb. We describe the cooperation scheme as follows:
• To maximize the outgoing bandwidth of supplying peers, we select the first
supplying peer as the lower layer supplying peer. Each packet is labeled with a
layer number and a packet number.
• After transmitting the base layer (which is CBR coded in FGS coders), the
incoming bandwidth of requesting peer Pr is updated to Ir − Wb. Although
supplying peer PMs has the highest outgoing bandwidth, its sending rate might
be slow due to various reasons (e.g., requests from other peers). If the enhance-
ment layer can be finely divided, the requesting peer will be able to allocate
different portion of the enhancement layer to different supplying peers to achieve
fast transmission and better video quality.
• If a supplying peer P is fails, the buffer at the requesting peer side will allow
a quick supplying-peer switch without quick quality degradation. If no other
supplying peers can take over the data that P is used to transmit, the sending
141
portion of other supplying peers will be adjusted and the video quality at the
receiver might be degraded.
In addition, a quality control scheme is often in demand for continuous playback.
2. Scalable Rate Control System
Since the current best-effort Internet does not provide any QoS guarantees to video
applications, end users often suffer from quality fluctuations and playout starvation
(i.e., receiver-buffer underflow). While the former mainly results from varying band-
width, the latter happens when the receiver buffer is empty and the playout rate is
faster than the incoming frame rate. Many studies have been conducted to provide
good video quality to end users. Steinbach et al. [100] propose a client-controlled
method to flexibly scale the playout rate to prevent playout starvation. However, end
users often prefer constant playout rate.
Thus, as an alternative, adaptive rate control mechanisms are proposed to adjust
the sending rate according to the available bandwidth and the feedback from receiver
buffers [69], [88], [94]. The fundamental idea of these mechanisms is to dynamically
allocate bandwidth. When the total bandwidth of all available supplying peers is
insufficient to support the requested bitstream from a requesting peer Pr, Pr can
either request more frames covering fewer number of layers or fewer frames covering
more layers. The switch threshold TH is decided by buffer condition, playout rate,
and available incoming bandwidth Ir.
142
REFERENCES
[1] P. Abry and V. Darryl, “Wavelet analysis of long-range-dependent traffic,”
IEEE Trans. Inform. Theory, vol. 44, Jan. 1998.
[2] J. G. Apostolopoulos and S. J. Wee, “Video compression standards”, available
at Apr. 2002.
[3] D. Bansal and H. Balakrishnan, “Binomial congestion control algorithms,” in
Proc. IEEE INFOCOM, Anchorage, Alaska, Apr. 2001, pp. 631–640.
[4] W. R. Bennett, “Spectra of quantized signals,” Bell Sys. Tech. Journal, vol. 27,
pp. 446–472, July 1948.
[5] T. Berger, Rate Distortion Theory, Englewood Cliffs, NJ: PrenticeHall, 1971.
[6] J. A. Bilmes, “A gentle tutorial of the EM algorithm and its application to
parameter estimation for gaussian mixture and hidden Markov models,” Inter-
national Computer Science Institute, Berkeley, California, Apr. 1998.
[7] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, “An archi-
tecture for differentiated services,” IETF RFC 2475, 1998.
[8] R. Braden, D. Clark, and S. Shenker, “Integrated services in the internet archi-
tecture: An overview,” IETF RFC 1633, 1994.
[9] K. Chandra and A. R. Reibman, “Modeling one- and two-layer variable bit rate
video,” IEEE/ACM Trans. on Networking, vol. 7, pp. 398–413, June 1999.
[10] D. Clark and W. Fang, “Explicit allocation of best effort packet delivery ser-
vice,” IEEE/ACM Trans. on Networking, vol. 6, pp. 362–373, Aug. 1998.
143
[11] J.-J. Chen and D.W. Lin, “Optimal bit allocation for coding of video signals
over ATM networks,” IEEE. J. on Sel. Areas in Comm., vol.15, pp. 1002–1015,
Aug. 1997.
[12] T. P.-C. Chen and T. Chen, “Markov modulated punctured auto-regressive
processes for video traffic and wireless channel modeling,” in Packet Video,
Apr. 2002.
[13] T. Chiang and Y. Q. Zhang, “A new rate control scheme using quadratic dis-
tortion model,” IEEE Trans. on CSVT, vol. 7, pp. 246–250, Feb. 1997.
[14] A. Cohen and J.-P. D’ales, “Nonlinear approximation of random functions,”
SIAM Journal on Appl. Math, vol.57, pp. 518–540, Apr. 1997.
[15] A. Cohen, I. Daubechies, O. G. Guleryuz, and M.T. Orchard, “On the impor-
tance of combining wavelet-based nonlinear approximation with coding strate-
gies,” IEEE Trans. on Information Theory, vol. 48, pp. 1895 - 1921, July 2002.
[16] A. L. Corte, A. Lombardo, S. Palazzo, and S. Zinna, “Modeling activity in VBR
video sources,” Signal Processing: Image Communication, vol. 3, pp. 167–178,
June 1991.
[17] T. M. Cover and J. A. Thomas, Elements of Information Theory, New York:
John Wiley, 1991.
[18] M. Dai, D. Loguinov, and H. Radha, “Statistical analysis and distortion mod-
eling of MPEG-4 FGS,” in Proc. IEEE ICIP, Barcelona, Spain, Sept. 2003, pp.
301–304.
[19] A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queu-
ing algorithm,” in ACM SIGCOMM, vol.1, pp.3–26, 1990.
144
[20] R. A. Devore, B. Jawerth, and B. J. Lucier, “Image compression through wavelet
transform coding,” IEEE Trans. on Information Theory, vol. 38, pp. 719 - 746,
Mar. 1992.
[21] R. A. Devore, “Nonlinear approximation,” in Acta nnumerica, New York: Cam-
bridge Univ. Press, Cambridge, 1998.
[22] W. Ding and B. Liu, “Rate control of MPEG video coding and recording by
rate-quantization modeling,” IEEE Trans. on CSVT, vol.6, pp. 12–20, Feb.
1996.
[23] P. Embrechts, F. Lindskog, and A. McNeil, “Correlation and dependence in
risk management: Properties and pitfalls,” available at
Aug. 1999.
[24] A. Erramilli, O. Narayan, and W. Willinger, “Experimental queueing analysis
with long-range dependent packet traffic,” IEEE/ACM Trans. Networking, vol.
4, pp. 209–223, Apr. 1996.
[25] T. Eude, R. Grisel, H. Cherifi, and R. Debrie, “On the distribution of the DCT
coefficients,” in Proc. IEEE Conf. Acoustics, Speech, Signal Processing, vol. 5,
Apr. 1994, pp. 365–368.
[26] V. Firoiu, and M. Borden, “A study of active queue management for congestion
control,” in Proc. IEEE INFOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 1435–1444.
[27] F. H. P. Fitzek and M. Reisslein, “MPEG-4 and H.263 video traces for network
performance evaluation (extended version),” available at
berlin.de, Oct. 2000.
145
[28] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Congestion
Avoidance,” IEEE/ACM Trans. on Networking, vol. 1, pp. 397–413, Aug. 1993.
[29] S. Floyd, “TCP and explicit congestion notification,” ACM Computer Commu-
nication Review, vol. 24, pp. 8–23, Oct. 1994.
[30] S. Floyd, M. Handley, and J. Padhye, “Equation-based congestion control for
unicast applications,” in Proc. ACM SIGCOMM, Stockholm, Sweden, Sept.
2000, pp. 43–56.
[31] M. Frey and S. Nguyen-Quang, “A Gamma-based framework for modeling
variable-rate MPEG video sources: The GOP GBAR model,” IEEE/ACM
Trans. on Networking, vol. 8, pp. 710–719, Dec. 2000.
[32] M. W. Garrett and W. Willinger, “Analysis, modeling and generation of self-
similar VBR video traffic,” in Proc. ACM SIGCOMM, London, UK, Aug. 1994,
pp. 269–280.
[33] A. Gersho, “Asymptotically optimal block quantization,” IEEE Trans. on In-
formation Theory, vol. 25, pp. 373–380, July 1979.
[34] A. Gersho and R. Gray, Vector Quantization and Signal Compression, Boston,
MA: Kluwer Academic Publishers, 1992.
[35] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding
of video sequences,” IEEE Journal on Selected Areas in Communications, vol.
SAC-5, pp. 1140–1154, Aug. 1987.
[36] H. Gish and J. N. Pierce, “Asymptotically efficient quantizing,” IEEE Trans.
on Information Theory, vol. IT-14, pp. 676–683, Sept. 1968.
146
[37] R. M. Gray, Source Coding Theory, Boston, MA: Kluwer Academic Publishers,
1990.
[38] H.-M. Hang and J.-J. Chen, “Source model for transform video coder and its ap-
plication —Part I: fundamental theory,” IEEE Trans. on CSVT, vol. 7, pp. 287–
298, Apr. 1997.
[39] B. G. Haskell, A. Puri, A. N. Netravali, Digital Video: An Introduction to
MPEG-2, Boston, MA: Kluwer Academic Publishers, 2002.
[40] Z. He and S. K. Mitra, “A unified rate-distortion analysis framework for trans-
form coding,” IEEE Trans. on CSVT, vol. 11, pp. 1221–1236, Dec. 2001.
[41] D. P. Heyman, A. Tabatabai, T. V. Lakshman, “Statistical analysis and simu-
lation study of video teleconference traffic in ATM networks,” IEEE Trans. on
CSVT, vol. 2, pp. 49–59, Mar. 1992.
[42] D. P. Heyman, “The GBAR source model for VBR video conferences,”
IEEE/ACM Trans. on Networking, vol. 5, pp. 554–560, Aug. 1997.
[43] C.-Y. Hsu, A. Ortega, and A. Reibman, “Joint selection of source and channel
rate for VBR video transmission under ATM policing constraints,” IEEE Jour-
nal on Selected Areas in Communication, vol. 15, pp. 1016–1028, Aug. 1997.
[44] C. Huang, M. Devetsikiotis, I. Lambadaris, and A. R. Kaye, “Modeling and sim-
ulation of self-similar variable bit rate compressed video: A unified approach,”
in Proc. ACM SIGCOMM, Cambridge, MA, Aug. 1995, pp. 114–125.
[45] D. Huffman, “A method for the construction of minimal redundancy codes,” in
Proc. IRE, Sept. 1952, pp. 1098–1101.
147
[46] H. E. Hurst, “Long-term storage capacity of reservoirs,” Trans. on American
Society of Civil Engineers, vol. 116, pp. 770-799, 1951.
[47] T. Y. Hwang and P. H. Huang, “On new moment estimation of parameters of
the Gamma distribution using its characterization,” Annals of the Institute of
Statistical Mathematics, vol. 54, Issue 4, 2002.
[48] ISO/IEC JTC1, “Coding of moving pictures and associated audio for digi-
tal storage media at up to about 1.5Mb/s–Part2: video,” ISO/IEC 11172-2
(MPEG-21), 1993.
[49] ISO/IEC JTC1, “Information technology-coding of audio-visual objects- Part2:
video,” ISO/IEC 14496-2 (MPEG-4), 1999.
[50] ITU-T, “Codec for videoconferencing using primary digital group transmis-
sion,” ITU-T Recommendation H.120; version 1, 1984; version 2, 1988.
[51] ITU-T, “Video codec for audiovisual services at p × 64 kbits/s,” ITU-T Rec-
ommendation H.120; version 1, 1990; version 2, 1993.
[52] ITU-T and ISO/IEC JTC1, “Generic coding of moving pictures and associated
audio informaiton–Part2: video,” ISO/IEC 13818-2 (MPEG-2), 1994.
[53] ITU-T, “Video coding for low bitrate communication,” ITU-T Recommendation
H.263; version 1, 1995; version 2, 1998.
[54] N. Jayant and P.Noll, Digital Coding of Waveforms, Englewood Cliffs, NJ:
Prentice Hall, 1984.
[55] JPEG, “JPEG2000 part I final committee draft version 1.0,” ISO/IEC
JTCI/SC29 WGI, Mar. 2000.
148
[56] S.-R. Kang, Y. Zhang, M. Dai, and D. Loguinov, “Multi-layer active queue man-
agement and congestion control for scalable video streaming,” in Proc. IEEE
ICDCS, Tokyo, Japan, Mar. 2004, pp. 768–777.
[57] K. Kar, S. Sarkar, and L. Tassiulas, “Simple rate control algorithm for max
total user utility,” in Proc. IEEE INFOCOM, Anchorage, Alaska, Apr. 2001,
pp. 133–141.
[58] F. P. Kelly, A. Maulloo, and D. Tan, “Rate control in communication networks:
Shadow prices, proportional fairness and stability,” Journal of the Operational
Research Society, vol. 49, pp. 237–252, 1998.
[59] M. Krunz and S. K. Tripathi, “On the characterization of VBRMPEG streams,”
in Proc. of ACM SIGMETRICS, Seattle, WA, June 1997, pp. 192–202.
[60] S. Kunniyur and R. Srikant, “End-to-end congestion control schemes: Utility
functions, random losses and ECN marks,” in Proc. IEEE INFOCOM, Tel-Aviv,
Israel, Mar. 2000, pp. 1323–1332.
[61] D. Leviatan and I. A. Shevchuk, “Coconvex approximation,” Journal of Approx.
Theory, vol. 118, pp. 20–65, 2002.
[62] A. Lombardo, G. Morabito, and G. Schembra, “An accurate and treatable
Markov model of MPEG-Video traffic,” in Proc. IEEE INFOCOM, San Fran-
cisco, CA, Mar. 1998, pp. 217–224.
[63] A. Lombardo, G. Morabito, S. Palazzo, and G. Schembra, “A Markov-based
algorithm for the generation of MPEG sequences matching intra- and inter-
GOP correlation,” European Trans. on Telecommunications, vol. 12, pp. 127–
142, Mar./Apr. 2001.
149
[64] S. H. Low and D. E. Lapsley, “Optimization flow control I: Basic algorithm
and convergence,” IEEE/ACM Trans. on Networking, vol. 7, pp. 861–874, Dec.
1999.
[65] W. Li, “Overview of fine granularity scalability in MPEG-4 video standard,”
IEEE Trans. on CSVT, pp. 301–317, Mar. 2001.
[66] J. Lin and A. Ortega, “Bit-rate control using piecewise approximation rate-
distortion characteristics,” IEEE Trans. on CSVT, vol. 8, pp. 446–459, Aug.
1998.
[67] F. Ling, W. Li, and H. Sun, “Bitplane coding of DCT coefficients for image and
video compression,” Proc. SPIE Visual Communications and Image Processing,
San Jose, CA, Jan. 1999, pp. 500–508.
[68] D. Liu, E. I. Sa´ra, and W. Sun, “Nested auto-regressive processes for MPEG-
encoded video traffic modeling,” IEEE Trans. on CSVT, vol. 11, pp. 169–183,
Feb. 2001.
[69] T. Liu, H.-J. Zhang, W. Qi, and F. Qi, “A systematic rate controller for MPEG-
4 FGS video streaming,” Multimedia Systems, vol. 8, Dec. 2002.
[70] D. Loguinov and H. Radha, “Increase-decrease congestion control for real-time
streaming: Scalability,” in Proc. IEEE INFOCOM, New York, June 2002,
pp. 525–534.
[71] S. Ma and C. Ji, “Modeling video traffic using wavelets,” IEEE Communication
Letters, vol. 2, no. 4, pp. 100–103, Apr. 1998.
[72] S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain,”
IEEE/ACM Trans. on Networking, vol. 9, pp. 634–649, Oct. 2001.
150
[73] B. Maglaris, D. Anastassiou, P. Sen, G. Karlsson, and J. Robbins, “Perfor-
mance models of statistical multiplexing in packet video communications,”
IEEE Trans. on Comm., vol. 36, pp. 834–844, July 1988.
[74] S. Mallat and F. Falzon, “Analysis of low bit rate image transform coding,”
IEEE Trans. on Signal Processing, vol.46, pp. 1027–1042, Apr. 1998.
[75] L. Massoulie´, “Stability of distributed congestion control with heterogeneous
feedback delays,” IEEE Trans. on Automatic Control, vol. 47, pp. 895–902,
June 2002.
[76] B. Melamed and D. E. Pendarakis, “Modeling full-length VBR video using
Markov-renewal-modulated TES models,” IEEE Journal on Selected Areas in
Communications, vol. 16, pp. 600–611, June 1998.
[77] J. L. Mitchell, MPEG Video: Compression Standard, Boston, MA: Kluwer
Academic Publishers, 2002.
[78] MPEG, “Coding of moving pictures and audio,” ISO/IEC JTC1/SC29/WG11
N3908, Jan. 2001.
[79] F. Muller, “Distribution shape of two-dimensional DCT coefficients of natural
images,” Electronics Letters, vol. 29, Oct. 1993.
[80] A. N. Netravali and B. G. Haskell, Digital Pictures Presentation, Compression,
and Standards. New York: Plenum, 1988.
[81] J. O’Neal, T. Natarajan, “Coding isotropic images,” IEEE Trans. on Informa-
tion Theory, vol. 23, pp. 697–707, Nov 1977.
[82] A. Ortega and K. Ramchandran, “Rate-distortion methods for image and video
compression,” IEEE Signal Processing Magazine, vol. 15, pp. 23–50, Nov. 1998.
151
[83] J. Padhye, V. Firoiu, D. F. Towsley, and J. F. Kurose, “Modeling TCP reno
performance: A simple model and its empirical validation,” IEEE/ACM Trans.
on Networking, vol. 8, pp. 133–145, Apr. 2000.
[84] J. Postel, “User datagram protocol,” RFC 768, IETF standard, Aug. 1980.
[85] J. Postel, “Transmission control protocol C DARPA Internet program protocol
specification,” RFC 793, IETF standard, Sept. 1981.
[86] H. Radha, M. V. Schaar, and Y. Chen, “The MPEG-4 fine-grained scalable
video coding method for multimedia streaming over IP,” IEEE Trans. on Mul-
timedia, vol. 3, pp. 53–68, Mar. 2001.
[87] M. Reisslein, J. Lassetter, S. Ratnam, O. Lotfallah, F. H. P. Fitzek, and S.
Panchanathan, “Video traces for network performance evaluation,” available at
2004.
[88] R. Rejaie, M. Handley, “Quality adaptation for congestion controlled video
playback over the Internet,” in Proc. of ACM SIGCOMM, Cambridge, MA,
Sep. 1999, pp. 189–200.
[89] R. Rejaie, M. Handley, and D. Estrin, “RAP: An end-to-end rate-based conges-
tion control mechanism for real-time streams in the Internet,” in Proc. IEEE
INFOCOM, New York, USA, Mar. 1999, pp. 1337–1345.
[90] V. J. Ribeiro, R. H. Riedi, M. S. Crouse, and R. G. Baraniuk, “Multiscale
queuing analysis of long-range-dependent network traffic,” in Proc. IEEE IN-
FOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 1026–1035.
[91] J. Rissanen and G. Langdon, “Arithmetic coding,” IBM Journal of Research
and Development, vol. 23, pp. 149–162, Mar. 1979.
152
[92] O. Rose, “Statistical properties of MPEG video traffic and their impact on
traffic modeling in ATM systems,” in Proc. of the 20th Annual Conference on
Local Computer Networks, Minneapolis, MN, Oct. 1995, pp. 397–406.
[93] O. Rose, “Simple and efficient models for variable bit rate MPEG video traffic,”
in Performance Evaluation, vol. 30, pp. 69–85, 1997.
[94] D. Saparilla and K. Ross, “Optimal streaming of layered video,” in Proc. IEEE
INFOCOM, Tel-Aviv, Israel, Mar. 2000, pp. 737–746.
[95] U. K. Sarkar, S. Ramakrishnan, and D. Sarkar, “Modeling full-length video us-
ing Markov-modulated Gamma-based framework,” IEEE/ACM Trans. on Net-
working, vol. 11, pp. 638–649, Aug. 2003.
[96] M. van der Schaar, “System and network-constrained video compression,” Ph.D.
dissertation, Eindhoven University of Technology and Delft University of Tech-
nology, Netherlands, 2001.
[97] C. E. Shannon, “A mathematica theory of communication,” Bell Syst. Tech.
Journal, vol. 27, pp. 379–423, 1948.
[98] M. Shreedhar and G. Varghese, “Efficient fair queuing using deficit round-
robin,” IEEE/ACM Trans. on Networking, vol. 4, pp. 375–385, June 1996.
[99] S. R. Smoot and L. A. Rowe, “Study of DCT coefficient distributions,” in Proc.
SPIE Symposium on Electr. Imaging, San Jose, CA, vol. 2657, Jan. 1996.
[100] E. Steinbach, N. Farber, and B. Girod, “Adaptive playout for low latency video
streaming,” in Proc. IEEE ICIP, Thessaloniki, Greece, Oct. 2001, pp. 962–965.
153
[101] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. “Chord: A
scalable peer-to-peer lookup service for Internet applications,” in Proc. ACM
SIGCOMM, San Diego, CA, Aug. 2001, pp. 149–160.
[102] G. J. Sullivan and T. Wiegand, “Rate-Distortion optimization for video com-
pression,” IEEE Signal Processing Magazine, vol. 15, pp. 74–90, Nov. 1998.
[103] D. S. Taubman, “Directionality and scalability in image and video compres-
sion,” Ph.D. dissertation, University of California At Berkeley, Berkeley, CA,
1994.
[104] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Cod-
ing, New York: McGraw-Hill, 1979.
[105] Q. Wang, Z. Xiong, F. Wu, and S. Li, “Optimal rate allocation for progressive
fine granularity scalable video coding,” IEEE Signal Processing Letters, vol. 9,
pp. 33–39, Feb. 2002.
[106] Y. Wang, M. T. Orchard, and A. R. Reibman, ”Multiple description image
coding for noisy channels by pairing transform coefficients,” in Proc. IEEE
Workshop on Multimedia Signal Processing, Princeton, NJ, June 1997, pp. 419–
424.
[107] Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video Processing and Communica-
tions, NJ: Prentice Hall, 2001.
[108] D. Wu, Y. T. Hou, B. Li, W. Zhu, Y.-Q. Zhang, and H. J. Chao, “An end-to-end
approach for optimal mode selection in Internet video communication: Theory
and application,” IEEE Journal on Selected Areas in Communications, vol. 18,
pp. 1–20, June 2000.
154
[109] D. Wu, Y. T. Hou, W. Zhu, H.-J. Lee, T. Chiang, Y.-Q. Zhang, and H. J. Chao,
“On end-to-end architecture for transporting MPEG-4 video over the Internet,”
IEEE Trans. on CSVT, vol. 10, pp. 923–941, Sept. 2000.
[110] D. Wu, Y.T. Hou, W. Zhu, Y.-Q. Zhang, J.M. Peha, “Streaming video over the
Internet: approaches and directions,” IEEE Trans. on CSVT, vol. 11, pp. 282–
300, Mar. 2001.
[111] G. S. Yovanof and S. Liu, “Statistical analysis of the DCT coefficients and
their quantization error,” in Conf. Rec. 30 thAsilomar Conf. Signals, Systems,
Computers, Pacific Grove, CA, Nov. 1996, pp. 601–605.
[112] Y. Zhang, S-R. Kang, and D. Loguinov, “Delayed stability and performance of
distributed congestion control,” in Proc. ACM SIGCOMM, Portland, OR, Aug.
2004, pp. 307–318.
[113] L. Zhao, J. W. Kim, and C.-C. Kuo, “MPEG-4 FGS video streaming with
constant-quality rate control and differentiated forwarding,” in Proc. SPIE Vi-
sual Communications and Image Processing, San Jose, CA, Jan. 2002, pp. 230–
241.
[114] X. J. Zhao, Y. W. He, S. Q. Yang, and Y. Z. Zhong, “Rate allocation of equal
image quality for MPEG-4 FGS video streaming,” in Packet Video, Pittsburgh,
PA, Apr. 2002.
[115] J.-A. Zhao, B. Li, and I. Ahmad, “Traffic model for layered video: An approach
on Markovian arrival process,” in Packet Video, Nantes, France, Apr. 2003.
155
VITA
Min Dai received her B.S. and M.S. degree in precise instruments from Shanghai
Jiao Tong University, China, in 1996 and 1998, respectively. She has been pursuing
her Ph.D. degree in electrical engineering at Texas A&M University since 1999.
She was a research intern with LSI Logic Company, San Jose, CA, from January
2002 to August 2002. Afterwards, she joined the Internet Research Lab, Department
of Computer Science, Texas A&M University.
Her research interests include scalable video streaming, video traffic modeling,
and image denoising. She may be contacted at:
Min Dai C/O Shanren Dai
11 Shucheng Road, the 8th Floor
Hefei, Anhui, 230001
P. R. China
Các file đính kèm theo tài liệu này:
- rate-distortion-analysis-and-traffic-modelling-of-scalable-video-coders.pdf