Contenido - OCW UPM

lynxfatkidneyedΔίκτυα και Επικοινωνίες

26 Οκτ 2013 (πριν από 4 χρόνια και 18 μέρες)

134 εμφανίσεις


Contenido

CHAPTER 1
................................
................................
................................
................................
.....

7

1.

DIGITAL IMAGE

................................
................................
................................
.....................

7

1.1

BLACK AN
D WHITE

................................
................................
................................
........

7

1.1.1

IMAGE CAPTURE

................................
................................
................................
.....

7

1.1.2

IMAGE FORMAT

................................
................................
................................
....

10

1.2

IMAGE
IN COLOR

................................
................................
................................
..........

11

1.2.1

RGB

................................
................................
................................
.........................

11

1.2.2

Y, CR, CB

................................
................................
................................
.................

12

1.3

IMAGE FILES

................................
................................
................................
.................

15

1.3.1

TIFF

................................
................................
................................
.........................

16

1.3.2

GIF

................................
................................
................................
...........................

16

1.3.3

PNG

................................
................................
................................
.........................

16

1.3.4

JPEG

................................
................................
................................
........................

17

1.3.5

JFIF

................................
................................
................................
..........................

18

1.3.6

BMP

................................
................................
................................
.........................

18

1.3.7

PMBP

................................
................................
................................
......................

18

1.3.8

JPEG2000

................................
................................
................................
...............

18

1.4

QUALITY OF IMAGE

................................
................................
................................
......

18

1.4.1

OBJECTIVE MEASUREMENT

................................
................................
................

19

1.4.2

SUBJECTIVE MEASUREMENT.

................................
................................
..............

20

CHAPTER 2
................................
................................
................................
................................
...

21

2.

IMAGES COMPRESSION

................................
................................
................................
......

21

2.1

BASI
S COMPRESSION

................................
................................
................................
...

21

2.1.1

SPATIAL REDUNDANCY

................................
................................
.......................

21

2.1.2

SYMBOLS PROBABILITY

................................
................................
.......................

22

2.1.3 TEMPORAL REDUNDANCY

................................
................................
....................

24

2.1.4. HUMAN EYE CHARACTERISTICS

................................
................................
.......

26

2.2

MPEG1

................................
................................
................................
...........................

28

2.2.1

VIDEO STRUCTURE

................................
................................
...............................

28

2.2.2

CODIFICATION
................................
................................
................................
.......

30

2.3

MPEG 2

................................
................................
................................
..........................

31

2.4

MPEG4

................................
................................
................................
...........................

32

2.4.1

MULTIMEDIA RESTRICTIONS

................................
................................
..............

32

2.4.2

THE STANDARD

................................
................................
................................
....

33

2.4.3

BITSTREAM
................................
................................
................................
............

33

2.5

DV

................................
................................
................................
................................
...

34

2.5.1

ENCODING
................................
................................
................................
..............

34

2.6

JPEG 2000

................................
................................
................................
......................

34

2.6.1

IMAGE PROCESSING

................................
................................
.............................

34

2.6.2

JP2 FILE

................................
................................
................................
..................

36

2.6.3

CONTIGUOUS CODE STREAM BOX

................................
................................
......

37

2.7

PRODUCTION FORMATS

................................
................................
..............................

39

2.8

UNCOMPRESSED VIDEO

................................
................................
..............................

40

CHAPTE
R 3
................................
................................
................................
................................
...

41

3.

STORAGE

................................
................................
................................
..............................

41

3.1

PRODUCTION

................................
................................
................................
................

41

3.1.1

SPECIFICATIONS

................................
................................
................................
...

42

3.1.2

RAID SYSTEMS

................................
................................
................................
.......

43

3.2

FILE

................................
................................
................................
................................

47

CHAPTER 4
................................
................................
................................
................................
...

49

4.

NETWORKS

................................
................................
................................
..........................

49

4.1

GENERAL CONCEPTS

................................
................................
................................
...

49

4.1.1

TRANSPARENCY

................................
................................
................................
....

49

4.1.2

CONNECTIVITY

................................
................................
................................
......

49

4.1.3

SCOPE.

................................
................................
................................
....................

50

4.2

THE LAYER MODEL

................................
................................
................................
......

50

4.2.1

PROT
OCOLS

................................
................................
................................
...........

51

4.2.2

LEVEL 1 PHYSICAL

................................
................................
................................

51

4.2.3

LEVEL 2 LINK

................................
................................
................................
.........

52

4.2.4

LEVEL 3 NETWORK
................................
................................
...............................

52

4.2.5

LEVEL 4 TRANSPORT

................................
................................
...........................

53

4.3

VIDEO TRANSFER ORIENTED PROT
OCOLS

................................
...............................

53

4.3.1

PHYSICAL AND LINK LEVELS

................................
................................
...............

54

4.3.2

STREAMING AND FILE COPY.
................................
................................
...............

54

4.3.3

FTP (FILE TRANSFER PROTOCOL)

................................
................................
......

55

4.3.4

FTP ALTERNATIVES

................................
................................
.............................

56

4.3.5

UNICAST / MULTICAST TRANSMITION

................................
..............................

57

4.3.6

INTERNET GROUP MANAGEMENT PROTOCOL (IGMP)

................................
....

59

4.3.7

MPLS

................................
................................
................................
.......................

59

4.3.8

RTP/RTCP.

................................
................................
................................
.............

60

4.3.9

RTSP (REAL
-
TIME STREAMING PROTOCOL).

................................
....................

62

4.4

QUALITY OF SERVICE

................................
................................
................................
..

63

4.4.1

INTSERV Y RSVP

................................
................................
................................
....

64

4.4.2

DIFFSERV

................................
................................
................................
...............

65

4.5

FEC (FORWARD ERROR CORRECTION)

................................
................................
.....

66

4.5.1

SMPTE 2022‐2007

................................
................................
................................

66

4.5.2

FEC 2D

................................
................................
................................
....................

6
8

CHAPTER 5
................................
................................
................................
................................
...

71

5.

3D IMAGES

................................
................................
................................
...........................

71

5.1

KINDS OF 3D VIDEO

................................
................................
................................
.....

71

5.1.1

3D

VIDEO BASED ON STEREO TV

................................
................................
........

71

5.1.2

STEREOSCOPIC 3D VIDEO

................................
................................
....................

72

5.1.3

3D VIDEO BASED ON AUTOSTEREO TV
-

................................
............................

72

5.1.4

MULTICAMERA

................................
................................
................................
......

73

5.2

PRODUCTION AND DISTRIBUTION

................................
................................
............

74

5.2.1

THE AVATAR CASE

................................
................................
................................

74

5.2.2

TDT 3D, TV3 CASE

................................
................................
................................
.

78

5.3

ISSUANCE AND DISPLAY

................................
................................
.............................

79

5.3.1

ANAGLYPHS SYSTEM

................................
................................
............................

79

5.3.2

Colorcode 3D

................................
................................
................................
.........

80

5.3.3

POLARIZATION SYSTEM

................................
................................
......................

81

5.3.4

SEQUENCE FIELD SYSTEM

................................
................................
...................

82

5.3.5

SISTEMA WOWVX

................................
................................
................................
.

83

5.3.6

OTHE
R SYSTEMS

................................
................................
................................
...

83



CHAPTER 1


1.

DIGITAL IMAGE


The digital video has become a very important element inside the different areas of our society:
business, education, entertainment, etc . Its diffusion and distribution covers a wide gamma of
technologies; from the own diffusion of the
Digital Video Broa
dcasting

(DVB
-
T) into the Digital
Terrestrial Television (DTT) up to the publication of contents based on web technology. The increase

of video consumption is due to the union of two factors: Cheaper cost and the increase of
capacities, both in the fiel
d of communications as well as in the storage.



The speed increment in the net communications joined to the low cost of it , has led the massive use
of these nets both in the domestic use and the professional environments of production.


About the storag
e systems and process, has allowed that the domestics users can have in their
computers libraries of content and strong tools of content creation that only recently were available
in cost production environments. This cheapening has permitted that the comp
anies of production
are able to update their equipments in less time and with more benefits.


This migration of process from a traditional environment office, has requested the creation and the
continue modification of work standards with the intention of
satisfying strong demands that
imposed by this type of contents.


In this chapter it will be accomplished a revision of the basic concepts of the capture and image
creation for their professional production, so that it establishes a quality of base image.
This base
image, will allow us to have a starting point to optimize the resources in each process.


1.1

BLACK AND WHITE


1.1.1

IMAGE CAPTURE


The capture of our cameras has a projection in two dimensions of the real image different from our
actual ones in three
dimensions, this projection is a distribution of the light energy that reflect the
image. In order to capture this image to digital data, it is necessarily to convert this light energy into
digital image. It will require three steps:




Definition of rows a
nd columns (axis u and v) in order to take the values of luminance of
the image. Through this process, it is possible to obtain discrete values of luminance
from the continuous values. In each point (u,v) a value will be obtained.



Quantification of values

obtained in the previous steps through a certain number of bits
that can be treated by the computer.




Sequencing of the values that are in the points (x,y), so that the values sets can
define the image.

As shown in the Illustration 1
-
1, the image is
inverted for the effect of the concave lens.






ILLUSTRATION 1
-
1 IMAGE BEHAVIOR THROUGH THE LENS


In the image lay out we have a representation of two reality dimensions. It is in this lay out where
the image sensor is placed. This sensor is divided by
rows and columns with a cell in each pair (x,y),
this cells give the information of the luminosity that impact in each one of them (Illustration 1
-
2),
obtaining a numeric representation of the distribution of the light that impacts the sensor.





ILLUSTR
ATION 1
-
2. SENSOR DIVIDED IN CELLS


The generated data done by the sensor because of the incidence of the light energy, has to be
sampled in regular intervals to extract such information of the sequential form pixel by pixel
(Illustration 1
-
3)





ILLUSTR
ATION 1
-
3. DATA SERIALIZATION


Either to store the data or to be done through mathematical operations, it’s necessary to encode
the result data in bits sequence, thus, each pixel will come defined by a bit number (k) inside the
data streams, the values ga
mma that are able to have the image is given by a minimum value (0)
and a maximum value (
). In case to use seven bits (k=7) the values from the black to white will be
referenced to 0 to

that reveals a range of values going from 0

to 127 (128 values), in case of 8
bits will be
from 0 to 255. As a general rule, the more bits are used to encode an image, much
more quality will be obtained from the original image. In the case where k has a value of 1, a bit
will be us
ed for its codification and will show us a black and white image without none type of grays.


For the data serialization it exists several methods, the simplest is selecting the values pixel by pixel
and is going to show the information of each pixel.


On

Illustration 1
-
4 the well
-
known method can be seen , such as the Frame Transfer (CCD
-
FT),
through this method, after doing the presentation, the values of the images pass to storage
registers and then the transfer values of the different lines to the o
utput registers, when the outputs
registers have turned out to the memory, values are transferred to the next line.




ILLUSTRATION 1
-
4. FRAME TRANSFER


1.1.2

IMAGE FORMAT


The image format or the definition has been defined by the number of columns M against
the rows
N, so that the pixel numbers will be M x N, if that image is encoded with k bit numbers, the storage
of that image uncompressed will have a size (s) of:


S = M x N x k


The image format used to specify through the relation between the numbers of
columns and the
number of rows, or the same, among the pixel numbers of a horizontal line and the pixel numbers
of a vertical one. As a consequence an image has M/N, which means that each one has M
horizontal pixels (nº of columns) and has N verticals (n
º rows). Thus 16/9 indicates that each 16
pixels in one horizontal line will have 9 vertical lines.


When speaking about a format of 1.080 horizontal lines in 16/9, the pixel numbers will have in a
horizontal line will be of 1.920 = 1.080 x 16/9, so the im
age will have No Pixels = 1.920 x 1.080 =
2.073.600, if it is encoded with 7 bits per each pixel there will No Bits= 1920 x 1080 x 7, that means
14.515.200 Bits (1814400Bytes)


1.2

IMAGE IN COLOR


A monochromatic image performs through the sampling of the inci
dent light intensity in the
scene, to capture images in color, the beam has been filtered in three primary colors and
captured by three different CCD, one by color: red, green and blue (RGB).


1.2.1

RGB


Each pixel will be represented by three numbers that
indicate the intensity of each one of the
colors. If it uses 8 bites per color, there is a need of 24 bites per pixel, so the size of the image in
color, corresponding to k bites per pixel and color will be:


S = M x N x 3 x k


With these three colors it m
ay represent all additive colors through the combination of each one
of them, for the specific case of one pixel in white the value of each one of the components is:


Y = 0,299 x R + 0,587 x G x 0,114 x B


In the Illustration 1
-
5 can be seen how the origin
al image is decomposed in their three RGB
components. It shows how the part of the lips that is redder, appears with more presence in the
R channel while is darker in the rest of the channels. The sky that in the original image is white
will have a maximum

in each one of the components.




ILLUSTRATION 1
-
5 FRACTION OF A NATURAL IMAGE IN ITS THREE COMPONENTS


In the Illustration 1
-
6, shows a generated image of an electronic device that symbolizes the color
bars, is one of the most used signals to adjust ele
ctronic equipments because it gives the
possible combinations of the three RGB components.



ILLUSTRATION 1
-
6 ELECTRONIC IMAGE OF BARS AND THEIR SEPARATION IN THE THREE
PRIMARY COLORS


1.2.2

Y, CR, CB


One of the most used ways to represent the images is
through the separation of the luminance
and the chrominance components. One of the key aspects of the images’ compression is the
more sensitivity of the human eye to the variations in the lightmore than the variations of the
color levels, so that if we sep
arate the luminance from the chrominance it may be treated each
one separately.


Considering that:


Y = 0,299 x R + 0,587 x G + 0,114 x B


Where Y is the luminance component, we can define two chrominance components Cr y Cb like
the subtraction of the red
component with the luminance and the subtraction of the blue
component with the luminance respectively.





From the values YCrCb, the RGB may be obtained:







Through the separation of the components, the different ways the luminance and the
chrominance may be encoded. This differentiation of encoding is usually represented through
three figures that identify the quantity of encoded pixels in a block of 2x2 pixels, in the case of
4:4:4, these three figures mean that they encode four pixels of
Y, the same four of Cr and the
same four of Cb. In the case 4:2:2, for the four pixels encoded by Y, they encode two of Cr and
two of Cb, because they encode the color components in the half of the horizontal pixels. In the
case of 4:2:0 they encode half o
f the horizontal and vertical according to how it’s shown in the
illustration 1
-
7.






ILLUSTRATION 1
-
7 ENCODING EXAMPLES 4:4:4, 4:2:2 AND 4:2:0


For each one of the cases the sizes will be:


S (4:4:4) = M x N x K





For the case of 4:1:1 it doesn’t take a square of 2x2 pix
els but four consecutive pixels, so what it
indicates is that it encodes the colors components of one pixel to each four (in horizontal). 3:1:1
indicates that the four pixels encode three luminance and one chrominance in a horizontal. If one
more number i
s added, it indicates that there is an alpha channel and there is the encoding of that
channel (see
Error! reference source not found
).































4:4:4

4:2:2

4:2:0


Y

Cr

C
b




ILLUSTRATION 1
-
8 ENCODING FOR 4:1:1 AND 3:1:1


The sizes will be:







1.3


IMAGE

FILES

Today there are many standardized formats to store and to process images. The developers have to
select the type of image
and the ideal compression for each type of process, it is not the same to
manage an image with retouching needs to production or a file to publish in a web page. In some of the
cases, it will be necessary the store the uncompressed material for an image
process and in other cases
it will be necessary a compression as high as possible even with changes in the characteristics in terms
of image size.








3:1:1

4:1:1


Y

Cr

Cb









We will see the formats using arrays to store the values of pixels and not the referenced vector formats.


1.3.1

T
IFF

The Tagged Image File Format (TIFF) was originally developed by Aldus and Microsoft, but actually
itbelongs to Adobe System. The last version was performed in 1992 (Adobe Developers Association,
Final


June 3, 1992). A TIFF File can contain one or sev
eral images with the possibility to describe from
black and white images up to real color images with diverse compression schemes.

One of the possibilities that offers the mentioned format and becomes a very attractive product in the
professional environme
nts of production is the use of Alpha Channel which allows the creation of masks
and transparency effects necessary in the professional environments of production both photography
and professional video.

The architecture is based on the use of etiquettes t
hat defines the characteristics of the image as the
palette can be resolution dimensions, localization of data, etc. These etiquettes are located at the
beginning of the file and provide flexibility at the moment to create new data and etiquette types
defi
ned by the user
.


1.3.2

GIF

The Graphics exchange Format (GIF) was created by CompuServe in 1987, it has been one of the most
widely used formats for web environments because of many factors; one of them is the possibility to
have harbor inside the same file of
several images so that it can store sequences to provide motion to
the stored images in the file. On the other side the use of the motion images. Another factor that is
widely used is for the size in the files that can be generated, the format can use pale
ttes from 2 to 256
colors of between 16,8 millions of colors that can be used, this fact enables the use of few bits for the
color encoding because it is restricted to a maximum of 256 colors for each palette that is used.


For example, a drawing that uses

8 different colors instead of using one byte to store the color
information of each pixel, only is requires the use of 3 bites to store that information and that means a
safe of more of the fifty percent in storage space and the transmission of this image
.


The format is used mainly for the creation of motion icons which with few bits of color encoding, it turns
into compressions lossless with a great performance. This together with the possibility to create
animates makes it an ideal format for these envi
ronments.


1.3.3

PNG


The Portable Network Graphic (PNG) was initially developed as a free format to replace the gif files for
many reasons to avoid too many license payments. Designing the representation of images on the Net
with support of three types of image
s: grayscale (until 16 bits per pixel), real color (until 3x16 bits per
pixel) and indexed colors (until 256 colors)


Also it incorporates Alpha channel to work in different production environments of photography or
motion video.


1.3.4

JPEG


It was created by

the Join Photographic Experts Group (JPEG) and established like a standard by ISO
(CCITT, 1992). Today is the most used compression format to still images. The first encoding JPEG that
emerged from the collaboration between ITU and ISO was a compression o
f 15:1 without losing
subjective quality, with more compression factors that can be perceived as the loss of quality.

Is a standard created for images that has used the encoding of the motion images like image sequences,
has been for a long time the system

of the video servers in professional quality. The advantage of editing
in any frame of the sequence without increase is the computer complexity made was widely used in
production.

The standard specific two kinds of encoding: lossless and lossy.


Lossless:

No loss of the quality in the compression. Based mainly in two principles: The similarity
between close points and the highest probability of appearance of some values with others. These two
techniques are complements one of each other.

Lossy: based in t
he transformation of the sequential data in time to be defined in frequency through
transformed.

The compression is performing in three stages:




Conversion of the base color, RGB values transferred to signals Y, Cr and Cb, because the human
eye is less
sensible to the color variations than the light variations, it can be treated in an
independent form and enable compressions to note that not degradation.



Cosine transform, the image is divided in blocks of 8x8 pixels and applies the cosine transform
for e
ach one of the blocks, obtaining an array with 64 coefficients of frequencies. These
coefficients are encoding in determinate scale in the quantification matrix, this quantification
matrix determines the grade of compression of the image because they assig
n the necessary
number of bits to quantify each one of the coefficients. In general the coefficients are usually
encoding in high frequencies with less numbers of bits for the low frequencies because the
human sight is less sensible to the variations of hi
gh frequencies.



Compressing without loss, the coefficients from the previous steps are coded through codes of
variable lengths such as Huffman codes, so that you get this step lossless compression.

1.3.5

JFIF

JPEG is not in itself a file format but rather a comp
ressionformat, that is known as a JPEG file is actually
JPEG File exchange Format (JFIF). The JPEG standard only defines the compressing method but doesn’t
specify the packaging of the data, the file format. JFIF specification defines a container file in J
PEG
compressed images.

1.3.6

BMP

The bitmapped (Bit Mapped Picture) may be the simplest image file existing and it is commonly used for
uncompressed image management, so that their weight is usually very large compared to other formats.

The file begins with a he
ader which defines the image, size, color palette, number of colors, etc,
followed by the image values. The image information is organized by beginning with the last row or and
crosses the row values
for the different columns from left to right, when the

last row is finished it just
starts the penultimate row and column one.


1.3.7

PMBP

Uncompressed images are used to preview images that have more size.


1.3.8

JPEG2000

Long been studied and exploited the characteristics of different transformed to compress images, bu
t
the JPEG2000 standard was developed by the Joint Photographic Experts Group, which had created the
JPEG standard based on discrete cosine transformed, using other transformed. The JPEG2000 algorithm
has been developed based on discrete wavelet transform(
DWT), helps integrate natural lossy and
lossless technologies within the same platform and enables different types of encoding and decoding
progress.

The file is a sequence of boxes containing information of different kinds of images that are encoded
bitst
ream that is located at the ContiguousBitstream Box. The box begins with a series of headers that
specify the type of encoding (few layers, quantification, etc.)



1.4

QUALITY OF IMAGE

The quality of the images depends on the different factors, so that is
necessary a control to avoid
significant degradations in the different processes of creation. The images can loss the quality in
different kindsof the process:



Compression



Production



Transmissions of streaming images


In the process of compression, there
are two types of degradation; in the own quantification due to the
conversion of analog images and the compression of those values. The quantification means the
conversion from analog to digital imaging, which involves the conversion of an image into an in
finite
number of points and also infinite number of possible values
to a finite number of pixels with a finite
number of values
for each pixel. The compression of the image will influence in the quality of the image,
in case of lossy compression.

In sy
stems of production there may be a loss of quality due to transcoding, moving from uncompressed
images and compressions with different technologies.

In the process of streaming broadcast, the images are reproduced in real time, so that if it is produces an

error in the transmission or packet loss, the receiver tries to hide those errors through techniques of
prediction depending on the type of image that can have more or less success.

To evaluate the quality, both images and the sound, there are goal method
s related to signal
degradation, these objectives methods reflect through discrete values the distortion of the image. These
distortions can have different effects in the spectators, depending on the influence that has in the
spectator, so occur that the s
pectator a sensation to see an image with more quality than other when
actually he is appreciating an image with less quality. As a consequence different quality subjective
techniques of measurement have come up since the main purpose is that the spectator

perceives the
image as to have a high quality.

1.4.1

OBJECTIVE MEASUREMENT


There are different methods to test the objective quality of the image or the sound. The main
characteristics are:




Mean absolute error (mae): Indicates the way of the difference, in
absolute values of the
received pixels with the reference.



Percentage absolute error (pae): Indicates the maximum value absolute that a pixel has taken.



Mean Squared Error (mse): the mean is the difference of each pixel elevated at square.



Root Mean Square

Error (rmse)



Signal to noise Ratio


The MAE measures the errors media in a data set, without taking into account the sign, it is therefore
the expected absolute error in each of the values
in this type of measurement error, the differences are
weighted

as equals. PAE indicates us what is the maximum mean error that we can expect.

In contrast, the mean square error (MSE) determines in many ways the differences, weighted of
different way the differences by squaring the differences, gives more importance
to the major
differences. So that the MSE is useful when big mistakes are worse than smaller errors.

The MAE and RMSE can be used together to study the variation of the errors. The RMSE will always be
greater than the MSE, if these values
there are simil
ar errors of the same magnitude and if there is much
difference between these values, it will be greater the variation of individual errors. Might vary from 0
to ∞, being more beneficial the values
close to 0.


1.4.2

SUBJECTIVE MEASUREMENT.


Systems based on
viewing images from different users. The main procedures are:

Double stimulus (ITU
-
R BT.500)

Simple stimulus


The first system (DSCQS), the user scores a signal of 0 to 5 in quality with respect of a reference signal (0
worse, 5 better). In most of the
occasions, it can’t be possible to dispose of a reference signal because
the image we can reach a distant font, for these cases it is used the second method by which the users
qualify the image without a reference of the font. The first system it is used t
o measure the degradation
of the image in process of edition, postproduction and transcoding inside the installation in which are
available both the original images and the processed images, the second method it used to measure
the quality of the links at

reception.




CHAPTER 2

2.

IMAGES
COMPRESSION


In this chapter will be emphasized a brief reflection of the need to compress the images, especially
when it comes to images in motion moving into the review of the most significant aspects of the
dif
ferent standards of production, distribution and video broadcast.

2.1

BASIS COMPRESSION


The video can be considered as one sequence of consecutive images that provides a motion sensation. To
specified the image format, it uses the annotation Lines E Frames,
where lines specified the number of
horizontal lines E
1

defining the interlaced type that is realized between the images and the Frames
specifying the number of images per second. Thus 720i50 that means that the image has 720 lines per
frame with interlaci
ng I (a half of the 720 lines go in a frame and the other half at the next) and 50 indicate
that is encoded 50 frames in a second, the format of 1080p50, means that there will be a resolution of
1080 in horizontal with 50 images per second and progressive
(the frames don’t interlace, each frame has
1080 lines). In the last case, if the format has 16/9, the pixel numbers per image will be of

pixels. If it encodes 10 bits per component in 4:4:4 for YRgTb, is
82.944.000 bits per each image, to be fifty images its results 4,1 Gbp/s.

The Consultant Committee for the International Telegraphy and Telephony (CCITT) has specified the Serial
Digital
Interface (SDI) the type of encoding 4:2:2 for the video digital signals, with this speed of encoding, it
will be necessary a file of 311 Mbytes to store a file in a second. These speeds are nowadays
unmanageable to determine environments of production, st
orage and visualization of shape that have
defined several standards of compression in order to lower these sizes. These methods based on several
characteristics, compatibles among themselves to realized the compression.


2.1.1

SPATIAL REDUNDANCY

Based on the si
milarity that has the adjacent pixels, are encoded the difference between the real value
and the estimated or expected. Each one of the values that are taken like sample will be multiplied to a
constant, so that for an X value, there will be an X estimatio
n as such:



Where

is the value of the constant associated at the pixel value
. In the image (Illustration 2
-
1 a), can
be used the points a, b and c to encode the point d. for example if the values of a, b and c are 60 64 and 59
respectively and use

such as constant to each one of the pixels, the estimated for d will be 61 so it is
expected that the value in d is 61. If the value in d is 59, will be encode the number 2 and that value will be
sent to the receiver.




1
The value of this field can be I (Interleave) if the fields are interlacing or P (Progressive) if not are interlacing.



ILLUSTRATION 2
-
1 PREDICTOR

The encoder will be based on the errors that have been obtained by the predictor of the predictions and
the real value to encode the error in the prediction. The obtained histogram differ a lot of the real image.
If we look at
the Illustration 2
-
2
Error! reference source not found
the first part of the figure corresponds
to the probability of the occurrence of the gamma values in a determinate image to an encoding Pulse
Code Modulation (PCM), the analogic signal is sampled and e
ncoded to each one of the values
independently of the rest, the data correspond to an encoded image with 8 bits (256 values), they show us
how to have many values that have similar probability of shows in the image. In the second part of the
figure can be
seen the histogram of the same image through the technic of Difference Pulse Code
Modulation (DPCM), to attempt with differences, the values may be negatives with the average centered
in zero. The zero and neither the same values have more probabilities of

occurrence, in the figure that can
be seen how the zero value has a probability of occur at 0,17 in contraposition of the maximum value in
PCM that has a probability of 0,012. While in PCM the range of the values is 0 to
, in DPCM the range
w
ill be
to

to attempt with differences.





ILLUSTRATION 2
-
2 HISTOGRAM BOTH IN PCM AND IN DPCM OF AN IMAGE


In the PCM encoding, the values used to encode with the same skip among one and other, so that if it is
encoded with 8 bits, each skip of a value to one an
other will be of

take into
account the difference between the maximum value of luminance (white) and the minimum (black). For
the DCPM case, it can exploit code technics of variable length based on the probability of frequency of the
values.


2.1.2

SYMBOLS PROBABILITY

PCM

DPCM

As seen in the histogram correspondent to the DCPM encoding, none of the symbols have the same
probability of frequency in the final bitstream.

For the process of sending and information storage, it is often the use of compressing syst
ems without loss
of information through based encodings in statistical studies of appearances by a determined symbol.


Created by David A. Huffman (Huffman D.A., 1952), it is one of the most useful encodings nowadays due to
its simplicity, good performance

and usage. As the encoding of Shannon
-

Fanno (Fanno R.M., 1961), it is
based on the idea of use compressing codes of length variable so that those most frequently symbols are
represented through the shortest codes. The encoder creates the codes through a
binary tree in which the
symbols are in the leaves and the branches correspond to the etiquettes using the binaries digits 0 and 1.
The distance of the symbol at the root of the tree define the code length or bits numbers assigned to that
symbol and it dep
ends on the last instance of probability of the symbol.


To design the tree, the two symbols with less probability join at a node and assign a 0 to one of them and a
1 to the other. The new node is the result probability to be added to the probabilities of

the symbols of the
leaves, being incorporated as another symbol. That process is repeated until there are no symbols.


The Illustration 2
-
3, shows the example of five symbols; as you can see the initial set of symbols is
{A,B,C,D,E}. In a first iteration
the two symbols with less probability are D and E which are forming the
Node N1 with probability 0,25 (0,15+0,109). At the next iteration the set {A,B,C,N1} can be found, in which
N1 and C form N2 passed the set {A,B,N2}. To know the beginning of each it
is necessary to review the
tree from the root node to the leaves.





ILLUSTRATION 2
-
3. EXAMPLE OF HUFFMAN ENCODING


One of the limitations of that kind of encoding consist on having codes with many symbols, where it will be
necessary many bits to encode
just a symbol. The technique that uses to avoid this situation is to modify
the algorithm and for a specific set of symbols it is associated a certain symbol in front called “Scape” to
can access to those symbols.


For example this kind of encoding will st
udy the code of the Illustration 2
-
4, we have the set of variables:



With their probabilities:




Is defining two sets:




With:


We have:


ILLUSTRATION 2
-
4 HUFFMAN CODE MODIFIED


2.1.3 TEMPORAL REDUNDANCY

Taking advantage of the existent
redundancy among two consecutives images are very useful except
when it exists a change of view. The most important difference that exists between two consecutive
images is determined by the movement that can have their components; this type of movement ca
n
be produced by the movement of the object inside the image or by the movement of the camera.
There are many algorithms of movement predictions, so that it will be necessary to determine the
encoding and the decoding over the algorithm that is being used
and pass the necessary values to the
decoding in order to be able to reconstruct the original image. In the area of encoding, for general
rule as much as better be the movement algorithm, bigger is the need to process and less will be the
information of di
fference that is necessary to encode and send in each frame.

In the Illustration 2
-
5 can be seen 3 frames and the movement vector among the frame N and N+1
and N+1 with N+2. The movement estimation has aims to minimize the use of encoding bits, each
method

has their advantages and their inconveniences depending on the nature of images to encode,
if it has more or less movement, if the objects that are moving are big or small, if the movement is
panning or zooming, etc. There are many techniques to estimate
the movement, depending on the
type of movement that is being produced, as can be seen in the Illustration 2
-
6.


ILLUSTRATION 2
-
5 ESTIMATION VECTORS OF MOVEMENT.

The estimation of movement of each pixel is represented by a set of parameters (displacement,
speed and acceleration). It is one of the heaviest estimating because it requ
ires a huge capacity of
calculation. Resourceful techniques are used for estimating; the most famous is PEL Resourceful
Technique, associates a movement vector of each pixel (the term PEL has the same meaning than
pixel), there has developed different reso
urceful techniques to estimate the movement, the first
algorithm is known as Displaced Frame Difference (DFD) (Netravali A. N., 1979) which changes the
problem of movement estimation in a problem of minimization, which passes through an iteractive
one p
ixel to another to find the best match.


ILLUSTRATION 2
-
6 ESTIMATION TYPES OF MOVEMENT

In the case of the estimating for regions, the partition is not performed as rectangular spaces but they
are grouped in “surfaces” (MPEG4), the object that involves th
e same movement are grouped to
perform a joint estimating to get better estimations with less rate of bitrate at risk of more complex.


Block Matching

The most useful nowadays and it was suggested at the beginning of the eighty decades (Jain J. R, Dec.
198
1). The image is divided in rectangular blocks of fixed size assuming that the movement is uniform
in the entire block, the movement vector associates those blocks, being estimated by locating the best
match of the previous image. The size value of the blo
ck must be known, both the encoding and the
decoding, this value used to be 16x16 (macro block).

For the optimization of the movement it is necessary to use the functions as: maximum correlation
(Anuta P.E., 1969), the less difference, media square error

(MSE) and the media absolute difference
(MAD), the number of points to use for performance estimation is a balance between processing and

N+2

N+1

N






information to broadcast. As many points it is used for movement estimation of a block, more is the
processing charge

with a more precision in the movement, less the estimation error and it will be
necessary to send more information of the same quality.

In relation to the search strategy of the blocks, there also many algorithms to perform; which is
usually called full

search, it is performed through a search window that moves on the image and
applies one of the functions listed above, for each position to locate the rectangular that best fits. It is
a criteria that can be called of rude force. One of the variations c
alled logarithmic search (Jain J. R,
Dec. 1981), has reduced the search of five rectangular points instead of the 16 x 16, around the winner
of five points, it is selected another small rectangle with five points and so on until you locate the
lowest with

a 3x3 window. There have been many documents and publications to locate the box that
best fits.

2.1.4. HUMAN EYE CHARACTERISTICS

One of the most useful aspects of the compression of the images is the sensibility difference that the
human eye has at

the luminance variation in high and low frequencies. It is proved experimentally that
the eye is less sensitive to variations of intensity at high frequencies than these variations at low
frequencies. This doesn’t mean that the high frequencies should be
eliminated from the image but
that an error in the luminance in this frequency is less noticeable at low frequencies.

To take advantage of these characteristics in the codification process, it will be necessary to be able to
differentiate the wideness for

each frequency, in order to encode with less number of bites the high
frequencies than lows. The way to separate the components on these frequencies is through
transform

Fourier Transform:

Maybe is the most useful transform process. That transformation i
s used to move a group of
sequenced values in time, a sequence of frequencies give the idea that the frequencies can be created
by the sequence of the input signal. The base of the transformation is because the values of the signal
in the time domain

can be represented through sums of sine waves and cosine with their
corresponding weights:


Where

The sine and cosine signals are pure frequencies signals or forming signals of the sign
, that
because of their sum,
modeled coefficients form the signal. The functions representing the calculation
of the coefficients of a function with N points of encoding are:


Discrete Cosine Transform (DCT):

This transform smooth the calculation needs of the Fourier Transform and it

is better to adapt to the
efficiency needs in the signal encoding of the video.

The basic vectors of the transform to N points are:


With an inverse transform defined by:


Which:


When the image is defined in two dimensions, it will be necessary to
apply the transform for two
dimensions



An image matrix of 8x8 pixels, will have as a result a matrix of 8x8 values, one of these values is a one
coefficient of the frequencies’ components. The execution of the inverse transform means the matrix
conversion of coefficients in the domain of the frequency at the consecution of the matrix in the time
domain (original signal).






































































ILLUSTRATION 2
-
7 MATRIX OF 8X8
PIXELS AND THEIR TRANSFORM IN MATRIX OF 8X8 VALUES IN
FREQUENCY

In the image of the
Error! reference source not found

can be seen in the pixel matrix of 8x8 and the cosine
transform moves the individual values of each one of the points in coefficients of t
he frequency
components.

The values which subscripts
are greater, among themselves to high value frequencies, the

subscript
corresponds to middle luminance values. An error of the values which subscripts are lower, is more
detectable by th
e human eye, while the same error in subscripts are higher, are less detectable, so that
the higher subscripts can be encoded with less bits or with higher steps of encoding than correspondents
to the subscripts lower, thus can be compressed of the images
with little influence in final subjective
quality.



2.2

MPEG1

In the year 1988 was created the Moving Picture Expert Group Committee (MPEG) from the members of
ISO, IEC, JTC1, SC29 and WG11. MPEG1 supposed the first generation of proposed encoders by the
group
as storage video standard it was thought to lows resolutions (352x288) and satisfy the contents
visualization up to a certain quality but it is not appropriate for production or diffusion, the standard was
generated as alternative of videocassettes.

Although is not used nowadays, will study the theories bases that are common in all MPEG family.

The synchronization is based on presentation time stamps (PTS) defined by b ISO 11172. The PTS are
recorded in the encoding part and sent to the player for a c
orrect planning copy. When the material is
encoded, it adds a clock reference (STC) to facilitate the random access to the information at the copy
time.

2.2.1

VIDEO STRUCTURE

The movement as images sequence requires the broadcast of the sequence, but in MPEG the

images are
not the same type and use the temporal for the process of sending less information.

Frames types

They mean three types of frames to the images:



Type I (Intraframe): Are encoded with independence from the previous images. Provide directs
access

points to the encoding and it is used to fast speed, backward motion and sequential access
to the images.



Type P (predictive) : Referenced to the before fields P o I



Type B (bidirectional): Can be referenced with b images, posterior or both. B fields don’
t use as
predictors.

The type B when not being used as a prediction of other fields it can be encoded at a lower speed because
errors do not carry the following fields so that if a transmission error occurs in this kind of image, they not
carry the remaini
ng fields.

GOP

It is a group of fields which can be accessed directly. The first image is the type I and it follows of fie
ld P
and B, according the illustration 2
-
8
, each Group of Pictures GOP have to be one field I.

The structure is defined with two param
eters:



N is the GOP size. Can be defined as the distance among two images I at the beginning of each
GOP.



M is the distance between fields I/P and P

Through the letters M and N specify the GOP structure. Depending on the change types, the encoding
order won’t be identical to the generation of images, it produce certain latency at the hour of encoding
and decoding. In the
Illustration 2
-
8
can beobserved

the relation of the movement between the different
GOP fields.


Fordware Prediction






Bidirectional Prediction

ILLUSTRATION 2
-
8 GOP STRUCTURE FOR N=9 AND M=3

Frame Structure

The basic unit of information send is a block that contains the information 8x8 pixels; it is in this blo
ck
where the DCT transform occurs. The blocks are grouped in macro blocks of 16x16 pixels. A variable
number of macro blocks form that called a Slice that is a macro blocks group and a variable number of
them form the frame or the complete image. In
Error!

reference source not found
, can be seeing how the
structure are the different elements from a block of 8x8 pixels to a video sequence, beginning from the
video sequence, as it group in GOP, each one of the GOP images are a frame, this frame is divided in
slices
at the same time as divided in macro blocks each one of them have four block of 8x8 pixels.

GOP

2.2.2

CODIFICATION

The encoding unit is the macro block, the sequence is from left to right and from top to bottom. To define
the type of encoding, it is used the
composition X:Y:Z where X is the number that represents the quantity
of blocks that encodes of luminance or grayscale inside the macro block, Y and Z are correspondent values
of the chrominance or color component. Thus a maximum quality is generated with a

encoding 4:4:4,
which is encodes all the blocks of macro block, both the chrominance and luminance; when is use a
encoding 4:2:0, they generate 6 blocks in each macro block (four of luminance and two of chrominance,
one by each component).

For a macro bl
ock, the encoding will depend on:



The field type: the effect that produces the prediction in that region



Depends on the encoding, the compensation effect of movement will depend of the previous or
futures fields. This prediction is subtracted of actual
data to form the error signal.



The error signal is divided in 8 blocks of 8x8 and transformed with DCT in each blocks. The result is
a two
-
dimensional block of 8x8 with DCT coefficient that are quantified and scanned in zigzag to
convert in DCT stream one
-
dimensional coefficients.



It will stay to add the information part of the macro block, including the type, block patterns and
encoding vectors of movement.

For a more efficacy of compression, the data is encoded with statistical codes of variable length.

As a result of the different types of fields and the encoding variability, the resultant stream will have a very
variable speed so that in the fixed speed net will be necessary their storage in a FIFO to give output to the
data depending on the time code.

The standard expected mechanism of variable encoding to adapt speed
of content and characteristics of broadcasting line.

Quantification Matrix

The human eye sensibility to the high frequencies is less than the lows, so that it can quantified the
coefficie
nts of lower frequencies with smaller sizes of quantification and the higher frequencies with
bigger sizes so that is optimize the compression without produce negatives effects.

The matrix by defect of weights for the macro blocks of the fields I (intrafr
ame) it is shown in the chart
(see the Illustration 2
-
9, QUANTIFICATION MATRIX OF LUMINANCE AND CHROMINANCE).

8

16

19

22

26

27

29

34

16

16

16

16

16

16

16

16

16

16

22

24

27

29

34

37

16

16

16

16

16

16

16

16

19

22

26

27

29

34

34

38

16

16

16

16

16

16

16

16

22

22

26

27

29

34

37

40

16

16

16

16

16

16

16

16

22

26

27

29

32

35

40

48

16

16

16

16

16

16

16

16

26

27

29

32

35

40

48

58

16

16

16

16

16

16

16

16

26

27

29

34

38

46

56

69

16

16

16

16

16

16

16

16

27

29

35

38

46

56

69

83

16

16

16

16

16

16

16

16

Intra

Inter


ILLUSTRATION 2
-
9
QUANTIFICATION MATRIX OF LUMINANCE AND CHROMINANCE

This matrix is designed for special characteristics of distance and size of the objects, so that each image
may have a matrix that maximizes the quality with the same compression.
During the transmission it
should be taken into account the transfer of that table, since the decoder will need to know the matrix for
a correct reproduction of the images.

For the macro blocks interframe it is used a plane matrix because the interframe di
fferences in the high
frequencies doesn’t mean they are special frequencies.



ILLUSTRATION 2
-
10 D
ATA DIVISION FROM A VIDEO SEQUENCE UNTIL A BLOCK

2.3

MPEG 2

The standard defines two types of stream: program and transport (PS and TS) (ISO / IEC, 2000). The st
ream
program is similar to the MPEG stream 1, but uses a modified syntax with more features, it is compatible
with MPEG 1 files and also uses variable length packets.

The TS was designed for transmission so it was equipped with error
-
handling systems and
processing
hardware, which may contain various programs in the same stream. To achieve this robustness to errors
and can synchronized the package must be fixed, establishing a size of 188 bytes. In the diffusion ATM, it is
segmented in sizes from 47 bytes
to accommodate the cells.

The common data structure TS and PS it is the package Packetised Elementary Stream (PES). PES packets
are generated through video and audio stream of compressed data being able to incorporate PES with
data.

TS consist on fixed
-
len
gth packets that begin with four bytes of header followed by 184 bytes of data
where the data are obtained by slicing the PES packets.

In the
Illustration 2
-
11
, can be seen how PES packets are grouped to form the transport stream. Each
separate stream
starts with a header for synchronization functions, followed by the head of the system
that indicates the type of packages and their numbers, then are happening PES packets, which can have
different types of content (audio, video or data).


ILLUSTRATION 2
-
11 MPEG2 TRANSPORT STREAM

In the standard the data structure is defined for managing information. As discussed before, there are two
types of formats for MPEG2, TS and PS, the first one is designed primarily for the diffusion and the second
for storage. A
lthough you can create files in TS, we will study the structure of PS to be used for storage and
production.

The MPEG
-
2 Program Stream combines one or more PES packets of variable length and relatively large in a
single stream. PES packets that make up the

PS are encoded using a reference clock System Time Clock
(STC). This stream may be a video stream and its associated audio stream or multiplexing of many streams
of audio.

The MPEG2 PS contains up to 32 streams of audio, 16 video streams, 16 streams of da
ta and several
streams with a variety of information for internal management. Each stream has its own header. PS
variable rate conflicts with the error correction scheme that requires constant rate (constant length PES
packets). This is useful primarily fo
r distribution in an environment free of error and noise such as a studio.

2.4

MPEG4

MPEG expert group started the studies to define the MPEG4 in 1993 and in the 96 the first draft was
introduced with the standard which was over in 98. The first standard of MP
EG4
-

version1 was completed
in 1999.

The mission of the new standard provides the new technology to can broadcast, store, and manipulate
multimedia elements among a variety (graphics, audio, video, etc). Also to create a common method to
work with differen
t multimedia formats, the format takes advantage of new coding systems to increase
compression efficiency, gives greater robustness for transmission and also provides interactivity for the
user so that optimize the transfer of images to lower broadcasting
speed rates.

MPEG4 is the first one standard that uses the basis of encoding based on audiovisual objects, these media
objects (audio visual object AVO) can be scenes, sounds recorded by cameras or computer graphics, the
combination of these objects elabor
ate the scenes and with these objects you can insert data objects that
can be multiplied by the transmission channel or in the own storage file.

2.4.1

MULTIMEDIA RESTRICTIONS

As you seen, MPEG 4 is a standard to provide a structure of multimedia work, to realiz
e those functions
the standard have to meet certain requirements:

Interactivity: Viewed as a series of features for the user can:



Change the image content, it means to edit the sequence. For this function it is very important
that the user has direct
access to the tables so it can have a position anywhere in the file.



Cohabitation of natural images with synthetic images.

• Efficient compression: One of the goals of this standard, the power requirements to transmit a video
with extremely low, 7.5 frame
s per second and cover from video formats at low speeds and for
professional applications (10 Kbps to 5 Mbps). As a standard that works with objects, will require the
possibility to transfer the objects in parallel way, which means, to multiplex the data f
rom those objects in
a single bitstream.


Universal access. The standard must be sufficiently robustness for users to be able to:



Receive images on different transmission systems (ADSL, wireless, etc).



Receive the desired picture quality, so that the stan
dard has to provide scalability coding.

2.4.2

THE STANDARD

The main characteristic of this format is the work though audiovisuals objects, the standard has to encode
the audiovisuals objects separately from each other. At the same way that when the human sees
an
image can detect objects as how it is seen, the standard has to detect these objects in the image that gives
the CCD and data encode each of the objects so that when joined together creates the original scene.

Each object is described with information o
n its texture, shape and motion vector. To obtain these
characteristics, the standard provides a range of tools for both natural and in synthetic environments:
motion estimation, texture and shape coding, error concealment, etc.

For motion estimation based

on the standard block matching algorithm, seen in previous sections. From
this prediction, there will be a prediction error which is generated a block motion compensation (motion
compensation MC), the motion vector data and compensation are broadcasted to

form the original signal
at the destination.

In the same way as in MPEG2 there are different types of pictures or images of intra type, for MPEG4 there
are intra
-
type objects, these objects are encoded through discrete cosine transform (DCT) on blocks of
8x8
pixels (as MPEG2 and MPEG1), this transform is apply to luminance and chrominance while the motion
estimation is applied only to the chrominance (idem MPEG2 and 1). The difference with the previous
standards based on prediction and motion compensation
of standard nature; the objects.

2.4.3

BITSTREAM

The standard defines the bitstream syntax that must be generated by this type of compression for each of
the objects and the description of the scenes. The hierarchy is composed of different layers:



Video Session
(VS).



Video Object (VO).



Video of Layer (VOL) or texture of layer (TOL).



Group video object plane (GOV)



Object plane (VOP)

A typical hierarchy is composed of a video session with one or more visual objects. Each object can have
one or more layers. Each of
the layers can be video or textures. The group of video objects in a plane is the
same concept as GOP MPEG2 VOP is a frame of video.

A sequence of visual objects must have one or more objects encoded concurrently. The header of each
object must have inform
ation about the type of object that contains and its main features.

2.5

DV

Originally developed for the industrial environment, was adopted for its own professional and domestic
application. The standards that defines each one of the formats, defines not only
the way to encode also
the characteristics of the tapes.

2.5.1

ENCODING

Encoding is based in the discrete cosine transform, the fields are the type of intraframe compression with
a compression ratio of 5:1. The blocks are of 8x8 pixels, requiring 278 blocks per
image provided.

2.6

JPEG 2000

During long time have studied and exploited the characteristics of different transforms to compress
images, however the standard JPEG2000 was created by the committee Joint Photographic Experts Group
that which created the norm JP
EG based in the discrete cosine transform. The algorithm JPEG2000 has
developed in base to the discrete wavelet transform (DWT), its nature helps to integrate lossy and lossless
technologies inside of the same platform and provide different types of progre
ssive encoding and
decoding. The standard, for certain type of applications have had better results on their characteristics
than others as seen (Boxin S., 2008)

2.6.1

IMAGE PROCESSING

The standard provide the image partition in rectangles or uniform tiling for

big images, the dimensions of
these partitions used to be 256 x 256 or 512 x 512, to make smaller partitions usually involves problems of
the artifacts or geometric shapes in the image and also, it obtains a little efficiency in the compression, by
other
way, when the dimensions are bigger, long procedure is necessary.

The pixels values are entirely positives magnitude numbers and for a better mathematics process
becoming the values in a range of values to localize their middle value to zero.

The compressi
on is done in three parts: discrete wavelet transform (DWT), quantification and entropy
encoding.

The DWT divides each component in a number of sub
-
bands to different levels of resolution. Each sub
-
band is quantified by a quantification parameter in case o
f lossy compression. The quantified sub
-
bands
divide in small numbers of code blocks of identic size, these values use to be 32 x 32 or 64 x 64, each block
is encoded entropy to produce a compressed stream.

At the

Illustration 2
-
11
can be seen how the image

is discomposed in tiling, after eliminating the continue
component to the middle value which is approximate to zero, the transformation is applied, in this case in
two layers.


ILLUSTRATION 2
-
12 TILING DECOMPOSITION

The transform process is made through two tithers in vertical according appear in the
Illustration 2
-
13.


ILLUSTRATION 2
-
13 A TITHER STEP

Through other tither in horizo
ntal after the last step, it obtains a decomposition according to how it
appears in
illustration 2
-
14

(second image). Repetitive steps of the process ending in the decomposition in
different levels. In the
Illustration 2
-
14

can be seen the results of the
decomposition in one and two levels.


ILUSTRATION 2
-
14



STEP TO TWO DECOMPOSITION LEVELS

The advantages that relates the appearance of this standard on the previous one has been:



Scalability in compressed data streams



Improvement in the compression
efficiency (40
-
60% more of the compression than JPEG at the
same quality)




Inside of only data stream can be extracted the image with loss or without loss

H(n)

G(n)



Possibility to cut images’ zones without adding recompression noise



Possibility to improve the
associated quality for image regions.



Improvement the result of the compression to lows rates of bits



Possibility to define regions of interest inside the image (ROI)



Robustness in front of defaults in bits produced by the communications

2.6.2

JP2 FI
LE

The ann
ex I of the standard (ISO/IEC 2000) defines completely the JP2 file.

The file can be accommodated until 2000 different streams. The fundamental part of the JP2 file is the one
called “box” that it is used to encapsulate the different possible image data; o
ther parts of the information
as may be properties of the image, rights, etc., it is organized by a box sequence. Some are necessary and
others not depending of the compressor. In the
Illustration 2
-
15
is illustrated the meaning of the file,
which must ha
ve at least four parts or boxes: signature, profile, header and contiguous Code
-

Stream.


ILLUSTRATION 2
-
1
5

BASIC STRUCTURE OF A JP2 FILE

Some boxes can harbor one or more boxes as can be the header box

The mandatory box is:



JP2 Signature box: the only mission is to identify what i
s a JP2 file, always go the first and has a
length of 12 bytes of fixed content. The type of this box is “jp<space><space>” are four bytes that
indicate the type of file.



Profile Box: always go after the signature box. Is a type of TBox that has the mark
and a list of
compatibilities ones.




JP2 Header Box (superbox): contain

several boxes,

has an Image Header Box and at least a Color
Specification Box.


JP2
Signature
Box

Profile Box

JP2 Header
Box

ImageHea
der Box

Color
Specificati
on Box

Contiguous
Code
-
stream Box

Contiguous
Code
-
stream Box



Image Header Box: is the fixed length (24 bytes), contains fixed information about the image, high,
width

and other components.



Color Specification Box: Specifies the space of color (RGB) of the decompressed image.

Can have more types boxes like:



Pubpcc: BitsperComponent box. Specifies the bit depth and each component of the codestream
after the decompression
.



Pclr: Palette box. Defines the color palette to use to create multiplies components from a simple.



Cdef: Component Definition box. Define the components of the codestream



Res: Resolution box. Specify the image resolution



Contiguous Code
-
stream box: conta
ins a valid stream and JPEG2000 complete data, a complete
image according is define in the annex A of the norm. The four bytes of the box type are “jp2c”
and the variable length.

Is in the Contiguos Code
-
stream Box where they are kept the relatives data to

the image and the values of
different pixels that has defined with previous. The marks and their types are defined in the annex A of the
norm (ISO/IEC, 2000). The order that build is the shown in
Error! reference source not found
, the fields
that appear i
n the discontinue line are optional.


The two first fields of the box have the same functionality in the entire box (see
Illustration 2
-
16)

LBox.Identifies the box length. Can have other type of values apart of the length; if the value is zero, is the
last

file box, if the value is 1, the length are specified in the optional field XLBox.

TBox. Identify the type of data stored in the box.


ILLUSTRATION 2
-
1
6
. BOX STRUCTURE

From these two fields, the number and field type depends on each type of box.

2.6.3

CONTIGUOUS CODE STREAM BOX

The data are structured by marks and headers which define different aspects about the image size,
encoding type, scale to image quantification, etc.

Data are necessary to obtain a correct reproduction of the data packages.

There
are two kinds of headers in the specification: “main” that it is set at the beginning of the box and
“tilepart header” that is located at the start of each part of the image. According the standard, each image
can divide in tiles to have a differential tre
atment of each part of the image. Thus, each Contiguous Code
Stream Box contain the complete image and internally are organ organizes the data in tiles and in the
different decomposition layers of the image.

There are marks that can be in only one of them
and others can be in both, always start by 0 x FF and are
followed by a value in the range 0x FE to indicate the kind of mark, after come two bytes that indicate the
length in the bytes of the parameters of the mark.

There are six types of marks:



Anchors a
re used to frame headers and data.



Fixed information about the image.



Functional, describe the used encoding.



Bit stream, use for error recovery.



Pointer, specify the bit stream position.



Informative, to supply auxiliaries data.


ILLUSTRATION 2
-
17
. CODE STREAM CONSTRUCTION

A detail relation of each one of the marks can be located in the normative (ISO/IEC, 2000). In the

Illustration 2
-
17
, can be seen how to build the code stream; in which start w
ith a SOC header followed of a
main, the main contains a series of marks, some are obligatory and others optional, that describe the
Codestream

SOC

main

SOT

Tile 1 M

SOD

Tile 1
P

SOT

Tile 2 M

SOD

Tile 2
P

EOC

SOC

SIZ

COD

COC

QCD

QCC

RGN

POD

PPM

TLM

PLM

CME

SOT

COD

COC

QCD

QCC

RGN

POD

PPT

PLT

CME

SOD

Main

Tile

complete image information. After the main, comes the description of each one the tiles with the inclusion
of the image dat
a.

Inside the own stream of the image, the standard defines four different ways to order data, the most used
is the LRCP (Layer, Resolution Component, Position), this means that inside of each tile first it puts the
correspondent data to the lower layer, i
nside of that layer it puts the resolution data, components and
position.


In the
Illustration 2
-
18
can be seen how to start with a main header followed by the header of the first
tile and next the tile 1 data; when they finished the tile 1 data starts th
e header of the tile 2, in case that
there are many tiles per frame.

JPEG2000 decoding can generate the image with lower speeds with only minor trimming layers, without
the need of process for transcoding of the image.


ILLUSTRATION 2
-
1
8

STREAM OF A TILE

2.7

PRODUCTION FORMATS

The production systems are based in two standards of mainly compression; MPEG2 and DV:



DVCPRO. In 4:1:1 for NTSC and 4:2:0 in PAL with compression DV at speeds of 25 Mb / s.

• DVCAM. In 4:2:0 and DV encoding 25Mb / s.

• Betacam SX. In 4:
2:2 with encoded MPEG2 with frame scheme IB to 21 Mb / s.

• DVCPRO50. In 4:2:2 based in DV to 50 Mb / s.

• D
-
9 (Digital S). In 4:2:2 based in DV to 50 Mb / s.

• IMX. In 4:2:2 with MPEG2 with encoding speed of 50 Mb / s.


These formats have different compression speeds, so that when you need to record images with higher
quality it will use formats of higher speed of compression.

For each production process are used to be different compression speeds. The two extremes wil
l be the
images dedicated to tasks of viewing the actual video content to be uncompressed.

2.8

UNCOMPRESSED VIDEO

The High Bit Rate Audio Video over IP group (HBRAV
-
IP) Transport and FEC Committee have specified the
speed values for contribution in high speed
net of audiovisual material. The specifications are concern
specifically in links until 3Gps over IP net over Ethernet. In the table 2
-
1 shows a relation of the
contribution speeds. As can be seen is divide in two big groups; with quality loss and without
loss and the
correspondent values to each format of the screen.


TABLE 2
-
1 RELATION OF SPEEDS TO CONTIBUTION IN HIGH SPEED.

En Mbps

With loss

Without loss

Format

Minimum

Typical

Maximum

Minimum

Typical

Maximum

576i/50,
480i/59.94,
480i/60

20

40

80

55

110

165

720p/50,
720p/59.94,
720p/60

90

180

360

280

560

840

1080i/50,
1080i/59.94,
1080i/60

100

200

400

300

600

900

1080p/50,
1080p/59.94,
1080p/60

200

400

800

600

1200

1800



CHAPTER

3


3.

STORAGE

As you seen on previous sections, there are different

formats of production, distribution and diffusion,
each one of these process give their own restrictions on storage in terms of volumes, reliability, response
time and bandwidth available in each moment. It is not the same as a user wants to visualize a r
eport of a
minute at a 1Mbps quality as to reproduces a movie of two hours stored at 50Mbps for broadcast on TDT.

Can be seen the production process as a set of services as they appear in the Illustration 3
-
1, in which they
appear the storage as a common p
latform to all production process.


ILLUSTRATION 3
-
1
PRODUCTION PROCESS

This common platform

can be decomposed into three storage environments with respect of production
process: acquisition, production and archiving.


3.1

PRODUCTION

The recorded material mu
st be accessible for all users and will contain the material that is currently
working on.

3.1.1

SPECIFICATIONS

At the time to design a storage system, it is necessary take into account several factors to establish the
required architecture to give a support to a range of users.

The most basic way to have grouped the needs of a user group is to have the specificati
ons and the
process in a table in which reflects the volume of hours, qualities, and user numbers who have access at
any time.

In the table 3
-
1 shows a configuration example which is showing the speed of works on storage. One of
the process is the capture
or intake of images in the system, according to this configuration is required to