D7.4 (D7F): The ALICANTE Adaptation Framework – Final

bootlessbwakInternet and Web Development

Nov 12, 2013 (5 years and 9 months ago)


D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Information and Communication Technologies

MediA Ecosystem Deployment through Ubiquitous
Content-Aware Network Environments

Contract No. 248652
D7.4 (D7F): The ALICANTE Adaptation Framework –

Deliverable Identifier: D7.4 (D7F)
Work-package, Task:
WP7, T7.1-T7.4
Status – Version:
Contractual Date:
Submission Date:
Distribution Security*:
Deliverable Type **:
Markus Waltl, Michael Grafl, Christian Timmerer (UNI-KLU)
Markus Waltl, Michael Grafl, Christian Timmerer (UNI-KLU), Stefano
Battista (BSOFT), Yiping Chen, Soraya Ait-Chellouche, Daniel Négru
(CNRS-LaBRI), Alex Chernilov (OPTEC), George Xilouris, Nikolaos
Vorniotakis (DEM), Anne-Lore Mevel, Hervé Durand (TH-VN),
Daniele Renzi, Claudio Alberti (EPFL), António Pinto, Tânia Calçada
(INESC), Jordi Mongay Batalla, Stanisław Janikowski (NIT), Mamadou
Sidibé, Wael Cherif, Enrique Arizon (VIOTECH), Serban Obreja,
Eugen Borcoci (UPB).

Abstract: This deliverable represents the final deliverable of WP-7 and describes the ALICANTE
adaptation framework. The Executive Summary provides the motivation for this framework whereas
Section 1 – Introduction – guides the reader through the remainder of this deliverable.
*PU – Public, PP – Restricted to other programme participants (including the Commission Services), RE – Restricted to a group specified by
the consortium (including the Commission Services), CO – Confidential, only for members of the consortium (including the Commission
**R – Report, P – Prototype, D – Demonstrator, O – Other.

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013



Adaptation, adaptation decision
taking, content
awareness to the network environment, quality of
service, quality of experience, implementation, scalable video coding, dynamic adaptive streaming over
HTTP, evaluation, validation, SVC Tunnelling.
Revision History
Revision Date Description Author (Organisation)
V0.1 2012-11-12 Initial TOC Markus Waltl, Michael Grafl (UNI-KLU)
V0.2 2013-03-25
Integrated first inputs from NIT,
Michael Grafl (UNI-KLU)
V0.3 2013-04-23
Integrated second inputs from EPFL,
Markus Waltl (UNI-KLU)
V0.4 2013-05-02
Integrated third inputs from EPFL,
Markus Waltl (UNI-KLU)
V0.5 2013-05-14
Integrated fourth inputs from OPTEC
Markus Waltl (UNI-KLU)
V0.6 2013-05-27
Integrated fifth inputs from INESC,
Markus Waltl (UNI-KLU)
V0.8 2013-05-28 Pre-final version Markus Waltl (UNI-KLU)
V0.9 2013-06-10
Pre-final version with minor
Markus Waltl, Christian Timmerer (UNI-
V1.0 2013-06-30 Final version Markus Waltl, Michael Grafl (UNI-KLU)

List and Schedule of milestones
Milestones n° Milestones name WPs N°s Lead beneficiary Comments
The Adaptation Decision-Taking
Framework Proof-of-Concept
WP7 OPTEC Deliverable D7.4
The Adaptation Engine@Home
Box Proof-of-Concept

WP7 BSOFT Deliverable D7.4
The Adaptation Engine@CAN
Concept implemented

WP7 TH-VN Deliverable D7.4
The Quality of
Service/Experience evaluations

WP7 UNI-KLU Deliverable D7.4
The Service/Content Adaptation
Subsystem integrated and

WP7 UNI-KLU Deliverable D7.4

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Executive Summary
The heterogeneity of devices, platforms, and networks is and most likely will be a constant companion
within (future) media (Internet) ecosystems. Thus, we need to provide tools to cope with that
heterogeneity in order to support a maximum of use cases while optimizing (network) resource
utilization and Quality of Experience (QoE). The adaptation of the multimedia content and services
according to the heterogeneity of the context – in which the content and services are consumed – is
undoubtedly considered as an important tool and the adoption of scalable media coding formats would
provide a general solution to the interoperability problem.
In this context, the ALICANTE project proposes a Scalable Video Coding (SVC) Tunnelling and
distributed adaptation framework including edge and in-network media adaptation in order to address
the heterogeneity introduced above. Therefore, the video content is encoded in or transformed to SVC
enabling in-network adaptation during the delivery thereof and finally, if required, transformed to the
suitable target format before being consumed by a user. However, multiple adaptation steps within the
media delivery network raise research questions such as where to adapt, when to adapt, how often to
adapt, and how to adapt. Additionally, the performance of this approach needs to be evaluated
including aspects related to QoE.
The aim of this deliverable is to provide qualitative and quantitative answers to the questions
highlighted above which are supported by high-quality contributions to scientific conferences and
journals. The major findings of this deliverable can be summarized as follows:
• In general, the adaptation shall always be performed as early as possible in the delivery
network to avoid superfluous transmission of content;
• The adaptation to terminal capabilities (e.g., resolution) and user preferences (e.g., modalities)
shall be done within the user and service environment, respectively. Typically, these
adaptations have higher computational requirements and, thus, shall be done at the edges of
the network. The exact location depends on the actual deployment scenario (e.g., HTTP- vs.
RTP-based streaming);
• Bitrate adaptation may (also) be done within the network environment which allows for in-
network adaptation, thanks to the quality scalability provided by SVC;
• The transformation (mainly transcoding) to/from the scalable media formats is feasible using
existing tools (e.g., for SVC). Re-writing techniques shall be adopted when the source/target
format is compatible with the scalable media format. For example, the base layer of SVC is
compatible with AVC and, thus, re-writing from/to AVC allows for a lossless transformation
in case when only quality scalability has been used for SVC;
• A large number of scalability layers significantly impacts the quality of the content for
existing codec implementations. Thus, we propose a hybrid SVC approach – specifically for
HTTP-based streaming solutions – which comprises multiple independent SVC-based
adaptation sets (e.g., per device class) where each provides quality scalability only. Our hybrid
SVC approach is aligned with state-of-the-art industry deployments, provides support for re-
writing, allows for in-network adaptation, and still enables the switching across adaptation sets
(e.g., for session mobility between mobile and stationary devices with different resolution
• The actual service planning needs to determine the number of scalability layers for which
encoding guidelines are provided which are aligned with state-of-the-art industry best
• The quality of the multimedia content and services can be estimated based on objectively
measurable metrics (identified and monitored within the ALICANTE project). The results
indicate an acceptable relationship to subjective scores validated through subjective quality
assessments. The objectively measured QoE serves as an input to the adaptation framework
enabling the optimization thereof.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Table of Contents
Executive Summary __________________________________________________________________________ 3

Table of Contents ___________________________________________________________________________ 4

List of Figures_______________________________________________________________________________ 6

List of Tables _______________________________________________________________________________ 7


Introduction ___________________________________________________________________________ 8


Context and Related Works ______________________________________________________________ 10


Current Limitations, Objectives & Research Challenges _______________________________________ 10


Related Works ________________________________________________________________________ 10


Innovation Area _______________________________________________________________________ 13


Subsystem Architecture _________________________________________________________________ 16


Home-Box Adaptation __________________________________________________________________ 17


Relation of ADTF with other Home-Box Components _________________________________ 17


Processing Engine at the Home-Box ______________________________________________ 18


SVC Encoding and Transcoding __________________________________________________ 18


Integration of SVC with DASH ___________________________________________________ 19


Monitoring and QoE Estimation __________________________________________________ 20


CAN Layer Adaptation __________________________________________________________________ 22


Unicast SVC Adaptation ________________________________________________________ 22


Multicast SVC Adaptation _______________________________________________________ 23


Distributed Adaptation Decision-Taking ___________________________________________________ 24


Implementations (Software Modules) _____________________________________________________ 29


Adaptation Modules at Home-Box ________________________________________________________ 29


Adaptation Decision-Taking at Home-Box __________________________________________ 29


RTP Streaming and Adaptation at Home-Box _______________________________________ 32


Dynamic Adaptive Streaming over HTTP at Home-Box ________________________________ 35


QoS/QoE Monitoring Tool ______________________________________________________ 44


Adaptation Modules at MANE ___________________________________________________________ 45


Adaptation Decision-Taking at MANE _____________________________________________ 45


Deep-Packet Inspection for SVC __________________________________________________ 47


Unicast Adaptation ____________________________________________________________ 48


Multicast Adaptation __________________________________________________________ 49


Evaluations and Validation ______________________________________________________________ 51


SVC Encoding Guidelines ________________________________________________________________ 51


Resolution and Bitrate Recommendations _________________________________________ 51


Evaluation ___________________________________________________________________ 52


SVC Tunnelling ________________________________________________________________________ 56


Introduction _________________________________________________________________ 56


Concept _____________________________________________________________________ 56


Evaluation ___________________________________________________________________ 57


Conclusions __________________________________________________________________ 60


Evaluations on the In-Network Adaptation _________________________________________________ 60


Evaluation of the In-Network Adaptation for H.264/SVC Streams using MPLS/DiffServ ______ 60


Validation of Unicast SVC Adaptation _____________________________________________ 62


Evaluations on the Quality of Experience ___________________________________________________ 63


ALICANTE Pseudo-Subjective Quality Assessment ___________________________________ 63

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013



Evaluations using MPEG-DASH ___________________________________________________ 64


Demonstrators ________________________________________________________________________ 68


Multicast Demonstrator ________________________________________________________ 68


Unicast SVC Adaptation Demonstrator ____________________________________________ 69


MPEG-DASH Demonstrator _____________________________________________________ 70


Conclusion ____________________________________________________________________________ 72


References ____________________________________________________________________________ 74


Annexes ______________________________________________________________________________ 79


Annex 1: List of Acronyms _______________________________________________________________ 79


Annex2: ALICANTE Research Papers and Publications related to WP7 Work ______________________ 83

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


List of Figures
Figure 1. ADTF Subsystem inside ALICANTE Architecture. 16

Figure 2. Modules of the Adaptation Framework at the Home-Box, based on D7.1.1 [2]. 17

Figure 3. Architecture of the Adaptation Framework at the Home-Box. 18

Figure 4. Excerpt of a Simplified SVC MPD. 20

Figure 5. Modules of the Adaptation Framework at the MANE, based on D7.1.1 [2]. 22

Figure 6. ADTF Controller at CAN Manager and Intra-NRM. 25

Figure 7. Adaptation Decision for VCANs Flows. 26

Figure 8. Example of VCAN Pipes Distribution along the Traffic Trunks. 27

Figure 9. The Adaptation Process at the CAN Level. 28

Figure 10. Adaptation Tool Chain at the Home-Box. 29

Figure 11. Adaptation Decision-Taking Framework at the Home-Box. 30

Figure 12. IPTV-to-DASH Subsystem Architecture. 36

Figure 13. Multi-channel HW IPTV Encoder. 37

Figure 14. Synchronized Multichannel Streaming. 37

Figure 15. Representation Segmentation. 38

Figure 16. Media Parameters Detection Module. 39

Figure 17. Media Parameters Detection Module in ADTF Architecture. 39

Figure 18. IPTV-to-DASH Module GUI. 40

Figure 19. IPTV-to-DASH Module Architecture. 40

Figure 20. QoE Monitoring Tool Block Diagram. 44

Figure 21. Queue Structure at the MANE for DiffServ. 46

Figure 22. GRED VQ Configuration Arguments. 47

Figure 23. SVC NAL Unit Header with Additional Octets (NALU types: 14, 15, 20). 47

Figure 24. PACSI Header (NALU types: 30). 48

Figure 25. NAL Unit Parsing from MANE Software. 48

Figure 26. SVC Adaptation Module. 49

Figure 27. VQM results of Rate Control Modes for Different Encoders for (a) PedestrianArea, (b) Dinner,
(c) DucksTakeOff, and (d) CrowdRun Sequences [ALI7]. 53

Figure 28. VQM Results of Rate Control Modes for Different Encoders for PedestrianArea Sequence at
(a) 1280x720, (b) 704x576, (c) 960x540, (d) 640x360, (e) 352x288, and (f) 176x144 Resolutions [ALI7]. 54

Figure 29. VQM Results of Spatial Scalability for the VSS Encoder. The Lines Labeled VSS CBR 2 res Represent
Single Bitstreams Ranging over both Resolutions (a) 640x360 and (b) 1280x720 [ALI6]. 55

Figure 30. Illustration of SVC Tunneling [ALI3]. 57

Figure 31. Trade-off between Bandwidth Requirements and Quality Loss of SVC Tunneling for (a) foreman (b)
container, (c) hall_monitor, and (d) stefan Sequences. 59

Figure 32. In-Network Adaptation Experimental Test-Bed. 60

Figure 33. Unicast SVC Adaptation Test-Bed. 62

Figure 34. A_PSQA Evaluation (Output) vs. VQM Scores (Target). 63

Figure 35. MOS Estimation using A_PSQA. 64

Figure 36. Subjective and Automated MOS Comparison for Sequences at 7 and 30 frames/sec (fps) and for
500 kbps and 8 Mbps [ALI10]. 65

Figure 37. Calculation of Representation Quality Switch (RQS) Events on the MOS. 66

Figure 38. DASH-Metrics Client Reference Model. 67

Figure 39. MPEG-DASH Buffer Model for AVC. 68

Figure 40. Demonstrator using Multicast. 69

Figure 41. SVC Unicast Adaptation. 70

Figure 42. Demonstrator using Dynamic Adaptive Streaming over HTTP. 71

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


List of Tables
Table 1. SVC Adaptation Table. ............................................................................................................................. 23

Table 2. DiffServ Configuration for 100Mbps Link. ............................................................................................... 45

Table 3. AFx Virtual Queues Setup and Drop Probabilities.................................................................................... 47

Table 4. Bitrate Recommendations for SVC Streaming [ALI7]. .............................................................................. 51

Table 5. DiffServ Per-Hop Forwarding Behaviour Setup. ....................................................................................... 61

Table 6. Adaptation Thresholds. ........................................................................................................................... 61

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


1 I

The ALICANTE adaptation framework is responsible for the multimedia content and service
adaptation according to a heterogeneous and dynamically changing context. Therefore, multiple
adaptation modules have been deployed within the delivery network to ensure service continuity and
optimize the quality as perceived by the user. This deliverable provides a summary of the efforts and
results achieved by WP-7; it is organized as follows. Section 2 summarizes the work related to WP-7.
Section 3 reviews innovation area 6 entitled as a distributed framework for edge and in-network media
adaptation with respect to target research outcomes and scientific results addressed within WP-7. The
architecture of the ALICANTE adaptation framework is described in Section 4. Section 5 provides
the implementation details.
Section 4 particularly highlights the relationship of the adaptation decision-taking framework with
other Home-Box (HB) components. It further includes the following aspects:
• SVC encoding and transcoding aspects within ALICANTE and solved by WP-7;
• The Home-Box (HB) adaptation tool chain is introduced comprising the processing engine,
including the General Purpose Transcoder, SVC-to-AVC transcoding, and SVC decoder.
Furthermore, the integration of SVC with DASH is described. Finally, the monitoring
aspects of the HB and the QoE estimation are highlighted;
• The CAN layer adaptation utilizing SVC is defined for both unicast and multicast streams;
• Finally, the distributed adaptation decision-taking is defined.
Section 5 provides implementation details for the software modules identified within the architecture
of the ALICANTE adaptation framework. In particular, it describes the modules for adaptation at
both HB and MANE. The adaptation at the HB includes the following modules:
• Adaptation decision-taking at the HB and its interaction with other HB modules, developed
within other WPs;
• RTP streaming and adaptation at the HB provides implementation details about the actual
SVC chain comprising de-/encoder and de-/packetizer. Furthermore, SVC-to-AVC and
AVC-to-SVC is provided;
• Dynamic Adaptive Streaming over HTTP (DASH) tools developed in the course of
ALICANTE, including various general DASH modules providing the basis for the IPTV-to-
DASH component and the actual integrated DASH component within the HB.
The adaptation at the MANE includes the following modules:
• Adaptation decision-taking at the MANE, including the adaptation logic and packet
dropping characteristics;
• The deep packet inspection module determines the scalability layer of the SVC packet and
provides this as an input for the adaptation decision-taking module;
• Finally, the modules performing the actual unicast/multicast in-network adaptation of SVC
are described.
The evaluation and validation of the ALICANTE adaptation framework architecture and its
implementation is described in Section 6. It provides the following evaluation and validation results:
• SVC encoding guidelines aligned with major industry solutions/recommendations and our
own implementation is compared with existing state-of-the-art implementations (i.e.,
MainConcept, VSS, and JSVM);
• SVC Tunnelling focusing on the trade-off between the quality loss of transcoding and the
bandwidth savings of using SVC;
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


• The in-network adaptation is evaluated for SVC streams using MPLS/DiffServ and the HW
MANE, respectively;
• The evaluations related to QoE include the ALICANTE Pseudo-Subjective Quality
Assessment (A_PSQA). Additionally, the monitoring for MPEG-DASH compares the
estimated MOS with the subjective MOS and shows how the metrics are signalled and
integrated with the adaptation decision-taking framework;
• Finally, demonstrators for multicast/unicast adaptation and HTTP-based streaming services
are provided.
Lastly, Section 7 concludes the deliverable by summarizing the impact on the industry, contributions
to scientific community, and points out future work items.

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


2 C




Deliverables D2.1 [1], D7.1.1 [2], and D7.3 [3], identified some architectural and technical limitations
related to the content/network media delivery which are w.r.t. adaptation:
• Service/content creation, publication and dissemination limitations;
• Service/content management, deployment and context-aware adaptation;
• Limitations linked to the User Environment capabilities to provide media service/content
ubiquitous access and context-adapted consumption;
• The Internet best-effort, content-agnostic nature and the lack of cooperation between
transport and higher layers architectural limitations.
Currently, there are a number of research challenges for handling these limitations. These research
challenges comprise end-to-end and cross-layer services monitoring and context-aware service
management. Furthermore, an important research challenge is to enhance the user experience while
using streaming technologies like the developed ones in ALICANTE. This comprises Quality of
Experience (QoE) evaluation and context-aware adaptation of the content based on the location or
preferences of the user. To achieve these goals, a number of adaptation possibilities, depending on the
locations of the adaptation, have been defined which are server-side, in-network, and client-side
adaptation [3].
To resolve the mentioned issues and research challenges w.r.t. adaptation, the ALICANTE project
specified in D2.1 [1] a number of objectives such as the design and implementation of an advanced
functional architecture comprising two novel virtual layers (i.e., Content-Aware Network (CAN) layer
and Home-Box layer). This allows the coordination of adaptation decisions over the network.
Therefore, in this project a Distributed Adaptation Decision-Taking Framework (ADTF) was
developed which uses network information and information provided by the two virtual layers for
offering a suitable adaptation decision. Using these pieces of information, the adaptation can process
the content in-network for reducing the network load and, additionally at the server and client, to
provide the most suitable representation (i.e., resolution, bitrate) for the end-user terminal.
Additionally, the used information allows providing the highest possible QoE to the end-user.


A vast number of research and development projects address issues similar to ALICANTE.
Deliverables D2.2 [4] and D7.3 [3] provide an overview of these projects. For example, there is the
ENTHRONE project [5] that provides an end-to-end QoS architecture for improving the media
quality but does not take QoE into account. Another is the ENVISION project [6] that provides a
cross-layer adaptation solution which is media aware, topology aware, locality aware and QoS aware
but, similar to ENTHRONE, does not provide QoE-awareness at the User Environment or deeper
network-awareness at the Service Environment. For additional projects and a more detailed
description of the ENVISION, ENTHRONE, and other projects the reader is referred to D2.2 [4] and
D7.3 [3].
As the adaptation decision-taking process is a very complex task, there are a number of available
technologies that can help in achieving good results w.r.t. streaming and video quality. For example,
there is MPEG-21 Digital Item Adaptation (DIA) [7] that provides tools for describing the User
Environment (i.e., preferences, terminal capabilities, or network capabilities), limitation constraints
(e.g., minimum resolution or maximum packet loss), and the relationship between constraints. In
Deliverable D2.2 [4], a number of different tools beside MPEG-21 DIA are described in detail.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


In the following, related work besides the already described ones is presented. We briefly highlight
related work on the following areas: Scalable Video Coding (SVC) performance, SVC transcoding and
adaptation, as well as SVC-Dynamic Adaptive Streaming over HTTP (DASH).
On the topic of SVC performance, Wien et al. [8] and Schwarz et al. [9] provide performance
evaluations of several encoding configurations, targeting spatial and quality scalability (coarse grain
scalability (CGS) and medium grain scalability (MGS)). Their evaluations indicate a 10% bitrate
overhead of SVC compared to H.264/AVC. Guidelines for testing conditions of the Joint Scalable
Video Model (JSVM) for the development of SVC by the Joint Video Team (JVT) are documented in
[10]. Based on the proclaimed 10% bitrate overhead of SVC, a subjective performance evaluation [11]
investigated the subjective quality of SVC. The subjective quality ratings confirm the 10% bitrate
overhead of SVC. A broader range of SVC settings is assessed in [12] (via subjective and objective
tests), including a test for the best extraction path. Both subjective and objective results show that
lowest resolutions (quarter common intermediate format (QCIF)) and lowest frame rates (7.5 fps) of
the test should be avoided. Lambert et al. [13] discuss the deployment of SVC with spatial scalability
for IPTV. Among others, they point out the options for dealing with different aspect ratios at
individual resolutions. Most performance evaluations use different test sequences and do not provide
exact encoding configurations. Those circumstances make comparisons of results between studies very
hard. Although [11] tries to use realistic bitrates, none of the studies performed evaluations based on
bitrates used in actual industry solutions.
Transcoding to SVC can be accomplished either by generic pixel domain transcoding (PDT) [14][15],
i.e., full decoding and then re-encoding to SVC, or by advanced transcoding in the transform domain.
Transform domain transcoding (TDT) uses coded video data from the source format and translates it
to the target format (i.e., SVC) without the need for full decoding [14][16]. This requires a special
transcoder between any pair of source and target formats [17][18]. A special case of TDT is bitstream
rewriting, which converts the video from one format to another without any quality loss. Bitstream
rewriting is only possible if both video formats use the same bitstream syntax and coding techniques,
which is the case for AVC and SVC. De Cock et al. have developed a technique for AVC-to-SVC
bitstream rewriting in [19], [20], and [21]. While a variety of transform domain transcoders from
different formats to AVC exist (e.g., from MPEG-2 [16][18][22]), for SVC only transcoding and
rewriting techniques from AVC as the source format have been researched so far.
Similarly, transcoding from SVC is possible through PDT or TDT (or even bitstream rewriting). The
SVC base layer is backward-compatible to AVC. The full SVC bitstream can be converted to AVC
through lossless bitstream rewriting [23][24][25]. Note that SVC-to-AVC bitstream rewriting is only
applicable to SNR scalability [25]. Different techniques for SVC-to-AVC transcoding supporting
spatial scalability were proposed in [26] and [27]. The SVC-to-AVC rewriter (or transcoder) can be
followed by another TDT from AVC to the target format (e.g., AVC-to-MPEG-2 TDT [28][29]).
Technical challenges of SVC adaptation are discussed in [30]. The application of SVC for IPTV and
corresponding adaptation are discussed in [31]. Design options for SVC in-network adaptation are
discussed in [32]. Kofler et al. [33] have demonstrated SVC adaptation on off-the-shelf routers. Nur et
al. [34][35] have proposed an SVC adaptation technique based on a utility function of SVC layers. The
utility function ranks SVC extraction points by weighting spatial, temporal, and quality layers. The
weights are based on a model devised from subjective evaluations for videos classified by motion
intensity and structural features. During RTP/UDP-based streaming, packet loss significantly
influences the resulting video quality [36]. Even with good error concealment at the SVC decoder
[37], it is often better to switch to a lower SVC layer to improve the QoE [38]. Unequal error
protection can be deployed for improving the robustness of SVC streaming [39].
One way to reduce the number of subjective ratings for a QoE model is the Pseudo-Subjective Quality
Assessment (PSQA) [40]. The deployment of PSQA for SVC is evaluated in [41]. Based on the PSQA
model for SVC, Ksentini and Hadjadj-Aoul [42] have developed an adaptation technique. The
variation of the mobile channel condition causes a QoE fluctuation and an unpleasant user feeling. To
tackle this issue, the authors introduced a QoE-based adaptive decoding algorithm, which selects the
SVC layers to be decoded based on the measured packet loss and displayed to the end-users in order to
increase the QoE.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


While packet loss is a major adaptation concern of RTP/UDP-based streaming, TCP-based
transmission ensures the arrival of packets for HTTP streaming. Thus, the adaptation in HTTP
streaming has to avoid playback stalls due to late arrival of video segments [ALI1]. Hoßfeld et al. [43]
have researched the trade-off in terms of QoE between initial delay of HTTP streaming services and
stalling during playback. Viewers clearly prefer initial delay over stalling. In adaptive HTTP
streaming another impact factor comes into play: flickering due to switches between representations.
Ni et al. [44] have evaluated the impact of flickering on the video acceptance by the viewer on mobile
devices. Their results show that frequent noise flickering between two SNR representations with a
period below 2 seconds impairs the viewing quality down to a point where viewers would prefer the
lower video representation altogether. Sieber et al. [ALI2] have proposed an SVC adaptation logic that
reduces the number of quality switches by striving for a stable buffer level before increasing the
number of consumed SVC layers.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


3 I

In the context of WP-7, we have identified an innovation area referred to as “Distributed Framework
for Edge and In-Network Media Adaptation” [ALI3], [ALI18]. The focus of this work is the
development of an SVC Tunnelling approach featuring edge and in-network media adaptation for
which research challenges are highlighted in the following. Additionally, answers to the research
challenges including relevant publications and references to the appropriate sections within this
deliverable are given.
Distributed adaptation decision-taking framework:
• Where to adapt? At the content source, within the network (with multiple options), the
receiving device, and combinations thereof:
o In the media ecosystem architecture proposed by ALICANTE, distributed adaptation
of SVC streams is realized by adaptation at network edges and in-network adaptation
at MANEs [ALI4]. Adaptation shall always be performed as early as possible in the
delivery path to avoid superfluous transmission of content. On the other hand,
terminal capabilities or user preferences should not be propagated to the content-
aware network environment and handled within the user and service environments
respectively (cf. Sections 5.1 and 5.2);
• When to adapt? At request and during the delivery enabling dynamic, adaptive streaming
based on the users’ context:
o At the content request phase, the combination of SVC layers has to be decided based
on terminal capabilities and user preferences. Whether the decision is performed at the
client or server depends on the intended infrastructure scalability, the business model,
and deployment scenario rather than on the supported adaptation operations. During
streaming, dynamic bitrate adaptation towards network conditions is best performed
within the network [ALI20]. At the client side, support of heterogeneous terminals is
achieved through SVC Tunnelling [ALI3] (cf. Section 6.2), relying on a smart home
gateway such as the Home-Box;
• How often to adapt? Too often may increase the risk of flickering whereas too seldom may
result in stalling, both having a considerable impact on the QoE:
o Based on available literature, we suggest that the interval between two representation
switches should be at least 2 seconds [44]. Nevertheless, viewers prefer multiple small
quality changes over a single, large switch [45]. In case of network congestion, the
adaptation should always be performed immediately; only up-switching to a higher
representation should be scheduled accordingly to avoid flickering. We have proposed
a new concept, called representation switch smoothing, for further reducing the
annoyance of quality switches [ALI5]. For RTP streaming, in-network adaptation
avoids packet loss – a lost RTP packet can cause one or more SVC NALUs to be
discarded, resulting in distortion and error propagation (cf. Section 6.3.1). For HTTP
streaming, the goal of adaptation is to prevent of playback stalling (cf. Section 6.4.2) –
initial delay to fill the client's buffer is generally better tolerated by viewers than any
stalling event, no matter how small [43];
• How to adapt? The optimization towards bitrate, resolution, framerate, signal-to-noise ratio
(SNR), modality, accessibility, region-of-interest (ROI), etc., results in (too) many
possibilities and often depends on the actual content, genre, and application:
o Within the network, bitrate-based adaptation shall be deployed. While this is a simple
and efficient strategy, some studies also have proposed in-network adaptation based
on an on-the-fly QoE estimation [42]. However, this will require a careful
configuration of input parameters to the QoE estimation algorithm. Client-side
adaptation best focuses on resolution and video coding format of the terminal's media
player (cf. Sections 5.1 and 5.2). As no industry streaming solution documents frame
rate adaptation [ALI6], we are sceptic towards its acceptance in real-life streaming
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


systems. In general, we propose a hybrid approach which provides multiple SVC
versions at different spatial resolutions and consider quality scalability the most
suitable scalability dimension of SVC in adaptive streaming scenarios (cf. [ALI7],
Section 6.1.2).
Efficient, scalable SVC Tunnelling and signalling thereof:
• Minimum quality degradation and scalability w.r.t. the number of parallel sessions and
acceptable (end-to-end) delay:
o SVC Tunnelling provides device-independent media access through transcoding. We
have evaluated the trade-off of SVC Tunnelling in terms of bandwidth efficiency and
video quality. Our findings enable advanced control of the quality impact of SVC
Tunnelling for transcoding from (and back to) MPEG-2 as detailed in [ALI3], [ALI8],
[ALI9], and Section 6.2. For example, around 2.5 dB PSNR loss have to be taken into
account for full SVC Tunnelling with the bSoft encoder in order to be more
bandwidth efficient than MPEG-2 simulcast of the source material;
o Assuming full pixel-domain transcoding from SVC to MPEG-2 at the HB, the SVC
decoder typically requires an entire GOP in order to start the decoding process.
Similarly, the MPEG-2 encoder (or any other encoder towards the target video coding
format) needs at least one GOP for encoding. Evaluations on (implementation-
specific) end-to-end delays, as well as results on transcoding speed (and thus the
number of supported parallel sessions), will be reported in the upcoming Deliverable
D8.3 [46];
o Using a relatively large number of SVC layers would lead to significant quality
degradations [ALI1] and, thus, we propose a hybrid SVC approach for DASH-based
streaming services with multiple adaptation sets (e.g., per device class) and quality
scalability within those adaptation sets [ALI7] (cf. Section 6.1).
The impact on the QoS/QoE:
• The QoS/QoE trade-off for the use cases and applications in question:
o The impact of adaptation on the Quality of Service/Experience has been studied in
ALICANTE in order to find the best trade-off between QoS- and QoE-driven
adaptation for the examined use cases (cf. Section 4.3 of D2.2 [4]). The concept of
QoE has recently gained increasing attention, especially in the domain of Future
Internet, where innovative applications and services need to gain wide acceptance and
QoE objectives have to be met. We believe that basing the adaptation on QoE
information gives added value over simply using objective parameters, such as bitrate,
client resolution, etc., as discussed above. However, the preparation and execution of
subjective tests is costly and time consuming. For this reason, the subjective quality
perceived by the user has to be linked to the objective, measurable quality, which is
expressed in application and network performance parameters resulting in QoE. For
this purpose, the ALICANTE project has developed new QoE estimation algorithms
(cf. Sections 7.2.1 and 7.2.2 of D3.2 [47] and Section 3.1.2 of D3F [48]). Such
algorithms are based on the combination of the meaningful QoS parameters and the
introduction of explicit dependencies with user expectations and satisfaction, in order
to give a good approximation of the QoE in an automated and inexpensive way (cf.
Section and;
• Possible mappings of QoS to QoE:
o The above mentioned algorithms provide a mapping between objective QoS
parameters and the QoE. A quantitative description of the subjective impact of QoS
variations on QoE is reported in [49] (Figure 2), where the QoE, as a function of QoS
disturbance, is split into three areas:
 Area 1: constant optimal QoE, where for a vanishing QoS disturbance, the
user considers the QoE equivalent to that of the reference;
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


 Area 2: sinking QoE, after a first threshold, the quasi-optimal QoE level
cannot be maintained anymore;
 Area 3: unacceptable QoE, after a second threshold the outcome of the
transmission might become unacceptably bad in quality, or the service might
stop working because of technical constraints such as timeouts.
For the elaboration of the automated mapping algorithms, previous publications, such as [50] and
[51], have been taken into consideration and elaborated further. The main purpose was to cover
scenarios and frameworks which were just partially covered in existing works, such as for instance,
the MPEG-DASH framework (cf. Section D3F [48]). The idea behind them is to measure
quantifiable parameters (QoS), in order to estimate the impact on the user (QoE). Based on the
service classes’ definitions given in D5F [52], different weights are assigned to the different
quantifiable parameters, and then QoE is calculated using a series of equations, such as the
exponential model described in [ALI10]. Another interesting approach is the one of the PSQA
model [41][42], which is based on random neural networks and self-learning processes. In
ALICANTE, an evolution of such a model, named A_PSQA, has been developed [ALI11][ALI12]
(cf. Section 6.4.1).

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


4 S

Figure 1 depicts the architecture of the Adaptation Framework (AF). Adaptation specific locations are
highlighted. In the ALICANTE system, there are four dedicated locations where adaptation
components are located. These locations are: at the Service Provider/Content Provider (SP/CP), at the
HB, at the MANE, and between the content-aware network manager (CANMgr) and the intra-domain
network resource manager (Intra-NRM). In the following sections, each key component and its
functionality is described.

Figure 1. ADTF Subsystem inside ALICANTE Architecture.

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013



Adaptation at the Home-Box comprises a number of components which together form the Adaptation
Framework (AF) and the Adaptation Decision-Taking Framework (ADTF) at the Home-Box. Figure 2
illustrates the modules of the complete AF at the Home-Box. The Adaptation Framework at the Home-
Box (AF@HB) has the goal to perform adaptation in order to customize a Media Service to the
capabilities of the terminal and the context and preferences of the End-User. Depending on the type of
Media Service and the delivery/consumption scenario, several adaptation steps may have been already
applied to the content.

Figure 2. Modules of the Adaptation Framework at the Home-Box, based on D7.1.1 [2].
In the following sections, the key building blocks of the AF @ HB are shortly described. Detailed
implementation information of the different components can be found in Section 5.

4.1.1 Relation of ADTF with other Home-Box Components
The ALICANTE Home-Box layer is part of the Service Environment, which takes into account
information delivered upward by the CAN layer and enforces Network-Aware Applications
procedures. The Home-Box layer also takes into account the user dynamic context delivered by the
terminal QoE Monitoring Manager in order to support the user Context-Aware Applications. To
support the functionalities of the Home-Box, a Middleware layer, a Service layer and a User
Management layer are introduced.
The Home-Box Middleware layer consists of a set of software modules that provide necessary
functionalities to the Home-Box Service, such as resources, connectivity, adaptation, monitoring and
security management.
The functional architecture of the Adaptation Framework at the Home-Box, as well as its internal and
external interfaces are outlined in Figure 3. Note that the figure only shows modules relevant to the
adaptation decision-taking and adaptation. Further information on the functional architecture of the
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Home-Box and the User Profile Management can be found in Sections 7.2.3 (User Profile
Management) and Sections 7.4 (Home-Box Subsystem) of Deliverable D2.1 [1] respectively.

Figure 3. Architecture of the Adaptation Framework at the Home-Box.

4.1.2 Processing Engine at the Home-Box
The Processing Engine at the Home-Box has the goal to adapt content to a suitable format for the end-
user terminal. The main objectives of the Processing Engine are [54]:
• To inject SVC content into the CAN layer; this also assists users in User Generated Content
(UGC) scenarios;
• To guarantee device-independent access to the SVC content distributed over the CAN layer,
by providing SVC-to-X transcoding capabilities at the User Environment side;
• To enhance the adaptation granularity at the User Environment side, by adapting the content
previously adapted by the MANEs to the specific user context (e.g., either adding new
scalable sub-streams or dropping useless layers, when the terminals have unsuitable
capabilities in terms of screen resolution, access network, etc.);
• To perform adaptation at SP side, especially in scenarios such as UGC and VoD, where the
unicast approach makes the adaptation in SP/CP servers more advantageous than in MANEs.
These objectives are reached using the SVC-to-AVC transcoder, the General Purpose Transcoder and
the Adaptation Decision-Taking Engine (ADTE) which are described in Section 5.1.2. When
available, the transcoder may use hardware acceleration capabilities of the Home-Box platform in
order to deal with intense computing requirements of video transcoding algorithms,

4.1.3 SVC Encoding and Transcoding
Scalable Video Coding (SVC) is the scalable extension of Advanced Video Coding (AVC) known as
ITU-T H.264 and ISO/IEC MPEG-4 Part 10 [53]. SVC has been selected as the video content format
for multimedia delivery in the ALICANTE system. The reason for this is that it provides an intrinsic
mechanism for having multiple representations of the video content at different quality levels with
minimal bandwidth overhead, especially when compared to simulcast.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


SVC uses the same bitstream structure as AVC, with individual pictures encoded as Access Units
(AU) and in turn organized in Network Access Layer Units (NALU).
The SVC representation levels differ in one of the three scalability dimensions supported by SVC:
• Temporal Scalability: where increasing layers correspond to an increase in temporal
resolution, i.e., higher frame rate, e.g., with layers corresponding to 15, 30, and 60 frames per
• Spatial Scalability: where increasing layers correspond to an increase in spatial resolution, i.e.,
higher number of horizontal and vertical pixels, e.g., with layers corresponding to 176x144
(QCIF), 352x288 (CIF), and 704x576 (Standard Resolution TV);
• SNR (Quality) Scalability: where increasing layers correspond to an increase in quality level,
i.e., lower coding distortion for the same temporal and spatial resolution; in Coarse Grain
Scalability (CGS), quality is increased lowering the Quantization Parameter between layers,
while in Medium Grain Scalability (MGS), quality is increased encoding transform
coefficients of higher frequency partitions.
While in the core network, ALICANTE is using as coding format SVC, at the server and client side
several coding formats need to be supported, for interoperability with existing content and services.
Thus, at the Service Provider/Content Provider (SP/CP) side there is a need to transcode from several
coding formats including AVC into the chosen format for delivery (i.e., SVC); this operation is
referred to as X-to-SVC transcoding (from a generic coding format into SVC).
Conversely, a the User side, more specifically at the Home-Box which is the main gateway to the User
environment, there is a requirement to transcode from the delivery format SVC into the more
convenient coding format for the specific terminal device; this is referred to as SVC-to-X transcoding
(from SVC to a specific format).
Given the current deployment of AVC as coding format in the market (in IP streaming, in Digital TV
broadcasting, in video-communication, etc.), the most significant transcoding cases, on which
ALICANTE concentrated its efforts, are AVC-to-SVC (Server side) and SVC-to-AVC (Client side).
Since the base layer of SVC coding is essentially an AVC conformant bitstream with any of the
scalable modes (Temporal, Spatial, Quality), it is possible to encode and decode the multimedia
content minimizing the bitrate overhead required to support scalability, with respect to the simple non-
scalable AVC bitrate.

4.1.4 Integration of SVC with DASH
One of the main objectives of Dynamic Adaptive Streaming over HTTP (DASH) is to address the
problem of bandwidth variations through a dynamically adaptive framework. The adopted approach is
simple but effective: instead of having one single media encoded as a unique representation in terms of
bitrate, spatial resolution, framerate, etc., the same media is encoded at several representations. Such
multiple representations are then split into segments that can be individually requested by the client
through HTTP requests. This enables the client to switch between different qualities, resolutions etc.
during the streaming session, according to both bandwidth variations and user environment
requirements, like, e.g., terminal screen size. Moreover, the clients can be served through ordinary
Web servers which let the system scale very well.
Typically, AVC [53] has been used to generate multiple qualities of the media for DASH. However,
its scalable extension [9] namely Scalable Video Coding, seems to be very promising on these aspects
and can potentially bring some major advantages due to its layered architecture which enhances the
flexibility of the segment selection. This means that, in comparison to AVC, it is possible to cancel a
segment request at the layer boundaries. That advantage can simplify the adaptation process as the
client would always be able to download the highest quality and in case of insufficient bandwidth, it
could cancel the request at the segment boundaries. This is not possible with AVC because when the
client cancels a segment it could not use the video data of the segment anymore.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Müller et al. [ALI1] have evaluated a SVC-based solution for an MPEG-DASH framework. In
particular, a pre-existing solution based on AVC has been compared to the ALICANTE solution
based on SVC, and both solutions have then been compared with the major industry solutions (i.e.,
Microsoft Smooth Streaming, Adobe HTTP Streaming, and Apple HTTP Live Streaming). Based on
conclusions of Kofler et al. [55] which demonstrated that the ISO Base Media File Format
(ISOBMFF) in combination with SVC is ineffective for adaptive HTTP streaming with streams lower
than 1 Mbps, only elementary streams have been evaluated. The layer dependency in the Media
Presentation Description (MPD) has been described as depicted in Figure 4, which shows a simplified
MPD containing one base layer with a bandwidth of 451231 bps and an enhancement layer with a
cumulative bandwidth of 550737 bps that depends on the base layer. The segments in Figure 4
correspond to 2 seconds of video data, which can be obtained through byte range requests; e.g., the
base layer segment starts from byte 141 and ends at byte 74012 and comprises multiple base layer
Network Abstraction Layer (NAL) units. When the client selects the Representation (SVC layer) with
550737 bps, it would initially download the corresponding segment of the Representation which this
Representation depends on, that is the 451231 bps Representation. This download scheme would
produce a bitstream that is not valid for the decoder due to the fact that the decoder would get a bunch
of base layer NAL units followed by a bunch of enhancement layer NAL units and, therefore, would
lose all dependencies (cf. Section 3.5.4 in D7.3 [3]). As a consequence, the NAL units have to be
reordered. In ALICANTE, such a reordering is performed in the Home-Box by the Processing
Engine. The Processing Engine parses the downloaded bitstream, identifies the layer information of
each NAL unit, and relocates the enhancement layer NAL units of each frame to appear directly after
the corresponding NAL units of the base layer of that frame. The tools for NAL unit reordering at
server and client sides have been made available at https://sourceforge.net/p/svc-demux-mux/.

Figure 4. Excerpt of a Simplified SVC MPD.

The layered architecture of SVC allows using a more aggressive buffer model compared to AVC. In
comparison to the AVC experiment, SVC achieves better bandwidth utilization with a quite stable
buffer. Additionally, it also reacts very accurately to bandwidth variations and it recovers very quickly
from low quality levels when the available bandwidth increases.

4.1.5 Monitoring and QoE Estimation
ALICANTE has developed and integrated tools for monitoring the QoS parameters of the multimedia
sessions traversing the content distribution network. The tools derive from a single architecture and
are adapted according to the specific usage scenario. The main differences in the monitoring part of
the system are related to the type of protocols used in the content delivery. In this context, we use the
"monitoring" terminology strictly for the data analysis functions. Conversely, the main commonalities
are in the monitoring of the intrinsic media parameter, i.e., in terms of the specific data that can be
retrieved from the analysis of the audio and video streams.
The QoS Monitoring tools are thus differentiated for different scenarios, like RTP steaming over UDP
in case of both unicast and multicast operation, and DASH streaming over HTTP, as introduced in the
previous Section 2.2. More specifically, while in the case of RTP streaming the QoS Monitoring can
be performed either emulating the behaviour of a fully fledged multimedia client, or in a less invasive
way, extracting the multimedia sessions from the analysis of the packets flowing through a specific
network node or interface. In the case of DASH streaming, only the former setup is feasible, since the
session is initiated and evolved by the data requests coming from the specific multimedia client.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


The following diagram depicts the two possible setups:

Concerning the data analysis implemented in ALICANTE, the three main sets of data extracted for
the estimation of the QoS can be classified as follows.
Network Parameters:
In case of RTP streaming:
• Session bitrate;
• Rebuffering events;
• Packet loss;
• Packet jitter.
In case of DASH streaming (for more details, see Deliverable D3F [48]):
• Re-buffering event frequency;
• Re-buffering event average duration;
• Representation quality switching rate.

Video Parameters:
• Video bitrate;
• Video format;
• Video frame loss (differentiated in Intra, Predicted, Bidirectional pictures);
• Video noise;
• Coding parameter (notably quantization parameter).

Audio Parameters:
• Audio bitrate;
• Audio format;
• Audio frame loss;
QoS Monitor

(RTP Sniffer)
QoS Player

(RTP Analyser)
QoS Player

(DASH Analyser)
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


• Audio noise;
• Coding parameters (depending on the coding format).
The overall QoS evaluation of the video and audio quality is thus performed as a non-linear
combination of the network and media parameters:



( 1 )
where Pv
is the video Parameter, kv
is a scaling constant, ev
a weighting exponent;



( 2 )
where Pa
is the audio Parameter, ka
is a scaling constant, ea
a weighting exponent.



Similar to the Home-Box, the AF at the CAN layer consists of a number of tools. These tools are
described shortly in the following and a detailed description of each module is given in Section 5.
Figure 5 illustrates the adaptation chain at the CAN Layer. The SVC adaptation in the CAN layer is
done by the MANE.

Figure 5. Modules of the Adaptation Framework at the MANE, based on D7.1.1 [2].

4.2.1 Unicast SVC Adaptation
The unicast adaptation is usually performed at the AF@SP/CP, so that the content is adapted at its
source and does not overload the network with packets that should be later dropped in the MANE.
However, in case of network congestion, some unicast SVC Adaptation needs to be applied to remove
some of the enhancement layers of the SVC stream; for this purpose, the Adaptation Engine inside the
MANE filters the RTP packets belonging to the enhancement layers and updates the RTP sequence
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


For Adaptation Decision, information is exchanged between Adaptation Manager and Adaptation
Engine. This is done through shared memory (via read/write operations) with modifications of the
SVC Adaptation Table.
The Adaptation Manager will modify the SVC Adaptation Table if network congestion occurs. In such
a case, the fields “SVC Adaptation Spatial Layer”, “SVC Adaptation Temporal Layer” and “SVC
Adaptation Quality Layer” will be updated for all flows that require adaptation. The SVC Adaptation
Table (depicted in Table 1) is stored in shared memory and can be accessed by the Adaptation
Manager and the Adaptation Engine.
Table 1. SVC Adaptation Table.

To perform an adaptation, the Adaptation Engine:
1. Extracts the NAL units from the SVC stream and gets three identifiers from the NAL unit
header extension: TID (temporal_id, which indicates the hierarchy between temporal layers
for temporal scalability), DID (dependency_id, which denotes the inter-layer coding
dependency hierarchy between higher/lower scalable enhancement layers for spatial
scalability) and QID (quality_id, which designates the quality level hierarchy);
2. Compares the three identifiers to the values provided by the Adaptation Manager;
3. Filters the RTP packets if the three identifiers are the same;
4. Updates the RTP sequence number for the next unfiltered RTP packets of the same flow.

4.2.2 Multicast SVC Adaptation
In order to leave/join higher layer multicast sessions, the MANE accepts a SOAP interface. This
interface, represented in D7.2.1 [54], Section (Leave/Join Higher Layer Multicast Session) has
primitives to allow multicast and QoS resources to be allocated in a MANE. Multicast trees are
characterized by a set of global parameters, and a set of “bridge ports”.
The global parameters include the CATI and Flow Identification parameters with additional SVC
Information that allows the MANE to match packets also by their SVC properties: the NAL Unit is a
32-bit field that contains all layer information, i.e., base layer and all types of enhancement layers can
be identified by this single value. Thus, it becomes possible for multicast trees to include certain layers
and drop other layers, this way realizing one form of in-network adaptation.
Each BridgePort element indicates an action to be applied for each incoming packet, such as
transmitting the packet over an outgoing intra-domain interface as IP multicast, transmit to a MANE in
another domain, or transmit as unicast to a set of HBs to seed the P2P swarm.
The multicast primitives supported by the MANE allow installation and removal of a multicast tree
(for layered SVC, we need one tree for each layer), as well as add, remove, or modify bridge ports, in
runtime. For more information on this interface, see D6.4.1 [56], Section (Intra-
NRM/Multicast Bridge Interface).

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013



The distributed adaptation decision-taking is performed with the help of the CAN Manager (CANMgr)
and the Intra-domain network resource manager (Intra-NRM) (see [56]).
The ADTF Controller (ADTF-Ctrl) @ CANMgr has the following functionalities:
• To create long-term adaptation policies based on aggregated monitoring and Service Level
Specifications (SLS) from the CANMgr;
• To signal the adaptation policies to the ADTFs @ MANEs to configure their adaptation logics
• To adjust the adaptation policies if necessary.
Thus, the ADTF-Ctrl @ CANMgr realizes the concept of distributed adaptation decision-taking. Note
that the ADTF-Ctrl @ CANMgr is not directly involved in per-flow adaptation decisions. These
individual per-flow adaptation decisions are taken by the respective ADTFs @ MANEs, which react to
immediate network congestions. Through the adaptation policies, the ADTF-Ctrl @ CANMgr
provides a long-term coordination of all adaptation decisions in a VCAN. This ensures that the
network resource utilization stays within specified thresholds.
At the CANMgr, information about the current network is stored in the CAN DB. This information is
provided to the ADTF-Ctrl at the CAN manager via the CAN Operation and Maintenance module.
The provided information is send via SOAP messages to the ADTF-Ctrl at the Intra-NRM which
provides all necessary parameters to the ADTE @ MANE. At the Intra-NRM, no additional
processing is performed. The ADTF @ Intra-NRM is mainly for forwarding messages to the lower
layer (i.e., MANE) or upper layer (i.e., CANMgr). Note that the CANMgr stores information about all
registered MANEs in a domain. Thus, the CANMgr provides a list of registered MANEs to the ADTF-
Ctrl which informs all registered ADTEs @ MANEs that there was a change, e.g., in available
bandwidth and, hence, for the next adaptation decision these updated parameters should be taken into
account. As ALICANTE provides a distributed adaptation decision-taking, the ADTE @ MANE
forwards decisions and monitored parameters (i.e., how many layers for which flow/aggregation are
forwarded, how many packet loss occurred, etc.) to the Intra-NRM and CANMgr. This information is
afterwards, distributed to all registered MANEs and their ADTE for performing suitable adaptation
decisions and adaptation itself. Figure 6 illustrates the communication between the CANMgr, Intra-
NRM and the ADTE @ MANE.
The architecture also foresees an inter-domain interface between ADTF-Ctrls @ CANMgrs to enable
communication for coordination between different domains. Our findings suggest that most of such
inter-domain coordination can be alleviated by proper negotiation of Service Level Agreements
(SLA). That is, properly configured distributed adaptation decision-taking within a VCAN is typically
sufficient to ensure that the agreed thresholds for that VCAN. However, further improvements of
inter-domain adaptation coordination are an interesting topic of future work.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Figure 6. ADTF Controller at CAN Manager and Intra-NRM.
The CAN Operation and Maintenance (OM) has a process which catches the trigger events sent by the
DB when the Monitoring process is updating the measurement data. It analysis the measurement data
and decides if it is necessary to inform the ADTF-Ctrl module that it has to start the adaptation
process. The decision to trigger the adaptation process is taken based on the following parameters: the
bandwidth threshold for the VCAN and the available bandwidth threshold for the Class of Service
(CoS) to which the VCAN belongs. For both VCAN bandwidth and available bandwidth thresholds,
there are two values: a minimum value and a maximum value. The decision to start the adaptation
process to reduce layers is taken when the maximum value thresholds are crossed, and the decision to
increase layers is taken when the parameters are decreasing below the minimum value thresholds.
These parameters are stored in the database and can be configured based on domain policy. How the
adaptation decision is taken is illustrated in Figure 7.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Figure 7. Adaptation Decision for VCANs Flows.
In Figure 7, the bandwidth and the occupancy for a given CoS are illustrated. All VCANs illustrated in
Figure 7 belong to that class of service. The amount of bandwidth allocated for each Class of Service
is known at the CAN level. This is a value which is specific to each traffic trunk in the virtual
topology. Because a VCAN can have multiple pipes, with different bandwidth values, as depicted in
Figure 7, the VCAN bandwidth refers at the bandwidth occupied by the VCAN's pipes which are
using the traffic trunk under consideration. For the VCANs which allow adaptation, two values are
specified for each VCAN pipe: committed information rate (CIR) and peak information rate (PIR).
The bandwidth specified by CIR is guaranteed by the network, while the PIR bandwidth is offered by
the network based on the resource availability. In order to chose the threshold values for a VCAN on a
particular traffic trunk, total CIR and total PIR bandwidth values are computed, based on the CIR and
PIR values for the VCAN's pipes using the considered traffic trunk. The threshold minimum and
maximum bandwidth for the VCAN will be selected based on the total PIR and CIR bandwidth values
computed before. The minimum bandwidth threshold for the VCAN will be equal with the total CIR
bandwidth, and the maximum bandwidth threshold will have a value between the CIR and PIR values,
which will be determined based on the number of layers that are allowed to be dropped for the pipes'
flows. In Figure 7, VCANs highlighted with red are VCANs occupying a bandwidth greater than the
maximum VCAN bandwidth threshold, while VCANs highlighted with green are VCANs which
occupy bandwidth below the minimum bandwidth threshold. In Figure 7 a), the VCANs 2 and 4 are
allowed to cross the thresholds, because the total occupied bandwidth for the CoS-A Class of Service
is under the max_thresh threshold. In Figure 7 b), the occupied bandwidth increases above max_thresh
because of which it is decided to adapt the flows belonging to the VCANs 2 and 4. Figure 7 c)
illustrates the bandwidth occupancy for CoS-A after the adaptation was applied for the selected
VCANs. If the CoS-A occupied bandwidth decreases below the min_thresh value, the adaptation
decision may check if it is possible to add again layers for all the VCANs’ flows. The decision to add
additional layers to one VCAN’s flows is taken also if the VCAN bandwidth decreases below the
min_vcan_bw_thresh value. Decreasing below this threshold means that some VCAN flows ended,
and a portion of VCAN’s bandwidth was released. The adaptation decision-taking for these flows can
be stopped now, because, if the min/max thresholds are properly designed, the oscillation of the
adaptation process can be avoided.






A bandwidth



Adaptation triggered

for VCAN4



VCAN2 adapted
VCAN4 adapted


A bandwidth

VCAN2 & VCAN4 adapted




Vcan flows




D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


In Figure 8, the distribution of VCAN pipes along the traffic trunks between MANEs is illustrated. It
is assumed that the VCANs in the pictures belong to the same Class of Service, so they logically
belong to the same traffic trunk (which is a virtual link between two MANE2, with traffic having the
same Class of Service). So, for the virtual traffic trunk between MANE1 and MANE2, when we refer
to VCAN1 bandwidth in Figure 7, we are assuming that:
VCAN1_bw = VCAN1:pipe2_bw+VCAN1:pipe3_bw

Figure 8. Example of VCAN Pipes Distribution along the Traffic Trunks.
The operations defined between the CAN Operation and Maintenance module and the ADTF-Ctrl:
ListMANEs(vcan_id, list_of_manes)

• Method used to inform ADTF-Ctrl about the list of MANEs nodes associated with a VCAN;
AdaptVCAN(vcan_id, VCAN’s parameters needed for adaptation process)

• Method used to ask the ADTF-Ctrl to adapt the VCAN’s flows;
• VCAN parameters are: source IP, source port, destination IP, destination port, min_bitrate,
min_aggr_bandwidth, max_aggr_bandwidth.














D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


Figure 9. The Adaptation Process at the CAN Level.
In Figure 9, the adaptation process and the modules involved are shown. The adaptation process is
triggered based on the monitoring data analysis. The adaptation commands are carried by the ADTF-
Ctrl modules at CAN and IntraNRM towards the ADTE modules located at the MANEs nodes. The
chain is closed by the Monitoring Module (MON) which is gathering measurement data from the
MANEs nodes.

More details on the CAN layer and network environment can be found in Deliverable D6F [87].

D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


5 I
In this section, descriptions of the implementation of the different software modules are given.


5.1.1 Adaptation Decision-Taking at Home-Box
For performing an adaptation at the Home-Box, a number of different tools are available. The
adaptation tools comprise the Adaptation Decision-Taking Framework (ADTF), the SVC-to-AVC
transcoder, and the General Purpose Transcoder. These tools result in a complete tool chain presented
in Figure 10.
The ADTF consists of the Adaptation Manager (Adaptation-Mgr) and the Adaptation Decision-Taking
Engine (ADTE). The Adaptation-Mgr is responsible for retrieving information from the User Profile
for providing an adaptation decision that is suitable for the user preferences. Furthermore, the User
Profile consists of information from the Home-Box monitoring which is used for determining network
issues. Additionally, the Adaptation-Mgr itself performs some monitoring such as measuring the
current bandwidth and packet loss. All these pieces of information are used to update the descriptions
for the ADTE (i.e., Adaptation QoS (AQoS), Usage Environment Description (UED), and Universal
Constraints Description (UCD)). The ADTE performs then the decision-taking. The results from the
ADTE are used to configure the SVC-to-AVC transcoder (e.g., drop enhancement layers) and the
General Purpose Transcoder (GPT) (e.g., transcode to a different format and codec).
In case of MPEG-DASH, the receiving module configures the HTTP server (i.e., streamer) for
providing the transcoded content as DASH content. For further information, see Section5.1.3.3.

Figure 10. Adaptation Tool Chain at the Home-Box.
Figure 11 gives a detailed overview of the decision on the content type by the tool chain. First, the
Home-Box receives through a number of receivers (e.g., DASHProxy, RTPClient) the video stream
which needs to be presented to the user. Due to the variety of different end-user devices, the ADTF
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


checks if the end-user device can handle SVC by consulting the User Profile (see D3.1.1 [57]). If the
device supports SVC, the stream is directly sent to the streamer module which connects the Home-Box
with the end-user device. The figure also shows two different servers, which are used for content
server adaptation (CSA) and seamless handover (SH) between servers, as detailed in Deliverable
D5F [52].
If it happens that the client is not able to render SVC, the stream is processed by the SVC-to-AVC
transcoder which outputs an AVC stream. For performing this transcoding, the ADTF provides
decisions on the resolution, bitrate and fps which can be achieved by dropping enhancement layers.
Afterwards, the AVC stream is directed to the GPT. Depending on the supported formats by the end-
user device, the GPT either simply redirects the AVC stream to the streamer, transcodes it again to a
suitable format (i.e., resolution, bitrate, fps) not achieved via dropping enhancement layers, or even
transcodes the AVC stream to another format (e.g., MPEG-2). The resulting output is provided to the
end-user device the same way as the SVC stream.

Figure 11. Adaptation Decision-Taking Framework at the Home-Box. Interfaces with the User Profile Manager
The User Profile Manager (UPM) located at the Home-Box is in charge of gathering context
information from different user terminals, maintaining up-to-date data useful for content adaptation.
The User Profile Manager offers Web Service interfaces which are accessible internally by other HB
modules (e.g., Adaptation Manager @ HB) and by external modules. The complete list of available
web services such as GetUserContext, UpdateUserContext is detailed in Deliverable D3F [48]. For the
communication between UPM and the adaptation manager, GetUserContext method is used by the
adaptation manager to retrieve information such as AQoS, UED, and UCD from UPM database.
In practice, the GetUserContext method has two parameters: SessionRef, ID references of a given
session, and Name, name of the context parameter (e.g., “VideoQoE”, “CPUload”,
“VideoRtpPacketDelay”, etc.). UPM then returns the value of the requested context parameter.
The validation and evaluation of the UPM consists of two parts:
1. The functional analysis which ensures that all adaptation scenarios can be supported by the
proposed context model and context management systems;
2. The performance measurement that evaluates the subsystem in terms of query response time.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


For the functional analysis, the list of key parameters for adaptation is included in current context
model. Moreover, the model is flexible and extensible, i.e., any new parameters can be easily added if
they are necessary for some adaptation scenarios.
In terms of performance, we need to ensure that the use of complex model (XML or ontology) does
not affect the query response time. According to benchmark study in [58], the XML database Sedna
offers the best performance compared to other XML databases. Therefore, we use this database to
store the User Profile. The average response time is between 10-100ms, which is close to traditional
SQL database, and acceptable for the targeted adaptation scenarios. More details on the User Profile
Management can be found in Deliverable D3F [48]. Adaptation Logic
The Adaptation Decision-Taking Engine (ADTE) at the Home-Box uses context and monitoring
information to dynamically determine the best content representation. The adaptation logic applies
several (limitation and optimization) constraints on the parameter space to find the best decision
candidate [59]. The limitation constraints related to the spatial resolution are given in Equations ( 3 )
and ( 4 ):





( 3 )





( 4 )
sel layer

sel layer

represent the horizontal and vertical resolution of the selected
SVC layer, respectively. The horizontal display resolution
and vertical display resolution

form the upper bounds for the content resolution. The minimum guarantees for the
horizontal resolution
and the vertical resolution
specified in the Service Level
Agreement (SLA) form the lower bounds for the content resolution.
The limitations constraint related to media bitrate is given in Equation ( 5 ):






( 5 )
The variable
represents the bitrate of the selected SVC layer. The link capacity is denoted
. Multiplied with the maximum bandwidth share
of the stream, it forms the upper bound for
the content bitrate. The minimum guaranteed bandwidth
specified in the SLA forms the lower
bound for the content bitrate.
The bandwidth shares
are divided equally between all streams currently being handled by the HB
(including cross traffic). The weighting of bandwidth shares can be refined by traffic classification,
with each class having a different priority. Within the ALICANTE framework, the Service Priority
can be signalled in the Content-Aware Transport Information (CATI) as discussed in Deliverable
D5.1.1 [60], the distribution of bandwidth among those classes is described in the SLA. The
calculation of bandwidth share
for a stream in traffic class
is given in Equation ( 6 ):




( 6 )
be the weight assigned to traffic class
based on its Service Priority, such that

= 1
. Let
be the number of streams in traffic class
In order to enable adaptation decisions based on the measured packet loss, the adaptation logic
performs the following estimation. The monitored bitrate and packet loss are used to estimate the
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


upper bound for the bitrate of the stream according to the packet loss requirements of the SLA. Let
f:BW→ PL
be the packet dropping probability as a function of the bandwidth utilization. This
mapping can be specified by the configuration of the congestion avoidance algorithm, e.g., generic
random early drop (GRED). A more accurate mapping is obtained by monitoring the packet dropping
characteristics of the device. The monitored bitrate
of the stream is used to estimate the packet
loss as
= f(br
. This estimated packet loss is adjusted by the actual monitored packet loss
and the adjusted mapping
∙ f
is used to determine the highest bitrate that will not violate
the maximum packet loss
stated in the SLA. The resulting limitation constraint is shown in
Equation ( 7 ):











( 7 )
The goal of this limitation constraint is to prevent higher packet dropping rates from the congestion
avoidance algorithm by proactively switching to a lower layer.
The adaptation logic could be further refined by taking the quality degradation introduced by packet
loss into account. Note that such an approach also requires the initial quality of each layer to be
signalled, typically in the media stream itself. For SVC, this information could be signalled in the
supplemental enhancement information (SEI) message NALU by using a user data unregistered SEI
message. With the initial quality information and the quality degradation characteristics, the layer with
the highest estimated QoE could be selected based on a Pseudo-Subjective Quality Assessment
(PSQA) model [41][42].
The optimization constraints of the adaptation logic are given in Equations ( 8 ), ( 9 ), and ( 10 ):




( 8 )




( 9 )




( 10 )
The variable
represents the layer number of the select SVC layer. The optimization
constraints are subject to the limitation constraints of Equations ( 3 ), ( 4 ), ( 5 ), and ( 7 ). The
deployed implementation of the MPEG-21 ADTE uses a simple generate & test approach with priority
sorting of optimization constraints as discussed in [59]. Due to the priority sorting, the maximization
of the horizontal resolution has precedence over the vertical resolution and the SVC layer number.

5.1.2 RTP Streaming and Adaptation at Home-Box
As described in the previous Section 4.1.3, the technology selected by ALICANTE for delivering
content across the multimedia ecosystem is the ISO standard solution for H.264 | MPEG-4 Scalable
Video Coding (SVC). This technology provides the possibility to efficiently and effectively adapt the
content both at the CAN layer and at the HB layer.
The basic software tools developed and tested within the context of the ALICANTE project are
• SVC encoder, developed as an extension of an existing software AVC encoder;
• SVC decoder, developed as an extension of an existing software AVC decoder;
• RTP library to support sending and receiving of the SVC content over IP.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


The following diagram gives a graphical representation of the media coding and streaming tools
developed and their usage in the content delivery chain.

The most significant tool for the ALICANTE project is the SVC-to-AVC Rewriter, since this is an
essential part of the content management. The SVC-to-AVC Rewriter allows delivering to the
end-user a multimedia representation usable with a variety of terminals, including legacy terminals not
capable of decoding SVC.
There are two combinations of the SVC-to-AVC tools:
• SVC to AVC Rewriter, developed as a combination and optimization of the stand-alone SVC
decoder plus AVC encoder; where applicable, no processing is performed to rewrite the SVC
content into AVC format, as in the case of extraction of the SVC base layer, which is by
definition a compliant AVC bitstream;
• AVC to SVC transcoder, developed as a combination of the stand-alone AVC decoder plus
SVC encoder.
The following diagram gives a graphical representation of the media transcoding tools and their
corresponding decoder/encoder combination.
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


For several years, the Real Time Streaming Protocol (RTP) [61] has been the most widespread
protocol for delivering audio/video content over IP/UDP. Its success and diffusion have built on the
features of packet sequence ordering, delay management, and jitter compensation. The RTP protocol
supports both unicast and multicast delivery, i.e., streaming to a single destination IP address or to a
"special" IP address that several receivers can use to retrieve the same audio/video streaming session.
The Internet Engineering Task Force (IETF) has defined a series of specifications for adapting the
generic RTP approach to the specific audio/video coding standards defined by other organization, such
as ITU-T and ISO/IEC.
For the specific activity of ALICANTE, as described in the section dealing with SVC, AVC, and
transcoding, the most relevant application of RTP is for the delivery in the core network, either in
unicast or multicast mode, of SVC content. This has been implemented conforming to the IETF RFC
6184 [62] and 6190 [63].
As described in previous deliverables (see D7.1.1 [2] and D7.2.1 [54]), the Adaptation Decision-
Taking Engine (ADTE) is used for determining the most suitable parameters for the current network
and user environment. Therefore, the ADTE used a so-called Usage Environment Description (UED),
Universal Constraints Description (UCD) and an Adaptation QoS Description (AQoS) for performing
a decision. The output of this decision is used for steering the adaptation process. At the HB, the
adaptation is performed twofold as presented in Section 5.1.1. First, using information from the User
Profile, the Adaptation Manager (Adapt-Mgr) detects if SVC is supported. If yes, then adaptation
process directly sends the content to the streamer. If SVC is not supported, the SVC-to-AVC Rewriter
receives the content for transcoding it from SVC to AVC. The resulting AVC file is then either sent
directly to the streamer or is fed to the General Purpose Transcoder (GPT) for further processing.
The GPT uses FFmpeg for performing transcoding of AVC content to a more suitable format or codec.
Therefore, the GPT allows changing of the codec (e.g., MPEG-2), the bitrate, or the spatial resolution.
This allows the ADTF at the HB to offer content for a variety of different end-user terminals (e.g.,
D7.4 (D7F): The ALICANTE Adaptation Framework – Final
ALICANTE Consortium 2010-2013


tablets, smartphones, PCs, TVs). As the transcoding with GPT is performed from AVC content to less