Knowing Android's strengths and weaknesses

estrapadetubacityMobile - Wireless

Dec 10, 2013 (4 years and 7 months ago)


EE Times-India

| Copyright © 2011 eMedia Asia Ltd. Page 1 of 5

Knowing Android's strengths and

Here's a look at the techniques for exploiting Android's strengths and managing its
limitations, especially in hard real-time, mission-critical systems.

By Juan Gonzales, Darren Etheridge and Niclas Anderberg
Texas Instruments

The circumstances surrounding the phenomenal rise of Android in the smartphone market are well-documented.
However, another revolution is taking place in other applications where Android provides distinct advantages over
a "standard" Linux distribution. Android provides a tightly coupled environment for application development
where the frameworks and middleware components are selected by Google. Traditional Linux distributions are
typically "mix and match" (for example, some people prefer X11/KDE rather than Qt/embedded for graphics
development), which burdens the software designer with the need to invest time in understanding the very
complex options and make difficult choices that typically have an impact through the product's lifecycle.
For these and other reasons, many refer to Android as "Linux made easy." Today, even Windows Compact
Embedded (WinCE) developers who once shied away from Linux due to its complexity are taking a second look at
well-integrated Android solutions. Add into the mix a platform licensing approach free from "copy-left" burdens,
also known as a method for making a program free, along with a cost you cannot beat (free), and you have
Android's recipe for success.
Google's vision of connecting Android devices to a cloud and to share movies, music, books, and more via a single-
user account is sure to fuel adoption even further. Android implementations today can be found in a large number
of applications from tablets, e-readers, Internet TVs, portable media players, netbooks, GPS devices, digital cameras,
personal accessories, exercise equipment. and more. Android is free and anybody can download the sources and use
it for whatever purpose they wish. For example, a digital still camera could be shipped with a GPS receiver, WiFi,
and Android, along with the apps for Flickr, Picasa, and Shutterfly and other applications that allow photos to be
uploaded directly to the cloud from the camera. In the past, this would have taken months of software development
and testing for each model of camera and each photo-sharing website. With Android, the camera maker can rely on
the cloud vendor, such as Flickr, to maintain and develop the app for Android. All the camera vendor has to do is
port Android to the device.
Android is seeing adoption in areas where it's not inherently strong because it adds so much value in other areas as
previously discussed. With clever silicon system-on-chips (SoCs) and software architectures, these Android
limitations can be mitigated. Here are some tools, tips and architectures that help do that.

OpenMAX Integration Layer
Silicon vendors can use several optimisation techniques with OpenMax Inegration Layer (IL) to add value to their
silicon hardware offerings. (OpenMax is a royalty-free application programming interface from Khronos Group, a
nonprofit consortium.) Android's multimedia frameworks--including Packet Video OpenCore and Google
Stagefright--are built on OpenMAX IL-based codec components. OpenMAX IL defines the integration layer, or
provides application developers a consistent abstract interface to codecs whether they are implemented in
hardware or software. It also goes one step further with the ability to "tunnel" the communication between two
components so the application using the component does not get involved in every data transfer.
Heterogeneous systems, or systems with more than one processing core (such as a digital signal processor [DSP],
general purpose processors, hardware accelerators, field-programmable gate array), can be further refined by
distributing the OpenMAX IL components on other processing cores, tunnelling data transfers between them, and
eliminating the costly involvement of the host processor in moving buffers between components. Of course, this
implies that the heterogeneous system needs to support shared memory across the processing elements. The goal
in this approach is two-fold:
• Minimise or eliminate memory copies of large video buffers.
• Offload some of this CPU-intensive work to dedicated hardware, while relieving the host CPU from this burden.
Tunnelling has the potential to not just reduce the host CPU utilisation but also the latency, which is of paramount
importance in applications such as enterprise video conferencing. Figure 1 shows OpenMax used in tunnel mode.
EE Times-India

| Copyright © 2011 eMedia Asia Ltd. Page 2 of 5

Figure 1: OpenMAX being used in a tunnelled mode.

The Android frameworks today work with the "non-tunnelled" method of communicating between components, but
this does not preclude apps written using the Android native development kit from taking advantage of the
OpenMAX IL-tunnelled approach for use cases that can really benefit from it. Silicon vendors can take advantage of
their silicon features by implementing their own tunnelling approach as a means to add differentiating performance
value to their end-customers. To the end-Android Java programmer, this would ideally be hidden, though
developers also get free access to the necessary source code and make these enhancements themselves in an effort
to squeeze out performance.

Figure 2: Using YCbCr color space in video compression codecs.

EE Times-India

| Copyright © 2011 eMedia Asia Ltd. Page 3 of 5

Even in the "non-tunnelled" OpenMAX IL approach, some techniques can be applied to reduce CPU consumption.
For instance, Stagefright's default display method converts the output of the codec from YCbCr, the more common
pixel format used for video-compression algorithms, to RGB colour space (shown in Figure 2). A more efficient
method takes the output from the codec and displays it directly (with no memory copies or colour space
conversion) as YCbCr.
Of course, the SoC must have native YCbCr display support in the form of an overlay, and the SurfaceFlinger
(graphics composition engine in Android) has to be modified to take advantage of the YCbCr overlay. Once done, the
overlay reduces memory bandwidth and CPU utilisation.

Digital signal processing
Some SoCs have an embedded, powerful DSP, in addition to an ARM core or video accelerator, that can add some
serious processing power to the host CPU, especially for such tasks as complex math or intensive signal-processing
algorithms. The challenge here is how to expose that processing power to applications written in Java without
knowing anything about DSP.
We classify these SoCs as heterogeneous, as the processing cores have different architectures and instruction sets.
The DSP typically runs on an RTOS, so the Linux kernel is not controlling or even aware of the DSP. Some form of
inter-processor communication (IPC) is used to communicate between the cores, typically providing a master-slave
relationship between the general purpose processor (GPP) and the DSP. The GPP can load code and data for the DSP
in memory, pull the DSP out of reset and put it back in reset. Also, some form of basic messaging service is available
for low-level communication.
To abstract the locality of a DSP function, a framework for remote procedure calls (RPCs) can be used. The
processing cores on the SoC may not have the same C type sizes or endianism, so the RPC has to prepare the
arguments for the DSP functions for the other processing core using a process called marshalling (figure 3). On the
remote core, the function parameters needs to be unmarshalled to native types before being passed to the actual
DSP function on the DSP. The return value of the function is treated in a similar way, but this time coming back from
the DSP.

Figure 3: An implementation of RPC between two processors.

The RPC also needs to manage the cache for any buffers passed between the cores. If the cache is dirty for a buffer
about to be sent to the DSP (or back from the DSP), it needs to be written back before being passed to the other core
or the data will be invalid. Similarly when a buffer is received from another core, the cache for this buffer needs to
be invalidated before the buffer is accessed.
Both the RPC and IPC are typically written in C or C++ code, since the operations are machine- and architecture-
specific and complete control of memory and types are required. The DSP's RPC functions can be wrapped in the
Java Native Interface (JNI), allowing an Android Java application to call DSP functions remotely and transparently.

EE Times-India

| Copyright © 2011 eMedia Asia Ltd. Page 4 of 5

In addition, some DSPs on SoCs have a flat memory model, meaning the CPU goes straight to the memory bus, as
opposed to through a memory management unit. Android is built on a Linux kernel, which fragments memory into
4,096B "pages" on an ARM processor; this prevents normal "malice" memory from being accessed by the DSP
because these pages are scattered throughout the physical memory map.
In this case, design teams must use a custom memory allocator that physically allocates contiguous buffers from the
Linux kernel, allowing the DSP to run on. Java doesn't have this type of granular memory control, but it does have a
type of "direct byte buffers" as part of its java.nio.ByteBuffer. These buffers are not managed by the jvm and its
garbage collector. It's possible to wrap buffers allocated using the custom, contiguous Linux memory allocator in
such a direct byte buffer using the JNI, after which it can be used like any other java.nio.ByteBuffer in an Android
Java application while calling the DSP RPC functions.

Real-time capabilities
Real time can be broken down into subcategories of hard, firm, and soft, depending on the application's tolerance to
missing a deadline. In hard real time, this tolerance is zero and missing a deadline is considered a system failure.
This section will focus on hard real time as this is often encountered in mission-critical embedded processing.
Let's consider the example of a car's antilock braking system. Data must be guaranteed to be processed in a specific
time period no matter the overall system processing load. It would not be an ideal situation if a driver was scrolling
through MP3s on an Android-based-in-car computer system and looked up to see the car in front braking rapidly,
causing the driver to slam on the brakes. If the car's system load is high from all the graphics operations of scrolling
through the MP3 list, the antilock brake system may not get serviced in time and fail to operate. This is an extreme
example, but illustrates the point about how important real-time-processing is to embedded systems. As Android
finds its way into increasingly more end equipment, real-time capabilities will further increase in importance.
Android certainly has some challenges meeting these real-time requirements. Because Android is based on the
Linux kernel and just like Linux, Android can't be considered a real-time operating system (RTOS). This is even
more true when you add the extensive use of the Java virtual machine (VM) for the middleware and application
development. Along with Java's requirement for asynchronous garbage collection, this makes the challenge of being
able to meet real-time processing scheduling even more difficult.
One way of overcoming these limitations and obtaining true real-time performance is to partition your software in
such a way that the user interface (UI) or main app runs on the host CPU and the real-time functions run on a
separate processing core that's running an RTOS. Data could be captured in real-time on the separate processing
core, processed, and sent back to the Android application for display on the UI or saved to a file or network

Boot times
A key area of concern when deploying Android in a safety-critical environment is boot times. Referring to the
antilock braking system example, it wouldn't be much use if antilock brakes only became available two minutes
after a driver turned the key of a car, because it might take this long for Android to start up and to finish cataloging
new MP3 files. One approach is to run all critical systems on a heterogeneous core that can be booted in seconds,
before the main processor even begins to boot Android. Once these system-critical functions are operational,
Android can boot up in its own time and begin communicating with the rest of the non–mission-critical systems to
provide a UI for interacting with them (for example, engine monitoring system showing gas mileage and engine

Power savings
Power consumption is another big concern, especially on portable devices. And while battery life is increasingly
important, so are any techniques that help extend battery life on portable devices. Even devices that are
permanently connected to a power source experience phantom load--electric power consumed by electronic
appliances while they're switched off or in a standby mode. This is a hot topic, and ways to minimise power
consumption are increasingly important.

In an Android-based system, the main processor, display, and graphics can be put into a very deep sleep, leaving
critical systems running on a very low-power CPU, such as an ARM Cortex-M3. This deep sleep allows critical
communications with head-end equipment or safety-critical utility to still run, but the device can appear off to the
user and phantom load can be minimised.

Little green muscle man
Android is a very powerful operating environment in which to build feature rich-applications and also to leverage
an ever-growing catalogue of applications from a diverse set of authors. However, for low-media latency, mission-
critical, hard real time, heavy signal processing or algorithmic-type applications, native Android does not
necessarily provide the best fit. By employing some of the techniques mentioned in this article, the full range of
Android benefits can be coupled with advanced embedded features to offer the most optimised solutions for end
equipments utilising all of Android's highly desirable characteristics. ￿
EE Times-India

| Copyright © 2011 eMedia Asia Ltd. Page 5 of 5

About the authors
Juan Gonzales is a product marketing manager for the DaVinci digital media processors at Texas Instruments (TI).
Juan has a master's degree in computer engineering from the University of Central Florida and is completing his
MBA from the University of Texas.

Darren Etheridge has 15 years of experience in embedded systems. He is currently leading the team that provides
accelerated multimedia frameworks on a variety of TI's video-centric devices with a recent emphasis on Android.
Darren graduated from the University of Plymouth, UK in 1996 with a B.Sc. in computing informatics.

Niclas Anderberg has 10 years of experience in embedded systems, mainly focusing on Linux and DSP software.
Niclas is currently leading the effort of enabling the the TI's DSPs in Android on TI's SoC devices. Niclas graduated
from the University of Lund, Sweden in 2001 with an M.Sc. in computer science and engineering.