An Internet File System for Random Accessing ... - Maskawaih.com

soilkinkajouInternet and Web Development

Feb 2, 2013 (4 years and 4 months ago)

183 views

1
i





平成
19
年度


筑波大学第三学群情報学類


卒業研究論文


An Internet File

S
ystem for Random

Access
ing

Protected Data





主専攻


計算機システム





著者名


AHMAD SYAHIR BIN CHE ABDULLAH


指導教員


板野肯三、
新城



佐藤聡、中井央




i



Abstract

As
the
Internet
become
ubiquitous
in
1

these

days
,
2
video collaboration over the Internet
3
become
s

very common.

For example, in the sports coaching
4
area, video data is shared among
scattered organizations.

Most of
5
this video data are confidential and need to be protected.

Typically, to distribute protected video data, Digital Rights Management (DRM) is often used.
Representative DRM products include
Windows Media DRM, Helix DRM, FairPlay DRM, and
DReaM.

Existing DRM mechanisms for video have a problem in the random access capability.
Random access is really important for video seeking. This problem is
7
inherent for these
existing DRM mechanisms becau
se they are optimized for major users who watch video
sequentially.


In this paper, we
8
propose a new data protection mechanism to access protected data through
the Internet

which allows random access usage
.




ii


Table of Contents


Chapter 1

Introduction
................................
................................
................................
...............

1

Chapter 2

Related Works

................................
................................
................................
...........

3

2.1

Digital Right Management

................................
................................
............................

3

2.1.1

Windows Media
DRM

................................
................................
...........................

3

2.1.2

Apple’s FairPlay

................................
................................
................................
.....

4

2.
3

DirectShow API

................................
................................
................................
..............

4

2.4

Shell

Namespace Extension

................................
................................
...........................

5

Chapter 3

Overview of our protected data distribution

................................
............................

7

Chapter 4

VFS Module

................................
................................
................................
...............

9

4.1

Dokan Library

................................
................................
................................
................

9

4.1.1

Dokan’
s component

................................
................................
............................

10

4.2

User
-
level Module for Dokan

................................
................................
......................

11

4.2.1

Global variables

................................
................................
................................
...

11

4.2.1

Items class

................................
................................
................................
...........

12

4.2.2

CreateFile() function

................................
................................
............................

12

4.2.3

OpenDirectory() function

................................
................................
....................

13

4.2.4

GetFileInformation() function

................................
................................
.............

13

4.2.5

ReadFile() function

................................
................................
..............................

15

4.3

Secure Channels

................................
................................
................................
..........

16

4.3.1

Na
med Pipe

................................
................................
................................
.........

16

4.3.2

Filename Extension

................................
................................
.............................

17

4.4

Performance Issues

................................
................................
................................
.....

18

4.4.1

Pooled Connections
................................
................................
.............................

18

4.4.2

Cache and Prefetch

................................
................................
.............................

19

Chap
ter 5

Server
-
side Programs

................................
................................
..............................

20

5.1

Persistent Connections
................................
................................
................................

20

5.2

Byte serving

................................
................................
................................
.................

20

Chapter 6

Encryption and Decryption

................................
................................
......................

21

6.1

Advanced Encryption Standard

................................
................................
...................

21

6.2

Counter mode

................................
................................
................................
.............

21

iii


6.1.1

Padding

................................
................................
................................
................

22

6.3

Token

................................
................................
................................
...........................

22

6.4

Encryption sof
tware

................................
................................
................................
....

23

6.3.1

Multithreads

................................
................................
................................
........

24

6.5

Decryption

................................
................................
................................
...................

25

Chapter 7

Experiments

................................
................................
................................
.............

27

7.1

Experiment environment

................................
................................
............................

27

7.2

Encryption Rate

................................
................................
................................
...........

27

7.3

Transfer rate

................................
................................
................................
................

28

Chapter 9

Integration with Movie Database System of Japan Institute of Sports Sciences

31

9.1

Current data distribution mechanism in JISS movie database system

.......................

31

9.2

New data distribution mechanism propose to JISS movie database system

..............

31

9.3

The advantage compare to current mechanism

................................
.........................

32

Chapter 10 Conclusions

................................
................................
................................
...............

34

Acknowledgements

................................
................................
................................
.....................

35

References

................................
................................
................................
................................
...

36




iv


Table of
F
igures


Figure 1: Data flow and token distribution

................................
................................
...........

7

Figure 2: Dokan library working in Windows kernel

................................
...........................

10

Figure 3: Global variables

................................
................................
................................
....

11

Figure 4: Items class source code

................................
................................
........................

12

Figure 5: CreateFile() function code

................................
................................
....................

13

Figure 6: GetFileInformation() code

................................
................................
....................

14

Figure 7: ReadFile() function code

................................
................................
......................

15

Figure 8:
Partial of GetNamedPipe() code

................................
................................
..........

16

Figure 9: CreateFile() function code when using filename extension

................................
.

17

Figure 10: Counter mode encryption

................................
................................
..................

22

Figure 11: Portion of Encrypt() function

................................
................................
.............

23

Figure 12: Graph showing encryption rate using dif
ferent thread count

...........................

28

Figure 13: Graph of transfer rate using different methods

................................
................

29

Figure 14: Graph of CPU usage of different method

................................
..........................

29

Figure 15: Current data distribution mechanism in movie database system of JISS

..........

32

Figure 16: New data distribution mechanism proposed to movie database system of JISS

................................
................................
................................
................................
.....

33




v


Table of
Tables


Table 1: Multithreads calculation
................................
................................
........................

24



1
1


Chapter
1

Introduction


Internet is ubiquitous on these days and has been essential for collaborating people. Therefore,
sharing data on the Internet
9
is really common for easy access
. For example, in the sports
coaching
10
area
, video data is shared
with annotations
among scattered organizations [1].

M
o
st
of these video data are confidential
11
which need

to be protected.
Typically, to distribute
12
protected

video data, Digital Rights Management (DRM) is often used. Representative DRM
products include Windows Media DR
M, Helix DRM, FairPlay DRM, and DReaM.

Existing DRM mechanisms for
13
video

have a problem in the random access capability.
Random access is really important for video seeking. This problem
14
is inherent for these

existing DRM mechanisms because they are o
ptimized for major users who watch video
sequentially. In concrete, these existing DRM mechanisms use a large buffer and
15
often
download

entire video data in advance. This feature is nice to mitigate jitters, high latency, and
slow throughput
16
. However,
this feature is

not suitable for applications that
17
need

random
access.

In this paper, we propose a new data protection mechanism to access protected data through the
Internet.

18
In

this mechanism,
instead creating a whole new player or video format,

we
19
reuse

existing media players for the Windows operating system (OS) without
20
large

modifications to
applications and

no modification

the OS. Instead, we
21
extend

the OS in the Virtual File System
(VFS) layer. To simplify the implementation of a VFS modul
e, we
22
use

a framework for user
-
level file systems called Dokan [2].

Upon
23
execution

VFS module
24
mount

a

web server as a

new logical drive. The logical drive acts as a pipe to access the protected file on the web server.
Some media players
25
not

support opening file on remote server. Even
26
it

supported, there are
no random access capability for the remote file. From the
27
side of the application

which
28
accessing

the files on the logical drive, it only sees the files as local files and this enab
le
almost any application to open it with random access capability.

However, just to give random access capability is not the whole goal for this research. We also
want to provide access control to the accessible file in case the file is protected or class
ified. In
other words, we don’t want any malicious access
29
the

file on the mounted logical drive. This
can be done by
30
hide

the logical drive or
31
reports

there is no file in the drive except to the
legitimate application. To make it security comparable
with other DRMs, the file placed on the
web server is
32
decrypted
. This will prevent malicious user
s

to download the files

on the web
server even he or she knows the file URL.
33
Even

malicious users can download it, he/she
cannot open the file without
34
de
crypt

it first. The entire
35
necessary job
: accessing file from
web server, decrypt it and pass it to the legitimate application
36
done

seamlessly by the VFS
module that we will introduce in this paper.

To provide encryption while maintain the random acces
s property,
37
special

mode called
Counter Mode
with Advanced Encryption Standard (AES) method
38
used

in this research.
39
It
not only fast to do the encryption
process but also provides
41
unbreakable

result.
The VFS
2


module obtains
41
legit

key from the legitimate application using
42
secure channel

and decrypt
s

the file, block by block.

Our protected data distribution mechanism
43
also has the scalability
. In case the protected data is
44
media

file,
45
it

no longer limited to any
proprietary file format which depend on proprietary
software of
46
media

player to play the media file. For example, when using Microsoft DRM,
4
7
user

must use
48
file with format WMV or WMA and can only

be played on Windows Media
Player. Other media
49
player

also supported

only if the developer got
50
proper authorization
from the Microsoft
.
51
It
goes
same

to Apple FairPlay which
52
is need media player make use of
QuickTime
. In other
53
word most of this

DRM mechani
sms is closed source, limited and not
customiz
able.

The rest of the
thesis is

organized

as follows. Chapter 2
54
discuss

about related works including
Digital Right Management, Microsoft DirectShow API, Shell Namespase Extension and File
system in User Space (FUSE).
Chapter 3 discusses

the

overview of our protected data
distribution mechanism more detail. Chapter 4 elaborates the VFS module of our mechanism

and performance issues
. Chapter 5
55
is talking

about server side software and their requirements.
Chapter 6 elaborates more about encry
ption and decr
yption used in this mechanism. Chapter 7
examines the system functionality and performance.

In chapter 8 we propose the integration of
this mechanism into SmartSystem of Japan Institute of Sports Sciences. Chapter 9 concludes
this thesis and
56
discusses

future work of this thesis.








3


Chapter
2

Related Work
s



2.1

Digital Right Management


Digital Right Management (DRM)
57
is always
refers to

access control technologies used by
publishers and copyright holders to limit usage of digital media or devices. It may also refer to
restrictions associated with specific instances of digital works or devices. To some extent, DRM
overlaps with copy prote
ction, but DRM is usually applie
s

to creative media (music, films, etc.)
whereas copy protection typically refers to software.

In DRM, data files (usually audio and
video)
are
encrypted or just wrapped in encrypted containers. The authentic player will
re
ceive

the key to decrypt the data.

However there are
several
problems using this kind of DRM.

Most of
59
this DRM need

proprietary file format and special software or API to be

used. For example Microsoft DRM
also known as Windows Media DRM need the user to use their property codec like WMV or
WMA and this protected media can only be played on
60
their

property media player. Other
media player may support the DRM if the developer

got

the

authorization from Microsoft.
61
This also
the
same

for Apple’s FairPlay, which
62
limited

the file format to MP4 and AAC
codec, and only allow to be played using Quicktime.
63
This limited

to the audio and video file,
and not to other format such as

document file.

64
There also problem

with the encryption. Most of
these

DRM mechanism
s

are targeting the
65
download

data.
Some of them also support direct streaming from the server, but usually use
large buffer to
66
promise smooth

playback.

Each time
67
user seeking

the video,
the media player
buffers

the data before start playing. Also r
andom access
68
only supported

for limited file format.

This is
69
become disadvantage

as
user needs

to convert the file to the
specific

file
format.

Some
of
70
DRMs

also only has
71
one key
or a limited

set of keys
for all file, which is really unsecure.

Typical DRM
72
also built on purpose

to restrict user playing the media
73
not

to protect it. For
example it prohibit
s

user from playing the media on
74
the

unregistered

machine, or limits play
count. Most well
-
known DRMs are Microsoft’s Windows Media DRM and
Apple’s

FairPlay:


2.1.1

Windows Media DRM


Windows Media DRM

is a Digital Rights Management service for the Windows Media
platform. It is designed to provide secure delivery of audio and/or video content over an IP
network to a PC or other playback device in such a way that the distributor can control how that
cont
ent is used.

It

using a combination of elliptic curve cryptography key exchange, the DES
4


block cipher, a custom block cipher dubbed MultiSwap, the RC4 stream cipher, and the SHA
-
1
hashing function
.

Windows Media DRM is designed to be renewable, that is, it

is designed on the assumption that
it will be cracked and must be constantly updated by Microsoft. The result is that while the
scheme has been cracked several times, it has usually not remained cracked for long.


2.1.2

Apple’s FairPlay


FairPlay

is a DRM

technology created by Apple Inc., based on technology created by the
company Veridisc. FairPlay is built into the QuickTime multimedia software and used by the
iPhone, iPod, iTunes, and iTunes Store. Any protected song purchased from the iTunes Store
with

iTunes is encoded with FairPlay. FairPlay digitally encrypts AAC audio files and prevents
users from playing these files on unauthorized computers.

FairPlay

protected files are

just

regular MP4 container files with an encrypted AAC audio
stream. The audio

stream is encrypted using the AES algorithm in combination with MD5
hashes. The master key required to decrypt the encrypted audio stream is also stored in
encrypted form in the MP4 container file. The key required to decrypt the master key is called
the
"user key."

Each time a customer uses iTunes to buy a track a new random user key is generated and used
to encrypt the master key. The random user key is stored, together with the account information,
on Apple’s servers, and also sent to iTunes. iTunes sto
res these keys in its own encrypted key
repository. Using this key repository, iTunes is able to retrieve the user key required to decrypt
the master key. Using the master key, iTunes is able to decrypt the AAC audio stream and play
it.

When a user authori
zes a new computer, iTunes sends a unique machine identifier to Apple’s
servers. In return it receives all the user keys that are stored with the account information. This
ensures that Apple is able to limit the number of computers that are authorized and
makes sure
that each authorized computer has all the user keys that are needed to play the tracks that it
bought.


2.
3

DirectShow API


DirectShow is a multimedia framework and API provided by Microsoft for software developers
to perform various operations with media files [3]. Based on the Microsoft Windows
Component Object Model (COM) framework, DirectShow provides a common interface for

5


media across many of Microsoft's programming languages, and is an extensible filter
-
based
framework that can render media files on demand by applications.

DirectShow divides the processing of multimedia tasks such as video playback into a set of
steps kn
own as filters. Each filter represents a stage in the processing of the data. Filters have a
number of input and output pins which connect them together. The generic design of the
connection mechanism means that filters can be connected in many different w
ays for different
tasks to build a filter graph, and developers can add custom effects or other filters at any stage in
the graph then render the file, URL or camera.

Most video
-
related applications on Windows, not only Microsoft's Windows Media Player but

also most third
-
party applications use DirectShow to manage multimedia data. However,
DirectShow has a problem with a random access capability. Applications that use DirectShow
API can perform random access for local files but cannot when opening files ov
er HTTP. In
other words, the video seeking does not work when applications open files over HTTP.


2.4

Shell Namespace Extension


In
Windows

environment,
75
there

several

ways to implement a new file system. The easiest way
is through user land Shell names
pace extensions [4].
With a namespace extension,
software
developer

can take any data and have Windows Explorer present it to the user as a virtual folder.
When a user browses into this folder,
the

data is presented as a tree
-
structured hierarchy of
folders and files, much like the rest of the Shell namespace. Users and applications are able to
interact with the contents of this virtual folder in much the same way as with any other
namespace object.


Behind the scenes, every folder that Windows Explorer displays is represented by a Component
Object Model (COM) object called a folder object. Each time the user interacts with a folder or
its contents, the Shell communicates with the associated folder o
bject through one of a number
of standard interfaces. The folder object then does whatever is necessary to respond to the user's
action, and the Shell updates the Windows Explorer display.

The majority of the files and
folders that users interact with are
part of the file system or a system virtual folder such as the
Recycle Bin.

To implement a namespace extension,
the

information must be organized as a tree
-
structured
namespace.
The

namespace root is presented as a virtual folder in the Shell namespace. T
he root
folder, and all its subfolders and data items, becomes part of the Shell namespace, and Windows
Explorer becomes
the

user interface.
Developer

can thus present
their

information to the user in
a familiar and readily accessible way with much less UI

programming than would be required
for a custom application.

6


The availability of shell namespace extension file system toolkits [
5
], lighten the process of
implementation file system using namespace extension. The most notable file system using
namespace
extension is GMail Drive [6], which is a
Namespace Extension that creates a virtual
file

system around Google Mail account, allowing
user

to use Gmail as a storage medium
.


H
owever,
despite of this easiness, file system that implemented using this method
does not
support the lowest
-
level file system access API
in Windows,
including DirectShow API.
The
file system cannot be mapped as a drive letter. The file system also inaccessible through
command line tools. It can be accessible using Windows Explorer. So

n
ot all applications are
able to access file systems that are implem
ented as namespace extensions.


2.5

File system in User Space


File

system in User

space (FUSE)
[
]
is a free Unix kernel module, released under the GPL and
the LGPL, that allows
non
-
privileged users to create their own file systems without editing the
kernel code. This is achieved by running the file system code in user space, while the FUSE
module only provides a "bridge" to the actual kernel interfaces. FUSE was officially merge
d
into the mainstream Linux kernel tree in kernel version 2.6.14.

FUSE is particularly useful for writing virtual file systems. Unlike traditional file

systems which
essentially save data to and retrieve data from disk, virtual file

systems do not actually

store data
themselves. They act as a view or translation of an existing file

system or storage device. In
principle, any resource available to FUSE implementation can be exported as a file system.
FUSE is available for Linux, FreeBSD, NetBSD (as PUFFS), O
penSolaris and Mac OS X.

76
Dokan library used in this mechanism is similar to FUSE, instead it running in Windows
environment.





7


Chapter 3

Overview of our protected data distribution





In this mechanism, we make use of existing application or media player
77
with small

modification and no modification to OS. Instead, we extend the OS using virtual file system
(VFS) layer. W
e

78
try
to
overcome the
most

problem exist in existing

DRM mechanisms

by
providing the protection on the file system level

instead
. We also
try to make workaround for

random access

problem

when accessing remote file that

exist in DirectShow API

by not using
the URL accessibility API
in DirectShow, instead
79
make

the remote file as
80
local file virtually
.
All
the
process to access the remote
file is
done by
the
VFS module.

VFS module introduce
d

in
this paper also
81
not using

the namespace extensi
on approach. In its place, we use

Installable
File System (IFS),
a file

system API
in
Microsoft Windows that enables the
OS

to recognize
and load drivers for file systems.

Our mechanism consists of several
components
: encryption software, authentication se
rver, web
server

and

virtual file system module

as shown in Figure 1.
When the administrator wants to
upload a file to
the
web server, he/she need to use the encryption software.
The encryption
software generates random key and nonce and encrypts the given

file. After finish uploads, the
software registers the filename,

original file, file’s key, and file’s nonce onto the authentication
server. All this information packs as a token stored in authentication server.

We will elaborate
details of token
in

Chapter
6

Encryption

and Decryption
.
When
a user runs

the
legitimate

application
,
the application

logs in
to the authentication server to receive the file
name/token pair.

The applications also trigger the execution
of VFS module, which is mounts

a virtual logical
drive.

Upon opening the file, the application passes the token to the VFS module. The VFS
module accesses the data from the web server and decryp
ts it using the received token

from the
application through a secure channel

before sends it back to the application

using file system
API
.


Internet

User PC

Web server

Application

User
-
level

VFS
module

Windows
kernel

Kernel
level
VFS
module

Secure
channel


Encrypted files

Authentication server

Filename

Admin PC


Token

Encryption
software


Encrypted
file

Original
file

Figure
1
: Data flow and token
distribution

8


This paper focuses on the following programs: The VFS module, encryption software and
applications. We

will
not
discuss about authentication server as we

assume that we can use
some token distribution mechanism such as Smart System [1].


9


Chapter
4

VFS Module


The virtual file system (VFS) layer is an abstraction layer on top of more concrete file systems.
The
purpose of VFS is to allow applications to access different types of concrete file systems in
a uniform way. VFS specifies an interface (or a "contract") between the kernel and a concrete
file system. Therefore, it enables to add new file system types to t
he kernel by fulfilling the
contract.

For this mechanism, instead
of
using
a
limited shell namespace extension, we use

a
more
reliable

approach,

Installable File System (IFS)
. IFS is

a file system API in Microsoft
Windows that enables the OS to recognize a
nd load kernel module for file systems.

IFS
implementation in Windows is really
83
hard work

because it involved kernel programming
.

To
simplify this we make use of Dokan library.

IFS used in this research
84
has a difference compare

to normal IFS as it separated into 2 components: a kernel level module and a user
-
level module.


4
.1

Dokan Library


In this research, we use Dokan library [2] for simplifying kernel level programming.
Dokan
library contains a user mode
dynamic library lin
k (
DLL
),

dokan.dll

and a kernel mode file
system driver
,

dokan.sys
. Once Dokan file system driver is installed,
user

can create file systems
which
is

seen as normal file systems in Windows.
In this paper, we refer t
he application that
creates file systems
using Dokan library
as user
-
level module for Dokan
. File operation requests
from user programs (e.g., CreateFile, ReadFile, WriteFile, …) will be sent to the Windows I/O
subsystem (runs in kernel mode) which will subsequently forward the requests to the Do
kan file
system driver (dokan.sys). By using functions provided by the Dokan user mode library
(dokan.dll), file system applications are able to register callback functions to the file system
driver. The file system driver will invoke these callback routin
es in order to response to the
requests it received. The results of the callback routines will be sent back to the user program.
For example, when

a

Windows
application

requests to open a directory, the OpenDirectory
request will be sent to Dokan file syst
em driver and the driver will invoke the OpenDirectory
callback provided by the
user
-
level module
. The results of this routine are sent back to

the

Windows
application
as the response to the OpenDirectory request. Therefore, the Dokan file
system driver ac
ts as a proxy between user program
s and file system applications.

Dokan is
written in C and the user
-
level module can be written in C or Ruby and C# using
provided

language binding support.





10




4.1.1

Dokan’s component


Dokan itself consists of several main component
:



dokan.dll

Dokan user mode library
. It provides functions to the user
-
level
module.



dokan.sys

Dokan File System Driver
. It stays in kernel
-
level to invoke call
-
back function provided
by user
-
level module.



mounter.exe

Dokan mou
n
ter service
. It run as service to mount a virtual drive when the mount
function invoked.



dokanctl.exe

Do
kan control program
. User may use this program to dis
mount the mounted drive if
the user
-
level module ends unexpectedly.



dokan.lib

Dokan import library



dokan.h

Dokan library header



DokanNet.dll

Library for .NET binding
. This is required to write user
-
level

module in C#.


4.1.2

Callback function in Dokan library


Dokan library provide necessaries callback functio
n
to create a full features of file system,
however

in this mechanism we only use several functions.

Application

User
-
level module

Dokan file system driver

Windows kernel

Figure
2
: Dokan library working in

Windows kernel

11


Function name

Parameters

CreateFile

string
filename,

FileAccess access,

FileShare share,

FileMode mode,

FileOptions options,

DokanFileInfo info


OpenDirectory

string filename,

DokanFileInfo info

Cleanup

string filename,

DokanFileInfo info

CloseFile

string filename,

DokanFileInfo info

ReadFile

string filename,

byte[] buffer,

ref uint readBytes,

long offset,

DokanFileInfo info

WriteFile

string filename,

byte[] buffer,

ref uint writtenBytes,

long offset,

DokanFileInfo info

FlushFileBuffers

string filename,

DokanFileInfo info

GetFileInformation

string filename,

FileInformation fileinfo,

DokanFileInfo info

FindFiles

string filename,

ArrayList files,

DokanFileInfo info

Unmount

DokanFileInfo info






4
.2

User
-
level Module for Dokan


We implement the user
-
level module of Dokan in the C#
language. As this file
-
system

focus
es

on read
-
only capability, we make use only several needed
callback
function. Functions related
with creating folder and writing file will return
-
1 or error. Typically when an application
opening a file from the file system,
CreateFile()
,
OpenDirectory()
,
GetFileInformation()
,
ReadFile()
,
Cleanup
()
, and
CloseFile()

in seq
uence, so we only focus on these callback
functions and some others specific functions.


4.2.1

Global variable
s


1

2

3

4

5

6

7

8


private

int

count_;


private

string

host_ =
"http://server"
;


public

static

Hashtable

file
table =
new

Hashtable
();


public

static

TcpClient

c =
null
;


public

static

Stream

s =
null
;


public

static

SimpleEncoding

se =
null
;


public

static

StreamReader

r =
null
;


public

static

StreamWriter

w =
null
;



Figure
3
: Global variables

12


As shown in figure
in line 1 variable
count
_

is counter for the Dokan file handler. It increase
each time new
CreateFile()

function invoked. The
host_

variable in line 2 is the web server
hostname.
f
iletable

varia
ble is hashtable which
is store files meta information using filename as
the
key and
Items

class as

the

value.

Variable
from line 5 to line 8 used to create pooled
connection. This will be elaborated more in
ReadFile()

function.


4.2.1

Items class


Item

class
functions

to store file
’s

meta information

such as

the filename
s
, file
s

attribute
(whether it

a
file or directory),
file sizes,

files key,

files nonce

and process ID (PID) of
application which opening the file
. Details about key and nonce will be el
aborated in chapter 5.

1

2

3

4

5

6

7


class

Items

{


public

string

name;


public

int

size;


public

byte
[] key;


public

byte
[] nonce;


public

int

PID
;


}



Figure
4
: Items class source
code


4.2.2

CreateFile() function


1


2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18


19

20

21

22

public

int

CreateFile(
String

filename,
FileAccess

access,
FileShare

share,


FileMode

mode,
FileOptions

options,
DokanFileInfo

info)

{

string

path =
HttpUtility
.UrlPathEncode(filename.Replace(
"
\
\
"
,
"/"
));

info.Context = count_++;

if

(path.Equals(
"/"
))

{


info.IsDirectory =
true
;


return

0;

}

if

(!
file
table.ContainsKey(path))

{


ulong

PID =
GetNamedPipe(path)
;


if (
PID == 0 ||
PID !
-
= info.ProcessID
)


return

-
DokanNet
.ERROR_FILE_NOT_FOUND;

}

try

{


HttpWebRequest

request = (
HttpWebRequest
)
WebRequest
.Create(
"http://
host_
"

+
path
);


HttpWebResponse

response = (
HttpWebResponse
)request.GetResponse();


request.Accept =
"text/plain"
;


if (
response.StatusCode.ToString() ==
"OK"
)
return

0;


else
return

-
DokanNet
.ERROR_FILE_NOT_FOUND;

13


23

24

25

26

27

28

}

catch

{


return

-
DokanNet
.ERROR_FILE_NOT_FOUND;

}

}



Figure
5
: CreateFile()
function code

When
the application creates

a file handle it will invoke this function.
The first thing this
function do is converts the backslash in the filename to slash. This is necessary as URL use
slash as delimiter instead of backslash like Windows do
.
In the line 4 on source code above the
function increase the fill the file
Context

with the counter and increase it.
After that it checks
whether the
application opening a file or a root

directory. If it opening the root directory then it
just set

Dokan
file handler,

IsDirectory

as true and return 0. It continues to verify the
filetable

whether the filename key exists or not. If the key
does
not exist it will invoke the
GetNamedPipe()

method.

GetNamedPipe()

send
s

a
request to the pipe server which is the
application that open the file. If the application return
s

with

the

proper
token,

GetNamedPipe()

fill the
filetable

with received data and return
s

the application PID
. Otherwise it
returns

0
. More
details about
GetNamedPipe()

will be elaborated in Named Pi
pe section in Secure Channel
subchapter.

If the returned PID is 0 or the PID is not

the

same with the PID of application which
open
ed

the file, then
CreateFile()

returns

a

file not found error.
After that the
CreateFile()

functions

accesses the file header

from the

web

server to obtain the
verification of

the file
s’

existence on the web server
.

If the file exists it will return 0, otherwise it return error with file
not found status.


4.2.3

OpenDirectory() function


Virtually the file system is empty, since

this file system

does

not allow access to any
applications other than legitimate application, and the legitimate application already know
s

wh
ich

files
that

exist.
In other words,
applications are

not allowed to open directory. However,
t
o avoid problem up
on opening file,
OpenDirectory()

always return 0 or true.


4.2.4

GetFileInformation() function


1

2

3

4

5

6

7

8

public

int

GetFileInformation(


String

filename,


FileInformation

fileinfo,


DokanFileInfo

info)

{


string

path =
HttpUtility
.UrlPathEncode(filename.Replace(
"
\
\
"
,
"/"
));


if

(!table.ContainsKey(path))
return

-
1;


Items

file

= (
Items
)filetable[path];

14


9

10

11

12

13

14

15


fileinfo.Attributes = m.type;





fileinfo.CreationTime = DateTime.Now;


fileinfo.LastAccessTime = DateTime.Now;


fileinfo.LastWriteTime = DateTime.Now;


fileinfo.Length = file
.size;


return

0;

}



Figure
6
: GetFileInformation() code

This function
is
invoked when the application which opened the file try to get the file
information.
GetFileInformation()

draw
s

the file information from the
filetable

hashtable and
fill it in the
fileinfo

parameter and return
s

0. As the file time is not important it
’s

jus
t filled with
the
current time. If the requested
filename

is not available in the hashtable then it
returns

-
1
(error).




15


4.2.5

ReadFile() function


1


2

3

4

5

6

7

8

9

10

11

12

13

.

.

.

24

25


26

27

28

29

public

int

ReadFile(
String

filename,
Byte
[] buffer,
ref

uint

readBytes,
long

offset,
DokanFileInfo

info)

{


string

path =
HttpUtility
.UrlPathEncode(filename.Replace(
"
\
\
"
,
"/"
));


if

(!table.ContainsKey(path))
return

-
1;


Items

file = (
Items
)filetable[path];


int

filesize = file.size;


byte
[] key = file.key;


byte
[] nonce = file.nonce;


if

((
int
)offset >= filesize)


{


readBytes = 0;


return

0;


}

... calculate block counts ...

... get position of block in file ...

... find new offset ...


byte
[] encryptedData = AccessData(path, offset,
bytesize
);


byte
[] decryptedData = Decrypt(
encryptedData
,
firstBlock
, key, nonce,

filesize
, int offsetInFirstBlock
)


Array
.Copy(decryptedData, buffer,
decryptedData.Length
);


readBytes = (
uint
)
decryptedData.Length
;


return

0;

}



Figure
7
: ReadFile() function code

This is the most important function as
85
it

almost the ideas of this research go into this function.

Like other necessary callback function,
it converts backslash in the
filename

variable into slash.
After that it
tries

to draw the meta information from the hashtable. If
it
fail
s
, it return
-
1 or error.
The meta information used in this function are file size, file key and file nonce. After
that it
continue
s

to verify the
offset that
whether

or not

it
’s still in file size range. O
therwise it return 0
or success, but with 0
readBytes
. Such verification is vital to avoid any error when accessing
data on the web server.

After doing necessary

ver
ifications, it
starts

to calculate the
block number for the given offset
and how much data need to be accessed to decrypt it properly.
After th
e
se mathematical process
are
done, it use
s

the calculation result to
access the data portion from the web server.

Detail
about
AccessData()

function
will be
detailed in Chapter 4.4. AccessData() returns an array of
bytes and placed in
encryptedData

variable.
Decrypt()

function call
ed

to decrypt the gained data
portion
by applying

gained
encryptedData
,
buffer

length,
drawn file key, file nonce

and file
size as parameter. Details of
Decrypt()

will
be
elaborated

more

in Chapter 6.5.

Returned decrypted bytes array is copied into buffer with the length of
decryptedData
, which is
in some case shorter than buffer l
ength. And
decryptedData

l
ength

also returns as
readByte
.



16


4
.3

Secure Channels


It is vital to prevent malicious applications from opening the protected files. To realize this, a
file needs to be encrypted and the legitimate application must pass a token

to the VFS module,
which is needed to decrypt the requested file. To pass the token, a secure channel between the
VFS module and the application needs to be established. We are considering two ways to
implement this: a named pipe and a filename extension.


4
.3.1

Named Pipe


Named pipe

is a named, one
-
way or duplex pipe

(shared memory)

for communication between
the pipe server and one or more

pipe clients.

All instances of a named pipe share the same pipe
name, but each instance has its own buffers and
handles, and provides a separate conduit for
client/server communication.

In our case, the application will act as pipe server and create a
named pipe which can be accessed by the VFS module.

1

2

3

4

5

6

7

8

9

.

.

17

18

19

20

21

22

23

public

ulong

GetFileInformation(
String

filename)


{


IInterProcessConnection

clientConnection =
null
;


try

{



clientConnection =
new

ClientPipeConnection
(
"
PipeName
"
,
"."
);



clientConnection.Connect();



clientConnection.Write(
filename
);



string

base64data =
clientConnection.Read();



clientConnection.Close();

...
omitted (decoding base64data, split it
, decrypt

and put it into hashtable)
...



return

PID
;


}


catch

{



clientConnection.Dispose();



return

0
;


}

}



Figure
8
: Partial
of GetNamedPipe() code

.NET (dot net) itself not has library for named pipe. So we make use of freely available named
pipe
library
for C# [].
Upon execution, the legitimate
application creates a named pipe and
listens

to it.
As shown in the source code
above
, user
-
level VFS module

create connection to
pipe named “PipeName” and
send request to the application with filename as parameter
whenever the application open
a
file on the fi
le system. The application replies

with

self PID
and token which is
: filena
me, file size, file key, file nonce,
self PID. All this elements are
encrypted, convert to bas
e
64 string and concatenated with some delimiters.
This is for easy
transfer across the named pipe. Upon receiv
ing

the response from the application this function
will reverse all the procedure done by application. It split
s

the string; convert it back to byte
17


arrays and decrypts it. After that it put the appropriate data into the

hashtable of

filetable
. If the
data

is

successfully inserted into the hashtable, it re
turns the PID.
If any error occurred, it
returns 0. The error may be the string failed to be split
ted
, insufficient data like there no nonce
or no key in the string, or the received key is not long enough.


4
.3.2

Filename Extension


1


2

3

4

5

6

7

8

9

10

11

12

13

14

15

.

.

24

25

26


public

int

CreateFile(
String

filename,
FileAccess

access,
FileShare

share,


FileMode

mode,
FileOptions

options,
DokanFileInfo

info)

{

string

path =
HttpUtility
.UrlPathEncode(filename.Replace(
"
\
\
"
,
"/"
));

info.Context =
count_++;

if

(path.Equals(
"/"
))

{


info.IsDirectory =
true
;


return

0;

}

int

semiColon = path.LastIndexOf(
";"
);

if

(semiColon >
-
1) path = path.Substring(0, semiColon);


if

(!filetable.ContainsKey(path))

{


s
tring

token = filename.Substring(semiColon + 1);

... convert token to bytes arrays, decrypts it and put it into the hashtable ...

... omitted ...

}

try

{

... omitted ..



Figure
9
: CreateFile() function code when using filename
extension

In this method, a token is passed with the filename upon opening a file. For example
, when
opening a file named “sports.wmv”, the legitimate a
pplication

must open ‘sport.wmv;

MithJmRCTHVCNzIzcyhEZ3N0S0orWmNAdw==

instead, while

MithJmRCTHVCNzIzc
yhEZ3N0S0orWmNAdw==
’ is the
base64 encoded
token and the
semicolon
is the delimiter.

Notice that when using this method
CreateFile()

function slightly different, as shown in
Figure
9
. Instead of calling
GetNamedPipe()
, it split the requested filename to real filename and token
as shown on line 10,11 in
Figure
9
. Other functions also need this line if the filename extension
method is used. After that, it do the same procedure as
GetNamedPipe()

does: split it using
special delimiter, convert base64 string to bytes arrays
, decode it and put it into the
hashtable.


However, there is a limitation using this method. In Windows, a filename is limited up to 255
characters long for Windows XP and 260 characters for Windows Vista. This problem prevents
the application from openin
g
the
file since the token itself is can exceed 40 characters.

18



4
.4

Performance Issues


In file systems, performance always matter. To avoid high latency to the whole system, we
perform several optimizations.


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

2
7

.

40

41

42

43

44

45

46

47

48

49

50

51

52

53

if

(c ==
null

|| !c.Connected)

{


c =
new

TcpClient
(host_, port_);


s = (
Stream
)c.GetStream();


se =
new

SimpleEncoding
();


r =
new

StreamReader
(s, se);


w =
new

StreamWriter
(s, se);

}

w.WriteLine(
"GET "

+
HttpUtility
.UrlPathEncode(path) +
" HTTP/1.1"
);

w.WriteLine(
"HOST: "

+ host_);

w.WriteLine(
"Accept: */*"
);

w.WriteLine(
"Keep
-
Alive: 1000"
);

w.WriteLine(
"Connection: keep
-
alive"
);

w.WriteLine(
"range: bytes="

+
offset

+
"
-
"

+
(offset + bytesize)
);

w.WriteLine(
""
);

w.Flush();

WebHeaderCollection

h =
new

WebHeaderCollection
();

while

(
true
)

{


try


{


string

str = r.ReadLine();


if

(str.Length == 0)
break
;


if

(str.IndexOf(
":"
) !=
-
1)
h.Add(str);


}


catch

(
Exception
)


{

... omitted ... (reconnected if the connection closed)


}

}

int

len =
int
.Parse(h[
HttpRequestHeader
.ContentLength]);

char
[] tmp =
new

char
[len];

int

offset2 = 0;

while

(
true
)

{


int

ret = r.Read(tmp, offset2, tmp.Length
-

offset2);


if

(ret <= 0)
break
;


offset2 += ret;


if

(offset2 >= len)
break
;

}

byte
[] data = se.GetBytes(tmp);

return

data;



Figure
10
: Portion of AccessData() function code



19


4
.4.1

Pooled Connections


Upon accessing data by an application, data is usually accessed in small chunks from random
positions in the file. This leads to rapid
ReadFile
()

requests in a very short time, and each time
ReadFile()

is called, new connection
is

created to access the data portion from
the
internet.
To
avoid delays to establish a new connection, we use pooled connections to the server. In other
words, the VFS module reuses

the

same connections to access several portions of data from the
same file
.

Since the no method to create pooled connection in
available in
.NET, we create
d

the
method ourselves by using
TcpClient()

class. We make use of

custom

SimpleEncoding()

class
to
detect the stream encoding an
d return
s the

proper supported encoding.
GetSt
ream()

method is
used to send request using
StreamWriter
()

method and
StreamReader()

to get the response. Line
9 to 15 in
Figure
10

indicates the function send request header to web server. After that the
AccessData()

function read the response header from the web server as indicates in line 18 to
line 40. We use try and
catch to
work around

with any error occur due to lost connection. If the
connection
is
lost due to timeout or the keep
-
alive connection limit reached, we
would
reestablish the connection. Afterward the function read
the
response stream and r
eturn it.


4
.4.
2

Cache and Prefetch


Since we implement a decryption capability, data needs to be downloaded based on block sizes
instead of requested buffer sizes. The VFS module downloads data that is slightly bigger than
the requested buffer size. Data chunks are cac
hed first then decrypted before the module sends
them to the application.

In addition, when playing audio and video files, either from beginning or after perform seeking
across the file, data are usually accessed sequentially. Therefore, we improve the per
formance
by using prefetching. The algorithm guesses
the
next data, fetches it and stores it in memory.
Each time
a
user performs skipping backward or forward

command
, a new prefetching session
is started. To support prefetch properly, we perform the prefe
tch function in a dedicated thread.


<<todo>>



20


Chapter
5

Server
-
side Programs


For the web server,
any web server applications that support
s

HTTP version 1.1
are

usable.

These include

Apache, Microsoft ISS,
lighttpd
, etc
.

HTTP 1.1 is vital because it contains what
we need to
make the random access idea comes true
. Since version 1.1, it supports range
persistence connection (keep
-
alive)

and
request (byte serving) []. The range request is needed
for random access and persist
ence connection is essential for pooled connections.


5.1

Persistent Connections


I
n HTTP/0.9 and 1.0, the connection is closed after a single request/response pair. In HTTP/1.1
a keep
-
alive
-
mechanism was introduced, where a connection could be reused for
more than one
re
quest.

Such persistent connections reduce lag
considerably
, because the client does not need to
re
-
negotiate the TCP connection after
the first request has been sent. The advantage of
persistence connections are:



less CPU and memory usage (
because fewer connections are open simultaneously)



enables HTTP pipelining of requests and responses



reduced network congestion (fewer TCP connections)



reduced latency in subsequent requests (no handshaking)



errors can be reported without the penalty of cl
osing the TCP connection


5.2

Byte serving


Byte serving is the process of sending only a portion of an HTTP/1.1 message from a server to a
client. Clients which request byte
-
serving might do so in cases in which a large file has been
only partially
delivered and a limited portion of the file is needed in a particular range. Byte
Serving is therefore a met
hod of bandwidth optimization.

In the HTTP/1.0 standard, clients
were only able to request an entire document. By allowing byte
-
serving, clients may

choose to
request any portion of the resource. One advantage of this capability is when a large media file
is being requested, and that media file is properly formatted, the client may be able to request
just the portions of the file known to be of intere
st.
<<todo>>



21


Chapter
6

Encryption and Decryption


The encryption we use in our mechanism is
Advanced Encryption Standard (
AES) with Counter
(CTR) mode. We use CTR mode because it is suitable with random access.


6.1

Advanced Encryption Standard


Advance
d Encryption Standard (AES), also known as Rijndael, is a block cipher adopted as an
encryption standard by the U.S. government. It has been analyzed extensively and is now used
widely worldwide as was
the case with its predecessor,

the Data Encryption Standard (DES).
AES was announced by National Institute of Standards and Technology (NIST) as U.S. FIPS
PUB 197 (FIPS 197) on November 26, 2001 after a 5
-
year standardization process. It became
effective as a standard May 26, 2002. As
of 2006, AES is one of the most popular algorithms
used in symmetric key cryptography.

The cipher was developed by two Belgian cryptographers, Joan Daemen and Vincent Rijmen,
and submitted to the AES selection process under the name "Rijndael", a portmant
eau of the
names of the inventors.

AES is not precisely Rijndael (although in practice they are used
interchangeably) as Rijndael supports a larger range of block and key sizes; AES has a fixed
block size of 128 bits and a key size of 128, 192, or 256 bits
, whereas Rijndael can be specified
with key and block sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum
of 256 bits. Due to the fixed block size of 128 bits, AES operates on a 4×4 array of bytes,
termed the state (versions of Rijn
dael with a larger block size have additional columns in the
state).


6
.
2

Counter mode



C
ounter mode

(CTR mode) is a block cipher mode operation which
turns a block cipher into a
stream cipher. It generates the next keystream block by encrypting successiv
e values of a
"counter". The counter can be any simple function which produces a sequence which is
guaranteed not to repeat for a long time, although an actual counter is the simplest and most
popular. CTR mode has similar characteristics to

Output Feedbac
k

(
OFB
)
, but also allows a
random access property during decryption, and is believed to be as secure as the block cipher
being used.
T
he
initialization vector (IV)

in

this
mode

is the
combination of nonce and the
counter
. The nonce and the counter can be
concatenated, added, or XORed together to produce
the actual unique counter block

which we use as IV

for encryption. CTR mode is well suited to
operation on a multi
-
processor machine where bloc
ks can be encrypted in parallel
, which is also
an advantage of
CTR mode.

22





6
.1.1

Padding


Because
AES
works on units of a fixed size
; 16
bytes,

but
original data

come in a variety of
lengths,

this mechanism

require that the final block be padded before encryption. Several
padding schemes exist. The simplest is to add null bytes to the
original data

to bring its length
up to a multiple of th
e block size, but care must be taken that the original length of the

data

can
be recovered; this is so, for example, if the
original data

is a C style string which contains no
null bytes except at the end.


6
.
3

Token



The token term in our mechanism is a

combination of nonce
,

key

and original file size
.

The
token also may contain any additional useful data.

The reason why original file size is included
is because it needed when decrypting the file by VFS module.
Both nonce and key generates
randomly by th
e
encryption software

for each file. This

provide
s

more secure protections
against

malicious access.
The token can be in any size
,

depend on original file size and used key
size. Supported key sizes are, 128 bits, 192 bits and 256 bits. The nonce size is 6
4 bits as it is
concatenated later by counter bytes.


Block Cipher
Encryption

Nonce

c2b3f342…

Counter

00000000
0

Key

Original bytes

Ciphered bytes

Block Cipher
Encryption

Nonce

c2b3f342…

Counter

00000001
10

Key

Original bytes

Ciphered bytes

Block Cipher
Encryption

Nonce

c2b3f342…

Counter

00000002
0

Key

Original bytes

Ciphered bytes

Figure
11
:
Counter mode encryption

23


6
.
4

Encryption software


1

2

3

4

5


6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

string

filename =
args[0];

string

filename2 =
"enc_"

+ filename;

FileInfo

info =
new

FileInfo
(filename);

long

size = info.Length;

int

blockCount = (
int
)
Math
.Round((
decimal
)size /
16,
MidpointRounding
.AwayFromZero);

try

{


FileStream

fs =
File
.OpenRead(filename);


FileStream

fs2 =
File
.OpenWrite(filename2);


int

offset = 0;


byte
[] buffer =
new

byte
[16];


byte
[] encrypted =
new

byte
[16];


byte
[] nonce =
new

byte
[8];


byte
[] iv =
new

byte
[16];


RijndaelManaged

transform =
new

RijndaelManaged
();


transform.Padding =
PaddingMode
.Zeros;


transform.GenerateIV();


transform.GenerateKey();


byte
[] key = transform.Key;


Array
.Copy(transform.IV, nonce, 8);


for

(
int

i = 0; i < blockCount
-
1; i++)


{


byte
[] counter =
BitConverter
.GetBytes((
long
)i);


Array
.Copy(nonce,0,iv,0,8);


Array
.Copy(counter, 0, iv, 8, 8);


fs.Seek(offset,
SeekOrigin
.Begin);


offset = fs.Read(buffer, 0, buffer.Length) + offset;


transform.IV = iv;


ICryptoTransform

encrypt = transform.CreateEncryptor();


encrypt.TransformBlock(buffer, 0, 16, encrypted, 0);


fs2.Write(encrypted
, 0, 16);


}


byte
[] counter2 =
BitConverter
.GetBytes((
long
)blockCount
-
1);


Array
.Copy(nonce, 0, iv, 0, 8);


Array
.Copy(counter2, 0, iv, 8, 8);


fs.Seek(offset,
SeekOrigin
.Begin);


offset = fs.Read(buffer, 0, buffer.Length);


transform.IV =
iv
;


ICryptoTransform

encrypt2 = transform.CreateEncryptor();


byte
[] lastBLock = encrypt2.TransformFinalBloc
k(buffer, 0, offset);


fs2.Write(lastBLock, 0, 16);


fs.Close();


fs2.Close();

}



Figure
12
: Portion of Encrypt() function

Our encryption software is really simple.
We make use of
RijndaelManaged

class in the .NET
library to
simplify

the encryption process.
When it
encrypt
s a

file
, it
generate
s a

random

nonce
and key

by
using
GenerateIV()

and
GenerateKey()

method
respectively
.

Since there
are no
nonce terms

in
RijndaelManaged

class, we use the tru
ncated IV. IV is truncated to 8 bytes to
produce the nonce.
Since the

.NET library
86
not has

the counter mode implementation, we must
24


implement the mode ourselves.
The software
divides

the given file into blocks of 128 bit (16
byte
s
)

and
calculates

the block counts
.

Typically the division result
does
not always return
an
integer. In case

it return
s a

floating number, it is rounded using

round away from zero


ro
unding method. For example:
a file with 1030 bytes size divides

with 16 resulting 64.375.
However, instead round
ed

to 64
it’s

round
ed

to 65.
As shown in line 21 to 32
in
Figure
12

the
function start
s

a loop to rea
d

the

origin
al

file, decrypt and write
s it

to

the

destination file.
The
loop starts with generating

of
a
counter value. It cast
i

into long value and converts it into byte
array.
The reason it cast the
i

into long is to make it 8 bytes long.
The counter bytes array is
concatenated with the nonce bytes array to produce a temporary IV bytes array which is set into
transform.IV

property.
In each loop, 16 bytes data

is

read from the origin file and offset
is
increased with the readbyte value.
We m
ake use of
ICryptoTransform

interface to perform the
block transformation by using
TransformBlock()

method. After it has been encrypted, it
’s

immediately written into the destination file. This continues until it reach second last block.
When the loop fini
shed the function perform transformation of last block.

After that, e
ach block
is
fixe
d

with the counter

value

incrementally. The software
encrypts

each
block using

the
key,

the

nonce and

the

counter.

Since AES algorithm can encrypt 16 bytes
block only,
the last block need to be padded in case it less than 16 bytes. In our mechanism we
use Zero
s

Padding which
adds

zero byte to fill the block.

For example, a file with 1030 bytes
size split into
64 blocks of 16 bytes and a block of
6 bytes
for the final blo
ck. The final block
will
be padded

by 10 of zero byte (0x0) to make it 16 bytes.

T
h
e result file
size will be 1040
bytes.

After finish
ing

the encoding the software uploads the encrypted file to web server and
register the filename, file size, file key and
nonce to the authentication server.


6
.3.1

Multithreads


We
make use of

the advantage of current technology and the suitability of CTR mode with
multithread.
87
The whole file divided into blocks and the whole blocks set divided again into
several
subsets

as desired thread count
. For example
, in case we want to use 4 threads, 1030
bytes file divides into a set of 65 blocks. This block set will be divided again into 4 subsets. The
first, second, and third subset contain 16 blocks while the fo
rth block conta
ins 17 blocks.
T
h
e
counter value for the first block for each
subset

is sum of previous subsets blocks.

In this case:

Blocks subset

First counter value

Blocks contained

Calculation

First

1

16

0 + 1

Second

17

16

16 +1

Third

33

16

16+17

Forth

49

17

16+33

Table
1
: Multithreads calculation

All four
threads

execute simultaneously resulting

in

faster
encrypting

process.


25


6.5

Decryption


1


2

3

4

5


6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43


44

45

46

47

48

private

byte
[] Decrypt(
byte
[]

sourceFile
,
int

startBlock
,
byte
[] key
,
byte
[]
nonce,
int

filesize,
int
offsBl
)

{


int

blockCount = (
int
)
sourceFile
.Length

/ 16;


int

lastBl
o
ck = 0;


int

totalBlock = (
int
)
Math
.Round((
decimal
)size / 16,
MidpointRounding
.AwayFromZero);


int

padding = 0;


byte
[] lastBlk =
null
;


if

((startBlock * 16 + sourceFile.Length) > filesize)


{


lastBl
o
ck = 1;


padding = totalBlock
-

size;


}


try


{


int

offset = 0;


byte
[] iv =
new

byte
[16];


byte
[] decrypted =
new

byte
[sourceFile.Length



(lastBlock * 16)
];


int

i = startBlock;


RijndaelManaged

transform =
new

RijndaelManaged
();


transform.Padding =
PaddingMode
.Zeros;


transform.Key = key;


for

(
int

k = 0; k <
blockCount

-

lastBlock
; k++)


{


byte
[] counter =
BitConverter
.GetBytes((
long
)i);


Array
.Copy(
nonce
, 0,
iv
, 0, 8);


Array
.Copy(counter, 0,
iv
, 8, 8);


transform.IV = iv;


ICryptoTransform

decrypt = transform.CreateDecryptor();


decrypt.TransformBlock(sourceFile, offset, 16, decrypted, offset);


offset

= offset + 16;


i++;


}


int

dataLen = sourceFile.Length


offsBl


padding;


byte
[] data =
new

byte
[dataLen];


Array
.Copy(decrypted, offsBl, data, 0, decrypted.Length
-

offsBl);


if

(last == 1)


{


byte
[] counter =
BitConverter
.GetBytes((
long
)i);


Array
.Copy(iv, 0, nonce, 0, 8);


Array
.Copy(counter, 0, nonce, 8, 8);


transform.IV = nonce;


ICryptoTransform

decrypt
Last

= transform.CreateDecryptor();


lastBlk = decrypt
Last
.Tra
nsformFinalBlock(sourceFile, offset, 16
-

padding);


Array
.Copy(lastBlk, 0, data, decrypted.Length
-

offsBl, lastBlk.);


}


return

data;


}

... omit ...



Figure
13
: Decrypt() method code

26


Decryption process
is
done in user
-
level module of VFS module in
ReadFile()

function. Instead
88
we place

the code in the
ReadFile()

function, we ma
de an

independent
method
;
Decrypt()

which can be called from the
ReadFile()
. Parameter of
Decrypt()

consists of bytes array
sourceFile

which represent encrypted data portion,
key

and
nonce
, integer
startBlock

which is
first block counter,
filesize

represent original file size and
offsBl

represent offset in first block.

On

initialization, the function co
unt the blocks of the given
sourceFile
, verify if the
sourceFile

contain
s the

last block and calc
ulates the
padding if the last block exists. Afterward it tries to
begin block transformation by initializing the
RijndaelManaged

class. Using
for

statement, i
t
loop
s

in
blockCount
. In each loop, it generates the IV base on given nonce and incremental
counter begin
s

with
startBlock
. After that it copies the decrypted data into the destination bytes
array;
data
.

If the last block exist
, the loop count decrease by 1 and last block transformation perform after
the loop. The decrypted last block copied into
data

and the
data

returns.


27


Chapter 7

Experiments


Some experiments have been done to prove the ability of this mechanism. For the exp
eriment
purpose,
we use a PC

to

act as both
administrator PC (use to upload file to web server and
register file to authentication server) and
a
user PC

(used to access the file on web server)
.


7.1

Experiment environment




User PC
/Admin PC
:

o

CPU:
Intel
Core2 Quad

2.4
GHz


o

RAM:

4 GB RAM

o

Ethernet: 100 Mbps connection

o

OS:
MS Windows XP SP2




Web Server:

o

CPU:
Intel Pentium 4 3.0
GHz

with Hyper
-
Threading

o

RAM:

1 GB

o

Ethernet:
100 Mbps
c
onnection

o

OS:
Ubuntu Linux 7.10

o

Software:
Apac
he 2.2.4 as web server
software and ProFTPd as FTP server

Both user PC/admin PC and
the
web server
is
located in same intranet.
We use
d a

quad core PC
for
the
admin PC to test
the
full ability of

the

encoding software.


7.2

Encryption
Rate


To analyse

the encryption rate, we experiment
ed with

several file with different size
s
: 5, 10, 50
and 100 MB. Each file
was
encrypted using several thread
counts
: 1, 2 and 4 threads. We use
the user PC to run the experiment.
As
you

can see from the result shown in t
able and graph the
encryption rate nearly double when we double
d

the thread count. The content of the file

(whether it is media file, text file or compressed file)

does
not affect the
encryption
rate
s.



28



Figure
14
: Graph showing

encryption rate using different thread count


7.3

Transfer rate


We benchmark the transfer rate by comparing several methods used to transfer data online. The
methods are: direct access
HTTP/HTTPS

by using web client,
Server Message Block

(SMB)

protocol, WebDAV protocol using Windows XP embedded
WebDAV Mini Redirector

(shell
namespace extension), our
VFS module

with and without

decryption
.

The m
ethod
s

we use
d

to
benchmark transfer rate for HTTP and HTTPS is using WGET [] software. For benchmark of
SMB, our VFS module and WebDAV we mount a logical drive in My Computer and use
FastCopy
[]
program to
copy file on remote server to local drive.

As the result we can

see from
the graph in
Figure
15
, HTTP transfer rate is
the
fastest as
expected. Surprisingly our VFS module transfer rate is faster than
the
SMB/CIFS

protocol
transfer rate. Our VFS module with
prefetch does not improve the transfer rate as it use to
decrease request counts. As expected using decryption in VFS module decrease the transfer rate
to 25% of original transfer rate. However if we make this n
umber as
the
maximum speed of our
mechanism we can still afford any intermediate level of video streaming.

For example, if we want to stream a video file the mechanism can still afford to stream up to 20
M
b
its/
s
.


2.328125

4.40625

22.125

44.234375

1.265625

2.46875

12.109375

24.0625

0.625

1.28125

6.65625

12.796875

0
5
10
15
20
25
30
35
40
45
50
5
10
50
100
Duration (seconds)

Filesize (MB)

1 thread
2 threads
4 threads
29



Figure
15
:
Graph of transfer rate using different

methods



Figure
16
: Graph of CPU usage of different method

9.64

10.42

10.51

2.64

11.32

10.84

10.34

0
2
4
6
8
10
12
Samba
VFS Module
VFS Module
(prefetch)
VFS Module
(decryption)
HTTP
HTTPS
WebDAV
Transfer rate (MB/s)

0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Samba
VFS Module
VFS Module
(prefetch)
VFS Module
(decryption)
HTTP
HTTPS
WebDAV
CPU Usage

30



Random access




31


Chapter
8


Integration with Movie Database System of Japan
Institute of Sports Scienc
es


This mechanism is schedu
led to replace current data distribution mechanism in movie database
system of Japan Institute of Sports Sciences (JISS).

8
.1

Current data distribution mechanism in JISS movie database
system


As shown
in

Figure
17
, the

process starts

with client logging in

to
the
user administrator server
by sending
his/her
username and password and the client received list of movie IDs. Client can
access movie
s

on the

list by sending the movie ID back to user admin server to received URL.
User admin server logs the request and create random key and store it in
a
temporary table.
Beside
the
key, the table also contains client IP address, movie ID and timestamp. After th
at, the
user admin server responses to the client PC with the URL of proxy with key. Client creates
connection to the proxy with the key as filename. The proxy verifies the request by contacting
the user admin server and sends the requested key and the cli
ent IP address. User admin server
then, compares the key, IP address and timestamp with the temporary table. If t
hey are valid and
table is still

not expired,

the user admin server responds

with the real URL to the proxy server.
The proxy server
would then

access

the real file on

the

streaming server and relay it to client
’s

PC.


8
.2

New data distribution mechanism propose to JISS movie
database system


The current mechanism which was

developed 2 year ago, suffers
from

high latency due to
relaying data thro
ugh a proxy. In our mechanism we simplify the system by re
moving the proxy
server. Comparing

to the current mechanism which has 4 components, our mechanism only has
3 components. Based on
Figure
18

the process start with client log
g
in
g in

to

the

authentication
server using the username and password and receive movie IDs list. When client want to access
a movie on the list, it request
s

the movie filename by sending
the
movie ID to
the
authentication
server and
that
authentication server replies with

the

movie
’s

filename and token. The client
application access the file directly from the local virtual drive. The VFS module in client PC
o
pen connection to web server and access the wished file, decrypt it using the key and nonce
contained in the token and pass it to the application.


32





8
.3

The advantage
s

compare
d

to current mechanism


As we can see from the explanation above there
is

several advantage
s

of this new mechanism
compare
d

to
the
current mechanism.



Less components

As the working component reduced,

the latency of whole system is reduced.

It also
saves the whole costs of the system as less hardware needed.



No proxy

As the proxy
is not involved

any more, movie file is downloaded
directly from

the web
server instead relay
ing

it using the proxy. In this mechanism, VFS module acts as the
proxy but yielded much more performance due to
its existence

in client PC, instead of
located in
a
separat
e

and
remote server.

1. username, password

movieID

7. Stream the file

4. key, client IP address

5. http://server/moviefile

8. Relay the
streaming

User admin server

2
.
http://proxy/key

3
.
Access
http://proxy/key

Client PC

Proxy

6.
Access
http://server/moviefile

Streaming server

(upload movie files)

Movie files
publisher

(register movie ID)

Figure
17
:
Current data distribution mechanism in movie database system of JISS

33




Encryption

There no encryption involved in current mechanism of data distribution in JISS movie
database system. Our new
mechanism introduces

an
encryption
property which
provides

a
much mor
e secure system. For example in

the

current mechani
sm if the real
URL of the movie file
is
leaked, malicious user can gain access to the file and the file
would
los
e

it
s

confidential. On the other hand, in our mechanism, even

if

the file can be
accessed
,

malicious user can’t open it without the proper key.




Web server

(
Streaming

server)

Authentication server
(user admin server)

Client PC

1.

Username
, password,
movie ID

2
.

Movie

filename, token

3
.

Access encrypted file
on web server

4
.

Response the
encrypted file

Movie files
publisher


(
Register

movie ID)

(
Encrypt

and
upload movie file)

Figure
18
: New data distribution mechanism proposed to movie database system of JISS

34


Chapter
9

Conclusions


In the paper, we have proposed a new data protection mechanism and how it works. By using
this mechanism, users are able to access protected data over HTTP with a random access
capability while providing a tight security. In
the
future, we evaluate the use

of multiple pooled
con
nections for each file to relieve stress on a

single connection. We are also considering
implementing support to FTP servers.


<<todo>>



35


Acknowledgements


First and foremost, I would like to thank my supervisors, Professor Kozo
Itan
o
, Professor
Yasushi
Shinjo
, Professor Akira
Sato

and Professor Hisashi
Nakai

of Graduate School of
Systems and Information Engineering at the University of Tsukuba for being exceptional
advisors. They never
89
seize

to amaze me with their
unlimited

patienc
e
,

attention
to detail and
helping me improve

the thesis’s presentation style. I can

t imagine how I could have perfected
this thesis without their support.

I am also grateful to all my lab partners in the
Software

Laboratory at the University of Tsukuba

especially Mr.
Daiyuu Nobori
for offering they’re

support and encouragement since the day I
joined the research lab.

I owe my deepest debt of gratitude to my parents,
Che Abdullah

and
Nik Rahmah
, for providing
me with the resources to succeed in life. Th
ey have constantly advised and supported me in
everything that I have done. I am especially grateful to my mother for teaching me that time
management, organ
ization, and diligence, are
keys to success.

Finally, I would like to acknowledge the financial
support that made this research possible. All
the years when I was in Japan was supported by the Pub
lic Service Department of Malay
sia and
I am grateful for being given the opportunity to further my studies in a foreign country to widen
my knowledge.


36


Ref
erences



[1]

Chikara Miyagi, Koji Ito, and Jun Shimizu:
"Creating the SMART system
-

A Database for
Sports Movement", The Engineering of Sport 6, Vol.3, pp.179
-
184, 2006.

[2]

Hiroki Asakawa:
Design and implementation of user
-
mode file system library, 20
08
http://decas
-
dev.net/en/

[3]

Microsoft Developer Network (MSDN) Documentation:
DirectShow, 2008

[4]


MSDN Documentation:
Registering Shell Extensions, 2008

[5]

Galaxy File

system Toolkit
:
Chad Yoshikawa, 2005

http://galaxy.sourceforge.net/

[6]

GMail
Drive shell extension
:
Bjarke Viksoe 2007
,
http://www.viksoe.dk/code/gmail.htm

[]

Filesystem in Userspace
:
http://fuse.sourceforge.net/

[
]

Ivan Latunov
:
Inter
-
Process Communication in .NET Using Named Pipes


http://ivanweb.com/articles/namedpipes/

[]

RFC 2616:
Hypertext Transfer Protocol
--

HTTP/1.1, 1999


[]

GNU Wget:
Hrvoje Nikšić
, http://www.gnu.org/software/wget/ A
ccessed on 18 January
2008

[]

FastCopy
:
SHIROUZU Hiroaki
,
http://www.ipmsg.org/tools/fastcopy.html.en

A
ccessed on
18 January 2008