THE EFFECTS OF THE ECONOMIC CRISIS ON THE SOFTWARE INDUSTRY

lasagnaseniorManagement

Oct 28, 2013 (4 years and 15 days ago)

171 views

M
sc in International
Economic Consulting

Author: Peter Benedek Balog

Academic Supervisor:

Dr. Philipp Schröder

THE EFFECTS OF THE ECONOMIC CRISIS ON THE
SOFTWARE INDUSTRY

WITH
SPECIAL ATTENTION TO THE OPEN SOURCE SECTOR


Aarhus School of Business, University of Aarhus

2010.11.30.




i

Statement of originality

This work has not previously submitted for a degree or a diploma in any
university. To the best of my knowledge
and belief, the thesis contains no
material previously published or written by any other person except where due
reference is made in the thesis itself.



ii

Abstract

In

the

recent years the Open Source Software

(OSS henceforward)

development movement gained a considerable attention from various academic
research fields. Due to the nature of the OSS development process namely that
it involves developers at many diff
erent locations and

organizations

sharing
code to develop and refor
m

programs
, requires an interdisciplinary
understanding
.

Through a review of the existing literature

in

this dissertation
I develop

an
econometric model in order to

investigat
e
different
macroeconomic
factors that
might be

influential

on

the
behavior

of OS
S activity. By analyzing the software
industry as a whole I put the results into an
exploratory framework

which helps
the understanding.

The panel dataset used in the analysis consist quarterly macroeconomic figures
on 25 countries from 1998


Q1 until 201
0


Q2. It also includes data on four
OSS project
hosting site. The model analyzes

the aggregate OSS activity as
well as the differences across the “forges”.

The research concluded that the software industry was able to overcome the
negative effects of the

recent crisis fairly quickly. The reason might be that the
industry is a major driver of the R&D sector; therefore the once decreased ICT
budgets will be filled up quickly along with the stabilization.

The results show that changes in the real economy has

limited effect on the
OSS development activity in general. However there are considerable
differences among the

forges


in terms of the magnitude and type of the
different effects. There is a significant positive effect between GDP growth, and
the activi
ty of the individual communities.

Keywords:

open source software, productivity, economic crisis, software
industry
, business cycle theory




iii

Table of Contents

Abstract

................................
................................
................................
..............................
2

1.

Introduction

................................
................................
................................
.................
1

1.1.

Problem Statement

................................
................................
.............................
1

1.2.

M
ethodology
................................
................................
................................
........
2

1.3.

Delimitation

................................
................................
................................
.........
3

1.4.

Structure
................................
................................
................................
..............
3

2.

Literature
Overview

................................
................................
................................
....
4

2.1.

Open Source Phenomenon

................................
................................
................
4

2.2.

Business Cycle Theory

................................
................................
.......................
7

3.

Analysis of the software industry
................................
................................
..............

10

3.1.

Data

................................
................................
................................
...................

11

3.1.1.

FORBES Global 2000

................................
................................
...............

11

3.1.2.

Truffle 100

................................
................................
................................
..

12

3.1.3.

Glo
bal Software 100

................................
................................
..................

13

3.1.4.

Software 500

................................
................................
..............................

14

3.1.5.

Demand side

................................
................................
..............................

16

3.2.

Conclusion of the analysis

................................
................................
................

18

4.

Analysis of the Forges

................................
................................
..............................

19

4.1.

Data

................................
................................
................................
...................

20

4.1.1.

OSS dataset
................................
................................
...............................

20

4.1.2.

Productivity and performance measure

................................
....................

21

4.1.2.

Macroeconomic data

................................
................................
.................

31

4.1.

Methodology
................................
................................
................................
......

35

4.2.

Model
................................
................................
................................
.................

36

4.3.

Results

................................
................................
................................
..............

37

4.4.

Conclusions of the analyses

................................
................................
.............

51

5.

Conclusion

................................
................................
................................
................

53

List of References

................................
................................
................................
............

55

Appendix

................................
................................
................................
..........................

61





iv

List of Figures

Figure 1.1. Growth i n New of Proj ects in each Repository by Year

2

Figure 3.1. Number of Soft ware Compani es on Forbes2000 list

11

Figure 3.2.
Key Figures of Soft ware Companies on Forbes2000 list (billion $)

12

Figure 3.3. Revenue of the top 100 European vendors from software acti vity (billion €)

13

Figure 3.4. Revenue of the Software 500 list’s companies (billion $)

14

Figure 3.5. Ch
ange in IT budget by company size, 2009

16

Figure 3.6. IT expenditure by counties (billion $)

17

Figure 4.
1
. Number of Shared Names across each Repository

28

Figure 4.2. Number of new projects registered in a month to
Rubyforge

29

Figure 4.3. Number of new projects registered in a month to Freshmeat and Rubyforge

30

Figure 4.4. Number of new projects registered in a month to Sourceforge

31

List of Tables

Table 3.1. Number of Empl oyees by Software 500 compani es

14

Table
3.2. Top ten company based on Revenue / Employee

15

Table 4.1. Variables and its’ descriptions in the OSS dataset

20

Table 4.2. Software Producti vity (ESLOC/SM) by selected Application Domai ns

25

Table 4.3. Selected record form the analysis’ dataset

32

Tab
le 4.4. Macroeconomic vari abl es used in the analysis

33

Table 4.5.
RE Estimation R
esults for Model 1
-
4 (robust standard errors)

38

Table 4.6.
RE Estimation Results for Model 1
-
4 (
normal standard errors)

39

Table 4.6.
RE
Estimation Results for Model4


10 (
robust standard errors)

41

Table 4.7. Results for Forge wise RE estimation with (M4) regressors

44

Table 4.8. Results for Forge wise RE estimation with (M5) regressors

45

Table 4.9. Results for Forge wise RE estimation with (M6) regressors

46

Table 4.9. Results for Forge wise RE estimation with (M7) regressors

47

Table 4.9. Results for Forge wise RE estimation with (M8) regressors

48

Table 4.9. Results for Forge wise RE estimation with (M9) regressors

49

Table 4.9. Results for Forge wise RE
estimation with (M10) regressors

50

Appendix A


T
able of countries weighted by the number of acti ve devel opers

6
1

Appendix B


Table of calc
ulati ons of the effects’ size

6
2





1

1.

Introduction

Over the past decade, the Open Source Software phenomenon has had a
global impact on the way organizations and individuals create, distribute,
acquire and use software and software
-
based services. OSS has challenged
the conventional wisdom of

the software engineering and software business
communities, has been instrumental for educators and researchers, and has
become an important aspect of e
-
government and information society initiatives.
OSS is a complex phenomenon and requires an interdisci
plinary understanding
of its engineering, technical, economic, legal and socio
-
cultural dynamics.

T
he
open source movement has attracted the interest of
many
academic
researcher
s

due to the success of famous
OSS

products like the ‘Linux’
operating system, the ‘Apache Web Server’ or the ‘Firefox’ web browser
.

1.1.

Problem Statement

The purpose of this thesis is to provide an

answer to the following question:

How is the
S
oftware Industry affected by the
2008
Economic
Crisis?

It will be answered by an

analysis of the software industries performance with
special attention to the OSS sector.

During fall 2008 the world economy experienced dramatic falls in the GDP
leading to a deep recession due to various reasons such as
the collapse of the
US realty bubble.
Major economies
like

the American or the European were
even experienced negative growth rates while a slow but positive growth were
fueled by the emerging economies of China or India.

According to the FLOSSmole project

-

an internet
-
based collaborative collection
and analysis of free/libre/open source project data
(
Howison et al., 2006
)
-

the
number of registered projects in the OSS hosting
-
sites are started to decrease
significantly since the economic crisis started
late 2008

as
Figure

1
.1
.

shows
below
.


2

Figure
1.1
.
Growth in New

of Project
s in each Repository by Y
ear



Source: flossmole.org

1.2.

Methodology

In my thesis I will attempt to find the factors that can explain the recent changes
in the output of the OSS sector.

Those micro
-

and macroeconomic factors that
can affect the performance of the sector, and the recent economic recession
have a heavy effect on it, for instance the stock prices, the GDP or the size of
the unemployment or the changes in the amount of retai
l sales. I will also try to
find alternative explanations, which are not originated in the ongoing crisis, to
the shrinking number of OSS projects.

In contrast I will also analyze the proprietary software sector to provide a
cleaner picture about
the prospects
in the software industry as a whole.

I strongly believe that the topic is related to the existing academic literature in
macroeconomics, more narrowly it connects to the business cycle theories. The
ones are concerning the sources and the nat
ure of the macroeconomic
fluctuations.

It is critical that we specify what software industry means in this paper. Due to
the exponential growth of computer usage in various sectors and different
purposes there are tens of thousands
of
different software pr
oducts.
The term

3

‘software’ in this analysis will mean application software, system software and
all the necessary tools.

1.3.

Delimitation

It is found to be very hard to gain access to reliable data on the software
industry as a whole, and it was even harder i
n case of the OSS sector. The
author is aware of the fact that the activity level of the OSS is usually measured
by the number of messages generated in a given period of time, and that the
output in the software sector is generally measured by lines of cod
e. Instead a
different approach has been used due to the resource constraints. The detailed
discussion of data constraints and dataset building can be found in section 4.1.

1.4.


Structure

The rest of the thesis structured as the follows
:

Section 2

describes the evolution of the Open Source Phenomenon and
provides an insight to the existing academic literature

on business

cycle theory.

Section 3
presents an analysis of the software industry in the light of the recent
economic crisis.
It provides var
ious data about the sector and presents the
results as well as the conclusions of the analysis.

Section 4 consists

of the analysis on

the willingness to start an OSS project. It
presents a model used in the thesis, the dataset and the variables used are also
discussed in detail.

Section 5 concludes on the findings of the thesis.




4

2.

Literature

Overview

2.1.

Open Source Phenomenon

The Free/Open Source Software (F/OSS) phenomenon has attracted an
increasing amount of attention
in the academic literature
in recent years.

The
term “open source” refers to the fact that the program source code is


in
contrast to the source code of the proprietary software, which is only distributed
in compiled machine code


accessible, available and thus alterable by its user

(Bitzer

and S
chr
ö
der,

2006
).

Due to
the growing number of contributions from
various academic fields a wide range of interesting issues emerged. This
section provides account of these contributions in order to understand the state
of the phenomenon.

Among researches co
nducted on the F/OSS individual incentives and
motivations recei
ved by far the most attention. According to a

widely accepted

scheme of the results (Osterloh et al., 2001; Hars and Ou, 2002; Lakhani and
Wolf, 2005) the individuals motivation can

be grouped

under two headings.
I
ndividuals
are
driven either by intrinsic
-

or extrinsic motivations. In the first case
according to Deci and Ryan (1985)
i
ntrinsic motivation is
defined by an activity
for its inherent satisfactions rather than for some separable
consequence. The
individual is moved to act for fun, challenge rather than for rewards. On the
other hand Lerner and Tirole (2002) states that a programmer only engage in a
project, w
hether commercial or F/OSS, if she derives a net benefit. This means
that

the source of motivation depends on external factors


reputation, user
needs, learning and performance improvement.

The success of the OSS development
was rather surprising in among some
scholars.
Since

it seems to be in contrast to the “Brooks` Law”. Wh
ich declares
that adding
manpower

to a late software product makes it later. The OSS
development on the other hand seems to be driven precisely by the high
number of skilled developers. Raymond (1998) summarizes the
main features
of the OSS production mode

which generated large number of academic
commentary. Advantages of the parallel code development (Feller and

5

Fitzgerald, 2002). Integration of users into the production of software code
, von
Hippel and von Kogh (2003) describing F/OSS as a private
-
collect
ive model of
innovation.

Numerous studies turned their attention to the governance and
coordination structure of
the OSS projects.
Rossi (2005
) summarizes the
principal findings of these studies: according to Krishnamurthy (2002) the
median number of the developers per projects is four while it is only one
for

Healy and Suss
mann

(2003).

The success of Linux operating systems



accounting for a 3
8% share of the
server operating systems market (Bitzer 2004)
-

contrary to Microsoft

s
Windows

seemingly
huge
superiority

in many fields referenced by an extensive
number of authors.
The issue of the competition between open
-

and closed
software productio
n discussed widely

(
Gaudeul

2004,
Economides and
Katsamakas

2005,
H
arison and
K
oski 2008)
.
An

increasing number of for profit
firms base their services on the OSS phenomenon to mention the few biggest:
Red Hat, Cygnus, VA Linux. Meanwhile established hardware and software
producers


IBM, Hewlett Packard
and Oracle

-

turned their attention to the
OSS sector.

The
studies
analyzing

the competition in two dime
nsions: the
quality differences

and the dynamics of innovation between the competitor
sectors. The models suggesting that the driver of innovation is to meet with the
user
’s

needs which leads to
a quality increase (Kuan, 2001; Bessen, 2002
)
.

Bonaccorsi and Rossi (2003)

take

into account the network effects and
externalities in the competition. Casadeus
-
Masanell and Ghemwhat’s (2003)

model of mixed
-
duopoly competition focuses on the influence of strategic
pricing decisions on
consumer’s

valuation while Bitzer (2004) looks at the
product differentiation effects on the competition.

So far a very important aspect of
the
F/OSS
phenomenon
has been overlooked
the OSS licenses. T
hese licenses have a huge role

on all previously mentioned
issues
,
.
the motivation of developers, the coordination of projects, the
effectiveness of the OSS development model and the competition
with

the
commercial se
ctor.

The Open Source Definition (OSD)
1

defines the “rights that



1

I refer to the OSD v1.9, the latest version available at
http://opensource.org/osd.html

(accessed,
November 26, 2010)


6

a software license must grant you to be certified as Open Source” (Perens,
1999). The main principles of the OSD are the following:



Free Redistribution: the license may not restrict any
party from sell
ing or
giving away the software.



Availability of the source code: the license allows modifications and
derived works and allows the distribution of these under the same terms
as the license of the original software.



Integrity of the source c
ode
: The license may restrict source
-
code from
being distributed in modified form
only

if the license allows the
distribution of "patch files" with the source code for the purpose of
modifying the program at build time.

The confusion whether F/OSS belongs
to the public domain addressed by a
wide range of scholars: Lee (1999) Perens (1999)
, Lanzi (2005).
There is a
difference between
F/OSS and
a software that simply put into the public
domain, because although it is free and available to all developers do no
t
surrender their rights to their software creations. Rather, they retain copyright
over their work and adopt licenses to ensure free access and modification of the
source co
de as Rossi (200
5
) summarizes the above mentioned papers`

findings
.

Lerner and Tirole (2005) found in their empirical research that
restrictive licenses are more likely to be adopted when the software is directed
at end
-
users, whereas less
-
restrictive licenses are more frequently adopted for
projects
aimed

to developers,
the Internet or proprietary operating systems.

As
Rossi (200
5
) summarizes n
umerous studies suggest
the need for

rethink
ing

the
current state of the intellectual property protection for software programs
(
Moglen, 1999;

Osterloch et al., 20
01; Benkler, 2002
)
.

OSS seems to represent
a “new intellectual property paradigm” (Maurer & Scotchmer, 2006), i.e. a new
type of ownership concept that leads to different allocations of intellectual
property rights and different modes of organization as compared to so calle
d
proprietary software.


7

2.2.

Business Cycle Theory

A business cycle is identified by the behavior of aggregate economic activity
which is measured by a wide variety of series including output, sales,
employment, and income according to Moore (1983).

The
analysis conducted in this thesis is fitted into this context. I will try to shed
light to the conformation of OSS activity
2

with the use of the business cycle
theory’s approach. Explaining the behavior of OSS development activity through
macroeconomic ser
ies. A brief overview of the evolution of theories follows.

The early theories: The classical theory dominated the macroeconomics from
the late XVIII. century until the 1930s. The theory


although has been
discredited during the Keynesian revolution


pro
vides foundation for several
modern theories of the business cycle such as the monetarist and real business
cycle models. The model itself is focused on the supply side effects in the
economy. It views positive shocks causing expansions and negative shocks

causing recessions.

Keynes with
The General Theory of Employment, Interest and Money

(1936)
points out that there is truly wrong with a theory


classical


that cannot explain
the severe unemployment of the 1930s. The theory focuses on the nature of the
investment, the variable found to be the most unstable. Therefore he change
d

the focus of the analysis from the
exogenous

variables (supply side) to th
e
endogenous (demand side) one.

Monetarism
: Friedman

and Schwartz (1963)
examined

the changes in the
mone
tary growth and argued that it is the main source of economic instability.
The evidence showed that the growth of money supply indeed lead to a GNP
growth and, because prices are slow to adjust, generate business cycles. (Hall
1990)




2

OSS activity: I refer to OSS activity as a number of projects registered to the „forges”.


8

The rational expectatio
ns theory

(Miller, 1994)

provides insight to the behavior
of economic agents. It argues for policy heterogeneity

(Sargent and Wallace,
1975)
,
that discretionary monetary and fiscal policy

maybe unable to alter
aggregate output.

The New
-
Keynesian school
improved further Keynes’ original theory answering
many critics of it.

One of the major contributions is the prediction that unstable
aggregate demand and supply are important determinant of the business cycle.
The aggregate demand causes cycles because th
e
wages and prices are not
flexible in the short run, on the supply side the cause is in the real changes in
the labour market and/or production function alter the output (Hall1990).

Real business cycle theory is the first one that approaches the nature of

cycles
from the supply side again since Keynes discredited the classical theory. It
states that the dominant causes of instability are external shocks to the supply.

(Barro 1989, Plosser 1989)

This theory can be the base of research
conducted
in this thes
is
since
the assumption is that

the

macroeconomic environment
affects the sector indirectly,

through the
developers, who

provide

the supply for
the sector.

Modern theories:

These above mentioned major competing theories of business
cycle identified the fou
r
major variables of the theory: t
he role of price and wage
adjustment, the nature of cyclical unemployment the relative roles of aggregate
demand and supply. However each school differs on the judgment of the
factors. This moved to the scope to new resear
ch areas. Considerable attention
is devoted to the analysis of the stock market fluctuations (Chauvet 2001,
Chordia and Shivakumar 2001
, Casarin and Trecroci 2007
).

Due to the nature of the field the

subject of

forecasting
cycles is

adressed
.

(
Zimmermann
2001
)

Rötheli
(
2006)

showed
that business practitioners'
forecasting of the business cycle was ahead of business cycle theorizing for
many years. Theorists of the 1930s and 1940s were not yet ready to
incorporate into their theories the business forecastin
g methods that had
become widespread entrepreneurial practice. Instead, theorists either

9

speculated on business psychology or built tractable dynamic mathematical
models with accelerator type investment. However, by the 1940s, a model of
the cycle with for
ward
-
looking investors might actually have been developed.


Finally a
s Hall (1990) summarizes the
common method to
use

to characterize
the behavior of economic variables is in terms of their cyclical nature during
business c
ycles:



pro
cyclical: if the
variable

typically moves in the same direction as

aggregate economic activity
,



countercyclical variable usually moves in the opposite direction
of
aggregate economic activity,



a cyclical
:

showing no consistent pattern in terms of its movement over
business

cycles.




10

3.

Analysis of the

software industry

Interaction between the real economy and the virtual world

The above statement may sound odd but with two simple examples I show that
there is a serious level of interaction more correctly dependence on the real
economy from the side of virtual world.
For example the

resource that keeps the
virtual world “alive” i
s electricity which is produced in the real economy therefore
affected by all sorts of economic or ecologic effects to mention the most obvious
ones. According to a calculation a ‘Second Life’ avatar consumes 1.752 kWh
annually

almost as much as Brazilians

who consume 1.884. Furthermore

it has
been proven that Google alone consumes 2.1 teraWh
electricity in
a year which
equals to the production of 2 average nuclear reactor
s
. The financial crisis
started in 2008 have huge impact on the world
-
wide real econom
y leading to
growing unemployment rates, shrinking consumption, decrease
d or even
negative rate of GDP.

This describes the massive decrease to IT budgets around the world. There
was a huge pressure on CEOs to lower costs to be able to stay in the
competiti
on in any sector. And here comes the OSS sector into the picture. By
2009 FLOSS has been recognized by CEOs as an opportunity to lower costs.

The following section will provide a software market analysis.

We can say that among the different OSS business mo
dels those of which had
50
-
60% revenue from subscription fees and 40
-
50% from the services are
gained on the crisis.




11

3.1.

Data

During the analysis and the data collection period I had to realize that it is very
hard if not impossible to find aggregate data on

the industry.

Therefore a
productivity analysis is seems impossible to conduct with the resources
available to this thesis.

On the other hand there are numerous organizations


professional journal, market researcher


that conduct some kind of analysis o
r
data
collection, mostly survey based.
Thus I cho
se to

present the data from
these sources and conduct

an

analysis on it.

3.1.1.

FORBES Global 2000

The Forbes Global 2000 is a comprehensive list of the world’s biggest and most
powerful companies, as measured by
a composite ranking for sales, profits,
assets, and market value.
3

It is possible to search based on industries in the list,
the Software and services industry included on the list since 2004.

Figure 3.1. Number of Software Companies
on Forbes2000 list


Source: Author’s own calculation
, Forbes
2000

As the graph shows the number of
companies
between
28

and
35 since
2004. It is constantly growing since
2006

from an all time low 28
. As of
the 2010 list the number is the highest


㌵P


獩n捥 the introdu捴ion o
f the
獥捴or to the li獴K
Ba獥d on the 獨ape
of the number of 捯mpanie猠 the
獥捴or i猠 growing de獰ite the global
e捯nomi挠捲i獩献

qhe corbe猠OMMM li獴

al獯 捯ntain猠 figure猠 about 步y finan捩al indi捡tor猠 the following figure i猠
pi捴uring
the behavior of
tho獥
K






3

Forbes 2000 sources: Exshare, FT Ineractive Data, Reuter
s Fundamentals, and Worldscope
via FactSet Research Systems, Bloomberg Financial Markets.


12

Figure 3.2.
Key Figures of Software Companies on Forbes2000 list (billion $)
4



Source: Author’s own calculations, Forbes2000

The figure captures very well the effects of the crisis.
The market value of the
companies’ grew from 2004 to 2007 by
48.7%. The following year brought halt
to this massive growth; while during 2008 a significant 31.9% decrease took
place. The sector seems to be recovered very fast by reaching an all time high
1159.05 billion USD.

The value of the sales and the assets sho
ws very similar behavior. Very slow
growth until 2006. Then the growth accelerated until 2008
,

f
ollowed by a slow
decrease in the values

which may continue over 2010
.

3.1.2.

T
r
uffle 100
5

The
Truffle 100

ranks the top
European software companies
.
With the inclusion
of the Truffle 100 list’s analysis it is possible to draw a picture about the
European software industry
which can increase the resolution of the global
picture.

According to the list’s authors t
he
software industry

is characterized by
p
eriodic technological disruptions that

pave the way for new arrivals;
f
or that
purpose its results, research
and

statistics are released on a yearly bas
is.




4

Market value is as of Februar 28, 2010

5

The Truffle 100 is compiled from survey & research conducted by IDC & CXP for the purpose
of the Truffle 100 ranking.
Europe is defined as: EU + Switzerland + Norway. A company is
defined as a European software vendor if its headquarters and R&D management are based in
Europe


13

According to the results the
revenue of the top European
vendors is constantly growing
despite the o
ngoing economic
slowdown.
The European
market is very concentrated,
79% of the revenue comes
from the top 25 companies up
from 70% from 2008. And the
vendors are
facing
very heavy
competition with huge global
actors

like

Microsoft,

whose


Figure 3.3.
Revenue of the top 100
European vendors from software
activity

(billion €)


Source: Author’s own calculations,
Truffle 100

revenue in 2009 was alone
€40.9 billion
, 52% higher than the top 100 European
vendor’s aggregate revenue.

This clearly shows that there is room for
improvement.
The report “emphasizes

the impressive dynamismand
exceptional resilience of the European software industry. In a challenging
environment, software vendors have demonstrated their ability to bounce back

quickly (with 8.4% year
-
on
-
year growth) and remain profitable (€3.7 billion),
while maintaining a heavy level of investment in R&D (€3.8 billion).


(Kroes,
2010).

3.1.3.

Global Software 100
6

The Global Software Top 100 list is very similar to the Truffle list
but the focus is
on the worldwide software industry. The

companies are ranked according to
their revenues
7
.

The authors

gathered
data
from SEC filings, annual
reports and
corporate websites.

Similarly to the European market the global software market is al
so very
concentrated. The top ten vendor accounts for nearly 60% of the top 100’s
revenue, which is over $220 billion in 2009. 46 vendors out of the 100 reported
lower revenues than the previous
year;

still the average growth was 3.2%



6

http://www.softwaretop100.org/methodo
logy

(accessed October 27, 2010)

7

Revenue contains

'prepackaged' software sales
;

subscription and support activities
;

certain
service activities
are excluded;

hosted software solutions (Software as a Service) are included.


14

among the listed comp
anies.

The report concludes that the major effect of the
crisis was that many companies cut back their IT budgets, but since the sector
seems to be recovered very fast and the industry is one of the R&D drivers, the
budgets will increase in the future agai
n and will grow bigger than before. The
other effect is that the

„credit crisis abruptly stopped acquisitions by private
equity firms”.

3.1.4.

Software 500

Software Magazine
’s

Software 500
list
is a comprehensive look at the software
industry targeting enterprise IT organizations with software and services.
The
authors collect data based on the magazine’s survey.

The rankings are based on total worldwide software and service revenue
8
.

Total
2009
Software 500 revenue is $491.2 billion, an 8.7%
increase over last year's
total.
Once

again

an analysis that

seems to report about an industry, healed
very quickly after the global economic crisis started.

Figure 3.4. Revenue of the Software 500 list’s

companies

(billion $)


Source: Author’s own calculation, Software500




8

I
includes revenues from softwa
re licenses, maintenance and support, training, and software
-
r
elated services and consulting.
www.softwaremag.com

(accessed October26,

2010)


15

The graph above confirms the
assumptions;

the industry seems to grow despite
the ongoing crisis.

Another interesting aspect of the Software 500 list is that
data on
the number of
employees
was
also collected.
The total number of
employees in the Softwa
re 500 is up a healthy 26% to 3

707 957, compared to
2

953
016
. It suggests that in 2007

many companies were cutting costs and
conserving cash. The big increase in 2009 is primarily d
ue to the addition of
Hitachi to the list, with 389

752 employees, and Emerson El
ectric, with 140 700
employees. The fact that the actors on the list may change from year to year
can
cause biases in case a
time series analysis
therefore caution is advised
interpreting the data.

Table 3.1. Number of Employees by Software 500 companies


Employees

Growth

2008

3

707 957

25.6%

2007

2 953 016

1.3%

2006

2

914 480

14.7%

2005

2 539 872

-
4.6%

2004

2

660 023

13.7%

Source: Software 500

It is however possible to
calculate the labour

productivity and compare the
agents

in the industry. Table 3.2. shows the top ten company based on the
revenue / employee.




16

Table 3.2. Top
ten company based on Revenue /

Employee

Average for 500 = $ 191
417


Rank

Company

2008
Employees

2008

Revenue

(thousand $)


Revenue/

Employee




236

Innodata Isogen, Inc.

45

75
001



1 666
689





197

Lighthouse Computer
Services, Inc.

95

119 700




1

260
000





78

ePlus inc.

658

727 159




1

105
105





138

Technology
Integration Group

300

274 500




915
000





498

JangoMail

6

5 288




881
313





215

GreenPages
Technology Solutions

150

100

000



666
667





2

Microsoft Corp.

91
000

52

280

000



574
505





70

Akamai
Technologies, Inc.

1
500

790

924



527
283





106

SolidWorks Corp.

787

407

300



517
535





27

Juniper Networks,
Inc.

7
014

3

572

376



509
321



Source: Software 500

The table captures very well the productivity difference
among

the companies
the
first company on the

list realizes

about 8 times higher
revenue than the
average on the list.
This variance can be explained
by

the
different activities
performed.

3.1.5.

Demand side

It seems that the software and related service providers were able to overcome
a very challenging situation and were able to grow their revenues. But the major
driver of these revenues was

not the new orders but the already existing

17

maintenance contracts.
Thus it is worth to inspect the demand side as well.
There was reference about decreasing ICT expenditure in every sector due to
the heavy cost
cutting

pressure. According to IDC’s survey based research the
ratio of companies planning to increase or not ba
ck on ICT spending exceed the
ones decrease. Only one third of the small businesses plan to decrease the
budget, while it is about 50% in case of medium size, and the large ones are
planning to cut back less.

Figure 3.5. Change in IT budget by company
size, 2009

Source: IDC

Overall 42% of the companies
were
planned to decrease their spending along
with 27% who is not changing and 31% who will increase.

On country level a
growing tendency is
taking shape.

The fact of

that only part of the ICT
expenditure is spent o
n software and related services
; and that the covered
period
ends in 2008 makes
the graph less useful!




18

Figure 3.6. IT expenditure by counties
9

(billion $)



Source: Author’s own calculation, OECD, USCB

3.2.

Conclusion of the analysis

The above I have presented four different analyses on the software industry.
The
different lists created by various organizations included only the biggest
actors of the sector. It is possible however to draw a comprehensive pict
ure of
the effects of the crisis on the industry.



Based on the different researches the behavior of the revenue figures
share a theme:
the sector experienced a minor but
positive

growth

last
year

which started to accelerate again recently.



The market is

concentrated

therefore an unseen economic effect on the
major players can reshape the landscape of the software industry.



Many companies cut back on the ICT budgets to stay in competition, but
as the studies suggest the sector is one of the key player in
R&D
moreover necessary to increase productivity and
reach
optimal resource
allocation,
therefore the budgets expected to be filled up along with the
stabilization.





9

Europe equals to EU27, Switzerland, Norway, Turkey; exchange rate as of
Oct 25
, EUR/USD = 1.4031

(
http://www.x
-
rates.com/d/USD/EUR/graph120.html
)


19

4.

Analysis

of the Forges

In the following section I will attempt to address the
answer to the

research
starter problem. As Figure 1.1 shows in the introduction section the number of
projects started to decrease significantly at the same time.

It is found to be very
surprising that the OSS sector is seems to experience a massive decrease in
“output
”.
In this case

output loss mean
s

that there is a decrease in the number
of
p
rojects

registered

in each repository since the beginning of the economic
crisis.

In case an economic crisis most of the real economy
agents

are under a
pressure to decrease their

costs. The OSS in many cases a cheap alternative to
the available proprietary ones. The combination of the cost constraint and the
cheap alternative should result an increase in the demand and thus the output
as well or at least it should be stay steady.
That is why the decrease in number
of projects in the observed time period is
found to be
surprising.

T
he volume of the decrease seems to be the same
however

it is not since the
y

axis of the graph


Number of Projects


has

an exponential scale. Therefore
the
volatility

in the graphs
not represents

the true values relative to each other.
T
he number of projects hosted at

Objectweb is
in the 10
-
30 range at Rubyforge
it is around 1000 and at Sourceforge in the 10000 range.
If t
he graph would use
a linear scale instead of the exponential the Objectweb line would seem a flat
line on the
x

axis.


In

one side
there is

a massive output decrease in the OSS sector which can be
the result of productivity deterioration. On the other side

a deep
economic
recession

started at the same time which can be the cause

also
. My assumption
is that the real economy has an effect on the OSS productivity level. The
purpose of the analysis therefore is to shed light on the relationship between the
real

economy and the OSS productivity.

The following chapter can be divided into four parts.
The

first part
provides the
description of

a number of variables that can describe the performance of the

20

OSS sector. The second part will d
iscuss

the factors that can

affect these
performance measures. In the third part an econometric model
is presented

to
determine the effects of the different, preliminary defined factors on the
performance of the Forges. Finally a detailed discussion follows on the
received
results.

4.1.

Data

The Dataset building will be presented in two subsections the first describing
the OSS dataset, the second will provide an overview of the macroeconomic
dataset.

4.1.1.

OSS data
set

The OSS data
set

contains data about projects hosted by the Freshmeat,
Objectweb, Rubyforge and Sourceforge forges.

Table 4.1
.
summarizes

the

selected

variables

from Flossmole (
Howison
, 2006)

and
its’

descriptions of the
dataset.

Table 4.1. Variables and its


descriptions in the OSS dataset

Variable

Description

proj_unixname

Name

of the project

under it is
registered to the forge.

datasource_id

Identification number of the Flossmole
dataset

where it is collected to
.

Differs
across forge and time.

url

URL
address of the project

through
the hosting site.

real_url

The project’s

own

html

address
if it
has an additional one from the above.

For example own website,

mailing list

different from the

one on the

host
.


21

date_registered

Date of the project registered
to the
hosting site
.

proj_long_name

Full name of the project.

proj_id

Identification number of the project.
Identifies

a project with a single
number instead of the name.

dev_count

Total number of developers for the
project

date_collected

Date the data

collected

to the
Flossmole dataset.

4.1.2.

Productivity and performance measure

Productivity is usually defined as a ratio of output from a production process for
given inputs. It can also be expressed as the ratio of output to input. In the case
of software industry
it can be defined

as a rate of producing some output using
a set of inputs in a given time unit
.

Thus the first step is measure the output which is the software itself in our case.
Unfortunately due the complexity of the product there are several different
software size me
asures
. According to
Chemuturi and Kaligotla

these are the
following
:



Function Points
: A

unit of measurement to express the amount of
business functionality an information system provides to the end
user. The cost (in dollars or hours) of a single unit is
calculated
from past projects. (Cutting, 2009). This unit of measure is used
by the IFPUG Functional Size Measurement Method, which is one
of the five ISO recognized software metric to size an information
system. It is based on the functionality that is pe
rceived by the

22

user of the information system, independent of the technology
used to implement the information system
10
.



Object Points
:
According to Guiliano et

al. (1999) object point is a
method that estimates the object oriented (OO) software
development

projects’ size. The experience that has been
obtained with function points in traditional software development is
exploited into the OO paradigm. Adapting function points to OO,
the mapping of function point concepts to object oriented concepts
is necessa
ry, and OO



specific

concepts must be handled.



Equivalent Source Lines of Code (ESLOC)
: Source Lines of
Code means the number of lines in a software’s source code.
There are two different approaches
. O
ne is counting the physical
lines where every ‘enter’
or hard line brake stands for one line.
The other is counting the logical executable statements as lines.
Equivalent SLOC takes into account the differences in effort
required to incorporate new vs. inherited code into a delivered
system also the additiona
l effort required to modify
reused/adapted code for inclusion into the software product (Hihn,
2004).



Test Points
: Testing a project and that a Test Point is equivalent
to a normalized test case. It is common knowledge that test cases
differ widely in term
s of complexity and the activities necessary to
execute it. Therefore, the test cases need to be normalized


just
the way Function Points are normalized in to one common
measure using weighting factors. Now there are no uniformly



10

Wikipedia contributors. „Function point.”
Wikipedia, The Free Encyclopedia.

16 September
2010, 18:54 UTC.
h
ttp://en.wikipedia.org/w/index.php?title=Function_point&ol did=385215816
.
(accessed October 20, 2010)


23

agreed measures of normal
izing the test cases to a common
size.
11



Use Case Points
: A measurement of how much effort is required
to write software based on how much work the software is
intended to do. The method was created by Gustav Karner of
Rational Software Corporation in the m
id 1990′s. The method was
based on a study of about 200 projects with an average size of 5
man
-
years of effort. The use case point method of estimation was
found to be within 10% of the actual results for over 95% of the
projects (Blain, 2007).



Feature Poi
nts
: In 1986, Software Productivity Research, Inc.
developed an experimental method for applying Function Point
logic to system software such as operating systems, telephone
switching systems, and the like. To avoid confusion with the IBM
Function Point me
thod this experimental alternative was called
‘Feature Points’.
When Function Points are applied to such
systems, they of course generate counts. However, the counts
appear to be misleading for software that is high in algorithmic
complexity, but sparse in

inputs and outputs. From both a
psychological and practical vantage point, these kinds of systems
software seem to require a counting method that is equivalent to
Function Points, but sensitive to the difficulties brought on by high
algorithmic complexity
.
12



Etc.

Beyond the large number of different
software
size measures

there is no
generally accepted way of converting these from one to another. Therefore it is
possible that the size of the software will change relative to each other



11

aralikatte „Test Point Estimation.”
Scribd

http://www.scribd.com/doc/4939959/Test
-
Poi nt
-
Estimation
. (accessed October 20, 2010)

12

„What are Feature Points?”
Software Productivity Research
.
http://www.spr.com/feature
-
points.html
. (accessed October 20, 2010)


24

depending on the measu
rement system. The fact that there is no clear way to
conduct the Lines of Code methodology


count the logical statements or the
physical statements, how to treat inline documentation


makes the situation
even more complicated.

However these above
mentioned issues are only one side of the coin. The other
major problem with regards to software productivity measurement is that we
want to explain a rather complex process with one sin
gle empirical figure. The
d
evelopers have to work in an ever changing
-
, continuously evolving
environment where


focusing on new technology in telecommunications
industry


the software complexity increased by a factor of 10 in the early
2000’s (Groth, 2004). Almost simultaneously the introduction

passed off

to the
Web
-
base
d services

with Java. Moreover a revolution took place in software
outsourcing and organizational structure which further increased the complexity
of the software development by a wide margin. Nevertheless to say the skill
levels of these activities are di
fferent
,

the tools, inputs and outputs are all
different. In my opinion lumping them together and call it “Software
Development” then giving a single productivity figure to it at best can result an
unarguably rough estimate.

There are attempts to give a ra
nge to the productivity figures such as 10 hours
per Function Point but it could vary from 2 to 135 depending on the product

(Chemuturi and Kaligotla)

or from 45 to 975 (Reifer, 2004) ESLOC/SM (lines of
code/staff month) depending on the application domain
.

Table
4.2
.
summarizes
selected results from Reifer’s (2004) productivity calculations.




25

Table

4.2
.

Software Productivity (ESLOC/SM) by selected Application
Domains

Application Domain

Range (ESLOC/SM)

Banking

155 to 550

Command and Control

95 to 350

Data processing

165 to 500

Scientific

130 to 360

Telecommunications

175 to 440

Web Business

190 to 985


Source: Reifer (2004)

Through the des
cription of Reifer’s analysis it is possible to
show how complex
procedure is to measure the size of software.
The original calculation conducted
on 600 projects which are taken from Reifer’s database of more than 1,800
projects. These projects were completed within 1997 and 2004 by any of 40
organizations.

A
project is defined as the delivery of software to system

integration. Projects include builds and products that are delivered externally,
not internally. Both delivery of a product to market and a build to integration fit
this definition.

The scope of all projects starts with software requirements
analysis and
finishes with completion of soft
ware testing.
The average number
of hours per staff month was 152 in the original analysis
adjusted

holida
ys,
vacation, etc. into account
.

SLOC is defined to be logical source line of code using the conventions (Florac
and C
arleton, 1999). ESLOC are defined to take into account reworked and
reused code (Boehm, 1981)
.

Reifer defined f
unction point sizes using
the

International Function Point Users
Group (IFPUG) Function point sizes were converted to SLOC using backfiring
facto
rs published by IFPUG in 2000, as available on their web site.


26

As
Table 4.2
.

shows the variance of the results is quite large through the
domains which do

not allow us to receive a close to precise estimate sector
wise.

To sum up, three factors were common

themes among the vendors:



the lack of an industry
-
wide standard definition for software productivity
,



software applications’ increasing complexity,



need for more formalized processes in the industry as a whole.

As a solution for these c
omplications Chemut
uri and Kaligotla suggests to “shift
focus from macro productivity to micro productivity”.

This means that the software development process should be divided into sub
segments and each should be treated differently and described with a different
productivi
ty estimate. This is beyond the scope of this thesis, nevertheless the
available resources.

Therefore
a
different approach

has been used
.
The research

will not study the
productivity of the OSS sector
. F
irst

and foremost

because
as section 1.4.
describes
access to data
is limited and was not possible to retrieve any
about
the number of lines coded in any given period. Instead
the study will

analyze the
activity of the community.
The activity is
prox
ied

by counting the number of
projects registered

in a giv
en time period.
T
he
steps of dataset reforming
are

shown
through the Rubyforge dataset. The same transformations were made in
each four of the OSS hosting site datasets the only difference is the number of
observations.

The
Rubyforge

dataset contains data about 9000 projects hosted on
its

site. 40
Flossmole datasets with the collection

date since July 2006

were combined
to
receive a historical table
.
T
he original
dataset contains
211633 observation
s
.
Throughout the transformation
the

aim was to create a dataset with information
about a project and its registration date.
Variable

project_id
13

was used

to



13

For descriptions see: T
able 4.1.


27

distinguish the projects from each other.

The reason behind this is the problem
of the shared names across hosts. The following analy
sis discusses the
problem, than the Rubyforge dataset transformation description continues.

OSS development environment


Forges:

With the birth of Forges the, the
maturation and creation of large scale market of users and developers of open
source become
possible. By providing a basic, no
-
cost infrastructure for the
fundamental necessities of a project such as the mailing list


which greatly
increases the effectiveness of the communication within the community around
the project


or the free file storage space. On the other hand the wide spread
of the use of forges have a downside.

Three of the most important d
isadvantages are
the following.



First and foremost the
information dissemination

which means that it is
not clear what happens a project is lost between bug tracking and
mailing lists, in case of forum projects
-

-

difficult to interact with each
other moreover very hard,

almost impossible to tr
ack evolution between
projects.



Second negative effect of the forges is the
distributed development
which
at the first sight might seem a big advantage. This enables large scale
development by individual groups through the distribut
ed version control.
It can lead to huge success (Linux Kernel) but with the first mentioned
problem, the lack of information, it often times can lead to confusion,
hard to keep track if a certain problem has already so
lved can lead to
double coding.



Finall
y the
shared names

problem namely that different project can run
under the same name across the different forges. According to the
FLOSSmole project there were 1367 projects with shared names on
Rubyforge and Sourceforge in June 2009. For instance, starfis
h is a
project listed on both Sourceforge and Rubyforge. On Rubyforge, it is
described as a

tool to make programming ridiculously easy

, but on

28

Sourceforge the starfish project is described as a password management
application (
Howison, 2006
). Figure 1.2.

shows the number of shared
names across the largest project hosting sites.

Figure
4
.
1
. Number of Shared Names across each Repository



Source: flossmole.org

The continuation of the Rubyforge dataset transformation description follows.
As
of
the first step
projects with

no project id
were dropped out
which equaled to
31643 observations. The second step was to remove the projects that were
observed
multiple
times

during the collection period.
It is

resulted a data
set

with
9181 different projects.
Then

the
removal of the
ones with no registration date
and the ones that were registered in the first and last observed month
-

because
the collection was not conduc
ted on the last day of the month

-

followed
.
As the
final step
the projects were

counted and aggregated on a monthly level
. The
final dataset contains data about

8696
projects

before the aggregation
.

The

29

same dataset
contains 231581, 55247 and 14
6

observat
ions respectively in
case of Sourceforge, Freshmeat, and Objectweb.

Figure 4.1.
shows

the
results
with
regards
to
Rubyforge
.

Figure 4.2
. Number of new projects
registered
in a month
to

Rubyforge



Source: Author’s own calculation

Fi
gure 4.1. supports very well the Flossmole observation from
Figure 1.1.
N
amely since the beginning of the 2008 depression the number of new projects
is decreasing;

the growth

of the forges slowing down.

The next figure


representing the
Rubyforge

dataset

complemented with the
Freshmeat data


shows similar results. The growth of the total number is
decreasing;

however this decline starts earlier


around mid. 2004


compared
to
the Rubyforge site.




30

Figure 4.3
. Number of new projects registered in a month

to Freshmeat

and Rubyforge



Source: Author’s own calculation

Interestingly enough the Sourceforge’s community’s shows different trends over
the same observation period.
The first big and quick increase occurred at the
end of 2000 when the so far steady number of new projects tripled in just two
months.
That
might be the
result
of the
so called
dot
-
com

bubble

collapse
followed with a delay (the “IT bubble” collapsed in May

10, 2000)
. The

activity

level

stabilized
at

1500


2000 for the
next five years
.

Following this period until
mid 2008 the level of the activity grown along with the increase of the variance
of the newly registered projects number
.
Throughout 2008 until mi
d 2009 a
massive 26%
-

30% decline took place. Only, to took off and double the number
of projects and reach an all time high level of 4500.




31

Figure 4.
4
. Number of new projects registered in a month to Sourceforge



Source: Author’s own calculation

The
above mentioned observations with regards to the vide variety of the
conformation does not rules out
, nor confirms

any relationship between the real
economy and the level of OSS activity
.

To shed light to the relationship in the following I present an econ
ometric model
with the discussion of the related methodology and the results of the evaluation.

4.1.2.

Macroeconomic data

The dataset consists of panel data with
quarters and
countries

being the two
dimensions. Below the choice of timeframe, number of countries and data
sources is described.

The period from 1998
-
Q1

to 2010
-
Q2

was chosen as the timeframe of the
analysis. The period was lim
ited to the last t
welve

years since it results re
liable
figures on

OSS
projects.
With

t
he

use of

quarterly data an adequate number of
50

time periods
received
for each country.

To determine the countries included in the analysis I first
searched for the origin
of the OSS development.
Engelhardt

and Freytag
(
2009
)

state
s that studies

32

indicated that
,

firstly OSS developers are well
-
educated software
engineers (or
ICT students)

in order to be able to write software code (i.e. programming),
secondly one must be able to think in abstract terms and lo
gic. Additionally,
most programming languages are based on English and the whole
communication and coordination of OSS projects is done in English.

Therefore
macroeconomic data has been collected for the OECD member countries and
Brazil, Indonesia, Russian

Federation, South Afric
a.

In this stage I face a problem however; I have the same number of programs
registered for all the different countries in the different times. Table 4.3. provides
a visualization of the problem presenting selected lines and column
s from the
dataset.

Records of the dependent variable (1F)
14

constant over countries.

Table 4.3. Selected record form the analysis’ dataset

year

quarter

country

country id

unemp level

unemp rate

(%)

1F

1999

1

Austria

2

15266
7

4
.0

471

1998

4

Austria

2

160333

4.2

324

1998

3

Austria

2

16866
7

4.4

313

1999

1

Canada

4

1224600

7.9

471

1998

4

Canada

4

1245467

8.1

324

1998

3

Canada

4

1258833

8.2

313


Source: STATA dataset

I
t is beyond the

scope and

resources of this thesis to explore the geograph
ical
origin of each project
.

Therefore I used Engelhardt and Freytag’s (2009) top 30
list of countries by active users to weight the dependent variables.

As a result,
the number of countries included in the analysis shrinked to 25
15
.

In order to bring consistency to the analysis
my aim was to collect
data from as
few sources
as possible. The
only source

of the d
ata is the SourceOECD. Table
4.4
.
provides the description of each macroeconomic variables used during the
analysis.




14

See descri pti on of 1F i n p. 36.

15

In alphabetical order the countries included are: Austral
ia, Austria
, Belgium, Brazil, Canada
,
Czech Republic, Denmark, Finland, France, Germany, Israel, Italy, Japan, Mexico, Netherlands,
New Zeland, Norway, Poland, Russian Federation, South Africa, Spain, Sweden, Switzerland,
United Kingdom, United States


33

Table 4.4
. Macroeconomic variables used in the analysis

Variable

Description

emp

Employment:

People

in civilian employment above a
specified age

who during the reference period were
either paid employees, employers and self
-
employed,
unpaid family workers

or students with temporary paid
job
(hous
e
hold survey based).

h_unem_r

Harmonized

unemployment rate:
Give the numbers
of unemployed
people

as a percentage of the civilian
labour force. The civilian labour force consist the
employed and unemployed
people
.

g_indprod

Industrial production

is an index covering
production in mining, manufacturing and public utilities
(electricity, gas and water), but excluding
construction.
(growth since last quarter, seasonally
adjusted).

g_retsales_vol

Retail trade volume

index
: Calculated
by dividing
total retail trade turnover in current prices by an
appropriate price deflator

(growth since last quarter,
seasonally adjusted).

g_unit_labour_costs

Unit labour costs
:

measures the average cost of
labour per unit of output
and are calculated as the
ratio of total labour costs to real output

(growth since
last quarter, seasonally adjusted).

g_cons_prices

Consumer prices

measures changes over time in the
general level of prices of goods and services that a
reference populatio
n acquires, uses or pays for
consumption.
(growth since last quarter)


34

g_gdp

Gross Domestic Product

at constant prices,
seasonally adjusted

(growth since last quarter)
.

b_money

Broad Money

supply, in addition to currency in
circulation plus sight deposits held by domestic non
-
banks, also include time deposits as well as savings
deposits at short
-
notice held by domestic non
-
banks.
(growth since last quarter
, seasonally adjusted
)
.

share_p

Sha
re Prices
:
Prices of common shares of
companies traded on national or foreign stock
exchanges
(growth since last quarter)
.

exp

Export of Goods
:

Consist of exports of national
products, exports without transformation of goods and
exports from bonded warehouses which have not
been transformed since import
(
in billions of USD
,

growth since last quarter, seasonally adjusted).

imp

Import of Goods

Consist of imports for direct
domestic consumption; withdrawals from bonded
warehouses and free zones for domestic consumption
; and imports into bonded warehouses and free zones
(
in billions of USD
,

growth since last quarter,
seasonally adjusted).

s_exp

Service Exports:
Economic flows streaming into the
economy from the rest of the world.
(in USD, growth
since last quarter, seasonally adjusted).

s_imp

Service Import:
Economic flows streaming from the
economy to the rest of the world.

(in USD, growth
since last quarter, seasonally adjusted).


35

4.1.


Methodology

A linear regression is
attempts

to model the relationship between the dependent
variable and the independent variable(s) (or explanatory variable)

(Wooldridge,
2008)
. The linear relationship can be c
aptured through a scatter plot.

Equation (4.1) is the equation for the linear regression.

(4.1)
















Where:






= the dependent variable,






is the constant

term
,







is the explanatory variable,






is the error term.

In case of panel data however the model has to be modified. Panel data is a
dataset containing observations
from different times, from different places in the
same topic
(Wooldridge, 2008)
for instance the growth of the GDP between
1998 and 2010 in the di
f
ferent OECD countries.
Panel data models used as a
way of controlling for cross
-
sectional heterogeneity which means that there is
something “different” about the observed units, but it is not possible to reduce
these differences completely to the observabl
e data. T
he
equation

(4.1)
can
also
be solved by a fixed
-

or a random effect model.

(4.2)






















Where:






= the dependent variable,






is the constant term,







is the explanatory variable,






is the error term


36






observed effect but can not be estimated through
the
fixed
effect
model these are time
-
invariant factors,






un
-
observe
d

individual specific effect a fixed effect for each
individual

across time.

The difference between the two models is
in the
underlying

assumptions

(Wooldridge, 2008):



By using the

fixed effect model we assume that the
individual specific
effect is correlated with the independent variables. In this case the time
-
invariant factors


such as gender, name, etc.


will be excluded from the
equation by taking the difference between each observation with the
within
-
group mean values in order to get rid of the individual specific
effect
term



.



















On the other hand
in case of the random effect model we assume that
the individual specific effects are
uncorrelated

with the explanatory
variables. All the coefficients will be estimated
whether

it is a time
-
variant
or time
-
invariant. Since in this case t
here is no fixed individual specific
effect



and



can be combined together to form a new error term


.
Therefore we do not need to take differences and all variables will be
included

















4.2.

Mode
l

After taking i
nto
consideration the previously

discussed, I have decided to use
the random effect model to
estimate the relationship between the number of
projec
t
s registered and the changes in certain macroeconomic factors. The final
model
expressed in

the (4.3) equation.


37

(4.3)

























































































































Where:





represents the given country,





represents the given quarter,






captures the time variant factors by quarter dummies
,






captures the
country
variant factors by country dummies.

4.3.

Results

Al
l

together

eight different regressions been conducted. The eight models can
be separated into two groups in the following way:



the one is uses robust standard error calculation method: Model 1
-
4



the one that
computes the standard error in the usual way Model 1B
-
4B

The models differ on the number of quarters observed and the used dependent
variables rather than the independent variables. Three different datasets have
been constructed and two dependent variables

in order to capture differences
across the forges:



the first dataset (1DS) contains data on all the 50 observed quarters
(from 1998
-
Q1 to 2010
-
Q2)
,



the second (2DS)
keeps information on the v
ariables from 2003
-
Q3

to
2009
-
Q4
,



finally the third dataset
(3DS) has data on 40 quarters from 2000
-
Q1 to
2009
-
Q4
,



the first dependent variable (1F) aggregates the four forges’ records
,



the second (2F) sums the Sourceforge and Freshmeat data

accounts
for 97% of the observed number of projects.


38

Table 4.5
.
and 4.6
. summarizes the results of the equations, the description of
model
specifications follow

after the
output

tables.

Table 4.5
.
RE Estimation R
esults
for Model 1
-
4 (robust standard errors)


Model 1

Model 2

Model 3

Model 4














Empl oyment

-
0.000*

-
0.000

0.000

-
0.000**


(0.000)

(0.000)

(0.000)

(0.000)

Unempl oyment R
ate

-
0.012

0.007

0.014

0.019*


(0.034)

(0.044)

(0.011)

(0.011)

Industri al P
roducti on

-
0.003

-
0.002

0.007*

0.006


(0.012)

(0.013)

(0.003)

(0.005)

Retai l Sal es Vol ume

0.004

-
0.025

0.006

0.006


(0.032)

(0.043)

(0.006)

(0.008)

Uni t L
abour
C
osts

0.013

0.059

-
0.008

0.014


(0.068)

(0.092)

(0.012)

(0.014)

Consumer P
ri ces

-
0.044

-
0.065

0.002

0.012


(0.040)

(0.051)

(0.011)

(0.014)

GDP G
rowth

0.035

0.064

-
0.022**

0.003


(0.034)

(0.047)

(0.009)

(0.013)

Broad M
oney

Suppl y

-
0.001

-
0.004

-
0.005

-
0.011*


(0.018)

(0.019)

(0.005)

(0.006)

Share Pri ces

-
0.002

-
0.005

0.001

0.000


(0.004)

(0.005)

(0.001)

(0.001)

Export of Goods

0.008

0.009

-
0.002

0.002


(0.008)

(0.009)

(0.002)

(0.002)

Import of Goods

-
0.023**

-
0.024**

-
0.001

-
0.007**


(0.010)

(0.012)

(0.003)

(0.003)

Servi ce Exports

0.011

0.009

0.002**

0.001


(0.008)

(0.009)

(0.001)

(0.001)

Servi ce Imports

-
0.008

-
0.007

-
0.003**

-
0.004*


(0.008)

(0.010)

(0.002)

(0.002)






Constant

1.664*

2.233

0.061

0.304***


(0.910)

(1.381)

(0.103)

(0.107)









overal l
R
2

0.140

0.139

0.393

0.421

Number of obs.

516

516

276

416









The val ues i n parentheses are the standard errors. Si gni fi cance l evel s are denoted by *** = 1%, **

= 5%,
and * = 10%.

Al l model s have been esti mated usi ng, country
-
dummi es and quarter dummi es (not shown).

Source: STATA output


39

Table 4.
6
.
RE Estimation R
esults

for Model 1
-
4 (
normal standard errors)


Model 1B

Model 2B

Model 3B

Model 4B














Empl oyment

-
0.000***

-
0.000***

0.000

-
0.000**


(0.000)

(0.000)

(0.000)

(0.000)

Unempl oyment

R
ate

-
0.012

0.007

0.014*

0.019**


(0.037)

(0.048)

(0.009)

(0.009)

Industri al P
roducti on

-
0.003

-
0.002

0.007*

0.006


(0.017)

(0.022)

(0.003)

(0.004)

Retai l
Sal es V
ol
ume

0.004

-
0.025

0.006

0.006


(0.028)

(0.036)

(0.006)

(0.007)

Uni t Labour C
osts

0.013

0.059

-
0.008

0.014


(0.054)

(0.071)

(0.012)

(0.013)

Consumer P
ri ces

-
0.044

-
0.065

0.002

0.012


(0.054)

(0.071)

(0.011)

(0.012)

GDP G
rowth

0.035

0.064

-
0.022**

0.003


(0.049)

(0.064)

(0.010)

(0.012)

Broad
M
oney

Suppl y

-
0.001

-
0.004

-
0.005

-
0.011**


(0.023)