Longitudinal Social Network Data

frequentverseΠολεοδομικά Έργα

16 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

152 εμφανίσεις

Longitudinal Social Network Data


Longitudinal Social Network Data


Snijders, Tom A. B. 2005. Models for longitudinal network data. In
Models and methods in social network analysis
, edited by P.
J. Carrington, J. Scott and S. Wasserman. New York: Cambridge University Press



Discusses three approaches for statistical modeling of network dynamics
-
independent arcs
model, the reciprocity model, and the actor
-
oriented model.


These models assume that the network is observed in at least two discrete time periods and
that there is an unobserved evolution of the network between time periods (the first
observation or time period is not modeled but is regarded as given so the history leading to
the model is disregarded…really interested in the change between time periods.


It is also not assumed that change within the network, between time periods is at a steady
state.


It is also assumed that there is continuous
-
time evolution of the network(even though we
observe the network at discrete points in time), that network changes occur throughout the
time periods in a feedback loop mechanism, as the current network structure is a determinate
of the likelihood that change will occur in the network.


It is also assumed in these models that the network is continuous
-
time Markov chain for
statistical analysis.





When looking at longitudinal data it is often helpful to first look at the networks descriptive
statistics
-
average degree, mutuality index, transitivity index, fraction missing.


The focus of the model is on directed adjacency matrices with multiple observation points in
which there are a set of states in time and a transition function, with probabilities for each
transition, that is, a probability that the next state is s
j

given that the current state is s
i
.


Simplest model is the independent arcs model in which all arcs follow independent Markov
processes. (it does not take into account continuous processes)


The reciprocity model is a continuous
-
time Markov chain model for directed graphs where it
is assumed all dyads are independent and have the same transition distribution. This model
allows for change rates that are dependent on covariates, the assumption that dyads are
independent is counter to the basic ideas of social network analysis


The popularity model proposes that transition rates depend on the in
-
degrees of the actors
..thus the popularity of an actor, as measured by in
-
degrees, is determined in endogenously by
network evolution. A similar model if the expansiveness model in which transition rates are
determined by out
-
degrees.


Actor
-
oriented models
-

previous models only took into account one effect..ie reciprocity or
popularity, actor oriented models the probability of relational changes depend on the entire
network structure..both macro
-
the whole network and micro
-

individual actor ties.


The actor view means that for each change in the network the perspective is taken from the
actor whose tie is changing. It is assumed that only one tie at a time is changing and is called a
ministep. The moment an actor changes his tie and the particular change he makes can depend
on the network structure and on the attributes represented by the observed covariates. This
moment of change is determined by the rate function, the particular change to be made by the
objective function and the gratification function.


As noted in a great summary by van Duijn et al. and Snijiders et al. 2009, the main idea of this
model is that actors in the network evaluate their position and try to obtain a “better”
configuration of ties that increases heir social well being. Between time periods, time flows
continuously and actors may change their relations at random moments in this period of time.
A change might be that an actor forms a new tie or an existing tie is withdrawn. The second
time period is the dependent variable in the model.


Both observations influence tie changes as the actor evaluates the network structure in the first
time period in order to maximize their social well being in the second time period which is
done by the objective and gratification function, with the number of ministeps determined by
the rate function.



The objective function is the value to the actor of making a change and it is assumed that
actors will maximize their utility, it is the sum of the weighted effects , where the weights are
estimated from the data. Three groups of effects: standard network effects that incorporate
well knows structural properties like density (defined by out
-
degree), reciprocity (defined by
the number of reciprocating ties), transitivity ( defined by the number of transitive patterns),
balance (defined by the similarity between outgoing ties of an actor and the outgoing ties of
the other actors), number of geodesic distances two effect (defined by the number of actors to
whom actor is indirectly tied), popularity (defined by the sum of the in
-
degrees of the others
whom the actor is tied), and the activity effect (defined by the sum of the out degrees of the
others to whom the actor is tied); actor attribute effects like attribute
-
related popularity
(defined by the sum of the covariate over all actors), activity (defined by the actors out
-
degree
weighted by his covariate value), and dissimilarity (defined by the sum of the absolute
covariate differences between the actor and the others whom he is tied); and finally dyadic
attribute effects that are modeled as covariate
-
related preferences.


The gratification function takes into consideration that the effects for creating and breaking a
tie may operate differently, which the objective function does not account for.


The objective and gratification functions are combined in the model for the
ministep
, and it is
assumed that for each
ministep

the actor makes the change which maximizes the objective
function given the new state, takes into account the gratification in the change, and a random
component which captures the deviation from theoretical expectation and reality (or the
actors drives and limited foresight that cannot be readily modeled).


The random term is assumed to have a
Gumbel

distribution with a mean of 0 and scale
parameter of 1, which means the actors behave according to a discrete choice model (creation
or withdrawal of a tie).




An actor assess the value of the objective function that would be obtained after each possible
change in one of his ties and makes a stochastic choice in which the probability of a change in a
particular tie is larger when this change would lead to a greater increase in the objective
function.


It is assumed every individual has the same objective and gratification function, but differs in
individual characteristics that would cause changes in the probabilities of change.


It is also assumed that all actors behave independently and have full network knowledge.


When an actor makes a change, it is assumed that there is a rate change function of their ties
which can be constant or can be modeled as a function of actor attributes and degrees.


Given the individual changes rates, the times between the
ministeps

are independently and
identically distributed exponentially.


This results in the following parameters within this type of model: constant change rate and
the weights that indicate the influence of attributes and degrees on the change rate; the
weights of the objective function; and the weights of the gratification function.


To estimate these parameters we use the Robbins
-
Monro

approximation method and Monte
Carlo simulation methods (which repeated simulate the evolution of the network) are used to
obtained expected values of relevant statistics and parameters. A corresponding t
-
statistic is
used to test significance of the estimated parameters and their standard errors using the
Robbins
-
Monro

method mentioned above.


Overall these stochastic based models allow use to test hypotheses about tendencies and to
estimate parameters expressing their strengths while controlling for other tendencies or
cofounders


Introduction to Stochastic Models


Snijders, T. A. B., C. E. G. Steglich, and G. G. Van de Bunt. 2009. Introduction to actor
-
based
models for network dynamics.
Social Networks.



Data requirements
for this type of analysis
-

need at least two observations but often much
less than 10. With more than two waves, you study the differences between the first and
second time points and then progress the analysis, and actors will usually be larger than 20. A
total of 40 changes (cumulated over successive panel waves) is on the low side. More changes
give more information. Also network data need to be relatively complete, although a limited
amount of missing data can be dealt with. Additionally the use of structural zeros can allow
for the combination of several small networks into one structure to be analyzed.


Strategy for model selection:
it is often advisable to start with a model that includes all
effects that are expected to be strong, however, in complicated models forward selection
might be easier;


-
for simple models a simple standard initial value for the estimated algorithm works best, for
complicated models it maybe easier to start with an initial value obtained from a simpler
model,


-
high parameter correlations do not mean that an effect should be dropped as network
statistics are highly correlated by nature, parameters with significant may be added to the next
model round, it is important for the model selection to be guided by theory




-
Among structural effects the outdegree and reciprocity and some form of transitivity effect
should be included in models by default


-
It is a good practice to include control effects from the start


-
At some point in the modeling process one should check the degree based effects (popularity,
activity, assortativity) to see their influence in the model


-
It is good to check the indegrees and outdegrees for outliers and then seek actor covariates
that can explain the outliers or use dummy variables to capture this effect.



Example: Started with a simple model..
A friendship network of 26 students in a Dutch
School class were studied, network and other data were collected in 4 points of time at
intervals of three months. There were 26 students (17 girls and 9 boys) aged 11
-
13. Found
that there was a high degree of reciprocity, segregation according to the sexes, there was a
strong effect of having previous friendships, and there was evidence of transitive closure. Also
noted was the negative 3 cycle effect, a negative effect of the out
-
degree popularity, and
sender effect of sex. Then proceeded to do a more complicated model.


Evolution of Sociology Freshman into a
Friendship Network


Van Duijn, M. A. J, E. P. H. Zeggelink, M Huisman, F. N. Stokman, and F. W. Wasseur. 2003. Evolution of sociology
freshmen into a friendship network.
Journal of Mathematical Sociology

27:153
-
191.


This was a study that looked to answer the research question: “What kinds of individual and
network variables explain changes over time within a friendship network? At what stages and
why, are these variables important?”


The group felt that there were 4 main factors that determine change in meeting and friendship
networks overtime:


----

Physical proximity, visible similarity ( gender, ethnicity, age), invisible similarity (attitudes
and activities), and network opportunity (the ability to meet people through other people).
And these effects will be more important at different stages in the meeting and friendship
formation process (initial, middle, and final stages).


The study looked at 32 sociology students who where either in a traditional or accelerated
program, who were approximately either 18 or 22 years of age depending on the group, and
were given questionnaires seven times during their first year at the university.




The students were questioned more frequently at the beginning than the end of the year as the
thought was that more change would occur at the beginning of the year. There were 5
network measures total with 38, 25, 28, 18, and 18 students completed the questionnaire
respectively.


Proximity variables were program and smoking behavior, the visible variable included in the
study was gender, the invisible variables included activities like going out and five were chosen
based on responses to the questionnaire. An additional variable included in the study was
marihuana use which may indicate a subculture among the group.


Results:


-
The amount of change in the network decreases overtime and the amount of change in lower
in the mating than the meeting network


-
Balance plays an important role in all there stages of the mating process, especially in the
initial stage and the role of balance is only modest in the final stage of the meeting network.


-
Popularity in especially important in the initial stages of both the meeting and mating
networks which implies a preference for popular others



Results cont:


-
The proximity parameter program turned out to be important, especially in the early stages
of both networks.


-
The visibility parameter of gender is significant only in the initial stages of the mating process
but not in the meeting network.