Egocentric network analysis with R
2024-06-20
Chapter 1 Introduction
This book is a companion to my workshop on egocentric network analysis with R. Over the past several years I have taught this workshop at different conferences and summer schools on social network analysis, personal networks, and network science – such as INSNA Sunbelt conferences, NetSci conferences, and the UAB Barcelona Course on Personal Network Analysis. The book is a work in progress and I’ll keep updating it as I continue to teach the workshop and related courses.
A Spanish translation of part of this book has appeared in the journal REDES - Revista hispana para el análisis de redes sociales. My colleagues and I provide a more in-depth discussion of personal network analysis and the methods covered in this workshop in our textbook on personal network research (McCarty et al., 2019). See the bibliography at the end of the book for a list of other useful references on egocentric or personal network analysis.
Feel free to contact me to know more about this workshop and how to take it, or to give me feedback or report any issue about this book.
1.1 Workshop setup
To take this workshop you need to:
- Download the latest version of R here (select a location near you)
- Follow instructions to install R in your computer
- If you downloaded R some time ago, please update it to the latest version
- Download the latest version of RStudio (free version) here
- Follow instructions to install RStudio in your computer
- If you downloaded RStudio some time ago, please update it to the latest version
- Install the R packages listed below
- Open RStudio and go to
Top menu > Tools > Install packages...
- Install each package in the list
- Open RStudio and go to
- Bring your laptop to the workshop
- Download the workshop folder and save it to your computer: see below
- I recommend that you do this in class at the beginning of the workshop so as to download the most updated version of the folder.
- Once in class, go to the workshop folder on your computer (point 5 above) and double-click on the R project file in it (
.Rproj
extension).- That will open RStudio: you’re all set!
NOTE: It’s very important that you save the workshop folder as downloaded to a location in your computer, and open the .Rproj
within that folder. By doing so, you will be opening RStudio and setting the workshop folder as your R working directory. All our R scripts assume that working directory. In particular, they assume that the Data
subfolder is in your R working directory. You can type getwd()
in your R console to see the path to your R working directory and make sure that it’s correctly pointed to the location of the workshop folder in your computer.
1.2 Workshop materials
The materials for this workshop consist of this book and the workshop folder.
You can download the workshop folder from this GitHub repository:
- Click on the
Code
green button > Download ZIP - Unzip the folder and save it to your computer
The workshop folder contains several files and folders, but you only need to focus on the following:
Scripts
subfolder: all the R code shown in this book.Data
subfolder: all the data we’re going to use.egocentric-r.Rproj
: the workshop’s R project file (you use this to launch RStudio).
The Scripts
subfolder includes different R script (.R) files. You can access and run the R code in each script by opening the corresponding .R file in RStudio. Each script in Scripts
corresponds to one of the following chapters (see the table of contents):
- Basics of the R language (
02_Basics
script and slideshow). - Representing and visualizing ego-networks in R (
03_Representing_egonets.R
script and slideshow). - Analyzing ego-network composition (
04_Composition
script and slideshow). - Analyzing ego-network structure (
05_Structure
script and slideshow). - Modeling tie- or alter-level variables with multilevel models (
06_Multilevel
script and slideshow). - Introduction to the
egor
package (07_egor
script and slideshow). - Supplementary topics (
08_Supplementary
script and slideshow).
The slides used for this workshop can be downloaded here.
1.3 R settings
1.3.1 Required R packages
- For descriptive statistics:
- For network data, measures, visualization:
- To fit multilevel statistical models and view their results:
- General:
tidyverse
. This is a collection of different packages that share a common language and set of principles, includingdplyr
,ggplot2
, andpurrr
. See Wickham and Grolemund (2017) for more information.
1.3.2 RStudio options
RStudio gives you the ability to select and change various settings and features of its interface: see the Preferences...
menu option. These are some of the settings you should pay attention to:
Preferences... > Code > Editing > Soft-wrap R source file
. Here you can decide whether or not to wrap long code lines in the editor. When code lines in a script are not wrapped, be aware that some code will be hidden if script lines are longer than your editor window’s width (you’ll have to scroll right to see the rest of the code). With a script (.R
) file open in the editor, try both options (checked and unchecked) to see what you’re more comfortable with.Preferences... > Code > Display > Highlight R function calls
. This allows you to highlight all pieces of code that call an R function (“command”). I find function highlights very helpful to navigate a script and suggest that you check this option.
1.4 Data
This workshop uses real-world data collected in 2012 with a personal network survey among 107 Sri Lankan immigrants in Milan, Italy. Out of the 107 respondents, 102 reported their personal network. All data are in the Data
subfolder.
The data files include ego-level data (gender, age, educational level, etc. for each Sri Lankan respondent), alter attributes (alter’s nationality, country of residence, closeness to ego etc.), and information on alter-alter ties. Each personal network has a fixed size of 45 alters. Information about data variables and categories is available in ./Data/codebook.xlsx
.
All data objects are saved as R objects in the R data file data.rda
. Data objects are the following:
ego.df
: A data frame with ego-level attributes for all respondents (egos).alter.attr.all
: A data frame with alter-level attributes for all alters from all respondents.gr.list
: A list. Each list element is one ego-network stored as anigraph
object.alter.attr.28
: A data frame with alter-level attributes only for alters nominated by ego ID 28.gr.28
: The ego-network of ego ID 28 stored as anigraph
object.gr.ego.28
: The same asgr.28
, but with the node of the ego included in theigraph
object.
The R data objects above were imported from raw csv data files. All csv files are in the ./Data/raw_data/
subfolder:
ego_data.csv
: A csv file with ego-level data for all the egos.alter_attributes.csv
: A single csv file including attributes of all alters from all egos.alter_ties_028.csv
: The edge list for ego ID 28’s egocentric network.alter_attributes_028.csv
: The alter attributes in ego ID 28’s egocentric network.adj_028.csv
: The adjacency matrix for ego ID 28’s egocentric network.alter_ties.csv
: A single csv file with the edge list for all alters from all egos.
For most of the workshop we will directly use R data objects in data.rda
. We will not focus on importing ego-network data from outside sources (for example, csv files). However, Section 7.1 shows you how the data objects in data.rda
were created by importing csv files with the egor
package. Section 8.2 covers importing ego-network data using just tidyverse
and igraph
.