Quickstart to Inference

Cassy Dorff and Shahryar Minhas

2026-06-25

This is a minimal end-to-end tour of netify that uses only the data bundled with the package – no peacesciencer, no countrycode, no external downloads. If you want the full IR-data walkthrough with covariates from outside sources, see the Foundations article on the project site.

We’ll cover the four things netify is for:

  1. Build a network object from dyadic data
  2. Explore it with summary statistics and a quick plot
  3. Test a couple of basic inferential questions
  4. Bridge out to other packages when you want to model
library(netify)
library(ggplot2)

1. build

The bundled icews dataset has ICEWS event-data slices for 152 countries from 2002 to 2014 with the four “quad” variables (verbal/material by cooperation/conflict) plus a handful of nodal covariates.

data(icews)
head(icews[, c("i", "j", "year", "verbCoop", "matlConf", "i_polity2", "i_region")])
#>             i       j year verbCoop matlConf i_polity2   i_region
#> 2 Afghanistan Albania 2002        6        0        NA South Asia
#> 3 Afghanistan Albania 2003        1        0        NA South Asia
#> 4 Afghanistan Albania 2004       10        1        NA South Asia
#> 5 Afghanistan Albania 2005        0        0        NA South Asia
#> 6 Afghanistan Albania 2006        6       21        NA South Asia
#> 7 Afghanistan Albania 2007        3        0        NA South Asia

Turn it into a netify object with one call:

verb_coop <- netify(
    icews,
    actor1 = "i", actor2 = "j", time = "year",
    symmetric = FALSE,
    weight   = "verbCoop",
    nodal_vars = c("i_polity2", "i_log_gdp", "i_region"),
    dyad_vars  = c("matlCoop", "verbConf")
)

print(verb_coop)
#> ✔ Network data created.
#> • Unipartite
#> • Asymmetric
#> • Weights from `verbCoop`
#> • Longitudinal: 13 Periods
#> • # Unique Actors: 152
#> Network Summary Statistics (averaged across time):
#>           dens miss   mean recip trans
#> verbCoop 0.418    0 45.869 0.976 0.627
#> • Nodal Features: i_polity2, i_log_gdp, i_region
#> • Dyad Features: matlCoop, verbConf

The print() summary tells you the network is unipartite, directed, longitudinal (13 periods), 152 actors, with nodal and dyadic features attached. The numeric table shows density, missingness, mean edge weight, reciprocity, and transitivity averaged across time.

A quick glossary, since several of these terms appear before they’re defined elsewhere:

2. explore

summary() returns one row per time period with graph-level statistics:

gs <- summary(verb_coop)
head(gs[, c("net", "num_actors", "density", "reciprocity", "mutual", "transitivity")])
#>    net num_actors   density reciprocity    mutual transitivity
#> 1 2002        152 0.3787034   0.9778217 0.8537001    0.6058952
#> 2 2003        152 0.3871994   0.9632488 0.8479933    0.6072045
#> 3 2004        152 0.4145173   0.9769563 0.8452289    0.6215978
#> 4 2005        152 0.4071976   0.9804325 0.8386779    0.6215075
#> 5 2006        152 0.4108139   0.9771928 0.8510012    0.6277829
#> 6 2007        152 0.4243203   0.9783703 0.8511690    0.6330626

Note both reciprocity (the correlation between \(A\) and \(A^T\), useful for weighted networks) and mutual (the classic mutual-dyad proportion). Use whichever fits your audience.

Actor-level stats – degree, prop_ties, centrality, strength – come from summary_actor(). Quick definitions of what shows up in the columns:

as_ <- summary_actor(verb_coop)
head(as_[, c("actor", "time", "degree_in", "degree_out",
             "betweenness", "authority_score", "hub_score")])
#>         actor time degree_in degree_out  betweenness authority_score
#> 1 Afghanistan 2002        92         81 1.324503e-02      0.26866096
#> 2     Albania 2002        46         49 0.000000e+00      0.01252200
#> 3     Algeria 2002        68         65 0.000000e+00      0.01406673
#> 4      Angola 2002        64         67 8.830022e-05      0.01905067
#> 5   Argentina 2002        48         48 0.000000e+00      0.03580770
#> 6     Armenia 2002        71         66 0.000000e+00      0.05049222
#>     hub_score
#> 1 0.183233794
#> 2 0.009730031
#> 3 0.010671854
#> 4 0.011275109
#> 5 0.024732267
#> 6 0.039622725

Plot it. The default uses auto_format = TRUE and adapts to network size – for 152 actors over 13 years it’ll suppress text labels and tone down edge alpha automatically:

plot(verb_coop,
     time_filter = c("2004", "2008", "2012"),
     node_color_by = "i_region",
     edge_alpha = 0.1) +
  theme(legend.position = "bottom")

3. test (basic inferential)

A common first question is homophily: do similar countries cooperate more? The call below computes the descriptive homophily statistic quickly. Set significance_test = TRUE when you want the permutation p-value and interval.

hom <- homophily(verb_coop,
                 attribute = "i_polity2",
                 method = "correlation",
                 significance_test = FALSE)
head(hom)
#>    net    layer attribute      method threshold_value homophily_correlation
#> 1 2002 verbCoop i_polity2 correlation               0            0.04822429
#> 2 2003 verbCoop i_polity2 correlation               0            0.06630047
#> 3 2004 verbCoop i_polity2 correlation               0            0.04929628
#> 4 2005 verbCoop i_polity2 correlation               0            0.05016483
#> 5 2006 verbCoop i_polity2 correlation               0            0.04604959
#> 6 2007 verbCoop i_polity2 correlation               0            0.03108231
#>   mean_similarity_connected mean_similarity_unconnected similarity_difference
#> 1                 -6.897821                   -7.462007             0.5641861
#> 2                 -6.745410                   -7.514014             0.7686034
#> 3                 -6.874874                   -7.452968             0.5780945
#> 4                 -6.745738                   -7.336404             0.5906658
#> 5                 -6.787059                   -7.331676             0.5446173
#> 6                 -6.869626                   -7.230992             0.3613660
#>   p_value ci_lower ci_upper n_connected_pairs n_unconnected_pairs n_missing
#> 1      NA       NA       NA              8260               13792         3
#> 2      NA       NA       NA              8225               13237         5
#> 3      NA       NA       NA              8903               12853         4
#> 4      NA       NA       NA              8916               13136         3
#> 5      NA       NA       NA              8979               13073         3
#> 6      NA       NA       NA              9281               12771         3
#>   n_pairs
#> 1   22052
#> 2   21462
#> 3   21756
#> 4   22052
#> 5   22052
#> 6   22052

For a categorical attribute, mixing_matrix() gives the full who-with-whom table:

mm <- mixing_matrix(verb_coop, attribute = "i_region", normalized = TRUE)
round(mm$mixing_matrices[[1]], 2)
#>                            East Asia & Pacific Europe & Central Asia
#> East Asia & Pacific                       0.03                  0.04
#> Europe & Central Asia                     0.04                  0.18
#> Latin America & Caribbean                 0.01                  0.03
#> Middle East & North Africa                0.02                  0.04
#> North America                             0.00                  0.01
#> South Asia                                0.01                  0.02
#> Sub-Saharan Africa                        0.02                  0.04
#>                            Latin America & Caribbean Middle East & North Africa
#> East Asia & Pacific                             0.01                       0.02
#> Europe & Central Asia                           0.02                       0.05
#> Latin America & Caribbean                       0.03                       0.01
#> Middle East & North Africa                      0.01                       0.04
#> North America                                   0.00                       0.00
#> South Asia                                      0.00                       0.01
#> Sub-Saharan Africa                              0.01                       0.03
#>                            North America South Asia Sub-Saharan Africa
#> East Asia & Pacific                 0.00       0.01               0.02
#> Europe & Central Asia               0.01       0.02               0.04
#> Latin America & Caribbean           0.00       0.00               0.01
#> Middle East & North Africa          0.00       0.01               0.03
#> North America                       0.00       0.00               0.01
#> South Asia                          0.00       0.00               0.01
#> Sub-Saharan Africa                  0.01       0.01               0.07
mm$summary_stats[1, ]
#>    net    layer attribute assortativity diagonal_proportion entropy modularity
#> 1 2002 verbCoop  i_region     0.1717675           0.3507823 3.33186  0.1346415
#>   n_groups total_ties
#> 1        7       8692

compare_networks() is the all-purpose comparison tool. For a longitudinal netify it returns pairwise temporal comparisons:

temp_cmp <- compare_networks(verb_coop, method = "correlation")
head(temp_cmp$summary)
#>        metric      mean         sd       min       max
#> 1 correlation 0.8391837 0.05285993 0.7127432 0.9364287

You can also slice by a nodal attribute and compare the resulting subnetworks:

by_region <- compare_networks(
  subset(verb_coop, time = "2010"),
  by = "i_region",
  method = "correlation"
)
by_region$by_group$n_actors_per_group
#>        East Asia & Pacific      Europe & Central Asia 
#>                         19                         43 
#>  Latin America & Caribbean Middle East & North Africa 
#>                         23                         19 
#>              North America                 South Asia 
#>                          2                          6 
#>         Sub-Saharan Africa 
#>                         40

4. bridge

netify intentionally stops at descriptives + basic inference. For statistical models, hand off to a downstream package:

Want to fit… Use this netify exporter
Latent factor / AME amen to_amen(netlet)
ERGM vignette("pipeline_netify_ergm", package = "netify") to_statnet(netlet)
Community detection / graph algorithms igraph to_igraph(netlet)
Roll your own dyadic regression base R or modeling packages unnetify(netlet) for a long data frame

Additional project-site articles cover optional workflows, including latent-factor and multilayer modeling handoffs.

Example: convert to igraph and use a function netify doesn’t expose:

ig <- to_igraph(verb_coop)             # list of igraph objects, one per year
ig_2010 <- ig[["2010"]]
length(igraph::cluster_walktrap(ig_2010))
#> [1] 6

Or flatten back to a dyadic data frame:

df <- unnetify(subset(verb_coop, time = "2010"), remove_zeros = TRUE)
head(df[, c("from", "to", "verbCoop", "matlCoop", "i_polity2_from", "i_polity2_to")])
#>          from         to verbCoop matlCoop i_polity2_from i_polity2_to
#> 1 Afghanistan  Argentina        1        0             NA            8
#> 2 Afghanistan    Armenia        7        2             NA            5
#> 3 Afghanistan  Australia      125        0             NA           10
#> 4 Afghanistan    Austria        1        0             NA           10
#> 5 Afghanistan Azerbaijan        7        0             NA           -7
#> 6 Afghanistan    Bahrain        3        0             NA           -5

tl;dr

net <- netify(df, actor1 = "i", actor2 = "j", time = "year", weight = "x")
summary(net)            # graph-level stats
summary_actor(net)      # actor-level stats
plot(net)               # ggplot-based network visual
homophily(net, attribute = "v")   # do similar actors connect?
compare_networks(net)             # how do periods/layers/groups differ?
to_amen(net)  # or to_dbn / to_statnet / to_igraph when you're ready to model

For the long version with peacesciencer data and a full IR walkthrough, see the Foundations article on the project site.

a note for non-time use cases

Most of netify’s docs talk about “longitudinal” networks because that is the common social-science use case. The underlying longit_list structure is more general: it is a list of matrices, each with its own actor set. Common non-time examples:

For these, build a list of matrices (one per partition) and use either new_netify(list_of_matrices) or netify(df, ..., time = "partition_id"). The same per-period operations still apply: summary(), summary_actor(), faceted plot(), and subset(time = ...). The column is called time, but it can hold a subject, replicate, condition, or other slice id. If that framing gets confusing in your context, alias it locally:

# build a per-subject network library
subjects <- new_netify(list_of_subject_matrices)
# treat the "time" dimension as subject
per_subject_stats <- summary(subjects)

This generality is why the package treats longit_list as a general partition format rather than a strictly temporal one.