xSub offers data in four formats, including two types of data sources (individual and multiple) and two data structures (spatial panel data and raw events).
Through the online interface, users select:
a country (or multiple countries)
a data source
a unit of analysis
space: country / province / district / grid / electoral constituency
time: year / month / week
The website then generates a compressed archive with:
dataset
file format: comma-separated values delimited text file (.csv)
data structure: cross-sectional time-series
number of observations: N × T, where N is the number of spatial units (e.g. districts, grid cells), and T is the number of time units (e.g. months, weeks)
variables included: unit IDs, violence event counts (broken down by actors and tactics), covariates (local demographics, geography, ethnicity, weather)
map and time plot of violence
file format: portable network graphics (.png)
map: number of violent events per spatial units (darker shade = more violence)
time plot: number of violent events per time unit (higher bars = more violence)
codebook
file format: portable document format (.pdf)
contents: information on variable names, data structure
actor dictionary
file format: portable document format (.pdf)
contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file
Through the online interface, users select:
a country (or multiple countries)
a geographic and date matching window for event de-duplication:
geographic: 1 km / 5 km
date: 1 day / 2 days
all events of the same type (i.e. same actors, actions, targets) that fall within the same matching window (e.g. occurred within 1 km of each other on the same day) are treated as a single unique event
a dyad type:
A: directed (includes only events where initiator and target are both known)
B: undirected (all events, including those with some ambiguity over which actor was initiator vs. target)
ACLED and GED contain only undirected dyad information; most other sources include directed dyad information
a unit of analysis
space: country / province / district / grid / electoral constituency
time: year / month / week / day
The website then generates a compressed archive with:
dataset
file format: comma-separated values delimited text file (.csv)
data structure: cross-sectional time-series
number of observations: N × T, where N is the number of spatial units (e.g. districts, grid cells), and T is the number of time units (e.g. months, weeks)
variables included: unit IDs, violence event counts (broken down by actors and tactics), covariates (local demographics, geography, ethnicity, weather)
map and time plot of violence
file format: portable network graphics (.png)
map: number of violent events per spatial units (darker shade = more violence)
time plot: number of violent events per time unit (higher bars = more violence)
codebook
file format: portable document format (.pdf)
contents: information on variable names, data structure
actor dictionary
file format: portable document format (.pdf)
contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file
Through the online interface, users select:
a country (or multiple countries)
a data source
The website then generates a compressed archive with:
dataset
file format: comma-separated values delimited text file (.csv)
data structure: event data
number of observations: N, the number of political events that occurred within a country over the time period covered by each data source
variables included: unit IDs, violence event counts (broken down by actors and tactics), including more specific event categories than are included in the aggregate, panel datasets
map and time plot of violence
file format: portable network graphics (.png)
map: number of violent events per spatial units (darker shade = more violence)
time plot: number of violent events per time unit (higher bars = more violence)
codebook
file format: portable document format (.pdf)
contents: information on variable names, data structure
actor dictionary
file format: portable document format (.pdf)
contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file
Through the online interface, users select:
a country (or multiple countries)
a geographic and date matching window for event de-duplication:
geographic: 1 km / 5 km
date: 1 day / 2 days
all events of the same type (i.e. same actors, actions, targets) that fall within the same matching window (e.g. occurred within 1 km of each other on the same day) are treated as a single unique event
a dyad type:
A: directed (includes only events where initiator and target are both known)
B: undirected (all events, including those with some ambiguity over which actor was initiator vs. target)
ACLED and GED contain only undirected dyad information; most other sources include directed dyad information
a unit of analysis
space: country / province / district / grid / electoral constituency
time: year / month / week / day
The website then generates a compressed archive with:
dataset
file format: comma-separated values delimited text file (.csv)
data structure: event data
number of observations: N, the number of political events that occurred within a country, after pooling across multiple sources and removing duplicate events
variables included: unit IDs, violence event counts (broken down by actors and tactics), including more specific event categories than are included in the aggregate, panel datasets
map and time plot of violence
file format: portable network graphics (.png)
map: number of violent events per spatial units (darker shade = more violence)
time plot: number of violent events per time unit (higher bars = more violence)
codebook
file format: portable document format (.pdf)
contents: information on variable names, data structure
actor dictionary
file format: portable document format (.pdf)
contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file
To batch download and merge multiple data files, please use the xSub R package (Stata code also available here).