How to Use xSub

xSub offers data in four formats, including two types of data sources (individual and multiple) and two data structures (spatial panel data and raw events).

Spatial Panel Data: Individual Sources

Through the online interface, users select:

  1. a country (or multiple countries)

  2. a data source

  3. a unit of analysis

    • space: country / province / district / grid / electoral constituency

    • time: year / month / week

The website then generates a compressed archive with:

  1. dataset

    • file format: comma-separated values delimited text file (.csv)

    • data structure: cross-sectional time-series

    • number of observations: N × T, where N is the number of spatial units (e.g. districts, grid cells), and T is the number of time units (e.g. months, weeks)

    • variables included: unit IDs, violence event counts (broken down by actors and tactics), covariates (local demographics, geography, ethnicity, weather)

  2. map and time plot of violence

    • file format: portable network graphics (.png)

    • map: number of violent events per spatial units (darker shade = more violence)

    • time plot: number of violent events per time unit (higher bars = more violence)

  3. codebook

    • file format: portable document format (.pdf)

    • contents: information on variable names, data structure

  4. actor dictionary

    • file format: portable document format (.pdf)

    • contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file

Spatial Panel Data: Multiple Sources

Through the online interface, users select:

  1. a country (or multiple countries)

  2. a geographic and date matching window for event de-duplication:

    • geographic: 1 km / 5 km

    • date: 1 day / 2 days

    all events of the same type (i.e. same actors, actions, targets) that fall within the same matching window (e.g. occurred within 1 km of each other on the same day) are treated as a single unique event

  3. a dyad type:

    • A: directed (includes only events where initiator and target are both known)

    • B: undirected (all events, including those with some ambiguity over which actor was initiator vs. target)

    ACLED and GED contain only undirected dyad information; most other sources include directed dyad information

  4. a unit of analysis

    • space: country / province / district / grid / electoral constituency

    • time: year / month / week / day

The website then generates a compressed archive with:

  1. dataset

    • file format: comma-separated values delimited text file (.csv)

    • data structure: cross-sectional time-series

    • number of observations: N × T, where N is the number of spatial units (e.g. districts, grid cells), and T is the number of time units (e.g. months, weeks)

    • variables included: unit IDs, violence event counts (broken down by actors and tactics), covariates (local demographics, geography, ethnicity, weather)

  2. map and time plot of violence

    • file format: portable network graphics (.png)

    • map: number of violent events per spatial units (darker shade = more violence)

    • time plot: number of violent events per time unit (higher bars = more violence)

  3. codebook

    • file format: portable document format (.pdf)

    • contents: information on variable names, data structure

  4. actor dictionary

    • file format: portable document format (.pdf)

    • contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file

Event-Level Data: Individual Sources

Through the online interface, users select:

  1. a country (or multiple countries)

  2. a data source

The website then generates a compressed archive with:

  1. dataset

    • file format: comma-separated values delimited text file (.csv)

    • data structure: event data

    • number of observations: N, the number of political events that occurred within a country over the time period covered by each data source

    • variables included: unit IDs, violence event counts (broken down by actors and tactics), including more specific event categories than are included in the aggregate, panel datasets

  2. map and time plot of violence

    • file format: portable network graphics (.png)

    • map: number of violent events per spatial units (darker shade = more violence)

    • time plot: number of violent events per time unit (higher bars = more violence)

  3. codebook

    • file format: portable document format (.pdf)

    • contents: information on variable names, data structure

  4. actor dictionary

    • file format: portable document format (.pdf)

    • contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file

Event-Level Data: Multiple Sources

Through the online interface, users select:

  1. a country (or multiple countries)

  2. a geographic and date matching window for event de-duplication:

    • geographic: 1 km / 5 km

    • date: 1 day / 2 days

    all events of the same type (i.e. same actors, actions, targets) that fall within the same matching window (e.g. occurred within 1 km of each other on the same day) are treated as a single unique event

  3. a dyad type:

    • A: directed (includes only events where initiator and target are both known)

    • B: undirected (all events, including those with some ambiguity over which actor was initiator vs. target)

    ACLED and GED contain only undirected dyad information; most other sources include directed dyad information

  4. a unit of analysis

    • space: country / province / district / grid / electoral constituency

    • time: year / month / week / day

The website then generates a compressed archive with:

  1. dataset

    • file format: comma-separated values delimited text file (.csv)

    • data structure: event data

    • number of observations: N, the number of political events that occurred within a country, after pooling across multiple sources and removing duplicate events

    • variables included: unit IDs, violence event counts (broken down by actors and tactics), including more specific event categories than are included in the aggregate, panel datasets

  2. map and time plot of violence

    • file format: portable network graphics (.png)

    • map: number of violent events per spatial units (darker shade = more violence)

    • time plot: number of violent events per time unit (higher bars = more violence)

  3. codebook

    • file format: portable document format (.pdf)

    • contents: information on variable names, data structure

  4. actor dictionary

    • file format: portable document format (.pdf)

    • contents: information on which specific actors are classified as “government,” “rebel,” and other categories in the file

To batch download and merge multiple data files, please use the xSub R package (Stata code also available here).