Package 'flows' reference manual

Title:	Selections on Flow Matrices, Statistics on Selected Flows, Map and Graph Visualisations
Description:	The analysis and representation of flows often presuppose a selection to facilitate interpretation. Various methods have been proposed for selecting flows, one of the most widely used being based on major flows: it selects only the most important flows, absolute or relative, on a local or global scale. These methods often highlight hierarchies between locations, but the loss of information caused by selection is rarely taken into account. We propose statistical indicators to assess the loss of information and the characteristics of selected flows. We provide functions that select flows (main, dominant or major flows), provide statistics on selections and offer visualizations in the form of maps and graphs. See Beauguitte et al (2015) <doi:10.4000/netcom.2134>.
Authors:	Timothée Giraud [cre, aut] , Laurent Beauguitte [aut], Marianne Guérois [aut]
Maintainer:	Timothée Giraud <[email protected]>
License:	GPL-3
Version:	2.0.0
Built:	2025-03-02 04:19:24 UTC
Source:	https://github.com/riatelab/flows

Commuters datasets

Description

Data on commuters between Urban Areas of the French Grand Est region in 2011. Fields:

i: Code of the urban area of residence
namei: Name of the urban area of residence
wi: Total number of active occupied persons in the urban area of residence
j: Code of the urban area of work
namej: Name of the urban area of work
wj: Total number of active occupied persons in the urban area of work
fij: Number of commuters between i and j

Geopackage of the Grand Est region in France and its urban areas (2010 delineation).

References

Commuters dataset: https://www.insee.fr/fr/statistiques/2022113
Spatial dataset: https://www.data.gouv.fr/en/datasets/geofla-r

Examples

nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
library(sf)
UA <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "urban_area")
GE <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "region")
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
library(sf)
UA <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "urban_area")
GE <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "region")

Comparison of two matrices

Description

Compares two matrices of same dimension, with same column and row names.

Usage

compare_mat(mat1, mat2, digits = 0)
compare_mat(mat1, mat2, digits = 0)

Arguments

`mat1`	A square matrix of flows.
`mat2`	A square matrix of flows.
`digits`	An integer indicating the number of decimal places to be used when printing the data.frame in the console (see round).

Value

A data.frame that provides statistics on differences between mat1 and mat2: absdiff are the absolute differences and reldiff are the relative differences (in percent).

Examples

# # Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
# Remove the matrix diagonal
diag(mat) <- 0

# Select the first flows
flowSel1 <- select_flows(mat = mat, method = "nfirst", k = 1)

# Select flows greater than 2000
flowSel2 <- select_flows(mat = mat, method = "xfirst", k = 2000)

# Combine selections
flowSel <- mat * flowSel1 * flowSel2

# Compare flow matrices
compare_mat(mat1 = mat, mat2 = flowSel, digits = 1)
# # Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
# Remove the matrix diagonal
diag(mat) <- 0

# Select the first flows
flowSel1 <- select_flows(mat = mat, method = "nfirst", k = 1)

# Select flows greater than 2000
flowSel2 <- select_flows(mat = mat, method = "xfirst", k = 2000)

# Combine selections
flowSel <- mat * flowSel1 * flowSel2

# Compare flow matrices
compare_mat(mat1 = mat, mat2 = flowSel, digits = 1)

Package description

Description

Selections on flow matrices, statistics on selected flows, map and graph visualisations.

An introduction to the package conceptual background and usage is proposed in a vignette (see vignette(topic = "flows")) and a paper (Beauguitte, Giraud & Guérois 2015).

Author(s)

Maintainer: Timothée Giraud [email protected] (ORCID)

Authors:

Laurent Beauguitte
Marianne Guérois

References

L. Beauguitte, T. Giraud & M. Guérois, 2015. "Un outil pour la sélection et la visualisation de flux : le package flows", Netcom, 29-3/4:399-408. https://journals.openedition.org/netcom/2134.

Nodal flows map

Description

Perform a Nystuen & Dacey's dominants, or nodal, flows analysis and plot a dominant flows map.

Usage

map_nodal_flows(
  mat,
  x,
  inches = 0.15,
  col_node = c("red", "orange", "yellow"),
  breaks = "equal",
  nbreaks = 4,
  lwd = c(1, 5, 10, 20),
  col_flow = "grey20",
  leg_node = c("Dominant", "Intermediate", "Dominated",
    "Size proportional\nto sum of inflows"),
  leg_flow = "Flow intensity",
  leg_pos_flow = "topleft",
  leg_pos_node = "topright",
  add = FALSE
)
map_nodal_flows(
  mat,
  x,
  inches = 0.15,
  col_node = c("red", "orange", "yellow"),
  breaks = "equal",
  nbreaks = 4,
  lwd = c(1, 5, 10, 20),
  col_flow = "grey20",
  leg_node = c("Dominant", "Intermediate", "Dominated",
    "Size proportional\nto sum of inflows"),
  leg_flow = "Flow intensity",
  leg_pos_flow = "topleft",
  leg_pos_node = "topright",
  add = FALSE
)

Arguments

`mat`	A square matrix of flows.
`x`	An sf object, the first column contains a unique identifier matching mat column and row names.
`inches`	Size of the largest circle.
`col_node`	Node colors, a vector of 3 colors.
`breaks`	How to classify flows, either a numeric vector with the actual breaks, or a classification method name (see mf_get_breaks())
`nbreaks`	Number of classes.
`lwd`	Flows widths
`col_flow`	Flows color
`leg_node`	Labels for the nodes legend
`leg_flow`	Label for the flows legend
`leg_pos_flow`	Position of the flows legend
`leg_pos_node`	Position of the node legend
`add`	A boolean, if TRUE, add the layer to an existing plot.

Value

A list of sf objects is returned. The first element contains the nodes with their weight and classification (dominant, intermediary, dominated). The second element contains the flows (i, j, fij)

Examples

library(sf)
library(mapsf)
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
UA <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "urban_area")
GE <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "region")
mf_map(GE)
map_nodal_flows(
  mat = mat, x = UA,
  col_node = c("red", "orange", "yellow"),
  col_flow = "grey30",
  breaks = c(4, 100, 1000, 2500, 8655),
  lwd = c(1, 4, 8, 16), add = TRUE
)
mf_title("Dominant flows")
library(sf)
library(mapsf)
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
UA <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "urban_area")
GE <- st_read(system.file("gpkg/GE.gpkg", package = "flows"), layer = "region")
mf_map(GE)
map_nodal_flows(
  mat = mat, x = UA,
  col_node = c("red", "orange", "yellow"),
  col_flow = "grey30",
  breaks = c(4, 100, 1000, 2500, 8655),
  lwd = c(1, 4, 8, 16), add = TRUE
)
mf_title("Dominant flows")

Nodal flows selection

Description

Perform a Nystuen & Dacey's dominants flows analysis.

Usage

nodal_flows(mat)
nodal_flows(mat)

Arguments

mat

A square matrix of flows.

Value

The matrix of the selected flows is returned.

Examples

nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
res <- nodal_flows(mat)
res[1:5, 1:5]
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
res <- nodal_flows(mat)
res[1:5, 1:5]

Nodal flows graph

Description

This function plots a dominant flows graph.

Usage

plot_nodal_flows(
  mat,
  leg_pos_flows = "topright",
  leg_flow = "Flows Intensity",
  leg_pos_node = "bottomright",
  leg_node = c("Dominant", "Intermediary", "Dominated",
    "Size proportional\nto sum of inflows"),
  labels = FALSE
)
plot_nodal_flows(
  mat,
  leg_pos_flows = "topright",
  leg_flow = "Flows Intensity",
  leg_pos_node = "bottomright",
  leg_node = c("Dominant", "Intermediary", "Dominated",
    "Size proportional\nto sum of inflows"),
  labels = FALSE
)

Arguments

`mat`	A square matrix of dominant flows (see nodal_flows).
`leg_pos_flows`	Position of the flows legend, one of "topleft", "top", "topright", "left", "right", "bottomleft", "bottom", "bottomright".
`leg_flow`	Title of the flows legend.
`leg_pos_node`	Position of the nodes legend, one of "topleft", "top", "topright", "left", "right", "bottomleft", "bottom", "bottomright".
`leg_node`	Text of the nodes legend.
`labels`	A boolean, if TRUE, labels of dominant and intermediary nodes are plotted.

Details

This function uses the Davidson Harel algorithm from igraph.

Note

As square matrices can easily be plotted with plot.igraph or gplot functions from igraph and sna packages, we do not propose visualisation for other outputs.

Examples

nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
res <- nodal_flows(mat)

# Plot dominant flows graph
plot_nodal_flows(mat = res)
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
res <- nodal_flows(mat)

# Plot dominant flows graph
plot_nodal_flows(mat = res)

Flow matrix preparation

Description

From a long format matrix to a a wide format matrix.

Usage

prepare_mat(x, i, j, fij)
prepare_mat(x, i, j, fij)

Arguments

`x`	A data.frame of flows between origins and destinations: long format matrix (origins, destinations, flows intensity).
`i`	A character giving the origin field name in mat.
`j`	A character giving the destination field name in mat.
`fij`	A character giving the flow field name in mat.

Value

A square matrix of flows. Diagonal can be filled or empty depending on data used.

Examples

# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
# Prepare data
myflows <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
myflows[1:5, 1:5]
# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
# Prepare data
myflows <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
myflows[1:5, 1:5]

Flow selection

Description

Flow selection from origins.

Usage

select_flows(mat, method = "nfirst", ties = "first", global = FALSE, k, w)
select_flows(mat, method = "nfirst", ties = "first", global = FALSE, k, w)

Arguments

`mat`	A square matrix of flows.
`method`	A method of flow selection, one of "dominant", "nfirst", "xfirst" or "xsumfirst": dominant selects the dominant flows (see Details) nfirst selects the k first flows from origins, xfirst selects flows greater than k, xsumfirst selects as many flows as necessary for each origin so that their sum is at least equal to k. If k is not reached for one origin, all its flows are selected.
`ties`	In case of equality with "nfirst" method, use "random" or "first" (see rank).
`global`	If TRUE flows selections is done at the matrix scale.
`k`	Selection threshold for nfirst, xfirst and xsumfirst methods, ratio for dominant method.
`w`	A vector of units weigths (sum of incoming flows, sum of outgoing flows...).

Details

If method = "dominant", select which flow (fij or fji) must be kept. If the ratio weight of destination (wj) / weight of origin (wi) is greater than k, then fij is selected and fji is not. This function can perform the second criterion of the Nystuen & Dacey's dominants flows analysis.

Value

A boolean matrix of selected flows. Use element-wise multiplication to get flows intensity.

Examples

# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
# Prepare data
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
# remove diagonal
diag(mat) <- 0

# Select the first flow from each origin
res <- select_flows(mat = mat, method = "nfirst", global = FALSE, k = 1)
rowSums(res)

# Select the 5 first flows of the matrix
res <- select_flows(mat = mat, method = "nfirst", global = TRUE, k = 5)
sum(res)

# Select the flows greater than 5000
res <- select_flows(mat = mat, method = "xfirst", k = 5000)
r <- mat * res
r[r > 0]

# Select as many flows as necessary for each origin so that their sum is at least equal to 500.
res <- select_flows(mat = mat, method = "xsumfirst", global = FALSE, k = 500)
r <- mat * res
rowSums(r)

# Select as many flows in the matrix so that their sum is at least equal to 50000.
res <- select_flows(mat = mat, method = "xsumfirst", global = TRUE, k = 50000)
r <- mat * res
sum(rowSums(r))

# Select dominant flows
m <- mat[1:5, 1:5]
ws <- colSums(m)
res <- select_flows(mat = m, method = "dominant", k = 1, w = ws)
# 2nd element has a lower weight than 3rd element (ratio > 1)
ws[3] / ws[2]
# The flow from 2nd element to 3rd element is kept
res[2, 3]
# The flow from 3rd element to 2nd element is removed
res[3, 2]
# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
# Prepare data
mat <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")
# remove diagonal
diag(mat) <- 0

# Select the first flow from each origin
res <- select_flows(mat = mat, method = "nfirst", global = FALSE, k = 1)
rowSums(res)

# Select the 5 first flows of the matrix
res <- select_flows(mat = mat, method = "nfirst", global = TRUE, k = 5)
sum(res)

# Select the flows greater than 5000
res <- select_flows(mat = mat, method = "xfirst", k = 5000)
r <- mat * res
r[r > 0]

# Select as many flows as necessary for each origin so that their sum is at least equal to 500.
res <- select_flows(mat = mat, method = "xsumfirst", global = FALSE, k = 500)
r <- mat * res
rowSums(r)

# Select as many flows in the matrix so that their sum is at least equal to 50000.
res <- select_flows(mat = mat, method = "xsumfirst", global = TRUE, k = 50000)
r <- mat * res
sum(rowSums(r))

# Select dominant flows
m <- mat[1:5, 1:5]
ws <- colSums(m)
res <- select_flows(mat = m, method = "dominant", k = 1, w = ws)
# 2nd element has a lower weight than 3rd element (ratio > 1)
ws[3] / ws[2]
# The flow from 2nd element to 3rd element is kept
res[2, 3]
# The flow from 3rd element to 2nd element is removed
res[3, 2]

Descriptive statistics on flow matrix

Description

This function provides various indicators and graphical outputs on a flow matrix.

Usage

stat_mat(mat, output = "all", verbose = TRUE)
stat_mat(mat, output = "all", verbose = TRUE)

Arguments

`mat`	A square matrix of flows.
`output`	Graphical output. Choices are "all" for all graphics, "none" to avoid any graphical output, "degree" for degree distribution, "wdegree" for weighted degree distribution, "lorenz" for Lorenz curve of link weights and "boxplot" for boxplot of link weights (see 'Details').
`verbose`	A boolean, if TRUE, returns statistics in the console.

Details

Graphical ouputs concern outdegrees by default. If the matrix is transposed, outputs concern indegrees.

Value

The function returns a list of statistics and may plot graphics.

nblinks: number of cells with values > 0
density: number of links divided by number of possible links (also called gamma index by geographers), loops excluded
connectcomp: number of connected components (isolates included, weakly connected: use of clusters where mode = "weak")
connectcompx: number of connected components (isolates deleted, weakly connected: use of clusters where mode = "weak")
sizecomp: a data.frame of connected components: size and sum of flows per component (isolates included)
compocomp: a data.frame of connected components giving membership of units (isolates included)
degrees: a data.frame of nodes degrees and weighted degrees
sumflows: sum of flows
min: minimum flow
Q1: first quartile of flows
median: median flow
Q3: third quartile of flows
max: maximum flow
mean: mean flow
sd: standart deviation of flows

Examples

# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
myflows <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")

# Get statistics and graphs about the matrix
mystats <- stat_mat(mat = myflows, output = "all", verbose = TRUE)

# Size of connected components
mystats$sizecomp

# Sum of flows
mystats$sumflows

# Plot Lorenz curve only
stat_mat(mat = myflows, output = "lorenz", verbose = FALSE)

# Statistics only
mystats <- stat_mat(mat = myflows, output = "none", verbose = FALSE)
str(mystats)
# Import data
nav <- read.csv(system.file("csv/nav.csv", package = "flows"))
myflows <- prepare_mat(x = nav, i = "i", j = "j", fij = "fij")

# Get statistics and graphs about the matrix
mystats <- stat_mat(mat = myflows, output = "all", verbose = TRUE)

# Size of connected components
mystats$sizecomp

# Sum of flows
mystats$sumflows

# Plot Lorenz curve only
stat_mat(mat = myflows, output = "lorenz", verbose = FALSE)

# Statistics only
mystats <- stat_mat(mat = myflows, output = "none", verbose = FALSE)
str(mystats)

Package 'flows'

Help Index

Commuters datasets

Description

References

Examples

Comparison of two matrices

Description

Usage

Arguments

Value

Examples

Package description

Description

Author(s)

References

See Also

Nodal flows map

Description

Usage

Arguments

Value

Examples

Nodal flows selection

Description

Usage

Arguments

Value

Examples

Nodal flows graph

Description

Usage

Arguments

Details

Note

Examples

Flow matrix preparation

Description

Usage

Arguments

Value

Examples

Flow selection

Description

Usage

Arguments

Details

Value

Examples

Descriptive statistics on flow matrix

Description

Usage

Arguments

Details

Value

Examples