vazul is an R package that provides functions for data blinding, allowing researchers to scramble data while preserving its statistical properties.
Installation
You can install the development version of vazul from GitHub with:
# install.packages("devtools")
devtools::install_github("nthun/vazul")
Main Functions
scramble_values()
Scrambles the values within a vector without replacement, preserving the original values but changing their order.
library(vazul)
# Scramble numeric values
set.seed(123)
scramble_values(1:10)
#> [1] 3 10 2 8 6 9 1 7 5 4
# Scramble character values
scramble_values(letters[1:5])
#> [1] "b" "c" "a" "d" "e"
# Scramble logical values
scramble_values(c(TRUE, FALSE, TRUE, FALSE))
#> [1] FALSE TRUE FALSE TRUE
scramble_variables()
Scrambles the values of specified variables (columns) within a data frame, with optional grouping.
# Create example data
df <- data.frame(
x = 1:6,
y = letters[1:6],
group = c("A", "A", "A", "B", "B", "B")
)
# Scramble variables across entire dataset
scramble_variables(df, c("x", "y"))
# Scramble within groups using .groups parameter
scramble_variables(df, "y", .groups = "group")
# Or using dplyr grouping
library(dplyr)
df |>
group_by(group) |>
scramble_variables("x") |>
ungroup()
# Can use multiple groups
marp |>
group_by(country, gender) |>
scramble_variables(c("rel_1", "rel_2")) |>
unroup()
Included Datasets
MARP Dataset
The Many Analysts Religion Project (MARP) dataset contains data from 10,535 participants across 24 countries, exploring the relationship between religiosity and well-being.
data(marp)
dim(marp)
#> [1] 10535 46
# Explore countries in the dataset
length(unique(marp$country))
#> [1] 24
# Example: scramble religiosity variables by country
marp |>
group_by(country) |>
scramble_variables(c("rel_1", "rel_2", "rel_3")) |>
ungroup()
Variables include: - rel_1
to rel_9
: Religiosity measures - wb_*
: Well-being indicators (general, physical, psychological, social) - country
: Participant country - age
, gender
, ses
, education
: Demographics - gdp
: Country-level GDP data
Williams Dataset
A study dataset with 112 participants examining the risk taking behavior behavior of high or low wealth individuals.
data(williams)
dim(williams)
#> [1] 112 25
table(williams$ecology)
#> Desperate Hopeful
#> 56 56
# Example: scramble perception measures within ecology conditions
williams |>
group_by(ecology) |>
scramble_variables(c("SexUnres_1", "Impuls_1")) |>
ungroup()
Key variables: - ecology
: Experimental condition (“Desperate” vs “Hopeful”) - SexUnres_*
: Sexual unresponsiveness measures
- Impuls_*
: Impulsivity measures - Opport_*
: Long-term planning opportunity measures - age
, gender
: Participant demographics
Explanation of the package name
Vazul was a historical figure (Hungarian price) in the 11. century. He was blinded by the king to become unfit for the throne. More info: https://en.wikipedia.org/wiki/Vazul
Documentation
- Package documentation:
help(package = "vazul")
- Function help:
?scramble_values
,?scramble_variables
- Dataset documentation:
?marp
,?williams
- Package website: https://nthun.github.io/vazul/
Authors
- Tamás Nagy - Package author and maintainer
-
Alexandra Sarafoglou - Data contributor and author
- Márton Kovács - Author
License
This project is licensed under the MIT License - see the LICENSE file for details.