| Title: | Wasserstein Index Generation (WIG) Model |
|---|---|
| Description: | Efficient implementation of several Optimal Transport algorithms in Fangzhou Xie (2025) <doi:10.48550/arXiv.2504.08722> and the Wasserstein Index Generation (WIG) model in Fangzhou Xie (2020) <doi:10.1016/j.econlet.2019.108874>. |
| Authors: | Fangzhou Xie [aut, cre, cph] |
| Maintainer: | Fangzhou Xie <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-22 19:01:30 UTC |
| Source: | https://github.com/fangzhou-xie/rwig |
Barycenter algorithm to solve for entropy-regularized Optimal Transport
Barycenter problems.
For a more detailed explaination, please refer to
vignette("barycenter").
barycenter( A, C, w, b_ext = NULL, barycenter_control = list(reg = 0.1, with_grad = FALSE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 1000, zero_tol = 1e-06, verbose = 0) )barycenter( A, C, w, b_ext = NULL, barycenter_control = list(reg = 0.1, with_grad = FALSE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 1000, zero_tol = 1e-06, verbose = 0) )
A |
numeric matrix, source discrete densities (M * S) |
C |
numeric matrix, cost matrix between source and target (M * N) |
w |
numeric vector, weight vector (S) |
b_ext |
numeric vector, only used to calculate quadratic loss against the computed barycenter (default = NULL) |
barycenter_control |
list, control parameters for the computation
|
This is the general function to solve OT Barycenter problem,
and it will use either parallel (method = "parallel") or
log-stablized Barycenter algorithm (method = "log")
for solving the problem.
list of results
b: numeric vector, computed barycenter
grad_A: gradient w.r.t. A (only with with_grad = TRUE)
grad_w: gradient w.r.t. w (only with with_grad = TRUE)
loss: double, quadratic loss between b and b_ext
(only with with_grad = TRUE)
U, V: scaling variables for the Sinkhorn algorithm
(only with method = "parallel")
F, G: scaling variables for the Sinkhorn algorithm
(only with method = "log")
iter: iterations of the algorithm
err: condition for convergence
return_status: 0 (convergence), 1 (max iteration reached), 2 (other)
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport: With Applications to Data Science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607. https://doi.org/10.1561/2200000073
Xie, F. (2025). Deriving the Gradients of Some Popular Optimal Transport Algorithms (No. arXiv:2504.08722). arXiv. https://doi.org/10.48550/arXiv.2504.08722
vignette("gradient"),
vignette("threading")
A <- rbind(c(.3, .2), c(.2, .1), c(.1, .2), c(.1, .1), c(.3, .4)) C <- rbind( c(.1, .2, .3, .4, .5), c(.2, .3, .4, .3, .2), c(.4, .3, .2, .1, .2), c(.3, .2, .1, .2, .5), c(.5, .5, .4, .0, .2) ) w <- c(.4, .6) b <- c(.2, .2, .2, .2, .2) reg <- .1 # simple barycenter example sol <- barycenter(A, C, w, barycenter_control = list(reg = reg)) # you can also supply arguments to control the computation # for example, including the loss and gradient w.r.t. `A` sol <- barycenter(A, C, w, b, barycenter_control = list(reg = reg, with_grad = TRUE))A <- rbind(c(.3, .2), c(.2, .1), c(.1, .2), c(.1, .1), c(.3, .4)) C <- rbind( c(.1, .2, .3, .4, .5), c(.2, .3, .4, .3, .2), c(.4, .3, .2, .1, .2), c(.3, .2, .1, .2, .5), c(.5, .5, .4, .0, .2) ) w <- c(.4, .6) b <- c(.2, .2, .2, .2, .2) reg <- .1 # simple barycenter example sol <- barycenter(A, C, w, barycenter_control = list(reg = reg)) # you can also supply arguments to control the computation # for example, including the loss and gradient w.r.t. `A` sol <- barycenter(A, C, w, b, barycenter_control = list(reg = reg, with_grad = TRUE))
Check if CUDA is available for GPU computations.
check_cuda()check_cuda()
logical, TRUE if CUDA is available, FALSE otherwise
if (check_cuda()) { cat("CUDA is available for GPU computations.\n") } else { cat("CUDA is not available.\n") }if (check_cuda()) { cat("CUDA is available for GPU computations.\n") } else { cat("CUDA is not available.\n") }
NYT headlines by economic policy uncertainty
headlinesheadlines
A data frame with 18,507 rows and 2 columns:
Headline of the NYT News
Date of the News
www.nytimes.com
Xie, F. (2020). Wasserstein index generation model: Automatic generation of time-series index with application to economic policy uncertainty. Economics Letters, 186, 108874. https://doi.org/10.1016/j.econlet.2019.108874
Sinkhorn algorithm to solve entropy-regularized Optimal Transport problems.
For a more detailed explaination, please refer to
vignette("sinkhorn").
sinkhorn( a, b, C, sinkhorn_control = list(reg = 0.1, with_grad = FALSE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 1000L, zero_tol = 1e-06, verbose = 0L) )sinkhorn( a, b, C, sinkhorn_control = list(reg = 0.1, with_grad = FALSE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 1000L, zero_tol = 1e-06, verbose = 0L) )
a |
numeric vector, source discrete density (probability vector) |
b |
numeric vector, target discrete density (probability vector) |
C |
numeric matrix, cost matrix between source and target |
sinkhorn_control |
list, control parameters for the computation
|
This is the general function to solve the OT problem,
and it will use either vanilla (method = "vanilla") or
log-stabilized Sinkhorn algorithm (method = "log")
for solving the problem.
list of results
P: optimal coupling matrix
grad_a: gradient of loss w.r.t. a (only with with_grad = TRUE)
u, v: scaling vectors
loss: regularized loss
iter: iterations of the algorithm
err: condition for convergence
return_status: 0 (convergence), 1 (max iteration reached), 2 (other)
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport: With Applications to Data Science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607. https://doi.org/10.1561/2200000073
Xie, F. (2025). Deriving the Gradients of Some Popular Optimal Transport Algorithms (No. arXiv:2504.08722). arXiv. https://doi.org/10.48550/arXiv.2504.08722
vignette("gradient"),
vignette("threading")
# simple sinkhorn example a <- c(.3, .4, .1, .1, .1) b <- c(.4, .5, .1) C <- rbind( c(.1, .2, .3), c(.2, .3, .4), c(.4, .3, .2), c(.3, .2, .1), c(.5, .5, .4) ) reg <- .1 sol <- sinkhorn(a, b, C, sinkhorn_control = list(reg = reg, verbose = 0)) # you can also supply arguments to control the computation # for example, calculate the gradient w.r.t. a sol <- sinkhorn(a, b, C, sinkhorn_control = list(reg = reg, with_grad = TRUE, verbose = 0))# simple sinkhorn example a <- c(.3, .4, .1, .1, .1) b <- c(.4, .5, .1) C <- rbind( c(.1, .2, .3), c(.2, .3, .4), c(.4, .3, .2), c(.3, .2, .1), c(.5, .5, .4) ) reg <- .1 sol <- sinkhorn(a, b, C, sinkhorn_control = list(reg = reg, verbose = 0)) # you can also supply arguments to control the computation # for example, calculate the gradient w.r.t. a sol <- sinkhorn(a, b, C, sinkhorn_control = list(reg = reg, with_grad = TRUE, verbose = 0))
Truncated Singular Value Decomposition algorithm
tsvd(M, k = 1, flip_sign = c("auto", "sklearn", "none"))tsvd(M, k = 1, flip_sign = c("auto", "sklearn", "none"))
M |
matrix, data to be analyzed |
k |
int, number of columns/features to be kept |
flip_sign |
character, one of the following: "auto", "sklearn", "none" |
Compute the truncated SVD for dimension reduction. Note that SVD suffers from "sign indeterminacy," which means that the signs of the output vectors could depend on the algorithm and random state. Two variants of "sign flipping methods" are implemented here, one following the sklearn implementation on Truncated SVD, another by Bro et al. (2008).
matrix after dimension reduction
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html
Bro, R., Acar, E., & Kolda, T. G. (2008). Resolving the sign ambiguity in the singular value decomposition. Journal of Chemometrics, 22(2), 135–140. https://doi.org/10.1002/cem.1122
Truncated SVD (sklearn)
vignette("tsvd")
A <- rbind(c(1,3), c(2,4)) tsvd(A)A <- rbind(c(1,3), c(2,4)) tsvd(A)
Wasserstein Dictionary Learning (WDL) model for topic modeling
wdl(docs, ...) ## S3 method for class 'character' wdl(docs, specs = wdl_specs(), verbose = TRUE, ...) ## S3 method for class 'wdl' print(x, topic = 0, token_per_topic = 10, ...) ## S3 method for class 'wdl' summary(object, topic = 1, token_per_topic = 10, ...)wdl(docs, ...) ## S3 method for class 'character' wdl(docs, specs = wdl_specs(), verbose = TRUE, ...) ## S3 method for class 'wdl' print(x, topic = 0, token_per_topic = 10, ...) ## S3 method for class 'wdl' summary(object, topic = 1, token_per_topic = 10, ...)
docs |
character vector, sentences to be analyzed |
... |
only for compatibility |
specs |
list, model specification for the WDL
see |
verbose |
bool, whether to print useful info |
x |
WDL model |
topic |
int, number of topic to be printed |
token_per_topic |
int, number of tokens to be printed |
object |
WDL model |
This is the re-implementation of WDL model from ground up,
and it calls the barycenter under the hood
(to be precise directly calling the underlying C++ routine
for barycenter)
topics and weights computed from the WDL given the input data
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport: With Applications to Data Science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607. https://doi.org/10.1561/2200000073
Schmitz, M. A., Heitz, M., Bonneel, N., Ngolè, F., Coeurjolly, D., Cuturi, M., Peyré, G., & Starck, J.-L. (2018). Wasserstein dictionary learning: Optimal transport-based unsupervised nonlinear dictionary learning. SIAM Journal on Imaging Sciences, 11(1), 643–678. https://doi.org/10.1137/17M1140431
Xie, F. (2025). Deriving the Gradients of Some Popular Optimal Transport Algorithms (No. arXiv:2504.08722). arXiv. https://doi.org/10.48550/arXiv.2504.08722
vignette("wdl-model")
# simple WDL example sentences <- c("this is a sentence", "this is another one") wdl_fit <- wdl( sentences, specs = wdl_specs(wdl_control = list(num_topics = 2),word2vec_control = list(min_count = 1)), verbose = TRUE)# simple WDL example sentences <- c("this is a sentence", "this is another one") wdl_fit <- wdl( sentences, specs = wdl_specs(wdl_control = list(num_topics = 2),word2vec_control = list(min_count = 1)), verbose = TRUE)
Control the parameters of WDL and WIG models
wdl_specs( wdl_control = list(num_topics = 4, batch_size = 64, epochs = 2, shuffle = TRUE, rng_seed = 42), tokenizer_control = list(stopwords = stopwords::stopwords()), word2vec_control = list(type = "cbow", dim = 10, min_count = 3), barycenter_control = list(reg = 0.1, with_grad = TRUE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 20, zero_tol = 1e-06), optimizer_control = list(optimizer = "adamw", lr = 0.005, decay = 0.01, beta1 = 0.9, beta2 = 0.999, eps = 1e-08) ) wig_specs( wig_control = list(group_unit = "month", svd_method = "topics", standardize = TRUE), wdl_control = list(num_topics = 4, batch_size = 64, epochs = 2, shuffle = TRUE, rng_seed = 42), tokenizer_control = list(stopwords = stopwords::stopwords()), word2vec_control = list(type = "cbow", dim = 10, min_count = 1), barycenter_control = list(reg = 0.1, with_grad = TRUE, use_cuda = TRUE, method = "auto", threshold = 0.1, max_iter = 20, zero_tol = 1e-06), optimizer_control = list(optimizer = "adamw", lr = 0.005, decay = 0.01, beta1 = 0.9, beta2 = 0.999, eps = 1e-08) )wdl_specs( wdl_control = list(num_topics = 4, batch_size = 64, epochs = 2, shuffle = TRUE, rng_seed = 42), tokenizer_control = list(stopwords = stopwords::stopwords()), word2vec_control = list(type = "cbow", dim = 10, min_count = 3), barycenter_control = list(reg = 0.1, with_grad = TRUE, use_cuda = TRUE, n_threads = 0, method = "auto", threshold = 0.1, max_iter = 20, zero_tol = 1e-06), optimizer_control = list(optimizer = "adamw", lr = 0.005, decay = 0.01, beta1 = 0.9, beta2 = 0.999, eps = 1e-08) ) wig_specs( wig_control = list(group_unit = "month", svd_method = "topics", standardize = TRUE), wdl_control = list(num_topics = 4, batch_size = 64, epochs = 2, shuffle = TRUE, rng_seed = 42), tokenizer_control = list(stopwords = stopwords::stopwords()), word2vec_control = list(type = "cbow", dim = 10, min_count = 1), barycenter_control = list(reg = 0.1, with_grad = TRUE, use_cuda = TRUE, method = "auto", threshold = 0.1, max_iter = 20, zero_tol = 1e-06), optimizer_control = list(optimizer = "adamw", lr = 0.005, decay = 0.01, beta1 = 0.9, beta2 = 0.999, eps = 1e-08) )
wdl_control |
list, parameters for WDL |
tokenizer_control |
list, parameters for
|
word2vec_control |
list, parameters for
|
barycenter_control |
list, parameters for
|
optimizer_control |
list, parameters for the optimizer (SGD, Adam, AdamW) |
wig_control |
list, parameters for WIG model |
See vignette("specs") for details on the parameters.
list of the control lists
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport: With Applications to Data Science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607. https://doi.org/10.1561/2200000073
Schmitz, M. A., Heitz, M., Bonneel, N., Ngolè, F., Coeurjolly, D., Cuturi, M., Peyré, G., & Starck, J.-L. (2018). Wasserstein dictionary learning: Optimal transport-based unsupervised nonlinear dictionary learning. SIAM Journal on Imaging Sciences, 11(1), 643–678. https://doi.org/10.1137/17M1140431
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR).
Loshchilov, I., & Hutter, F. (2019). Decoupled Weight Decay Regularization (No. arXiv:1711.05101). arXiv. https://doi.org/10.48550/arXiv.1711.05101
Xie, F. (2020). Wasserstein index generation model: Automatic generation of time-series index with application to economic policy uncertainty. Economics Letters, 186, 108874. https://doi.org/10.1016/j.econlet.2019.108874
Xie, F. (2025). Deriving the Gradients of Some Popular Optimal Transport Algorithms (No. arXiv:2504.08722). arXiv. https://doi.org/10.48550/arXiv.2504.08722
wig_specs(), barycenter(),
word2vec::word2vec(), tokenizers::tokenize_words(),
vignette("specs")
Wasserstein Index Generation (WIG) model for time-series sentiment index autogeneration
wig(.data, date_col, docs_col, ...) ## S3 method for class 'data.frame' wig(.data, date_col, docs_col, specs = wig_specs(), verbose = TRUE, ...) ## S3 method for class 'wig' print(x, topic = 1, token_per_topic = 10, ...) ## S3 method for class 'wig' summary(object, topic = 1, token_per_topic = 10, ...)wig(.data, date_col, docs_col, ...) ## S3 method for class 'data.frame' wig(.data, date_col, docs_col, specs = wig_specs(), verbose = TRUE, ...) ## S3 method for class 'wig' print(x, topic = 1, token_per_topic = 10, ...) ## S3 method for class 'wig' summary(object, topic = 1, token_per_topic = 10, ...)
.data |
a dataframe containing the dates/datetimes and documents |
date_col |
name of the column for dates/datetimes |
docs_col |
name of the column for the texts/documents |
... |
only for compatibility |
specs |
list, model specification for WIG
see |
verbose |
bool, whether to print useful info |
x |
WIG model |
topic |
int, number of topic to be printed |
token_per_topic |
int, number of tokens to be printed |
object |
WIG model |
This is the re-implementation of WIG model from scratch in R.
"wig" class, i.e. list of the index and the WDL model
Xie, F. (2020). Wasserstein index generation model: Automatic generation of time-series index with application to economic policy uncertainty. Economics Letters, 186, 108874. https://doi.org/10.1016/j.econlet.2019.108874
vignette("wdl-model")
# create a small dataset wigdf <- data.frame( ref_date = as.Date(c("2012-01-01", "2012-02-01")), docs = c("this is a sentence", "this is another sentence")) wigfit <- wig(wigdf, ref_date, docs, specs = wig_specs(wdl_control = list(num_topics = 2),word2vec_control = list(min_count = 1)), verbose = FALSE)# create a small dataset wigdf <- data.frame( ref_date = as.Date(c("2012-01-01", "2012-02-01")), docs = c("this is a sentence", "this is another sentence")) wigfit <- wig(wigdf, ref_date, docs, specs = wig_specs(wdl_control = list(num_topics = 2),word2vec_control = list(min_count = 1)), verbose = FALSE)