| Title: | An Improved Version of WA-PLS | 
| Version: | 0.1.3 | 
| Description: | The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| URL: | https://github.com/special-uor/fxTWAPLS/, https://special-uor.github.io/fxTWAPLS/, https://research.reading.ac.uk/palaeoclimate/ | 
| BugReports: | https://github.com/special-uor/fxTWAPLS/issues/ | 
| Imports: | doFuture, foreach, future, geosphere, ggplot2, JOPS, MASS, parallel, progressr | 
| Suggests: | magrittr, progress, scales, tictoc | 
| Depends: | R (≥ 3.6) | 
| RoxygenNote: | 7.3.1 | 
| Language: | en-GB | 
| NeedsCompilation: | no | 
| Packaged: | 2024-06-25 11:27:00 UTC; r.villegas-diaz | 
| Author: | Mengmeng Liu | 
| Maintainer: | Roberto Villegas-Diaz <r.villegas-diaz@outlook.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-06-25 11:50:13 UTC | 
fxTWAPLS: An Improved Version of WA-PLS
Description
The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication.
Author(s)
Maintainer: Roberto Villegas-Diaz r.villegas-diaz@outlook.com (ORCID)
Authors:
- Mengmeng Liu m.liu18@imperial.ac.uk (ORCID) 
- Iain Colin Prentice c.prentice@imperial.ac.uk (ORCID) 
- Cajo J. F. ter Braak cajo.terbraak@wur.nl (ORCID) 
- Sandy P. Harrison s.p.harrison@reading.ac.uk (ORCID) 
Other contributors:
- SPECIAL Research Group @ University of Reading [copyright holder] 
See Also
Useful links:
- Report bugs at https://github.com/special-uor/fxTWAPLS/issues/ 
TWA-PLS predict function
Description
TWA-PLS predict function
Usage
TWAPLS.predict.w(TWAPLSoutput, fossil_taxa)
Arguments
| TWAPLSoutput | The output of the  | 
| fossil_taxa | Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. | 
Value
A list of the reconstruction results. Each element in the list is described below:
- fit
- the fitted values using each number of components. 
- nPLS
- the total number of components extracted. 
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## Train
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## Predict
fossil_t_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin, core)
fossil_tf_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin, core)
fossil_t_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin2, core)
fossil_tf_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin2, core)
## End(Not run)
TWA-PLS training function
Description
TWA-PLS training function, which can perform fx correction.
1/fx^2 correction will be applied at step 7.
Usage
TWAPLS.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
- fit
- the fitted values using each number of components. 
- x
- the observed modern climate values. 
- taxon_name
- the name of each taxon. 
- optimum
- the updated taxon optimum 
- comp
- each component extracted (will be used in step 7 regression). 
- u
- taxon optimum for each component (step 2). 
- t
- taxon tolerance for each component (step 2). 
- z
- a parameter used in standardization for each component (step 5). 
- s
- a parameter used in standardization for each component (step 5). 
- orth
- a list that stores orthogonalization parameters (step 4). 
- alpha
- a list that stores regression coefficients (step 7). 
- meanx
- mean value of the observed modern climate values. 
- nPLS
- the total number of components extracted. 
See Also
fx, TWAPLS.predict.w, and
WAPLS.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## End(Not run)
TWA-PLS training function v2
Description
TWA-PLS training function, which can perform fx correction.
1/fx correction will be applied at step 2 and step 7.
Usage
TWAPLS.w2(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
- fit
- the fitted values using each number of components. 
- x
- the observed modern climate values. 
- taxon_name
- the name of each taxon. 
- optimum
- the updated taxon optimum 
- comp
- each component extracted (will be used in step 7 regression). 
- u
- taxon optimum for each component (step 2). 
- t
- taxon tolerance for each component (step 2). 
- z
- a parameter used in standardization for each component (step 5). 
- s
- a parameter used in standardization for each component (step 5). 
- orth
- a list that stores orthogonalization parameters (step 4). 
- alpha
- a list that stores regression coefficients (step 7). 
- meanx
- mean value of the observed modern climate values. 
- nPLS
- the total number of components extracted. 
See Also
fx, TWAPLS.predict.w, and
WAPLS.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## End(Not run)
WA-PLS predict function
Description
WA-PLS predict function
Usage
WAPLS.predict.w(WAPLSoutput, fossil_taxa)
Arguments
| WAPLSoutput | The output of the  | 
| fossil_taxa | Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. | 
Value
A list of the reconstruction results. Each element in the list is described below:
- fit
- The fitted values using each number of components. 
- nPLS
- The total number of components extracted. 
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## Train
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## Predict
fossil_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_Tmin, core)
fossil_f_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin, core)
fossil_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_Tmin2, core)
fossil_f_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin2, core)
## End(Not run)
WA-PLS training function
Description
WA-PLS training function, which can perform fx correction.
1/fx^2 correction will be applied at step 7.
Usage
WAPLS.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
- fit
- the fitted values using each number of components. 
- x
- the observed modern climate values. 
- taxon_name
- the name of each taxon. 
- optimum
- the updated taxon optimum (u* in the WA-PLS paper). 
- comp
- each component extracted (will be used in step 7 regression). 
- u
- taxon optimum for each component (step 2). 
- z
- a parameter used in standardization for each component (step 5). 
- s
- a parameter used in standardization for each component (step 5). 
- orth
- a list that stores orthogonalization parameters (step 4). 
- alpha
- a list that stores regression coefficients (step 7). 
- meanx
- mean value of the observed modern climate values. 
- nPLS
- the total number of components extracted. 
See Also
fx, TWAPLS.w, and
WAPLS.predict.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## End(Not run)
WA-PLS training function v2
Description
WA-PLS training function, which can perform fx correction.
1/fx correction will be applied at step 2 and step 7.
Usage
WAPLS.w2(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
- fit
- the fitted values using each number of components. 
- x
- the observed modern climate values. 
- taxon_name
- the name of each taxon. 
- optimum
- the updated taxon optimum (u* in the WA-PLS paper). 
- comp
- each component extracted (will be used in step 7 regression). 
- u
- taxon optimum for each component (step 2). 
- z
- a parameter used in standardization for each component (step 5). 
- s
- a parameter used in standardization for each component (step 5). 
- orth
- a list that stores orthogonalization parameters (step 4). 
- alpha
- a list that stores regression coefficients (step 7). 
- meanx
- mean value of the observed modern climate values. 
- nPLS
- the total number of components extracted. 
See Also
fx, TWAPLS.w, and
WAPLS.predict.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## End(Not run)
Pseudo-removed leave-out cross-validation
Description
Pseudo-removed leave-out cross-validation
Usage
cv.pr.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  trainfun,
  predictfun,
  pseudo,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  test_mode = TRUE,
  test_it = 5
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| trainfun | Training function you want to use, either
 | 
| predictfun | Predict function you want to use: if  | 
| pseudo | The geographically and climatically close sites to each test
site, obtained from  | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
| cpus | Number of CPUs for simultaneous iterations to execute, check
 | 
| test_mode | Boolean flag to execute the function with a limited number
of iterations,  | 
| test_it | Number of iterations to use in the test mode. | 
Value
Leave-one-out cross validation results.
See Also
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  pseudo_Tmin,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  pseudo_Tmin,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()
## End(Not run)
Leave-one-out cross-validation
Description
Leave-one-out cross-validation as
rioja (https://cran.r-project.org/package=rioja).
Usage
cv.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  trainfun,
  predictfun,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  test_mode = FALSE,
  test_it = 5
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site. | 
| nPLS | The number of components to be extracted. | 
| trainfun | Training function you want to use, either
 | 
| predictfun | Predict function you want to use: if  | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
| cpus | Number of CPUs for simultaneous iterations to execute, check
 | 
| test_mode | boolean flag to execute the function with a limited number
of iterations,  | 
| test_it | number of iterations to use in the test mode. | 
Value
leave-one-out cross validation results
See Also
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
## LOOCV
test_mode <- TRUE # It should be set to FALSE before running
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>% fxTWAPLS::pb()
## End(Not run)
Get frequency of the climate value
Description
Function to get the frequency of the climate value, which will be used to
provide fx correction for WA-PLS and TWA-PLS.
Usage
fx(x, bin, show_plot = FALSE)
Arguments
| x | Numeric vector with the modern climate values. | 
| bin | Binwidth to get the frequency of the modern climate values. | 
| show_plot | Boolean flag to show a plot of  | 
Value
Numeric vector with the frequency of the modern climate values.
See Also
cv.w, cv.pr.w, and
sse.sample
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Get the frequency of each climate variable fx
fx_Tmin <- fxTWAPLS::fx(modern_pollen$Tmin, bin = 0.02, show_plot = TRUE)
fx_gdd <- fxTWAPLS::fx(modern_pollen$gdd, bin = 20, show_plot = TRUE)
fx_alpha <- fxTWAPLS::fx(modern_pollen$alpha, bin = 0.002, show_plot = TRUE)
## End(Not run)
Get frequency of the climate value with p-spline smoothing
Description
Function to get the frequency of the climate value, which will be used to
provide fx correction for WA-PLS and TWA-PLS.
Usage
fx_pspline(x, bin, show_plot = FALSE)
Arguments
| x | Numeric vector with the modern climate values. | 
| bin | Binwidth to get the frequency of the modern climate values, the curve will be p-spline smoothed later | 
| show_plot | Boolean flag to show a plot of  | 
Value
Numeric vector with the frequency of the modern climate values.
See Also
cv.w, cv.pr.w, and
sse.sample
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Get the frequency of each climate variable fx
fx_pspline_Tmin <- fxTWAPLS::fx_pspline(
  modern_pollen$Tmin,
  bin = 0.02,
  show_plot = TRUE
)
fx_pspline_gdd <- fxTWAPLS::fx_pspline(
  modern_pollen$gdd,
  bin = 20,
  show_plot = TRUE
)
fx_pspline_alpha <- fxTWAPLS::fx_pspline(
  modern_pollen$alpha,
  bin = 0.002,
  show_plot = TRUE
)
## End(Not run)
Get the distance between points
Description
Get the distance between points, the output will be used in
get_pseudo.
Usage
get_distance(point, cpus = 4, test_mode = FALSE, test_it = 5)
Arguments
| point | Each row represents a sampling site, the first column is longitude and the second column is latitude, both in decimal format. | 
| cpus | Number of CPUs for simultaneous iterations to execute, check
 | 
| test_mode | Boolean flag to execute the function with a limited number
of iterations,  | 
| test_it | Number of iterations to use in the test mode. | 
Value
Distance matrix, the value at the i-th row, means the distance
between the i-th sampling site and the whole sampling sites.
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()
## End(Not run)
Get geographically and climatically close sites
Description
Get the sites which are both geographically and climatically close to the
test site, which could result in pseudo-replication and inflate the
cross-validation statistics. The output will be used in
cv.pr.w.
Usage
get_pseudo(dist, x, cpus = 4, test_mode = FALSE, test_it = 5)
Arguments
| dist | Distance matrix which contains the distance from other sites. | 
| x | The modern climate values. | 
| cpus | Number of CPUs for simultaneous iterations to execute, check
 | 
| test_mode | Boolean flag to execute the function with a limited number
of iterations,  | 
| test_it | Number of iterations to use in the test mode. | 
Value
The geographically and climatically close sites to each test site.
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()
## End(Not run)
Show progress bar
Description
Show progress bar
Usage
pb(expr, ...)
Arguments
| expr | R expression. | 
| ... | Arguments passed on to  
 | 
Value
Return data from the function called.
Plot the residuals
Description
Plot the residuals, the black line is 0 line, the red line is the locally estimated scatterplot smoothing, which shows the degree of local compression.
Usage
plot_residuals(train_output, col)
Arguments
| train_output | Training output, can be the output of WA-PLS, WA-PLS with
 | 
| col | Choose which column of the fitted value to plot, in other words, how many number of components you want to use. | 
Value
Plotting status.
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_residuals(fit_tf_Tmin2, nsig)
## End(Not run)
Plot the training results
Description
Plot the training results, the black line is the 1:1 line, the red line is
the linear regression line to fitted and x, which shows the degree
of overall compression.
Usage
plot_train(train_output, col)
Arguments
| train_output | Training output, can be the output of WA-PLS, WA-PLS with
 | 
| col | Choose which column of the fitted value to plot, in other words, how many number of components you want to use. | 
Value
Plotting status.
See Also
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_train(fit_tf_Tmin2, nsig)
## End(Not run)
Random t-test
Description
Do a random t-test to the cross-validation results.
Usage
rand.t.test.w(cvoutput, n.perm = 999)
Arguments
| cvoutput | |
| n.perm | The number of permutation times to get the p value, which assesses whether using the current number of components is significantly different from using one less. | 
Value
A matrix of the statistics of the cross-validation results. Each component is described below:
- R2
- the coefficient of determination (the larger, the better the fit). 
- Avg.Bias
- average bias. 
- Max.Bias
- maximum bias. 
- Min.Bias
- minimum bias. 
- RMSEP
- root-mean-square error of prediction (the smaller, the better the fit). 
- delta.RMSEP
- the percent change of RMSEP using the current number of components than using one component less. 
- p
- assesses whether using the current number of components is significantly different from using one component less, which is used to choose the last significant number of components to avoid over-fitting. 
- -
- The degree of overall compression is assessed by doing linear regression to the cross-validation result and the observed climate values. -  Compre.b0: the intercept.
-  Compre.b1: the slope (the closer to 1, the less the overall compression).
-  Compre.b0.se: the standard error of the intercept.
-  Compre.b1.se: the standard error of the slope.
 
-  
See Also
Examples
## Not run: 
## Random t-test
rand_pr_tf_Tmin2 <- fxTWAPLS::rand.t.test.w(cv_pr_tf_Tmin2, n.perm = 999)
# note: choose the last significant number of components based on the p-value,
# see details at Liu Mengmeng, Prentice Iain Colin, ter Braak Cajo J. F.,
# Harrison Sandy P.. 2020 An improved statistical approach for reconstructing
# past climates from biotic assemblages. Proc. R. Soc. A. 476: 20200346.
# <https://doi.org/10.1098/rspa.2020.0346>
## End(Not run)
Calculate Sample Specific Errors
Description
Calculate Sample Specific Errors
Usage
sse.sample(
  modern_taxa,
  modern_climate,
  fossil_taxa,
  trainfun,
  predictfun,
  nboot,
  nPLS,
  nsig,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  seed = NULL,
  test_mode = FALSE,
  test_it = 5
)
Arguments
| modern_taxa | The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. | 
| modern_climate | The modern climate value at each sampling site | 
| fossil_taxa | Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. | 
| trainfun | Training function you want to use, either
 | 
| predictfun | Predict function you want to use: if  | 
| nboot | The number of bootstrap cycles you want to use. | 
| nPLS | The number of components to be extracted. | 
| nsig | The significant number of components to use to reconstruct past climates, this can be obtained from the cross-validation results. | 
| usefx | Boolean flag on whether or not use  | 
| fx_method | Binned or p-spline smoothed  | 
| bin | Binwidth to get fx, needed for both binned and p-splined method.
if  | 
| cpus | Number of CPUs for simultaneous iterations to execute, check
 | 
| seed | Seed for reproducibility. | 
| test_mode | Boolean flag to execute the function with a limited number
of iterations,  | 
| test_it | Number of iterations to use in the test mode. | 
Value
The bootstrapped standard error for each site.
See Also
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
Examples
## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## SSE
nboot <- 5 # Recommended 1000
nsig <- 3 # This should be got from the random t-test of the cross validation
sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
  modern_taxa = taxa,
  modern_climate = modern_pollen$Tmin,
  fossil_taxa = core,
  trainfun = fxTWAPLS::TWAPLS.w2,
  predictfun = fxTWAPLS::TWAPLS.predict.w,
  nboot = nboot,
  nPLS = 5,
  nsig = nsig,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2,
  seed = 1
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
  modern_taxa = taxa,
  modern_climate = modern_pollen$Tmin,
  fossil_taxa = core,
  trainfun = fxTWAPLS::TWAPLS.w2,
  predictfun = fxTWAPLS::TWAPLS.predict.w,
  nboot = nboot,
  nPLS = 5,
  nsig = nsig,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2,
  seed = 1
) %>% fxTWAPLS::pb()
## End(Not run)