| Type: | Package | 
| Title: | Bootstrapping Estimates of Clustering Stability | 
| Version: | 0.4.2 | 
| Description: | Implementation of the bootstrapping approach for the estimation of clustering stability and its application in estimating the number of clusters, as introduced by Yu et al (2016)<doi:10.1142/9789814749411_0007>. Implementation of the non-parametric bootstrap approach to assessing the stability of module detection in a graph, the extension for the selection of a parameter set that defines a graph from data in a way that optimizes stability and the corresponding visualization functions, as introduced by Tian et al (2021) <doi:10.1002/sam.11495>. Implemented out-of-bag stability estimation function and k-select Smin-based k-selection function as introduced by Liu et al (2022) <doi:10.1002/sam.11593>. Implemented ensemble clustering method based-on k-means clustering method, spectral clustering method and hierarchical clustering method. | 
| Depends: | R (≥ 3.5.1) | 
| Imports: | cluster, mclust (≥ 5.0.0), flexclust, fpc, plyr, dplyr, doParallel, foreach, igraph (≥ 1.2.0), compiler, stats, parallel, grid, grDevices, ggplot2, gridExtra, intergraph, GGally, network, kernlab, sna, progress | 
| License: | GPL-2 | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-06-17 13:26:12 UTC; tianm | 
| Author: | Han Yu [aut],
  Mingmei Tian [aut],
  Tianmou Liu | 
| Maintainer: | Tianmou Liu <tianmouliu@outlook.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-06-17 23:00:02 UTC | 
Calculate agreement between two clustering results
Description
Calculate agreement between two clustering results
Usage
agreement(clst1, clst2)
Arguments
| clst1 | First clustering result | 
| clst2 | Second clustering result | 
Value
Vector of agreement values
Calculate agreement between two clustering results with known number of clusters
Description
Calculate agreement between two clustering results with known number of clusters
Usage
agreement_nk(clst1, clst2, nk)
Arguments
| clst1 | First clustering result | 
| clst2 | Second clustering result | 
| nk | Number of clusters | 
Value
Vector of agreement values
Multi-Method Ensemble Clustering Analysis for Multiple-Objective Clustering (MOC) Datasets
Description
Performs ensemble clustering analysis on multiple datasets using different clustering methods and compares their performance.
Usage
analyze_moc_datasets(
  datasets,
  selected,
  n_ref = 3,
  B = 100,
  plot = TRUE,
  plot_file = NULL
)
Arguments
| datasets | List of datasets to analyze | 
| selected | Indices of datasets to analyze | 
| n_ref | Number of reference distributions (default: 3) | 
| B | Number of bootstrap samples (default: 100) | 
| plot | Whether to generate plots (default: TRUE) | 
| plot_file | Output file for plots (default: NULL) | 
Value
A list containing:
- results
- Results for each dataset 
- ari_table
- Adjusted Rand Index comparison table 
- runtime_table
- Runtime comparison table 
- plots
- List of generated plots if plot=TRUE 
Calculate Comparison Statistics
Description
Calculate Comparison Statistics
Usage
calculate_comparison_stats(results, method_names)
Arguments
| results | List of results for each method | 
| method_names | Names of combination methods | 
Value
List of comparison statistics
Calculate Stability Measures for a Clustering Method
Description
Calculate Stability Measures for a Clustering Method
Usage
calculate_stability_measures(
  x,
  k,
  scheme,
  B = 100,
  n_ref = 3,
  hc.method = "ward.D",
  dist_method = "euclidean"
)
Arguments
| x | Input data matrix | 
| k | Number of clusters | 
| scheme | Clustering scheme ("kmeans", "hc", or "spectral") | 
| B | Number of bootstrap samples | 
| n_ref | Number of reference distributions | 
| hc.method | Hierarchical clustering method | 
| dist_method | Distance method | 
Value
List containing stability measures and clustering results
Compare MOC Results
Description
Compare MOC Results
Usage
compare_moc_results(results, metric = "ari", plot = TRUE)
Arguments
| results | Results from analyze_moc_datasets | 
| metric | Metric to compare ("ari", "runtime", or "modularity") | 
| plot | Whether to generate comparison plot (default: TRUE) | 
Create Graph and Find Communities Using Different Methods
Description
Create Graph and Find Communities Using Different Methods
Usage
create_graph_and_communities(inc, method = "fastgreedy")
Arguments
| inc | Incidence matrix | 
| method | Community detection method: "fastgreedy", "metis", "hmetis", or "all" (default: "fastgreedy") | 
Value
List containing graph and community detection results from specified method(s)
Create Incidence Matrix for Graph Construction
Description
Create Incidence Matrix for Graph Construction
Usage
create_incidence_matrix(
  results_km,
  results_hc,
  results_sc,
  k_km,
  k_hc,
  k_sc,
  Smin_delta_km,
  Smin_delta_hc,
  Smin_delta_sc,
  combine_fn,
  n_samples
)
Arguments
| results_km | K-means results | 
| results_hc | Hierarchical clustering results | 
| results_sc | Spectral clustering results | 
| k_km | Number of k-means clusters | 
| k_hc | Number of hierarchical clusters | 
| k_sc | Number of spectral clusters | 
| Smin_delta_km | K-means stability delta | 
| Smin_delta_hc | Hierarchical clustering stability delta | 
| Smin_delta_sc | Spectral clustering stability delta | 
| combine_fn | Function to combine stability measures | 
| n_samples | Number of samples | 
Value
Incidence matrix for graph construction
Define Stability Combination Methods
Description
Define Stability Combination Methods
Usage
define_combination_methods(alpha = 0.5)
Arguments
| alpha | Weight for weighted combination | 
Value
List of combination functions
Multi-Method Ensemble Clustering with Graph-based Consensus
Description
Implements ensemble clustering by combining multiple clustering methods (k-means, hierarchical, and spectral clustering) using a graph-based consensus approach.
Usage
ensemble.cluster.multi(
  x,
  k_km,
  k_hc,
  k_sc,
  n_ref = 3,
  B = 100,
  hc.method = "ward.D",
  dist_method = "euclidean"
)
Arguments
| x | data.frame or matrix where rows are observations and columns are features | 
| k_km | number of clusters for k-means clustering | 
| k_hc | number of clusters for hierarchical clustering | 
| k_sc | number of clusters for spectral clustering | 
| n_ref | number of reference distributions for stability assessment (default: 3) | 
| B | number of bootstrap samples for stability estimation (default: 100) | 
| hc.method | hierarchical clustering method (default: "ward.D") | 
| dist_method | distance method for spectral clustering (default: "euclidean") | 
Details
This function implements a multi-method ensemble clustering approach that: 1. Applies multiple clustering methods (k-means, hierarchical, spectral) 2. Assesses stability of each clustering through bootstrapping 3. Constructs a weighted bipartite graph representing all clusterings 4. Uses fast greedy community detection for final consensus
Value
A list containing:
- membership
- Final cluster assignments from ensemble consensus 
- k_consensus
- Number of clusters found in consensus 
- individual_results
- List of results from individual clustering methods 
- stability_measures
- Stability measures for each method 
- graph
- igraph object of the ensemble graph 
Examples
data(iris)
df <- iris[,1:4]
result <- ensemble.cluster.multi(df, k_km=3, k_hc=3, k_sc=3)
plot(df[,1:2], col=result$membership, pch=16)
Multi-Method Ensemble Clustering with Multiple Stability Combinations
Description
Implements ensemble clustering using multiple methods for combining stability measures, generating separate consensus results for each combination method.
Usage
ensemble_cluster_multi_combinations(
  x,
  k_km,
  k_hc,
  k_sc,
  n_ref = 3,
  B = 100,
  hc.method = "ward.D",
  dist_method = "euclidean",
  alpha = 0.25
)
Arguments
| x | data.frame or matrix where rows are observations and columns are features | 
| k_km | number of clusters for k-means clustering | 
| k_hc | number of clusters for hierarchical clustering | 
| k_sc | number of clusters for spectral clustering | 
| n_ref | number of reference distributions for stability assessment (default: 3) | 
| B | number of bootstrap samples for stability estimation (default: 100) | 
| hc.method | hierarchical clustering method (default: "ward.D") | 
| dist_method | distance method for spectral clustering (default: "euclidean") | 
| alpha | weight for weighted combination (default: 0.5) | 
Value
A list containing results for each combination method:
- product
- Results using product combination 
- arithmetic
- Results using arithmetic mean 
- geometric
- Results using geometric mean 
- harmonic
- Results using harmonic mean 
- weighted
- Results using weighted combination 
Each method's results contain:
- fastgreedy
- Results from fast greedy community detection 
- metis
- Results from METIS (leading eigenvector) community detection 
- hmetis
- Results from hMETIS (Louvain) community detection 
- graph
- igraph object of the ensemble graph 
- edge_weights
- Edge weights of the graph 
- individual_results
- Results from individual clustering methods 
- stability_measures
- Stability measures 
- incidence_matrix
- Incidence matrix used for graph construction 
Each community detection method's results contain:
- membership
- Final cluster assignments 
- k_consensus
- Number of clusters found 
The function also returns comparison statistics for each community detection method:
- comparison$fastgreedy
- Comparison stats for fast greedy results 
- comparison$metis
- Comparison stats for METIS results 
- comparison$hmetis
- Comparison stats for hMETIS results 
Examples
data(iris)
df <- iris[,1:4]
results <- ensemble_cluster_multi_combinations(df, k_km=3, k_hc=3, k_sc=3)
# Compare cluster assignments from different methods
table(product = results$product$membership, 
      arithmetic = results$arithmetic$membership)
Estimate the stability of a clustering based on non-parametric bootstrap out-of-bag scheme, with option for subsampling scheme
Description
Estimate the stability of a clustering based on non-parametric bootstrap out-of-bag scheme, with option for subsampling scheme
Usage
esmbl.stability(
  x,
  k,
  scheme = "kmeans",
  B = 100,
  hc.method = "ward.D",
  cut_ratio = 0.5,
  dist_method = "euclidean"
)
Arguments
| x | 
 | 
| k | number of clusters for which to estimate the stability | 
| scheme | clustering method to use ("kmeans", "hc", or "spectral") | 
| B | number of bootstrap re-samples | 
| hc.method | hierarchical clustering method (default: "ward.D") | 
| cut_ratio | ratio for subsampling (default: 0.5) | 
| dist_method | distance method for spectral clustering (default: "euclidean") | 
Details
This function estimates the stability through out-of-bag observations It estimate the stability at the (1) observation level, (2) cluster level, and (3) overall.
Value
- membership
- vector of membership for each observation from the reference clustering 
- obs_wise
- vector of estimated observation-wise stability 
- clust_wise
- vector of estimated cluster-wise stability 
- overall
- numeric estimated overall stability 
- Smin
- numeric estimated Smin through out-of-bag scheme 
Author(s)
Tianmou Liu
Examples
set.seed(123)
data(iris)
df <- iris[,1:4]
result <- esmbl.stability(df, k=3, scheme="kmeans")
Estimate number of clusters
Description
Estimate number of clusters by bootstrapping stability
Usage
k.select(x, range = 2:7, B = 20, r = 5, threshold = 0.8, scheme_2 = TRUE)
Arguments
| x | a  | 
| range | a  | 
| B | number of bootstrap re-samplings | 
| r | number of runs of k-means | 
| threshold | the threshold for determining k | 
| scheme_2 | 
 | 
Details
This function estimates the number of clusters through a bootstrapping approach, and a measure Smin, which is based on an observation-wise similarity among clusterings. The number of clusters k is selected as the largest number of clusters, for which the Smin is greater than a threshold. The threshold is often selected between 0.8 ~ 0.9. Two schemes are provided. Scheme 1 uses the clustering of the original data as the reference for stability calculations. Scheme 2 searches acrossthe clustering samples that gives the most stable clustering.
Value
- profile
- a - vectorof Smin measures for determining k
- k
- integerestimated number of clusters
Author(s)
Han Yu
References
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
Examples
set.seed(1)
data(wine)
x0 <- wine[,2:14]
x <- scale(x0)
k.select(x, range = 2:10, B=20, r=5, scheme_2 = TRUE)
Estimate number of clusters
Description
Estimate number of clusters by bootstrapping stability
Usage
k.select_ref(df, k_range = 2:7, n_ref = 5, B = 100, B_ref = 50, r = 5)
Arguments
| df | 
 | 
| k_range | 
 | 
| n_ref | number of reference distribution to be generated | 
| B | number of bootstrap re-samples | 
| B_ref | number of bootstrap resamples for the reference distributions | 
| r | number of runs of k-means | 
Details
This function uses the out-of-bag scheme to estimate the number of clusters in a dataset. The function calculate the Smin of the dataset and at the same time, generate a reference dataset with the same range as the original dataset in each dimension and calculate the Smin_ref. The differences between Smin and Smin_ref at each k,Smin_diff(k), is taken into consideration as well as the standard deviation of the differences. We choose the k to be the argmax of ( Smin_diff(k) - ( Smin_diff(k+1) + (Smin_diff(k+1)) ) ). If Smin_diff(k) less than 0.1 for all k in k_range, we say k = 1
Value
- profile
- vectorof ( Smin_diff(k) - ( Smin_diff(k+1) + se(Smin_diff(k+1)) ) ) measures for researchers's inspection
- k
- estimated number of clusters 
Author(s)
Tianmou Liu
References
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
Examples
set.seed(1)
data(iris)
df <- data.frame(iris[,1:4])
df <- scale(df)
k.select_ref(df, k_range = 2:7, n_ref = 5, B=500, B_ref = 500, r=5)
Load Multiple-Objective Clustering (MOC) Datasets
Description
Loads and processes datasets for multiple-objective clustering analysis. The function loads CSV files from a specified directory and processes them by removing NA columns.
Usage
load_moc_datasets(data_dir = getwd())
Arguments
| data_dir | Directory containing the CSV datasets (default: current working directory) | 
Value
A list containing:
- datasets
- Named list of processed datasets 
Examples
## Not run: 
# Load datasets
result <- load_moc_datasets("path/to/MOC_Data")
# Access a specific dataset
spiral <- result$datasets$Spiral
## End(Not run)
Calculate minimum agreement across clusters
Description
Calculates the minimum average agreement value across all clusters
Usage
min_agreement(clst, agrmt)
Arguments
| clst | clustering result vector | 
| agrmt | agreement values vector | 
Value
minimum average agreement value across clusters
Estimate of detect module stability
Description
Estimate of detect module stability
Usage
network.stability(
  data.input,
  threshold,
  B = 20,
  cor.method,
  large.size,
  PermuNo,
  scheme_2 = FALSE
)
Arguments
| data.input | a  | 
| threshold | a  | 
| B | number of bootstrap re-samplings | 
| cor.method | the correlation method applied to the data set,three method are available:  | 
| large.size | the smallest set of modules, the  | 
| PermuNo | number of random graphs for null | 
| scheme_2 | 
 | 
Details
This function estimates the modules' stability through bootstrapping approach for the given threshold. The approach to stability estimation is to compare the module composition of the reference correlation graph to the various bootstrapped correlation graphs, and to assess the stability at the (1) node-level, (2) module-level, and (3) overall.
Value
- stabilityresult
- a - listof result for nodes-wise stability
- modularityresult
- listof modularity information with the given threshold
- jaccardresult
- listestimated unconditional observed stability and the estimates of expected stability under the null
- originalinformation
- listinformation for original data, igraph object and adjacency matrix constructed with the given threshold
Author(s)
Mingmei Tian
References
A framework for stability-based module detection in correlation graphs. Mingmei Tian,Rachael Hageman Blair,Lina Mu, Matthew Bonner, Richard Browne and Han Yu.
Examples
set.seed(1)
data(wine)
x0 <- wine[1:50,]
mytest<-network.stability(data.input=x0,threshold=0.7, B=20, 
cor.method='pearson',large.size=0,
PermuNo = 10,
scheme_2 = FALSE)
Plot method for objests from threshold.select
Description
Plot method for objests from threshold.select
Usage
network.stability.output(input, optimal.only = FALSE)
Arguments
| input | a  | 
| optimal.only | a  | 
Details
network.stability.output is used to generate a series of network plots based on the given threshold.seq,where the nodes are
colored by the level of stability. The network with optimal
threshold value selected by function threshold.select is colored as red.
Value
Plot of network figures
Author(s)
Mingmei Tian
References
A framework for stability-based module detection in correlation graphs. Mingmei Tian,Rachael Hageman Blair,Lina Mu, Matthew Bonner, Richard Browne and Han Yu.
Examples
set.seed(1)
data(wine)
x0 <- wine[1:50,]
mytest<-threshold.select(data.input=x0,threshold.seq=seq(0.1,0.5,by=0.05), B=20, 
cor.method='pearson',large.size=0,
PermuNo = 10,
no_cores=1,
scheme_2 = FALSE)
network.stability.output(mytest)
Estimate the stability of a clustering based on non-parametric bootstrap out-of-bag scheme, with option for subsampling scheme
Description
Estimate the stability of a clustering based on non-parametric bootstrap out-of-bag scheme, with option for subsampling scheme
Usage
ob.stability(x, k, B = 500, r = 5, subsample = FALSE, cut_ratio = 0.5)
Arguments
| x | 
 | 
| k | number of clusters for which to estimate the stability | 
| B | number of bootstrap re-samples | 
| r | integer parameter in the kmeansCBI() funtion | 
| subsample | logical parameter to use the subsampling scheme option in the resampling process (instead of bootstrap) | 
| cut_ratio | numeric parameter between 0 and 1 for subsampling scheme training set ratio | 
Details
This function estimates the stability through out-of-bag observations It estimate the stability at the (1) observation level, (2) cluster level, and (3) overall.
Value
- membership
- vectorof membership for each observation from the reference clustering
- obs_wise
- vectorof estimated observation-wise stability
- clust_wise
- vectorof estimated cluster-wise stability
- overall
- numericestimated overall stability
- Smin
- numericestimated Smin through out-of-bag scheme
Author(s)
Tianmou Liu
References
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
Examples
set.seed(123)
data(iris)
df <- data.frame(iris[,1:4])
# You can choose to scale df before clustering by 
# df <- scale(df)
ob.stability(df, k = 2, B=500, r=5)
Create a Grid Plot of MOC Results
Description
Creates a grid plot with datasets as rows and clustering methods as columns. This function is designed to visualize multiple datasets and methods in a single plot.
Usage
plot_moc_grid(
  results,
  dataset_names = NULL,
  methods = c("kmeans", "hierarchical", "spectral", "fastgreedy", "metis", "hmetis"),
  plot_file = NULL,
  format = "pdf",
  mar = c(2, 2, 2, 1),
  cex = 0.7,
  point_size = 0.8,
  family = "serif",
  label_style = TRUE,
  maintain_aspect_ratio = TRUE
)
Arguments
| results | Results from analyze_moc_datasets | 
| dataset_names | Names of datasets to plot (default: all datasets in results) | 
| methods | Methods to plot (default: all available methods) | 
| plot_file | Output file for plots (default: NULL) | 
| format | Output format, either "pdf" or "eps" (default: "pdf") | 
| mar | Margins for plots (default: c(2, 2, 2, 1)) | 
| cex | Text size multiplier (default: 0.7) | 
| point_size | Point size for scatter plots (default: 0.8) | 
| family | Font family (default: "serif" for Times New Roman) | 
| label_style | Whether to add row/column labels (default: TRUE) | 
| maintain_aspect_ratio | Whether to maintain aspect ratio in PDF (default: TRUE) | 
Value
Invisibly returns the layout information
Plot MOC Results
Description
Plot MOC Results
Usage
plot_moc_results(
  results,
  dataset_names,
  methods = c("kmeans", "hierarchical", "spectral", "fastgreedy", "metis", "hmetis"),
  plot_file = NULL,
  max_plots_per_page = 12,
  mar = c(2, 2, 2, 1)
)
Arguments
| results | Results from analyze_moc_datasets | 
| dataset_names | Name or vector of names of datasets to plot | 
| methods | Methods to plot (default: c("fastgreedy", "metis", "hmetis")) | 
| plot_file | Output file for plots (default: NULL) | 
| max_plots_per_page | Maximum number of plots per page (default: 12) | 
| mar | Margins for plots (default: c(2, 2, 2, 1)) | 
Generate reference distribution for stability assessment
Description
Generates a reference distribution by sampling from uniform distributions with ranges determined by the original data.
Usage
ref_dist(df)
Arguments
| df | data.frame or matrix of the original dataset | 
Details
Generate Reference Distribution
Value
A scaled matrix containing the reference distribution
Examples
data(iris)
df <- iris[,1:4]
ref <- ref_dist(df)
Generate reference distribution for binary data
Description
Generates a reference distribution by randomly permuting each column of the original binary dataset.
Usage
ref_dist_bin(df)
Arguments
| df | data.frame or matrix of the original binary dataset | 
Details
Generate Binary Reference Distribution
Value
A matrix containing the permuted binary reference distribution
Examples
binary_data <- matrix(sample(0:1, 100, replace=TRUE), ncol=5)
ref <- ref_dist_bin(binary_data)
Generate PCA-based reference distribution
Description
Generates a reference distribution in PCA space by sampling from uniform distributions with ranges determined by the PCA-transformed data.
Usage
ref_dist_pca(df)
Arguments
| df | data.frame or matrix of the original dataset | 
Details
Generate Reference Distribution using PCA
Value
A scaled matrix containing the reference distribution in PCA space
Examples
data(iris)
df <- iris[,1:4]
ref <- ref_dist_pca(df)
Subsampling-based Hierarchical Clustering
Description
Subsampling-based Hierarchical Clustering
Usage
scheme_sub_hc(
  df,
  nk,
  B = 10,
  cut_ratio = 0.5,
  hc.method = "ward.D",
  dist_method = "euclidean"
)
Arguments
| df | data.frame or matrix of the dataset | 
| nk | number of clusters | 
| B | number of bootstrap samples | 
| cut_ratio | ratio for subsampling (default: 0.5) | 
| hc.method | hierarchical clustering method (default: "ward.D") | 
| dist_method | distance method (default: "euclidean") | 
Value
List containing clustering results and stability measures
Subsampling-based K-means Clustering
Description
Subsampling-based K-means Clustering
Usage
scheme_sub_km(df, nk, B = 10, cut_ratio = 0.5, r = 5)
Arguments
| df | data.frame or matrix of the dataset | 
| nk | number of clusters | 
| B | number of bootstrap samples | 
| cut_ratio | ratio for subsampling (default: 0.5) | 
| r | number of k-means runs (default: 5) | 
Value
List containing clustering results and stability measures
Subsampling-based Spectral Clustering
Description
Subsampling-based Spectral Clustering
Usage
scheme_sub_spectral(df, nk, B = 10, cut_ratio = 0.5)
Arguments
| df | data.frame or matrix of the dataset | 
| nk | number of clusters | 
| B | number of bootstrap samples | 
| cut_ratio | ratio for subsampling (default: 0.5) | 
Value
List containing clustering results and stability measures
Estimate clustering stability of k-means
Description
Estimate of k-means bootstrapping stability
Usage
stability(x, k, B = 20, r = 5, scheme_2 = TRUE)
Arguments
| x | a  | 
| k | a  | 
| B | number of bootstrap re-samplings | 
| r | number of runs of k-means | 
| scheme_2 | 
 | 
Details
This function estimates the clustering stability through bootstrapping approach. Two schemes are provided. Scheme 1 uses the clustering of the original data as the reference for stability calculations. Scheme 2 searches acrossthe clustering samples that gives the most stable clustering.
Value
- membership
- a - vectorof membership for each observation from the reference clustering
- obs_wise
- vectorof estimated observation-wise stability
- overall
- numericestimated overall stability
Author(s)
Han Yu
References
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
Examples
 
set.seed(1)
data(wine)
x0 <- wine[,2:14]
x <- scale(x0)
stability(x, k = 3, B=20, r=5, scheme_2 = TRUE)
Estimate of the overall Jaccard stability
Description
Estimate of the overall Jaccard stability
Arguments
| data.input | a  | 
| threshold.seq | a  | 
| B | number of bootstrap re-samplings | 
| cor.method | the correlation method applied to the data set,three method are available:  | 
| large.size | the smallest set of modules, the  | 
| PermuNo | number of random graphs for the estimation of expected stability | 
| no_cores | a  | 
Details
threshold.select is used to estimate of the overall Jaccard stability from 
a sequence of given threshold candidates, threshold.seq.
Value
- stabilityresult
- a - listof result for nodes-wise stability
- modularityresult
- a - listof modularity information with each candidate threshold
- jaccardresult
- a - listestimated unconditional observed stability and the estimates of expected stability under the nul
- originalinformation
- a - listinformation for original data, igraph object and adjacency matrix constructed with each candidate threshold
- threshold.seq
- a - listof candicate threshold given to the function
Author(s)
Mingmei Tian
References
A framework for stability-based module detection in correlation graphs. Mingmei Tian,Rachael Hageman Blair,Lina Mu, Matthew Bonner, Richard Browne and Han Yu.
Examples
set.seed(1)
data(wine)
x0 <- wine[1:50,]
mytest<-threshold.select(data.input=x0,threshold.seq=seq(0.5,0.8,by=0.05), B=20, 
cor.method='pearson',large.size=0,
PermuNo = 10,
no_cores=1,
scheme_2 = FALSE)
Wine Data Set
Description
These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines.
Usage
data(wine)
Format
The data set wine contains a data.frame of 14 variables. The first variable is the types
of wines. The other 13 variables are quantities of the constituents.
References
https://archive.ics.uci.edu/ml/datasets/wine