Title: | Cluster-Based Multiple Comparisons |
---|---|
Description: | Multiple comparison techniques are typically applied following an F test from an ANOVA to decide which means are significantly different from one another. As an alternative to traditional methods, cluster analysis can be performed to group the means of different treatments into non-overlapping clusters. Treatments in different groups are considered statistically different. Several approaches have been proposed, with varying clustering methods and cut-off criteria. This package implements cluster-based multiple comparisons tests and also provides a visual representation in the form of a dendrogram. Di Rienzo, J. A., Guzman, A. W., & Casanoves, F. (2002) <jstor.org/stable/1400690>. Bautista, M. G., Smith, D. W., & Steiner, R. L. (1997) <doi:10.2307/1400402>. |
Authors: | Santiago Garcia Sanchez [aut, cre, cph] |
Maintainer: | Santiago Garcia Sanchez <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.1 |
Built: | 2025-02-24 04:48:37 UTC |
Source: | https://github.com/sgs2000/clustmc |
Includes the volumes (ml) of 85 loaves of bread made under controlled conditions from 100-gram batches of dough made with 17 different varieties of wheat flour and 5 levels of potassium bromate (mg).
bread
bread
A tibble with 85 rows and 3 columns:
a factor indicating the variety of flour used.
a number denoting the amount of potassium bromate used (milligrams).
a number denoting the volume of the loaf made under each condition (milliliters).
Data from a bread-baking experiment by Larmour (1941). Later reproduced by Scheffe (1959) and then used by Duncan (1965) to contrast different multiple comparison methods. Jolliffe (1975) applies this dataset to illustrate his cluster-based test.
Larmour, R. K. (1941). A comparison of hard red spring and hard red winter wheats. Cereal Chemistry, 18(6), 778-789. Available at: https://archive.org/details/sim_cereal-chemistry_1941-11_18_6
Duncan, D. B. (1965). A bayesian approach to multiple comparisons. Technometrics, 7(2), 171-222. doi:10.2307/1266670
Jolliffe, I. T. (1975). Cluster analysis as a multiple comparison method. Applied Statistics: Proceedings of Conference at Dalhousie University, Halifax, 159-168.
Scheffe, H. (1950).The analysis of variance. Wiley-Interscience Publication.
data(bread) summary(bread)
data(bread) summary(bread)
Bautista, Smith and Steiner (BSS) test for multiple comparisons. Implements a procedure for grouping treatments following the determination of differences among them. First, a cluster analysis of the treatment means is performed and the two closest means are grouped. A nested analysis of variance from the original ANOVA is then constructed with the treatment source now partitioned into "groups" and "treatments within groups". This process is repeated until there are no differences among the group means or there are differences among the treatments within groups.
bss_test( y, trt, alpha = 0.05, show_plot = TRUE, console = TRUE, abline_options, ... )
bss_test( y, trt, alpha = 0.05, show_plot = TRUE, console = TRUE, abline_options, ... )
y |
Either a model (created with |
trt |
If |
alpha |
Numeric value corresponding to the significance level of the test. The default value is 0.05. |
show_plot |
Logical value indicating whether the constructed dendrogram should be plotted or not. |
console |
Logical value indicating whether the results should be printed on the console or not. |
abline_options |
|
... |
Optional arguments for the |
A list with three data.frame
and one hclust
:
stats |
|
groups |
|
parameters |
|
dendrogram_data |
object of class |
Santiago Garcia Sanchez
Bautista, M. G., Smith, D. W., & Steiner, R. L. (1997). A Cluster-Based Approach to Means Separation. Journal of Agricultural, Biological, and Environmental Statistics, 2(2), 179-197. doi:10.2307/1400402
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group bss_test(y = weights, trt = treatments, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) bss_test(y = model, trt = "treatments", show_plot = FALSE)
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group bss_test(y = weights, trt = treatments, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) bss_test(y = model, trt = "treatments", show_plot = FALSE)
Includes the nitrogen content (mg) of 30 red clover plants inoculated with one of four single-strain cultures of Rhizobium trifolii or a composite of five Rhizobium meliloti strains, resulting in six treatments in total.
clover
clover
A tibble with 30 rows and 2 columns:
a factor denoting the treatment applied to each plant.
a number denoting the nitrogen content of each plant (milligrams).
Data originally from an experiment by Erdman (1946), conducted in a greenhouse using a completely random design. The current dataset was presented by Steel and Torrie (1980) and later used by Bautista et al. (1997) to illustrate their proposed procedure.
Steel, R., & Torrie, J. (1980). Principles and procedures of statistics: A biometrical approach (2nd ed.). San Francisco: McGraw-Hill. Available at: https://archive.org/details/principlesproce00stee
Bautista, M. G., Smith, D. W., & Steiner, R. L. (1997). A Cluster-Based Approach to Means Separation. Journal of Agricultural, Biological, and Environmental Statistics, 2(2), 179-197. doi:10.2307/1400402
Erdman, L. W. (1946). Studies to determine if antibiosis occurs among rhizobia. Journal of the American Society of Agronomy, 38, 251-258. doi:10.2134/agronj1946.00021962003800030005x
data(clover) summary(clover)
data(clover) summary(clover)
Di Rienzo, Guzman and Casanoves (DGC) test for multiple comparisons.
Implements a cluster-based method for identifying groups of nonhomogeneous
means. Average linkage clustering is applied to a distance matrix obtained
from the sample means. The distribution of (distance between the
source and the root node of the tree) is used to build a test with a
significance level of
. Groups whose means join above
(the
-level cut-off criterion) are statistically
different.
dgc_test( y, trt, alpha = 0.05, show_plot = TRUE, console = TRUE, abline_options, ... )
dgc_test( y, trt, alpha = 0.05, show_plot = TRUE, console = TRUE, abline_options, ... )
y |
Either a model (created with |
trt |
If |
alpha |
Value equivalent to 0.05 or 0.01, corresponding to the significance level of the test. The default value is 0.05. |
show_plot |
Logical value indicating whether the constructed dendrogram should be plotted or not. |
console |
Logical value indicating whether the results should be printed on the console or not. |
abline_options |
|
... |
Optional arguments for the |
A list with three data.frame
and one hclust
:
stats |
|
groups |
|
parameters |
|
dendrogram_data |
object of class |
Santiago Garcia Sanchez
Di Rienzo, J. A., Guzman, A. W., & Casanoves, F. (2002). A Multiple-Comparisons Method Based on the Distribution of the Root Node Distance of a Binary Tree. Journal of Agricultural, Biological, and Environmental Statistics, 7(2), 129-142. <jstor.org/stable/1400690>
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group dgc_test(y = weights, trt = treatments, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) dgc_test(y = model, trt = "treatments", show_plot = FALSE)
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group dgc_test(y = weights, trt = treatments, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) dgc_test(y = model, trt = "treatments", show_plot = FALSE)
I.T. Jolliffe test for multiple comparisons.
Implements a cluster-based alternative closely linked to the
Student-Newman-Keuls multiple comparison method. Single-linkage cluster
analysis is applied, using the p-values obtained with the SNK test for
pairwise mean comparison as a similarity measure. Groups whose means join
beyond are statistically different. Alternatively, complete
linkage cluster analysis can also be applied.
jolliffe_test( y, trt, alpha = 0.05, method = "single", show_plot = TRUE, console = TRUE, abline_options, ... )
jolliffe_test( y, trt, alpha = 0.05, method = "single", show_plot = TRUE, console = TRUE, abline_options, ... )
y |
Either a model (created with |
trt |
If |
alpha |
Numeric value corresponding to the significance level of the test. The default value is 0.05. |
method |
|
show_plot |
Logical value indicating whether the constructed dendrogram should be plotted or not. |
console |
Logical value indicating whether the results should be printed on the console or not. |
abline_options |
|
... |
Optional arguments for the |
A list with three data.frame
and one hclust
:
stats |
|
groups |
|
parameters |
|
dendrogram_data |
object of class |
Santiago Garcia Sanchez
Jolliffe, I. T. (1975). Cluster analysis as a multiple comparison method. Applied Statistics: Proceedings of Conference at Dalhousie University, Halifax, 159-168.
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group jolliffe_test(y = weights, trt = treatments, alpha = 0.1, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) jolliffe_test(y = model, trt = "treatments", alpha = 0.1, show_plot = FALSE)
data("PlantGrowth") # Using vectors ------------------------------------------------------- weights <- PlantGrowth$weight treatments <- PlantGrowth$group jolliffe_test(y = weights, trt = treatments, alpha = 0.1, show_plot = FALSE) # Using a model ------------------------------------------------------- model <- lm(weights ~ treatments) jolliffe_test(y = model, trt = "treatments", alpha = 0.1, show_plot = FALSE)