5 Statistics in R
A lot of helpful information that that was used here was inspired by a more detailed site that can be found here
And please take note, that the writer of this tutorial is no statistician and if you spot issues feel free to let me know!
5.1 Summary stats
To do some basic stats we use the nutrient growth data set, which you find in the R_exercises folder.
means()
= print the mean of a certain columnsummary()
= print the mean, median, max, min, … of a certain column.
#get the menas of our root length
mean(growth_data$Rootlength)
[1] 9.046117
#return summary stats for the root length
summary(growth_data$Rootlength)
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.236 6.721 8.099 9.046 11.793 14.716
We can also summarize data of several columns
#return summary stats for the root length
summary(growth_data[,c("Rootlength", "FW_shoot_mg")])
Rootlength FW_shoot_mg
Min. : 4.236 Min. : 3.75
1st Qu.: 6.721 1st Qu.: 7.69
Median : 8.099 Median :11.62
Mean : 9.046 Mean :15.23
3rd Qu.:11.793 3rd Qu.:22.05
Max. :14.716 Max. :39.39
We can also run summary()
on our complete dataframe.
#run summary on all our data
summary(growth_data)
SampleID Nutrient Condition FW_shoot_mg
Length:105 Length:105 Length:105 Min. : 3.75
Class :character Class :character Class :character 1st Qu.: 7.69
Mode :character Mode :character Mode :character Median :11.62
Mean :15.23
3rd Qu.:22.05
Max. :39.39
Rootlength
Min. : 4.236
1st Qu.: 6.721
Median : 8.099
Mean : 9.046
3rd Qu.:11.793
Max. :14.716
This way we do not only get summary stats for our values, but also we get some counts on how many measurements we have for different conditions.
5.2 The table command
The function table builds a contingency table of the counts at each factor level, or combinations of factor levels. This is quite useful to count the number of data points with certain metadata associated. So for example for our genome data we can ask how many measurements we made for each condition.
#summarize how many measurements we have for each treatment
table(growth_data$SampleID)
P P_101 P_230 P_28 noP noP_101 noP_230 noP_28
15 12 13 13 13 13 11 15
Here, we see that we have a slightly different number of measurements for each condition and timepoint. For this tutorial we can ignore them, but this might be relevant if we have huge differences for some statistical analyses. This approach allows you to for larger datasets to easily check for samples that might be outliers in terms of measurements per sample.
5.3 The ddply command
The table command can get rather slow and there are some useful packages to speed up things and run more complicated mathematical operations. One example is the ddply()
function of the plyr package. A useful feature of ddply is that this tool stores the output as a dataframe instead of a table.
Below we see examples were we summarize the data for the root length across different nutrient conditions. I.e. we do not want to have the mean for all root lengths but we want to see if roots have different lengths under low and high P conditions.
#load our package
library(plyr)
#calculate the mean root length across our Sample IDs
<- ddply(growth_data, .(Nutrient), summarize, Mean_RootLength = mean(Rootlength))
growth_data_summarized
#view data
head(growth_data_summarized)
The structure of the command is as follows:
ddply(Input_Data, .(Colum_we_want_to_sum_arcoss), summarize, New_Column_name = mean(Column_we_want_to_calc_the_mean_for))
We can also summarize the data combining different conditions (i.e. nutrient and condition).
#Summarize our data across Nutrients and Conditions
<- ddply(growth_data, .(Nutrient, Condition), summarize, Mean_RootLength = mean(Rootlength))
growth_data_summarized
#view data
head(growth_data_summarized)
We can also calculate the mean, sd and se in one line of code
#and we can use even fancier functions (i.e. get the se and sd), check the plyr package for more details
<- ddply(growth_data, .(Nutrient, Condition), summarize, RootLength = mean(Rootlength), sd = sd (Rootlength), se = sd(Rootlength) / sqrt((length(Rootlength))))
growth_data_summarized
#view data
head(growth_data_summarized)
5.4 tidyr
Tydir is another package to summarize data but also to transform data. For now we will just discuss the basics to summarize data but we will try to add an extended section on this package later.
You notice here that the syntax is a bit unusual and we use the %>%
symbol (a so-called forward pipe operator). This symbol is commonly used in the dplyr and tidyr packages. This function passes the left hand side of the operator to the first argument of the right hand side of the operator.
In the following example, the a subset of the growth_data data (only 3 columns, not the whole dataframe) gets passed to count()
New functions:
count()
= A function of the dplyr package. Here, we count the unique protein IDs grouped by BinID and gene (i.e. roughly equivalent to the columns we want to keep)summarize()
creates new data frame into one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row summarising all observations in the input. It will contain one column for each grouping variable and one column for each of the summary statistics that you have specified.mean()
= calculate the arithmetic mean.
#read in package
library(tidyverse)
#count how many measurements we have per conditions
%>% count(Condition, sort = FALSE) growth_data
#count how many measurements we have per nutrient and conditions
%>% count(Nutrient, Condition, sort = FALSE) growth_data
#get more comprehensive data stats and summarize for the whole dataset
%>% summarise(
growth_data count = n(),
mean_root = mean(Rootlength, na.rm = TRUE))
Good things about tydir are:
- it is extremely fast, so is important for dealing with larger datasets
- we can combine several compands by using the pipe
I.e. below we can see that we first group the input data and then summarized the group data.
#get more comprehensive data stats and summarize and grouping for the different conditions
%>%
growth_data group_by(Nutrient, Condition) %>%
summarise(
count = n(),
mean_root = mean(Rootlength, na.rm = TRUE) ,.groups = 'drop')
We can also do other stats than just calculating the mean:
#get more comprehensive data stats and summarize and grouping for the different conditions
%>%
growth_data group_by(Nutrient, Condition) %>%
summarise(
count = n(),
mean_root = mean(Rootlength, na.rm = TRUE),
sd_root = sd(Rootlength),.groups = 'drop')
Other useful summary functions:
mean(x)
: sum of x divided by the lengthmedian(x)
: 50% of x is above and 50% is belowsd(x)
: standard deviationIQR(x)
: interquartile range (robust equivalent of sd when outliers are present in the data)mad(x)
: median absolute deviation (robust equivalent of sd when outliers are present in the data)min(x)
: minimum value of xmax(x)
: maximum value of xquantile(x, 0.25)
: 25% of x is below this valuefirst(x)
: equivalent to x[1]nth(x, 2)
: equivalent to n<-2; x[n]last(x)
: equivalent to x[length(x)]n(x)
: the number of elements in xsum(!is.na(x))
: count non-missing valuesn_distinct(x)
: count the number of unique valuesum(x > 10)
: count the number of elements where x > 10mean(y == 0)
: proportion of elements where y = 0
5.5 More stats: tapply
Tapply is a base R function and allows to apply a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors. It can be used as an alternative to ddply.
Suppose now we want to estimate the mean root length for each growth condition. Notice, how we can vary the index?
#index = factors we want to select
tapply(X = growth_data$Rootlength, INDEX = growth_data$Nutrient, FUN = mean,na.rm = TRUE)
P noP
11.513931 6.530845
#same but for two-way tables (this is not so useful here, but might be handy when you have different conditions for the same organism or a timecourse)
tapply(growth_data$Rootlength, INDEX = list(growth_data$Nutrient, growth_data$Condition),FUN = mean, na.rm = TRUE)
MgCl Strain101 Strain230 Strain28
P 9.853457 12.729133 11.814678 12.007390
noP 5.805198 7.094616 7.172883 6.200309
5.6 Linear models
5.6.1 Basics
It is very simple to investigate linear relationships among variables in R. We want to estimate how a quantitative dependent variable changes according to the levels of one or more categorical independent variables. In the command below, the linear relationship between Rootlength (the dependent variable, i.e. the one we
We need to use the function summary to see the results of that command; coef extracts the best-fit coefficients, anova performs an analysis of variance; there are many other extractor functions.
In this case lets work with the data were we do not have to deal with the different time points to simplify things.
#read in data were we have several measurements
<- read.table("../data/Growth_Data.txt", sep="\t", header=T, fill=TRUE, quote = "")
growth_data
#is there a correlation between freshweight and root length?
<- lm(Rootlength ~ FW_shoot_mg, data = growth_data)
linearmodel linearmodel
Call:
lm(formula = Rootlength ~ FW_shoot_mg, data = growth_data)
Coefficients:
(Intercept) FW_shoot_mg
5.2912 0.2465
#let's extract the entire t-table
summary(linearmodel)
Call:
lm(formula = Rootlength ~ FW_shoot_mg, data = growth_data)
Residuals:
Min 1Q Median 3Q Max
-4.5263 -0.9978 -0.0181 1.0094 3.9292
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.29124 0.30469 17.37 <2e-16 ***
FW_shoot_mg 0.24648 0.01712 14.40 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.615 on 103 degrees of freedom
Multiple R-squared: 0.6681, Adjusted R-squared: 0.6649
F-statistic: 207.4 on 1 and 103 DF, p-value: < 2.2e-16
Here one of the values is the model p-Value (bottom last line) and the p-Value of individual predictor variables (extreme right column under
- Residuals: The section summarizes the residuals, the error between the prediction of the model and the actual results. Smaller residuals are better.
- Coefficients: For each variable and the intercept, a weight is produced and that weight has other attributes like the standard error, a t-test value and significance.
- Estimate: This is the weight given to the variable. In the simple regression case (one variable plus the intercept), for every increase in root length, the model predicts an increase of 0.24.
- Std. Error: Tells you how precisely was the estimate measured. It
<80><99>s really only useful for calculating the t-value. - t-value and Pr(>[t]): The t-value is calculated by taking the coefficient divided by the Std. Error. It is then used to test whether or not the coefficient is significantly different from zero. If it isn
<80><99>t significant, then the coefficient really isn <80><99>t adding anything to the model and could be dropped or investigated further. Pr(>|t|) is the significance level.
Performance Measures:
Three sets of measurements are provided.
- Residual Standard Error: This is the standard deviation of the residuals. Smaller is better.
- Multiple / Adjusted R-Square: For one variable, the distinction doesn
<80><99>t really matter. R-squared shows the amount of variance explained by the model. Adjusted R-Square takes into account the number of variables and is most useful for multiple-regression. - F-Statistic: The F-test checks if at least one variable
<80><99>s weight is significantly different than zero. This is a global test to help asses a model. If the p-value is not significant (e.g. greater than 0.05) than your model is essentially not doing anything.
We also can just print parts of the data:
#print only the coefficients
coef(summary(linearmodel))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.2912436 0.30469198 17.36588 2.278273e-32
FW_shoot_mg 0.2464783 0.01711628 14.40023 2.039703e-26
#print only the anova stats
anova(linearmodel)
#plot add the best line to a plot
with(growth_data, plot(Rootlength ~ FW_shoot_mg, col = 2))
abline(linearmodel)
If we look at the stats and the p value we see a nice correlation but also that we have two distinct clusters as well as more spread in the cluster that is more to the right. These clusters likely are the two different nutrient conditions and sometimes it might make sense to separate data to get a clearer picture. Something else to consider is to ask whether the data is normally distributed and based on that what statistical test to choose.
5.6.2 Analysing residuals
Anyone can fit a linear model in R. The real test is analyzing the residuals (the error or the difference between actual and predicted results).
There are four things we are looking for when analyzing residuals.
- The mean of the errors is zero (and the sum of the errors is zero)
- The distribution of the errors are normal.
- All of the errors are independent.
- Variance of errors is constant (Homoscedastic)
In R, you pull out the residuals by referencing the model and then the resid variable inside the model. Using the simple linear regression model (simple.fit) we
Below some examples:
<- linearmodel
simple.fit
layout(matrix(c(1,1,2,3),2,2,byrow=T))
#Rootlength x Residuals Plot
plot(simple.fit$resid~growth_data$Rootlength[order(growth_data$Rootlength)],
main="Rootlength x Residuals\nfor Simple Regression",
xlab="Marketing Rootlength", ylab="Residuals")
abline(h=0,lty=2)
#Histogram of Residuals
hist(simple.fit$resid, main="Histogram of Residuals",
ylab="Residuals")
#Q-Q Plot
qqnorm(simple.fit$resid)
qqline(simple.fit$resid)
The histogram and QQ-plot are the ways to visually evaluate if the residual fit a normal distribution.
- If the histogram looks like a bell-curve it might be normally distributed.
- If the QQ-plot has the vast majority of points on or very near the line, the residuals may be normally distributed.
5.7 Normal distribution
5.7.1 Visualize our data via density plots
There are different ways to visualize this, one example is ggdensity of the ggpubr package.
library("ggpubr")
Attaching package: 'ggpubr'
The following object is masked from 'package:plyr':
mutate
#is the data for my different variables normally distributed
ggdensity(growth_data$Rootlength)
We see nicely that we have two tails that likely represent the two nutrient conditions. To test this, we can simply subset the data as we have done before.
ggdensity(growth_data, x = "Rootlength",
add = "mean", rug = TRUE,
color = "Nutrient", palette = c("#00AFBB", "#E7B800"))
No we see that indeed the tool tails we see are seem to be due to our two nutrient conditions.
5.7.2 Visualize our data via Q-Q plots
Another way to represent data is in a Q-Q plot: Q-Q plot (or quantile-quantile plot) draws the correlation between a given sample and the normal distribution. A 45-degree reference line is also plotted.
#plot all data
ggqqplot(growth_data$Rootlength)
#plot by group
ggqqplot(growth_data, x = "Rootlength",
color = "Nutrient", palette = c("#00AFBB", "#E7B800"))
Again, here we see that our data for the indivdual growth conditions fit quite nicely into normal distribution.
5.7.3 Test for normality
#for all data
shapiro.test(growth_data$Rootlength)
Shapiro-Wilk normality test
data: growth_data$Rootlength
W = 0.91773, p-value = 6.744e-06
#separate nutrient conditions
<- growth_data[growth_data$Nutrient == "noP", ]
noP_data
#test for noP only
shapiro.test(noP_data$Rootlength)
Shapiro-Wilk normality test
data: noP_data$Rootlength
W = 0.96403, p-value = 0.1171
We can use ddplyr and the dlookr package for doing a group-wise comparison
library(dplyr)
library(dlookr)
Attaching package: 'dlookr'
The following object is masked from 'package:tidyr':
extract
The following object is masked from 'package:base':
transform
%>%
growth_data group_by(Nutrient) %>%
normality()
The shapiro.test tests the NULL hypothesis that the samples came from a Normal distribution. This means that if your p-value <= 0.05, then you would reject the NULL hypothesis that the samples came from a normal distribution.
From the output, the p-value < 0.05 for our complete dataset implies that the distribution of the data is significantly different from normal distribution. In other words, we can not assume the normality. However, we expect quite some differences dependent on the growth pattern and once we only look at our low P data we see that our indeed is normally distributed.
Notice: Shapiro works only for sample sizes between 3-5000 numbers since when you feed it more data, the chances of the null hypothesis being rejected becomes larger. An alternative is the Anderson-Darling test that however has a similar problem with the Shapiro Wilk test. For large samples, you are most likely to reject the null hypothesis, so be aware of this.
library(nortest)
ad.test(noP_data$Rootlength)
Anderson-Darling normality test
data: noP_data$Rootlength
A = 0.58838, p-value = 0.1187
5.8 ANOVA
When we visualize the data, we see that there is a difference between the nutrient conditions but we want to know whether it is significant and more importantly, whether there is also a difference based on our treatments with different strains of microbes.
#lets visually compare our data with ggpubr again
library("ggpubr")
ggboxplot(growth_data, x = "Condition", y = "Rootlength", color = "Nutrient",
palette = c("#00AFBB", "#E7B800"))
# We want to know whether root length depends on nutrient treatment
<- aov(Rootlength ~ Nutrient, data = growth_data)
aov summary(aov)
Df Sum Sq Mean Sq F value Pr(>F)
Nutrient 1 651.8 651.8 425 <2e-16 ***
Residuals 103 157.9 1.5
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here, we see that there are significant differences based on our nutrient treatments. Now lets see how we can look at both the nutrient treatment and growth conditions.
# We want to know if root length depends on condition and nutrient
<- aov(Rootlength ~ Nutrient + Condition, data = growth_data)
aov summary(aov)
Df Sum Sq Mean Sq F value Pr(>F)
Nutrient 1 651.8 651.8 712.58 < 2e-16 ***
Condition 3 66.5 22.2 24.23 7.27e-12 ***
Residuals 100 91.5 0.9
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the ANOVA table we can conclude that both nutrient condition and treatment are statistically significant. Nutrient treatment is the most significant factor variable.
Not the above fitted model is called additive model. It makes an assumption that the two factor variables are independent. If you think that these two variables might interact to create an synergistic effect, replace the plus symbol (+) by an asterisk (*), as follows.
# Two-way ANOVA with interaction effect
# These two calls are equivalent
<- aov(Rootlength ~ Nutrient * Condition, data = growth_data)
aov <- aov(Rootlength ~ Nutrient + Condition + Nutrient:Condition, data = growth_data)
aov
#summarize the aov
summary(aov)
Df Sum Sq Mean Sq F value Pr(>F)
Nutrient 1 651.8 651.8 817.131 < 2e-16 ***
Condition 3 66.5 22.2 27.780 4.72e-13 ***
Nutrient:Condition 3 14.1 4.7 5.891 0.000976 ***
Residuals 97 77.4 0.8
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It can be seen that the two main effects (Nutrient and Condition) are statistically significant, as well as their interaction.
Note that, in the situation where the interaction is not significant you should use the additive model.
5.9 TUKEY
In ANOVA test, a significant p-value indicates that some of the group means are different, but we don
As the ANOVA test is significant, we can compute Tukey HSD (Tukey Honest Significant Differences). Tukey test is a single-step multiple comparison procedure and statistical test. It is a post-hoc analysis, what means that it is used in conjunction with an ANOVA.
#test with anova
<- aov(Rootlength ~ Nutrient * Condition, data = growth_data)
aov
#run tukey
TukeyHSD(aov)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Rootlength ~ Nutrient * Condition, data = growth_data)
$Nutrient
diff lwr upr p adj
noP-P -4.983086 -5.329067 -4.637105 0
$Condition
diff lwr upr p adj
Strain101-MgCl 2.1029052 1.4604930 2.7453173 0.0000000
Strain230-MgCl 1.6836191 1.0341749 2.3330633 0.0000000
Strain28-MgCl 1.2784796 0.6545138 1.9024455 0.0000034
Strain230-Strain101 -0.4192861 -1.0864726 0.2479004 0.3597470
Strain28-Strain101 -0.8244256 -1.4668377 -0.1820134 0.0061445
Strain28-Strain230 -0.4051395 -1.0545837 0.2443048 0.3663501
$`Nutrient:Condition`
diff lwr upr p adj
noP:MgCl-P:MgCl -4.0482585 -5.0967311 -2.9997859 0.0000000
P:Strain101-P:MgCl 2.8756763 1.8040558 3.9472967 0.0000000
noP:Strain101-P:MgCl -2.7588405 -3.8073131 -1.7103679 0.0000000
P:Strain230-P:MgCl 1.9612207 0.9127482 3.0096933 0.0000023
noP:Strain230-P:MgCl -2.6805735 -3.7789218 -1.5822252 0.0000000
P:Strain28-P:MgCl 2.1539326 1.1054600 3.2024052 0.0000002
noP:Strain28-P:MgCl -3.6531484 -4.6634819 -2.6428149 0.0000000
P:Strain101-noP:MgCl 6.9239348 5.8162834 8.0315861 0.0000000
noP:Strain101-noP:MgCl 1.2894180 0.2041458 2.3746903 0.0087961
P:Strain230-noP:MgCl 6.0094793 4.9242070 7.0947515 0.0000000
noP:Strain230-noP:MgCl 1.3676850 0.2341551 2.5012149 0.0073003
P:Strain28-noP:MgCl 6.2021911 5.1169189 7.2874633 0.0000000
noP:Strain28-noP:MgCl 0.3951101 -0.6533625 1.4435827 0.9391499
noP:Strain101-P:Strain101 -5.6345168 -6.7421681 -4.5268654 0.0000000
P:Strain230-P:Strain101 -0.9144555 -2.0221069 0.1931958 0.1845494
noP:Strain230-P:Strain101 -5.5562498 -6.7112241 -4.4012755 0.0000000
P:Strain28-P:Strain101 -0.7217437 -1.8293950 0.3859077 0.4749256
noP:Strain28-P:Strain101 -6.5288247 -7.6004451 -5.4572042 0.0000000
P:Strain230-noP:Strain101 4.7200613 3.6347890 5.8053335 0.0000000
noP:Strain230-noP:Strain101 0.0782670 -1.0552629 1.2117969 0.9999989
P:Strain28-noP:Strain101 4.9127731 3.8275008 5.9980453 0.0000000
noP:Strain28-noP:Strain101 -0.8943079 -1.9427805 0.1541647 0.1537008
noP:Strain230-P:Strain230 -4.6417943 -5.7753242 -3.5082644 0.0000000
P:Strain28-P:Strain230 0.1927118 -0.8925604 1.2779841 0.9993212
noP:Strain28-P:Strain230 -5.6143692 -6.6628417 -4.5658966 0.0000000
P:Strain28-noP:Strain230 4.8345061 3.7009762 5.9680360 0.0000000
noP:Strain28-noP:Strain230 -0.9725749 -2.0709232 0.1257734 0.1223122
noP:Strain28-P:Strain28 -5.8070810 -6.8555536 -4.7586084 0.0000000
We can see that most differences are significant, with the exception of Strain28, which in most cases does not show an effect.
For some representations it is useful to plot significant letters. We can do this using some extra packages as follows:
#load library
library(agricolae)
Attaching package: 'agricolae'
The following objects are masked from 'package:dlookr':
kurtosis, skewness
#separate nutrient conditions
<- growth_data[growth_data$Nutrient == "noP", ]
noP_data
#run an anova
<- aov(Rootlength ~ Condition, data = noP_data)
aov_noP
#run test
HSD.test(aov_noP,"Condition", group=TRUE,console=TRUE)
Study: aov_noP ~ "Condition"
HSD Test for Rootlength
Mean Square Error: 0.606883
Condition, means
Rootlength std r Min Max
MgCl 5.805198 0.7738124 13 4.236348 6.834720
Strain101 7.094616 0.5939804 13 5.947965 7.973207
Strain230 7.172883 0.6117203 11 6.120089 8.197116
Strain28 6.200309 0.9988989 15 4.384756 7.428674
Alpha: 0.05 ; DF Error: 48
Critical Value of Studentized Range: 3.763749
Groups according to probability of means differences and alpha level( 0.05 )
Treatments with the same letter are not significantly different.
Rootlength groups
Strain230 7.172883 a
Strain101 7.094616 a
Strain28 6.200309 b
MgCl 5.805198 b
5.10 Check the homogeneity of variances
The residuals versus fits plot is used to check the homogeneity of variances. In the plot below, there is no evident relationships between residuals and fitted values (the mean of each groups), which is good. So, we can assume the homogeneity of variances. Only a few points (41, 58 and 77 are detected as outliers, which can severely normality and homogeneity of variance. It can be useful to remove outliers to meet the test assumptions.)
#check for homogeneity
plot(aov, 1)
#Use the Levene<e2><80><99>s test to check the homogeneity of variances.
library(car)
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:dplyr':
recode
The following object is masked from 'package:purrr':
some
leveneTest(Rootlength ~ Nutrient * Condition, data = growth_data)
From the output above we can see that the p-value is not less than the significance level of 0.05. This means that there is no evidence to suggest that the variance across groups is statistically significantly different. Therefore, we can assume the homogeneity of variances in the different treatment groups.
5.11 Check for normality v2
Normality plot of the residuals. In the plot below, the quantiles of the residuals are plotted against the quantiles of the normal distribution. A 45-degree reference line is also plotted. The normal probability plot of residuals is used to verify the assumption that the residuals are normally distributed.
The normal probability plot of the residuals should approximately follow a straight line. which we can see below. Again, we see the points marked as potential outliers.
## plot
plot(aov, 2)