MST 2034 Design of Experiments Assignment
The research is carried out to support Goal 3 of Sustainable Development Goals (SDGs), which titled Good Health and Well-being, and it ultimately aims to ensure healthy lives and promote well-being for all at all ages. The dataset is acquired through website kaggle.com which offers open datasets that are reliable. The dataset itself, titled ‘Do we still need gyms?’ has 12 different variables, and samples from 1000 people who was tested with three new medicines to improve physical features. For a simpler model, only medicine 1 and medicine 2 are taken into account for the purpose of this research, although it didn’t mention which kind of medicines were used.
This objective of this research topic is about investigating how would initial weights and medicines taken affect the weight difference of people whose ages range from 19 to 50.
The experimental units are the people who underwent the experiment given. There are no specific replications, as the number of each treatment combinations would not be the same throughout the whole experiment. The treatments are their initial weight (in kilograms), which takes the Greek signs (α, β, γ, and δ) and the second one being the medicines taken, which takes the Latin signs (A, B, C, and D). As stated, each of the treatment takes four levels in the design. The minimum and maximum of their initial weights are 51.6 and 109.2 kg respectively, which then being grouped into four groups range from 51.6 – 75 kg (α), 75.1 – 80 kg (β), 80.1 – 85 kg (γ), and 85.1 – 109.2 kg (δ). While some of the ranges may deviate unusually than other groups, as this is to help balance the frequencies of each category (the data may come close to a uniform distribution, or else it would be difficult to sample from balanced interval). Whereas the other treatment is distributed into only medicine 1 is taken (A), only medicine 2 is taken (B), both medicines are taken (C), and none of the medicines are taken (D) by the participants. Although medicine 3 is also considered in the dataset given, but it won’t be evaluated as it will complicate the current model. The response variable is the weight difference, which takes the variable ‘KgAfter’ deducted by another variable ‘KgBefore’. The response variable won’t take the absolute value as it is to allow more accuracies to be maintained within the model itself.
Randomization exists in the research as the participants are varied in gender, ages, initial weights, initial kilometres run before taking medicines, medicines taken as according to their current physical fitness, health, and diseases, etc. There is also no particular order on how the samples were taken. Replication is not possible as often times there is no more than one sample for each treatment combination. Although some of the treatment combination would have multiple samples in it, we would directly average out the value to make it a single value or replication. Whereas blocking is made to eliminate the variability transmitted from nuisance factors. Blocking is possible as there are multiple binary targets in the dataset given. The first blocking is variable ‘gender’ combined with ‘age’ which gives four levels of blocking factor, consists of ‘male ages from 19 – 34’, ‘female ages from 19 – 34’, ‘male ages from 35 – 50’, and ‘female ages from 35 – 50’. The second blocking is variable ‘side effects’ combined with ‘difference in kilometres run’ which takes the variable ‘KmAfter’ deducted by another variable ‘KmBefore’, which then gives four levels of blocking factor, consists of ‘negative difference in kilometres run without side effects’, ‘negative difference in kilometres run with side effects’, ‘positive difference in kilometres run without side effects’, and ‘positive difference in kilometres run with side effects’. As stated, each of the blocking also takes four levels in the design. To simplify, each level of blocking factor is transformed into group 1, 2, 3, and 4, as shown in the table given later.
Linear model for the experimental design:
yijkl = µ + αi + βj + γk + δl + εijkl
where
i = 1, 2, 3, 4
j = 1, 2, 3, 4
k = 1, 2, 3, 4
l = 1, 2, 3, 4
µ = overall mean
αi = effect of initial weights (treatment 1)
βj = effect of medicines taken (treatment 2)
γk = effect of gender/ age (blocking 1)
δl = effect of side effect/ KmRunDifference (blocking 2)
εijkl = random error ̴ NID (0, σ2)
Pairwise Treatment Effects
Greek Treatment (Initial Weights)
Latin Treatment (Medicines Taken)
q0.05 (4, 3) = 6.824526
T0.05 = 6.824526 * (52.01)1/2 = 49.2171
There is total of 68 samples taken across all treatment combinations, although in the final model only takes up one value for each treatment combination.
Any pairs of treatment averages that differ in absolute value by more than 24.6085 would imply that the corresponding pair of population means are significantly different. However, there is none of the means pairs exceeds the indicated value, meaning that there is no significant difference in initial weights and the medicines taken at any setting.
Standard error = 49.2171, hence the 95% confidence interval for all treatment means:
Greek Treatment (Initial Weights)
Latin Treatment (Medicines Taken)
Contrast
Orthogonal contrast is suggested between categories 1 and 2 as a group and categories 3 and 4 as a group for both treatments, assuming that the mean difference of weight differences of the first group is the same as the second group for both treatments. | ||||||||
Latin Treatment | Greek Treatment | |||||||
A | 38.575 | α | 92.0867 | |||||
B | 45.7283 | β | 47.2167 | |||||
C | 40.5667 | γ | 24.8625 | |||||
D | 58.9625 | δ | 19.6667 | |||||
Source of variation | df | Sum of Squares | Mean Square | F0 | ||||
Latin | 3 | 63.19 | 21.06 | 1.7452 | ||||
Greek | 3 | 816.42 | 272.14 | 22.5491 | ||||
Rows | 3 | 75.74 | 25.25 | 2.0918 | ||||
Columns | 3 | 42.15 | 14.05 | 1.1643 | ||||
Orthogonal Contrasts | ||||||||
C1 | 1 | 140.35 | 140.35 | 11.6288 | ||||
C2 | 1 | 3.62 | 3.62 | 0.3001 | ||||
Error | 1 | 12.07 | 12.07 | |||||
Total | 15 | 1153.54 | ||||||
C1: ȳ1… + ȳ2… – ȳ3… – ȳ4… | ||||||||
C2: ȳ.1.. + ȳ.2.. – ȳ.3.. – ȳ.4.. | ||||||||
F0.05, 3, 1 | 215.707 | |||||||
F0.05, 1, 1 | 161.448 | |||||||
We conclude from F statistic (F0 < F0.05,1,1) that there are no significant differences in mean weight difference between levels 1 and 2 and levels 3 and 4 of the initial weights, and also between levels 1 and 2 and levels 3 and 4 of the medicines taken at 5% significance level (by default). | ||||||||
or by Scheffe’s method | ||||||||
C1: ȳ1… + ȳ2… – ȳ3… – ȳ4… | = | 23.6935 | ||||||
C2: ȳ.1.. + ȳ.2.. – ȳ.3.. – ȳ.4.. | = | -3.8065 | ||||||
sCu | 14.4239 | |||||||
s 0.05,u | 76.092 | |||||||
where u = 1,2 | ||||||||
Because C1 < s 0.05,1, we conclude that the contrast equals to zero, that is, we conclude that the mean weight differences of initial weights 1 and 2 as a group does not differ from the means of initial weights 3 and 4 as a group. Furthermore, because C2 < s 0.05,2, we conclude that the contrast equals to zero, that is, we conclude that the mean weight differences of medicines taken 1 and 2 as a group does not differ from the means of medicines taken 3 and 4 as a group, at 5% significance level (by default). | ||||||||
F0 of treatment factor initial weights equals to 5.2322 is the closest to the critical value of F0.05,3,3 = 9.27663, hence its H0 has the highest possibility to be rejected than any other treatment or blocking factors. Hence, it is recommended to eliminate the initial weights treatment factor, reduced into a Latin square design with the remaining one treatment and two blocking factors. Hence, we would like to resample from the 1000 samples given as the eliminated factor would result in more samples for each treatment combination.
Linear model for the experimental design:
yijk = µ + αi + βj + γk + εijk
where i = 1, 2, 3, 4
j = 1, 2, 3, 4
k = 1, 2, 3, 4
µ = overall mean
αi = effect of medicines taken (treatment)
βj = effect of gender/ age (blocking 1)
γk = effect of side effect/ KmRunDifference (blocking 2)
εijk = random error ̴ NID (0, σ2)
Compared to Graeco-Latin square design, there is a much difference in the sum of squares of the elements perhaps due to the resampling, results in the change in the value of each treatment combination. However, F0 does not differ as much due to the increasing degrees of freedom for error term. Though the critical value F0.05,3,6 = 4.75706 as the increase in degrees of freedom in error term, but all of the H0 are not to be rejected, which still remains the same as the previous design.
Reduction from Graeco-Latin to Latin square design in this scenario does not affect the result significantly, though sometimes it allows more error when such reduction occurs in any other research. The increase in degrees of freedom in error term results in a lower criterion for the rejection of H0, which generally is not the proper way of conducting research. But in this case, Latin square design might be more preferable than Graeco-Latin square design as the latter does not improve the design in any way.
To further improve the design of the experiment for this research topic, it is suggested that more variable should be considered such as their Body Mass Index (BMI) to replace the current treatment factor ‘initial weights’ as the BMI is a more preferable and accurate representation and classification of the human health.
Reference
Sara, L. (2021). Do We Still Need Gyms? kaggle. https://www.kaggle.com/datasets/saralattarulo/do-we-still-need-gyms
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more
Recent Comments