Analysis with Galton’s original data set

Part 1: Analysis with Galton’s original data set
Galton’s work on children and parents’ height was published in: Galton, F. (1886): “Regression
towards mediocrity in hereditary stature”, Journal of the Anthropological Institute, 15: 246-63. In
this first part of the project you are asked to reconstruct the original data from this original article
and replicate his analysis.
• Question 1. Find Galton’s original article (on or LEARN). On Table I of his article,
the data used are summarized. You need to create a STATA data set that contains the 928
observations that Galton collected. It is recommended that you first type the data in an excel
file and then have STATA read that file. Some versions of the Galton data set are available
online. You are advised NOT to use them. It is part of this project that you show that you
understand how to make a data set from such a table. There are important conceptual issues
that you will miss if you borrow the data from somewhere else.
For those observations reported in Table I of Galton’s article as “below” or “above” the minimum and maximum height values, you need to assume some particular values. Please state
these explicitly in a table and provide a justification with one sentence. Define “tall parents”
and “short parents” according to your data. Then divide your sample into these two groups
and report relevant statistics for the adult children and for parents in each group. Report this
information in a table and comment it.
• Question 2. Galton was the first to describe and explain the phenomenon of “regression towards the mean”. Being concerned about the height of the English aristocracy, he interpreted
his results as “regression to mediocrity” (hence the name “regression”).
Regress the height of adult children against the height of parents. Report your results in a table
and interpret the estimated coecients. What can you say about the relationship between the
height of parents and their children? Are children of tall (short) parents as tall (short) as
their parents?
• Question 3. Taking your regression results from question 2, and using your definition of
“tall parents” and “short parents” from question 1:
Calculate the predicted adult children’s height whose parents are “tall” after 1, 2, 3, …, Z
generations. And similarly, for adult children of “short” parents. Report your results in a
table. Is there convergence in heights? If so, how many generations does it take? Is Galton’s
prediction correct?
• Question 4. Using the same data set,
Regress the height of parents against the height of adult children. Report your results in a
table. Is this regression equivalent to that in question 2? Are the estimated parameters the
same? Why or why not?
EofE Groups project, page 3 of 4
Template Answer Sheet
Group number: XX
Group members (student numbers only): s123, s345, etc.
Group XX, composed by students s123, s345, etc., confirms that the data collection has been conducted under the ethical guidelines of the School of Economics (
The data collection has only involved individuals 18 years old and over. The information collected
is not sensitive in any way that can harm the well-being and dignity of the subjects interviewed.
To maintain confidentiality, no names or any contact details have been collected. All interviewed
subjects have been told that this is part of the EofE course project.
Recall: each question has to be answered in one page maximum, font 12, double spaced.
• Question 1.1. Table 1.1.a. summarizes the height values assumed for cases below/above the
minimum/maximum height. We have assumed these values because ….
Table 1.1.a. Heights below/above the minimum/maximum height
assumed height
Adult children
Heights below the minimum xyz
Heights above the maximum xyz
Heights below the minimum xyz
Heights above the maximum xyz
Reference: Table I in Galton (1886).
Table 1.1.b reports the mean and standard deviation for adult children and parents.
Table 1.1.b. Summary Statistics
Mean Standard Deviation Number obs.
Adult Children xyz xyz xyz
Mid parents xyz xyz xyz
Source: Galton’s data with authors assumptions.
• Question 1.2. We have defined tall and short as… (i) Table 1.2. reports …. (ii) The
assumption of having 928 parents rather than 205…
• Question 1.3. Table 1.3 reports…
• Question 1.4. Table 1.4 reports…
• Question 1.5. Table 1.5 reports…
EofE Groups project, page 5 of 7
• Question 2.1. Description of own data set: population of interest, data collection process,
survey … Table 2.1 reports ….
• Question 2.2. Table 2.2. reports… Figure 2.2. shows…
• Question 2.3. Table 2.3. reports…
• Question 2.4. Table 2.4. reports…
• Question 2.5. Another literature in Economics that has analysed regression towards the
mean is….
EofE Groups project, page 6 of 7
Template Appendix: log file resulting running from
capture log close
set more 1
log using project.log, replace
/*question 1.1*/
use namedatagalton.dta
. sum X
Variable | Obs Mean Std. Dev. Min Max
X | n xyz ….
/*question 1.2*/
log close

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
The price is based on these factors:
Academic level
Number of pages
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more
error: Content is protected !!