Sample and Population

Today’s in-class assignment focused on research samples and populations of a research study. The key points from class notes includes:

  • Relationships between samples and populations most often are described in terms of probability
    • Joint prob = being in education AND a PhD student
    • Conditional = given one event will happen IF another event happens; e.g. going to a Star Wars movie if you are a Star Wars fan
  • Because our samples are never truly random samples due to constant probability – we are not studying the sample we think we are
  • Sampling Error – difference between our sampling statistics and our population statistics
  • Inferences about the population will be based on characteristics of your sample. —
  • Conclusions from your study will be based on how well your sample represents the population of interest. —
  • Populations then, need to be clearly identified and samples chosen accordingly.

Here is my ‘in-class’ response with the article referenced:

Hood, N., & Littlejohn, A. (2017). Knowledge typologies for professional learning: educators’ (re)generation of knowledge when learning open educational practice. Educational Technology Research and Development65(6), 1583–1604.

From this mixed methods research, Hood & Littlejohn identify the primary target population as adult educators in the United Kingdom. Due to the sampling process, whereby they used social media and electronic surveys to gather data from their sample, they resulted in a random sample of participants from across the UK and Europe. While this use of social media and electronic surveys may have limited the generalizability of the sample to the identified population, since many higher education adult educators may not be active on social media or capably navigate electronic surveys, I believe this method of acquiring the sample does provide an equal chance for participation.

The demographics of the sample (N=521) were described by their adult education job description, with the majority (N=468) being university instructors. There were no other reported demographic details such as years of teaching, job status (part-time, full time, contract, permanent), gender, or locations, but these may have been collected during the survey since geographically, eight countries were represented. Since this sample size is well above the 30 participants recommended to reduce measurement errors, and from which to draw inferences.

Since this research paper used the qualitative data collection to inform the quantitative phase, the researchers further explained how the quantitative data was analyzed, in order to determine the subset sample (N=30) for the qualitative research phase who were selected and interviewed. I believe the demographic details for this smaller sample of the population, as shared in Table 1, shows a representative sample of adult educators, from the larger general population of UK and European instructors in higher education e.g. M=11, F=19, the mean age = 56.4, the median is 57.5, the range is 31. However, the generalizability of this sample would need to be further explored statistically to see if it fits a normal distribution.

Feedback on this in-class submission included:

  • a bit of a tricky case with regards to the sample and population
  • the sample comes from a broader area than the UK
  • there are some definite questions regarding the population of interest
  • you raise some important considerations for the information that is not shared.

Additional notes from today’s readings and additional readings done for this topic:

Kranzler, J. H. (2017). Section II: Describing univariate data. Statistics for the terrified (6th Edition), (pp. 33-87). Lanham, Maryland: Rowman & Littlefield.

  • Three measures of central tendency – looking for one score to best representation the level of a set of scores
    • Mean – most often used; formula M = ∑ 𝒙 / N
    • Median – order from highest to lowest; find middle score
    • Mode – most frequently occurring score in a set of scores
  • Measures of variability
    • Range – highest score minus lowest score
    • Variance – SD2
    • Deviation scores – subtract the median from the score
      • Computational variance formula

Caldwell, S. P. M. (2018). Statistics unplugged. Belmont, CA: Wadsworth Cengage Learning.

  • Sample– defined, representative p. 13
  • Standard deviation – p. 47-48
  • Sample size
    • chi test p. 279; confidence interval p. 129; effect size p. 48;
    • margin of error p. 146
    • 𝑡 distribution p. 132-135

Aron, A., Aron, E. N., & Coups, E. J. (2009). Central tendency and variability. In Statistics for psychology (5th Edition), (pp. 33-63). Upper Saddle River, NJ: Pearson Education Inc.

  • Describe a set of scores as representative or typical, such as an average; central tendency
  • Amount of variance or variability among scores – two measures include variance & SD
  • Central tendency of set of scores refers to the middle of the group – mean, mode, median
    • Mean – average; equal distance from either end to the middle; balance the scores; sum of the scores divided by the number of scores
      • Formula M=∑𝒙/N
    • Mode – most common single value in a set of numbers; value with the highest frequency; compare mean to mode to see congruency
    • Median – line up all the numbers from low to high, figure how many scores, pick the middle score by adding 1 to total # of scores and dividing by 2; helps eliminate influence of ‘outliers’ since they influence mean; use this when using rank orders having outliers
  • Variability – how spread out the scores are in a set of numbers; around the mean; 2 measures
    • Variance – subtract mean from each score (deviation scores); square each deviation score; add up squared deviation scores, called sum of squared deviations; divide the sum by number of scores; squared deviance from the mean
      • This is important but rarely reported or used as descriptive statistic
    • Standard deviation – take the variance; take the square root
      • Is the average amount that scores differ from the mean; ordinary, not squared deviations from the mean
      • Formula SD2 = ∑ (𝒙 – M)2/N OR SD = √SD2
  • Variance and SD are heavily influenced by one or outliers
  • Computational formula – shortcuts developed to simplify the figuring; don’t tend to use these
    • Variance computation formula:
      • SD2 = ∑ 𝒙2 – ((∑ 𝒙)2 /N) / N