Quant Methods Qts « phd monkey

phd monkey An expedition through the forest of academia.


Understanding Relationship Strength

On relationship strength:
The relationship between two sets of scores has two characteristics: strength and direction. The strength of a relationship tells the amount scores on one variable are related to scores on the other. Strength is stated from .00 to 1.00. The higher the number, (regardless of sign), the greater the relationship. A correlation of .80 is strong, whereas a correlation of .15 is weak. In a textbook relationship, all data points line up in a straight line.

Steinberg (2011), submits two great examples to explain relationship strength:

Example 1
By knowing the temperature on the Celsius scale, we can exactly predict the temperature on the Fahrenheit scale. Thus, the correlation between Celsius and Fahrenheit is 1.00 (p. 422).

A correlation of .00, at the other extreme, indicates no relationship.

Example 2
For example, there is no relationship between adult IQ and shoe size.
Adults with high, medium, or low IQs are equally likely to have small, medium, or large shoe sizes. Thus, the data points fall in a circular “blob.” (p. 422).

Here is how these two scatterplots would look:

Steinberg, W. J. (2011). Statistics alive! (2nd ed.). Thousand Oaks, CA: Sage Publications.


What is Regression Analysis?

Don’t be fooled by his name.. “Regression”, or Mr. Regression as he prefers it. Mr. Regression “has nothing to do with the common meaning of returning to an earlier or a lower stage or declining to a previous level”, “such as a 10-year-old who starts wetting the bed and sucking his thumb” (Vogt, 2007, p. 145). What Mr. Regression tries to do is predict and explain. For this reason he might be better called Mr. Predictor or Mr. Explainer.

Mr. Regression’s analysis is by far “the most widely employed method for studying quantitative evidence” (Vogt, 2007, p. 145). Some of the progressive forms of Mr. Regression’s analysis are highly technical and complex.

According to Vogt (2007), even the most advanced and complex forms of Mr. Regression’s analysis always ask some version of one basic question:
How much better can I predict (or explain) a dependent variable (Y) if I know an independent variable (X)?” (p. 146)

The objective of Mr. Regression’s analysis is always to answer a form of, or an expansion on Vogt’s question.

Vogt, P.W., (2007). Quantitative research methods for professionals. Boston, MA: Allyn and Bacon.


Peer Review vs. Reports

As students, we regularly research and report for university. We quickly learn each instructor's likes and make sure our assignments meet their guidelines.

Peer Review
“The purpose of the peer review process is to pick out the publishable manuscript and prune it prior to the print run” (Nayak, Maniar, & Moreker, 2005, p. 153). A paper is publishable if it makes an adequate contribution and advances understanding to the scientific community. According to Nayak et al. (2005), the ultimate test of acceptability lies in the fulfillment of the question;What will the readers learn?” (p. 154). If a reader can gain something from a paper it may be worth publishing. An author’s work improves through the process of revision. “Peer review validates the author’s work, assures quality and authenticity, guards against plagiarism, and may support a job or funding application (Nayak et al., 2005, p. 155).

Nayak, B. K., Maniar, R.N., & Moreker, S. (2005). The agony and the ecstasy of the peer-review process. Indian Journal of Ophthalmology, (153-155).


Chi Square: Goodness-Of-Fit v. Test of Independence

What is Chi Square?
Statistics like t-tests and ANOVA are applicable for interval and ratio variables, when the dependent variables being measured are continuous. On the other hand, chi-square applies when the variables are nominal or ordinal. Chi-square tests if one group of amounts is higher or lower than you would expect by coincidence.

According to Marczyk, DeMatteo, & Festinger (2005):
Chi-square summarizes the discrepancy between observed and expected frequencies. The smaller the overall discrepancy is between the observed and expected scores, the smaller the value of the chi-square will be. Conversely, the larger the discrepancy is between the observed and expected scores, the larger the value of the chi-square will be (p. 223).


Goodness-Of-Fit VS.Test of Independence

A goodness-of-fit test is a one variable Chi-square test. According to Steinberg (2011), “the goal of a Chi-square goodness-of-fit test is to determine whether a set of frequencies or proportions is similar to and therefore “fits” with a hypothesized set of frequencies or proportions” (p. 371). A Chi-square goodness-of-fit test is like to a one-sample t-test. It determines if a sample is similar to, and representative of, a population.

Example of Goodness-Of-Fit:
We might compare the proportion of M&M’s of each color in a given bag of M&M’s to the proportion of M&M’s of each color that Mars (the manufacturer) claims to produce. In this example there is only one variable, M&M’s. M&M’s can be divided into many many categories like Red, Yellow, Green, Blue, and Brown, however there is still only one variable… M&M’s.

Steinberg (2011), notes: “the Chi-square goodness-of-fit test will determine whether or not the relative frequencies in the observed categories are similar to, or statistically different from, the hypothesized relative frequencies within those same categories (p. 371).

Test of Independence
A test of independence is a two variable Chi-square test. Like any Chi-square test the data are frequencies, so there are no scores and no means or standard deviations. Steinberg (2011) points out, “the goal of a two-variable Chi-square is to determine whether or not the first variable is related to—or independent of—the second variable” (p. 382). A two variable Chi-square test or test of independence is similar to the test for an interaction effect in ANOVA, that asks: Is the outcome in one variable related to the outcome in some other variable” (Steinberg, 2011) (p. 382).

Example of Test of Independence
To continue with the M&M’s example, we might investigate whether purchasers of a bag of M&M’s eat certain colors of M&M’s first. Here there are two variables: (1) M&M’s (2) The order based on color that an M&M bag holder/purchaser eats the candies.

Steinberg, W. J. (2011). Statistics alive! (2nd ed.). Thousand Oaks, CA: Sage Publications.

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ: John Wiley & Sons.


What is a Data Codebook?

A data codebook is like a dictionary that defines and explains in detail the variables included in a database. As a visual learner, I value references such as this and understand that while the creation of such a codebook takes a bit time, it will also save time when the researcher begins the process of examining their data. Marczyk, DeMatteo, and Festinger (2005), suggest a Data Codebook, “serves as a permanent database guide, so that the researcher, when attempting to reanalyze certain data, will not be stuck trying to remember what certain variable names mean or what data were used for a certain analysis” (p. 203).

What is in a Data Codebook?

According to Marczyk et al. (2005) at a minimum, a data codebook should contain:

  • Variable name
  • Variable description
  • Variable format (number, data, text)
  • Instrument or method of collection
  • Date collected
  • Respondent or group
  • Variable location (in database)
  • Notes (p. 203).

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ:John Wiley & Sons.


When Considering Correlations..

When considering correlations, they are descriptive in that they demonstrate relationship between variables: positive, negative, none.
But correlations are also used in inferential statistics, so we can infer what we find in our sample accurately represents what would be seen in the larger population.

1. If two variables, we ask is the level of measurement (of the data) nominal or ordinal?
a. If yes, we use those statistics you listed below ("Point-biserial (rpbi), Spearman rank-order (rs ), Phi (Φ), Gamma (γ)" and Kendall's tau
b. If no, for interval or ratio, we use Pearson's

2. If more than two variables, we ask again what is the level of measurement?
a. For Nominal or ordinal level, we use Logistic regression
b. For Interval or ratio levels, we use Multiple regression, Path analysis.

When research asks for relationship and..

When there are 2 Variables and..
The level of measurement is: nominal or ordinal we use:

  • Point-biserial (rpbi)
  • Spearman rank-order (rs ),
  • Phi (Φ)
  • Gamma (γ)
  • Kendall's tau

When there are 2 Variables and..
The level of measurement is interval or ratio we use:

  • Pearson's r

When there are More Than 2 Variables and..
The level of measurement is: nominal or ordinal we use:

  • Logistic regression

When there are More Than 2 Variables and..
The level of measurement is interval or ratio we use:

  • Multiple regression
  • Path analysis.

Steinberg, W. J. (2011). Statistics alive (2nd ed.). Thousand Oaks, CA: Sage Publications.


Quantitative Report Critique Checklist – (Vogt, 2007)

Checklist of questions to ask when critiquing the typical quantitative research report.

From Vogt (2007), (p. 300):
A. First, and in General
1. What is the hypothesis or the research question, if any, guiding the research?
2. Why do the authors believe it is important to investigate this hypothesis/question?
3. What methods did the authors use to collect evidence? What was their design?
4. Were the methods appropriate to address this problem/question?
5. What are the main findings or conclusions of the article?
6. Are the conclusions convincing?

B. Questions about the Variables
7. What is the dependent or outcome variable (OV)?
8. What are the independent or predictor variables (PVs)?
10. Should these or any other mediating/intervening variables have been studied?
11. Are any control variables considered?
12. Should other control variables have been examined?
13. Does the article discuss the possible moderating variables and interaction effects? Should it?
14. How are the variables defined and measured; that is, how are they operationalized?
15. Are the definitions and measurements of the variables appropriate for this study?

C. Questions about the Sample/Subjects
16. Who is studied and are the subjects appropriate given the goals of the study?
17. How many are studied and is this enough for the purposes of the study?
18. Is the sample representative of a population? How broadly can the conclusions be generalized?

D. Questions about the Conclusions
19. Are the findings statistically significant?
20. Are the findings scientifically significant?
21. How big are the effects discovered?
22. Are the findings practically significant?
23. Are the conclusions really supported by the evidence cited in the article?

E. Finally and Implied in the Answers to the Above Questions
24. How could the research have been improved?
25. What questions or problems does the article leave unanswered?
26. How could you go about doing a better job?

Vogt, P.W., (2007). Quantitative research methods for professionals. Boston, MA: Allyn and Bacon.


Normal Curves and “Normal” Distributions – (Vogt, 2007)

On “Normal”:
It is good to think of the normal distribution as a “normal”, or regular dispersal of data. It is likely no experimental dissemination of data will match it exactly, buy countless distributions of data in the real world look like the normal curve. If a real distribution that researchers are studying comes quite close to the normal distribution, it suggests the researchers will already know a great deal about the real distribution without further work (Vogt, 2007). The most common data are toward the middle, that has a mean of zero, or a z-score of zero. The least common data are far from the mean and from the mode and the median, which are all identical in a normal distribution (Vogt, 2007).

Real World Example: Normal Distribution
An example of something we might expect to see distributed normally would be:
The average age in months, that the average child begins to walk.
(Here we would expect to see children achieve this developmental milestone at a similar time.)

Real World Example: NOT a Normal Distribution
An example of something we might not expect to see distributed normally would be:
Annual incomes of members of the working population.
(Here we would expect to see data that yielded higher standard deviations than the previous example.)

Vogt, P.W., (2007). Quantitative Research Methods for Professionals. Boston, MA: Allyn and Bacon.


“the standard deviation is the flour” – (Vogt, 2007)

On standard deviation:
Standard deviation (SD) and the correlation coefficient (Pearson r) are old friends who rely on each other. According to Vogt (2007), “The standard deviation is used to describe the variation in a distribution of scores. The correlation coefficient is used to describe how two distributions of scores are related to each other (p. 19)”

The standard deviation (SD) is a measure of the unevenness of a data collection. It tells you how much the scores are spread out (high standard deviation) or are clustered together (low standard deviation) (Vogt, 2007). The standard deviation is a measure of how much all the data in the collection differ on average from the mean. Standard deviation is a measure of the divergence or deviation from the mean or average.

SD’s foundation is the mean. Other statistics are built on variations of the mean. The standard deviation and the square of the standard deviation or the variance are essential in statistics. Vogt (2007), describes the importance of standard deviation. “If the collection of statistical techniques is like a bakery shop filled with a wide variety of breads and pastries, the standard deviation is the flour with which most of them are made (p. 20).

Vogt (2007), also nicely describes the road to standard deviation:

First the mean or average is found. Then come the deviation data, which you get by subtracting the mean from each piece of data. You can take the average of the (absolute) deviation data to get the average deviation. Next comes the all-important variance, which is the mean of the squared deviation data. Finally, there is the standard deviation, which is the square root of the variance. It is hard to over-emphasize the importance of the mean, the deviation scores, the variance, and the standard deviation (p. 22).

Vogt, P.W., (2007). Quantitative Research Methods for Professionals. Boston, MA: Allyn and Bacon.


Variables – Independent vs. Dependent

According to Marczyk et al. (2005), the independent variable is the factor that is manipulated or controlled by the researcher, while the dependent variable is a measure of the effect (if any) of the independent variable. Typically scholars are interested in exploring the effects of the independent variable. In its simplest form, the independent variable has two levels: present or absent (Marczyk et al., 2005). In the question: Do mobile devices negatively impact classroom attentiveness of high school students? The independent variable is mobile devices; the dependent variable is high school student concentration levels.

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ: John Wiley & Sons.


Hypothesis: Null, Nondirectional, and Directional

Null Hypothesis
In research studies involving two groups of participants (e.g., experimental group vs. control group), the null hypothesis always predicts that there will be no differences between the groups being studied (Kazdin, 1992).

Nondirectional Hypotheses
If the hypothesis simply predicts that there will be a difference between the two groups, then it is a nondirectional hypothesis (Marczyk, DeMatteo and Festinger, 2005). It is nondirectional because it predicts that there will be a difference but does not specify how the groups will differ.

Directional Hypotheses
If, however, the hypothesis uses so-called comparison terms, such as “greater,” “less,” “better,” or “worse,” then it is a directional hypothesis. It is directional because it predicts that there will be a difference between the two groups and it specifies how the two groups will differ (Marczyk, DeMatteo and Festinger, 2005).

Kazdin, A. E. (1992). Research design in clinical psychology (2nd ed.). Boston: Allyn & Bacon.

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ: John Wiley & Sons.


Validity Threats of Quantitative Research – Vogt, (2007)

According to Vogt (2007), some of the most common threats to validity in quantitative research are self-selection effects, volunteer effects, attrition, history effects, maturation effects, and communication among subjects.

Self-Selection Effects
This occurs if subjects are not randomly assigned to the groups that interest the researcher (Vogt, 2007). This can mean that the subjects assign themselves or choose their own groups; not representative of random assignment. Researchers should be sure to assign the groups to avoid self-selection.

Volunteer Effects
This occurs because people cannot usually be studied without their prior consent, but those who give their consent are likely to differ in important ways from people who do not consent (Vogt, 2007). When possible researchers can study people without their consent by observing their public behavior and using the information when applicable.

Attrition occurs when subjects drop out or choose to no longer participate in a study. Researchers can help avoid high levels of attrition by careful screening of subjects.

History Effects
History effects refers to dangers to validity that occur as a result of extended periods of time that pass during a study. According to Vogt (2007), one way to guard against the potential threats to the validity of history effects is to take frequent measurements of the outcome variable rather than just one measurement at the end of the study.

Maturation Effects
Maturation effects also occur as a result of extended periods of time that pass during a study. However, in this case the change that occurs is due to the individual development of participants in the research, not to external events (Vogt, 2007). Here again recurrent measurements can help guard against the risks of subject maturation.

Communication Among Subjects
Communication among subjects and the complications it can cause, can take many shapes. To help mitigate this Vogt (2007), recommends assigning institutions rather than individuals, to control and treatment groups.

Vogt, P.W., (2007). Quantitative Research Methods for Professionals. Boston, MA: Allyn and Bacon.


Qualitative vs. Quantitative Research – (Schmidt & Brown, J. (2009)

“Quantitative” and “Qualitative” distinguish different research types.

Quantitative research views the world as unbiased. Quantitative studies characteristically test a hypothesis. Quantitative research requires that researchers separate themselves from occurrences being considered. The focus is on collecting empirical evidence, in other words, evidence gathered through the five senses (Schmidt & Brown, 2009). Observations are measured Numbers that can later be statistically analyzed are used by the researcher.

A big difference of qualitative research is that the world is viewed as prejudiced. There can be multiple realities because the context of the situation is different for each person and can change with time (Schmidt & Brown, 2009). In qualitative research verbal descriptions are stressed to describe human behaviors. In qualitative research, the focus is on providing a detailed description of the meanings people give to their experiences (Schmidt & Brown, 2009).

from p. 15


Schmidt, N. & Brown, J. (2009). Evidence-based practice for nurses. Sudbury, MA: Jones and Bartlett


The Belmont Report – (Marczyk, DeMatteo, & Festinger, 2005)

The Belmont Report’s basic principles are respect for persons, beneficence, and justice.

Respect for Persons
Respect for persons holds two primary ethical beliefs. The first is that individuals should be treated as individuals and without label. The second is that persons with reduced abilities or those who have been labeled, receive proper care.

According to Marczyk, DeMatteo, and Festinger, D. (2005), the term “beneficence” is often understood to cover acts of kindness or charity that go beyond strict obligation. Two main ideas associated with beneficence when using human subjects are to maximize benefits while causing no or the least harm possible.

According to Marczyk et al. (2005), justice represents “fairness in distribution”, “getting what is deserved”, and recognizing that equals be treated equally. No subject should be forced to deal with any problem as a result of the research. No subject should be denied any benefits for which they are eligible.

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ: John Wiley & Sons.


Dissertation Tip – Feasibility – (Corner, 2002)

A pilot study is appropriate for determining feasibility of a larger study. Corner’s (2002) comments regarding saving time and money suggested feasibility to me. During the design process we must determine feasibility related to availability of a sample (who meets the requirements for our research question to be examined), the skills of the investigator, the actual costs associated with methods and sample, and the time required to complete the study (extremely important for doctoral students).

As you progress through the steps of the research process from review of the literature to presentation of results, feasibility is considered. As an example, some of my research studies required measuring melatonin in saliva and serum. After reviewing the literature, I discovered the feasibility of using these media to accurately measure changes in melatonin in humans, and responses to stimuli in animals. However, I also discovered the cost of such measures was expensive and required special training in the laboratory procedures. These considerations of feasibility added to methods selected and the timeline for my research.

As you develope your studies, keep in mind Corner's economics ideas. Each element of the research process feeds back to preceding ones in determining the most economical but valid and reliable methods for obtaining an answer.

Corner, P. D. (2002). An integrative model for teaching quantitative research design. Journal of Management Education, 26, 671-692.


Accuracy vs. Reliability (Marczyk, Dematteo,& Festinger, 2005)

Accuracy indicates whether a measurement is correct.
Reliability refers to whether the measurement is consistent.

Marczyk, Dematteo, and Festinger (2005), make a dart throwing comparison:

When throwing darts at a dart board, “Accuracy” refers to whether the darts are hitting the bull’s eye (an accurate dart thrower will throw darts that hit the bull’s eye).“Reliability,” on the other hand, refers to whether the darts are hitting the same spot (a reliable dart thrower will throw darts that hit the same spot). Therefore, an accurate and reliable dart thrower will consistently throw the darts in the bull’s eye. (p. 10)

Marczyk, G., DeMatteo, D., & Festinger, D. (2005). Essentials of research design and methodology. Hoboken, NJ: John Wiley & Sons.