NUR 705 Assignment 10.1: Correlations

Assignment Guidelines

Part One

Download the NUR705 Week 10 Dataset (CSV)Links to an external site. and use JASP to run a correlation on the following variables: Depression Score and Perceived Health Score. Then run a correlation on the variables Depression Score and Number of Health-Related Visits to a primary care office in the past two years.

Interpret your results.
Create a short report on the output. (Look at pages 169–170 in your Kim, Mallory, & Vallerio textbook to see how to report results.) Be sure to check Confidence Interval under the “Additional Options”. It should be set to 95%.
Include a table for each of your results.
Discuss the relationship between the variables in your short paragraph about your results.
Be sure to use APA format.

Using JASP to Conduct a Correlation Screencast

JASP Correlation TranscriptLinks to an external site.

For part one of the assignment, submit screenshots of the items above. It is best to copy these and put them in a Word document.

Part Two

For part two of the assignment:

Prepare a short narrative to describe the correlation. Your narrative should use of APA formatting.
This narrative should be approximately one paragraph, double-spaced.

Submission

Submit your assignment and review full grading criteria on the Assignment 10.1: Correlations page.

Week 10: Relationships Among Variables

Lesson 1: Relationships Among Variables

Introduction

Examining relationships among variables is a common occurrence in nursing studies. However, how we determine the significance of the relationship is the focus of correlation.

In statistics, we can look at positive and negative correlations of variables. The correlation coefficient can help us describe both the strength and direction of the relationship.

In order to look at correlation, both variables must be measured on the interval level. Correlation also provides a foundation for further statistical tests regarding regression, but for now, let’s take a look at the principles of correlation.

Click here to ORDER an A++ paper from our MASTERS and DOCTORATE WRITERS: NUR 705 Assignment 10.1: Correlations

Learning Outcomes

At the end of this lesson, you will be able to:

Use JASP to compute the Pearson correlation coefficient.
NUR 705 Assignment 10.1 Correlations
Correctly interpret r-values.
Understand strength of the relationship between variables.

Before attempting to complete your learning activities for this week, review the following learning materials:

Learning Materials

Read the following in your Polit & Beck (2021) Nursing research: Generating and assessing evidence for practice textbook:

Chapter 15, “Measurements & Statistics,” starting at page 315

Read the following in your Kim, Mallory, & Vallerio (2022) Statistics for evidence-based practice in nursing textbook:

Chapter 9, “Examining Relationships Between and Among Variables”

Against All Odds: Correlation

Review the presentation by Dr. Pardis Sabeti to learn about correlation:

Sabeti, P. (Host), & Villiger, M. (Writer/Producer/Director). (2014). CorrelationLinks to an external site. [Video Unit 12]. Against All Odds: Inside Statistics. Retrieved from Annenberg LearnerLinks to an external site.. (Closed captioning is provided.)

Against All Odds: Correlation TranscriptLinks to an external site.

Lecture: Correlation

Review the lecture to learn more about correlation.

Lecture: Correlation TranscriptLinks to an external site.

Lecture: Correlation

Slide 1

Okay, welcome to our next lecture. In the next two lectures, we are going to talk about correlation and regression. Let’s start with correlation. With correlation, we change the independent variable. The dependent variable has always been quantitative and of the independent variable has always been categorical up until now, but now we’re going to be comparing two groups where both variables are quantitative.

Slide 2

Instead of talking about a difference between two groups we’re going to change it and the change is subtle but important. We’re now going to talk about associations or relationships. We are going say variable are significantly related or significantly associated with one another. I’ll illustrate kind of the difference or what exactly that means.

Slide 3

If both variables are quantitative then we can put them on a matrix. Something you might remember from maybe a high school geometry class where we have an x-axis called the abscissa on the horizontal axis or on the horizontal plane and then a vertical axis called the ordinate or the y-axis. The intersection of the independent and dependent variable can be represented by a point on this plot and we’ll create a scattered plot. Now we start to look at what these plots or what these patterns of dots mean.

Slide 4

One classic pattern we see all the time in higher education is the pattern between or the relationship between ACT and grade point average. If we overlay a quadrant, four equals squares on a matrix, we will see not a random scattering of point but a pattern starts to develop. It a crude pattern but it is a pattern. You will see the majority of dots are found in the lower left or upper right. Those two quadrants. This would be an example of a positive relation or in a positive association. As ACT increases GPA increases as well. It’s not a random relationship. There appears to be a pattern, as one goes up so does the other.

Slide 5

In correlation, the placement of the variables on the y or the x-axis are arbitrary, so I can flip this one around. If ACT correlates to GPA, then GPA correlates to ACT. There’s no chronology in correlation. One does not come first, and then we can’t say one comes first and then the implication is it causes the second. We can’t say that with correlations, so if A correlates to B, B correlates to A.

Slide 6

Up until now, the coefficients that SPSS has calculated, we haven’t really done anything with them. They don’t really have any meaning. We really focus on their associated significance, so when we dealt with the z-score or a t-value or an f-value, we didn’t really look at that value. We looked at the associated significance. With correlation, there is an Pearson r value—an r value. That r value has some meaning; r values range between negative 1 and positive 1, and they can tell us two basic things; one, the direction, so if the r value is positive, that tells us we have a positive relationship. If it is a negative number, we have a negative relationship, and then the strength or the magnitude of the relationship. The closer that number is to 1 or a negative 1, the more tightly clustered, or the more that scatterplots starts to resemble a line.

Slide 7

Here’s some examples of some correlation coefficients. The three along the top are both pretty rare. We don’t see them very often, but we can manufacture artificial ones. The first one is the perfect negative correlation of −1, where the data points line up perfectly, almost in a stair step fashion in a downward trend, so the points are found in the upper left or lower right quadrant. As we move from left to right, the y goes down. The opposite as a perfect positive 1 correlation where it ladders up and forms a perfect line, and then the fine one on the top row there is a correlation of 0 where it just looks like one big blob. If we were to overlay the quadrants, we would see probably very close to an equal number in all 4 quadrants. Those three are pretty rare. We don’t see them often in the natural world. What we see commonly and what we are left to interpret are the ones at the bottom. Not perfect by any means, but still suggesting a relationship. The first one in the lower left, that’s what a positive r value of 0.6 looks like, and then next to that a −0.6. Again, not a perfect line, but certainly a trend moving from the upper left to the lower right.

Slide 8

With correlation, there are some important assumptions we need to deal with, so let’s look at them one at a time.

Slide 9

First one, correlation assumes a linear relationship so if we saw a scatterplot like this where you see the dots rising, hitting a peak, a point of diminishing return and then a decline so we saw both the positive and a negative relationship, that would show as a P value—excuse me, an r value of zero but clearly there’s a relationship there. It’s just not a linear one so this is a curvilinear relationship. This is the importance with correlation of always asking for a scatterplot because, if we were just to run the correlation, get a Pearson r of zero, we would just say there isn’t a relationship. Well, there clearly is one. It’s just not a linear one and so the Pearson r would be an inappropriate measure of that. There’s another statistic (we’re not going to deal with it in this class) that fits a curve line to a relationship and if you saw, you would need to probably contact the statistician and work on that. We see that sometimes in healthcare. We certainly see it in rehabilitation where practice or therapy can actually go too far and you’ll see increases to a certain point and then a person can actually overwork a muscle and they will hit a point of diminishing return and actually see the output decline. You see it in athletic training and high-end athletes. If you have a son or daughter or if you were involved in swimming or in cross country, you’re familiar with the concept of tapering. Tapering is a concept that tries to avoid that decline by cutting off practice and not increasing practice so you don’t see that curve, downward curve. You’re trying to avoid that.

Slide 10

The second assumption is your big word of the day, that is, homoscedasticity. Homoscedasticity just means that you have consistent variance or a consistent spread in the data on one variable as you move to lower values, as you move from lower to higher values on your other variables. This is kind of an exaggerated example, you’ll never see anything that looks like this. You’ll see that for low ACT scores, you get primarily low GPAs. For high ACT scores you get relatively high GPAs.

Slide 11

That previous slide would be an example of not violating the assumption of homoscedasticity. This would be a visualization that would present a violation of that assumption. You’ll get—with lower, you’ll see this example, again, with ACT and GPA with lower ACT scores, you get lower GPAs but with higher ACT scores, you get high and low GPAs. You see the data points exploding on GPA as you move from low ACTs to high ACT scores. You need to make sure your data is relatively consistent, has a consistent variation on one variable as you move from low to high on the other.

Slide 12

The third assumption is that you do not have a range restriction. Range restrictions usually will result in a artificial reduction in the r value. You need to avoid that. Here’s two classic examples. The first one would be to offer some type of a study skills program to see if hours in a study skill program would relate to an increase in GPA. If you offered that at a community college, you would have a full spectrum of academic performance there. You would have high performing students, and you would have low performing students. You would have a good shot at showing a relationship there. If you went to a highly selective university and offered a similar program, let’s say Harvard or Yale or MIT, you probably would have no low achieving students. It would be hard for a study skills program to improve GPA when you have such high achieving students. They don’t really have anywhere to go. They’re already at a high end, so that’s a range restriction. If you were to sample it at a school like Harvard, you would probably show that a study skills program wouldn’t have any effect. It probably wouldn’t not be a reflection on the value of the study skills program. It would just be a reflection on the fact that these students were all high achieving. They really can’t get much better. It would be hard to move that variable very much.

Slide 13

Another one we see in athletic training, in physical therapy or in sports medicine is trying to improve a performance outcome. Let’s say we’re looking at improving someone’s free throw percentage, so we’re looking that the relationship of practice to how many free throws they can make. How many out of 10 free throws could you make. If we get a group of average individuals, probably the more they practice, the better they’ll get, but if we get high school, maybe not high school, how about college level division 1 or MBA players, they’re already at the high end. Many of them already have free throw percentages at 80% or 90% or higher. Practice may not improve that enough to get a significant r value. Again, in PT, whenever we’re looking at interventions, if our students are selecting data at a division 1 college, for example, they might not get a significant r value because they have a range restriction. Those individuals are all high-end athletes, so it’s hard to show a relationship when the low end is not well represented.

Slide 14

When we get a Pearson r score, there are three classic ways we can interpret that score. There are three different areas we can talk about when we deal with a Pearson r score, so let’s look at those one at a time.

Slide 15

First, with the Pearson R value, we can interpret the strength or make some comment on the strength of the association. This is a chart that is found in some stats textbooks that just quantifies low correlations, moderate correlations, or high correlations. Your book doesn’t have this chart, and I’m glad they don’t because I think this can be a little misleading. These numbers are arbitrary. What we need to do is consider context. Context is very important. We can’t just say a number is low because a chart says so. We need to look at the context.

Slide 16

As an example of this considering context, pay close attention when you watch the video on correlation, the video link I give you of the “Against All Odds” program. They talk about a twin study that was done and they got a relatively low correlation. It was a significant correlation, but it was relatively low, but it was surprising given the context. These are twins that, certainly, for example, these two guys in the picture, they had high correlation in physical traits. Their hair—they’re both bald, they both need glasses, but they had a significant correlation in psychological traits. Remember, they were both firefighters or you’ll learn they’re both firefighters. They both prefer the same brand of beer. They both hold the beer can the same way. They’re both confirmed bachelors. These are all things for twin studies. These guys were raised apart. We wouldn’t expect them to be so similar in this psychological characteristics, so given the context of twins reared apart, that low correlation score that they’re going to report is surprising given the context.

Slide 17

The second thing we can do with the Pearson r score is to square it and this is kind of a preview for our next lesson but when we square it, it’s called the coefficient of determination and basically what that is is the percent of variance of one variable that is shared or can be explained by the other variable. We’ll go into this in a little more detail in our next lesson.

Slide 18

The third thing we can do with the r value is to look at the associated significance or the P value for the Pearson r. That just tell us like in our other, with F scores and T scores, the probability that chance can explain it. We can get chance relationships too. We have to look at our associated P value and say whether that Pearson r is statistically significant or not.

Slide 19

The one thing we need to be careful about with correlation is you can get some crazy nonsensical significant correlations out there. Sometimes they’re called fishing expeditions or searching for significance. If you correlate enough variables with each other, you will get some that correlate. You just will out of chance, so we have to be very careful that the correlations make sense. If there’s a study that’s throwing a lot of variables out there and correlating them with each other, something will eventually stick, so we have to be skeptical when we see those types of studies.

Slide 20

Finally, I covered this in the video introduction, but correlation is not causation. To say that two variables are correlated to one another, that doesn’t mean that one causes the other. We’ll talk about the challenges of establishing a causal relationship later on in a future lecture. I always think of the example, the nonsensical example, of the correlation between the length of your pants and height. If I looked at a grade school or a high school and correlated student height to student pant length, there would be a significantly positive correlation, so if parents wanted their sons to grow tall or their daughters to grow tall, should they buy them long pants? Of course not, it makes no sense, but it is a significant correlation.

Slide 21

Okay, let’s look at a correlation example. This is a phenomenon I remember learning about when I was in college. These people that say they don’t smoke, but they say they only smoke when they drink. They only smoke when they’re at bars or when they have a beer in their hand or when they’re in social settings with alcohol. We’re going to explore that relationship and see if it exists. In this study, 15 male college students who smoked and who drank were asked to keep track of how much they spend on these products per week.

Slide 22

Now the SPSS output is pretty simple here. It gives us a matrix where both variables are put on the horizontal axis. You’ll see alcohol and tobacco there and then across the top on a vertical axis, so we get some redundant information, some information we don’t really need. First on the horizontal, you’ll see when the variable is correlated with each other, so if we follow alcohol to alcohol or tobacco to tobacco, we get a correlation of one, which makes sense. A variable correlated to itself will result in a perfect correlation of one, but what we’re interested in is the other diagonal that looks at tobacco correlated to alcohol or alcohol correlated to tobacco. We can just pick either one. Again, it’s arbitrary, but we see a correlation of 0.632, so it’s positive, relatively strong, and the significance is less than 0.05, 0.011, so we say there is a positive relationship. It’s fairly well represented by a line, and we say it is significant. It’s likely not a chance relationship, so we should also look for a matrix, look at the picture, so we’ll ask for a plot of this.

Slide 23

Here’s what our scatter plot looks like, kind of a slightly upward tilting relationship. They are fairly tightly clustered, but you will see one data point there, all by itself. That is certainly an outlier. I would want to check that score, look at the raw data, see if maybe something was entered wrong. It looks like they weren’t drinking alcohol at all, so we would want to make sure. Again, we are only including people in this study who both drink and smoke. You know, if the person maybe was sick that week, or for some reason stopped drinking alcohol, that might be a reason to exclude them from a study. From the study, we might even get a stronger correlation if we eliminated that point. This just points out that we need to always ask for a scatterplot and look at the picture when dealing with correlation.

Slide 24

If you would like to go through this presentation again, simply click the Replay button, and you will be returned to the beginning.

Assignment 10.1: Correlations Rubric
Criteria	Ratings	Pts
Screenshots	3 to >2 pts Meets Expectations Completed screenshots of the results from JASP are included and are correct. 2 to >0 pts Does Not Meet Expectations Completed screenshots are not included. Comments Just missing your Confidence interval for the two Pearson’s tables and you needed to run the interval data, not the ordinal data.	2 / 3 pts
Narrative	4 to >3 pts Meets Expectations Narrative includes an adequate interpretation of your results. Narrative includes a short report on the output, a table for each of your results, and a discussion of the relationship between the variables. 3 to >1 pts Nearly Meets Expectations Narrative includes an interpretation of your results. Narrative includes a short report on the output, a table for each of your results, and/or a discussion of the relationship between the variables. 1 to >0 pts Does Not Meet Expectations Narrative does not include an interpretation of your results. Narrative does not include a short report on the output, a table for each of your results, and a discussion of the relationship between the variables. Comments See Kim et al page 170 to see how to report the statistics properly in the narrative.	3.5 / 4 pts
Documentation and Mechanics	4 to >3 pts Meets Expectations No errors in grammar, spelling, punctuation, or sentence structure. 3 to >1 pts Nearly Meets Expectations Few errors in grammar, spelling, punctuation, or sentence structure. 1 to >0 pts Does Not Meet Expectations Numerous and distracting errors in grammar, spelling, punctuation, or sentence structure.	4 / 4 pts
APA Formatting	4 to >3 pts Meets Expectations APA formatting is followed in accordance with the 7th edition for reporting statistical results. 3 to >1 pts Nearly Meets Expectations Contains one or two APA errors in reporting statistical results. 1 to >0 pts Does Not Meet Expectations Does not follow APA 7th edition guidelines.	4 / 4 pts
Total Points: 13.5