One way to describe the association between two variables is to assume that the value of the one variable is a linear function of the value of the other variable. If this relationship is perfect, then it can be described by the slope-intercept equation for a straight line, Y = a + bX. Even if the relationship is not perfect, one may be able to describe it as nonperfect linear.
Correlation and regression are very closely related topics. Technically, if the X variable (often called the “independent variable, even in nonexperimental research) is fixed, that is, if it includes all of the values of X to which the researcher wants to generalize the results, and the probability distribution of the values of X matches that in the population of interest, then the analysis is a regression analysis. If both the X and the Y variable (often called the “dependent” variable, even in nonexperimental research) are random, free to vary (were the research repeated, different values and sample probability distributions of X and Y would be obtained), then the analysis is a correlation analysis. For example, suppose I decide to study the correlation between dose of alcohol (X) and reaction time. If I arbitrarily decide to use as values of X doses of 0, 1, 2, and 3 ounces of 190 proof grain alcohol and restrict X to those values, and have the equal numbers of subjects at each level of X, then I’ve fixed X and do a regression analysis. If I allow X to vary “randomly,” for example, I recruit subjects from a local bar, measure their blood alcohol (X), and then test their reaction time, then a correlation analysis is appropriate.
In actual practice, when one is using linear models to develop a way to predict Y given X, the typical behavioral researcher is likely to say she is doing regression analysis. If she is using linear models to measure the degree of association between X and Y, she says she is doing correlation analysis.