

The third line will import the pyplot from matplotlib - also, we will refer to it as plt.Īnd %matplotlib inline sets your environment so you can directly plot charts into your Jupyter Notebook! The first two lines will import pandas and numpy. And you’ll also have to make a small tweak in your Jupyter environment. Just as we have done in the histogram article, as a first step, you’ll have to import the libraries you’ll use. Step #1: Import pandas, numpy and matplotlib! Note: By the way, I prefer the matplotlib solution because I find it a bit more transparent.

The two solutions are fairly similar, the whole process is ~90% the same… The only difference is in the last few lines of code. Scatter plot in pandas and matplotlibĪs I mentioned before, I’ll show you two ways to create your scatter plot.
Pandas plot scatter how to#
It’s time to see how to create one in Python! Okay, I hope I set your expectations about scatter plots high enough. But in the remaining 1%, you might find gold! Well, in 99% of cases it will turn out to be either a triviality, or a coincidence. There are always exceptions and outliers!)īut it’s also possible that you’ll get a negative correlation:Īnd in real-life data science projects, you’ll see no correlation often, too:Īnyway: if you see a sign of positive or negative correlation between two variables in a data science project, that’s a good indicator that you found something interesting - something that’s worth digging deeper into. (Of course, this is a generalization of the data set. The greater is the height value, the greater is the expected weight value, too. This above is called a positive correlation.
Pandas plot scatter code#
Note: this article is not about regression machine learning models, but if you want to get started with that, go here: Linear Regression in Python using numpy + polyfit (with code base) regression line) to this data set and try to describe this relationship with a mathematical formula. Looking at the chart above, you can immediately tell that there’s a strong correlation between weight and height, right? As we discussed in my linear regression article, you can even fit a trend line (a.k.a. Scatter plots play an important role in data science – especially in building/prototyping machine learning models. So, for instance, this person’s (highlighted with red) weight and height is 66.5 kg and 169 cm.
