Plotting a Pandas Series
This article explores the concept of plotting series on a DataFrame using Pandas.
Whether you are exploring a dataset to hone your skills or aiming to make a good presentation for your company’s performance analysis, visualization plays an important role.
Python provides various options through its .plot()
Pandas library with functions to transform our data into presentable forms in unprecedented ways.
Even an amateur Python developer can easily figure out how to use the library after understanding the steps and following the correct procedures to generate valuable insights.
However, to do this, we first need to understand what the library does and how it helps analysts provide value to their companies.
Various Plots in Pandas
Let's start this tutorial by understanding how many different graphs there are.
line
- Line graph (this is the default graph)bar
- Bar chart parallel to the Y axis (vertical)barh
- Bar chart parallel to the X-axis (horizontal)hist
- Histogrambox
- Box plotkde
- Kernel density estimation plotdensity
-kde
Same asarea
- Area Chartpie
- Pie Chart
Pandas uses plot()
the method for visualization. Additionally, the library pyplot
can be used Matplotlib
for illustrations.
This tutorial covers important plot types and how to use them effectively.
Plotting a Bar Chart from a Pandas Series
As the name suggests, series plots are important when the data is in the form of series, and there should be a correlation between the variables. If there is no correlation, we will not be able to visualize and compare.
Below is an example of drawing a basic bar chart based on dummy data given in dictionary form. We can use a CSV file based on real data or we can use custom created dummy data to explore various options for development and research.
import pandas as pd
import matplotlib.pyplot as plt
s = pd.Series(
{
16976: 2,
1: 39,
2: 49,
3: 187,
4: 159,
5: 158,
16947: 14,
16977: 1,
16948: 7,
16978: 1,
16980: 1,
},
name="article_id",
)
print(s)
# Name: article_id, dtype: int64
s.plot.bar()
plt.show()
The above code gives this output.
As we can see, a bar graph is displayed to aid in comparison and analysis.
Plotting a Line Chart of a Pandas Series
Let's consider one more example where our purpose is to draw a line chart based on given dummy data. Here, we should not add extra elements and plot()
.
# using Series.plot() method
s = pd.Series([0.1, 0.4, 0.16, 0.3, 0.9, 0.81])
s.plot()
plt.show()
The above code gives this output.
It is also possible to plot a graph with multiple variables on the Y-axis as shown below. Including multiple variables in a single graph makes it more illustrative and feasible to compare elements belonging to the same category.
For example, if a graph of students' scores in a particular exam is created, it will help the professor analyze the performance of each student in a specific time interval.
import numpy as np
ts = pd.Series(np.random.randn(1000), index=pd.date_range("1/1/2000", periods=1000))
df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index, columns=list("ABCD"))
df = df.cumsum()
plt.figure()
df.plot()
plt.show()
Plotting a Boxplot of a Pandas Series
plot()
The method allows for other plotting styles besides the default line plot. We can provide kind
the -p argument to the plot function.
We can series.plot.box()
draw a box plot to illustrate the distribution of values within each column by calling the function . A box plot tells us a lot about the data, such as the median.
We can also find out the first, second, and third quartiles by looking at the box plot.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=["A", "B", "C", "D", "E"])
df.plot.box()
plt.show()
Additionally, color
passing arguments of other types using the keyword will be immediately assigned to matplotlib
for all boxes
, whiskers
, , medians
and caps
shadings.
We can get this infographic given below by writing just one line.
Draw a Histogram Bin Chart of a Pandas Series
Next, we will learn how to plot hexagonal bin plots and autocorrelation plots.
Histogram Bin plots are dataframe.plot.hexbin()
created using the syntax. These are a good alternative to scatter plots if your data is too dense to plot each point clearly.
A very important keyword here is gridsize
, as it controls the number of hexagons along the horizontal direction. More grids will tend towards smaller and larger bins.
Here is the following code snippet based on random data.
df = pd.DataFrame(np.random.randn(1000, 2), columns=["a", "b"])
df["b"] = df["b"] + np.arange(1000)
df["z"] = np.random.uniform(0, 3, 1000)
df.plot.hexbin(x="a", y="b", C="z", reduce_C_function=np.max, gridsize=25)
plt.show()
The above code gives this output.
For more information on hexagonal bin plots, navigate to the hexagonal bin plot method in the Pandas official documentation hexbin
.
Plotting the Autocorrelation of a Pandas Series
We end this tutorial with the most complex type of plot: the autocorrelation plot. This plot is often used to analyze machine learning models based on neural networks.
It is used to describe whether the elements in a time series are positively correlated, negatively correlated, or not dependent on each other. We can find 自相关
the value of the function ACF on the Y-axis, ranging from -1 to 1
It helps to correct for randomness in the time series. We obtain the data by calculating the autocorrelation at different time lags.
The lines parallel to the x-axis correspond to the approximately 95% to 99% confidence bands. The dashed lines are the 99% confidence bands.
Let's see how to create this graph.
from pandas.plotting import autocorrelation_plot
plt.figure()
spacing = np.linspace(-9 * np.pi, 9 * np.pi, num=1000)
data = pd.Series(0.7 * np.random.rand(1000) + 0.3 * np.sin(spacing))
autocorrelation_plot(data)
plt.show()
If the time series is not based on real data, then this autocorrelation is around zero for all lag differences, and if the time series is based on real data, then the autocorrelation must be non-zero. There must be one or more autocorrelations.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Finding the installed version of Pandas
Publish Date:2025/04/12 Views:190 Category:Python
-
Pandas is one of the commonly used Python libraries for data analysis, and Pandas versions need to be updated regularly. Therefore, other Pandas requirements are incompatible. Let's look at ways to determine the Pandas version and dependenc
KeyError in Pandas
Publish Date:2025/04/12 Views:81 Category:Python
-
This tutorial explores the concept of KeyError in Pandas. What is Pandas KeyError? While working with Pandas, analysts may encounter multiple errors thrown by the code interpreter. These errors are wide ranging and can help us better invest
Grouping and Sorting in Pandas
Publish Date:2025/04/12 Views:90 Category:Python
-
This tutorial explored the concept of grouping data in a DataFrame and sorting it in Pandas. Grouping and Sorting DataFrame in Pandas As we know, Pandas is an advanced data analysis tool or package extension in Python. Most of the companies
Plotting Line Graph with Data Points in Pandas
Publish Date:2025/04/12 Views:65 Category:Python
-
Pandas is an open source data analysis library in Python. It provides many built-in methods to perform operations on numerical data. Data visualization is very popular nowadays and is used to quickly analyze data visually. We can visualize
Converting Timedelta to Int in Pandas
Publish Date:2025/04/12 Views:123 Category:Python
-
This tutorial will discuss converting a to a using dt the attribute in Pandas . timedelta int Use the Pandas dt attribute to timedelta convert int To timedelta convert to an integer value, we can use the property pandas of the library dt .
Pandas fill NaN values
Publish Date:2025/04/12 Views:93 Category:Python
-
This tutorial explains how we can use DataFrame.fillna() the method to fill NaN values with specified values. We will use the following DataFrame in this article. import numpy as np import pandas as pd roll_no = [ 501 , 502 , 503 , 50
Pandas Convert String to Number
Publish Date:2025/04/12 Views:147 Category:Python
-
This tutorial explains how to pandas.to_numeric() convert string values of a Pandas DataFrame into numeric type using the method. import pandas as pd items_df = pd . DataFrame( { "Id" : [ 302 , 504 , 708 , 103 , 343 , 565 ], "Name" :
How to Change the Data Type of a Column in Pandas
Publish Date:2025/04/12 Views:139 Category:Python
-
We will look at methods for changing the data type of columns in a Pandas Dataframe, as well as options like to_numaric , , as_type and infer_objects . We will also discuss how to to_numaric use downcasting the option in . to_numeric Method
Get the first row of Dataframe Pandas
Publish Date:2025/04/12 Views:78 Category:Python
-
This tutorial explains how to use the get_first_row pandas.DataFrame.iloc attribute and pandas.DataFrame.head() get_first_row method from a Pandas DataFrame. We will use the following DataFrame in the following example to explain how to get