Pandas 填充 NaN 值
本教程解释了我们如何使用 DataFrame.fillna()
方法用指定的值填充 NaN 值。
我们将在本文中使用下面的 DataFrame。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
print(student_df)
输出:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
DataFrame.fillna()
方法
语法
DataFrame.fillna(
value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)
DataFrame.fillna()
方法使我们能够用指定的值或方法来填充 DataFrame
中的 NaN
值。
使用 DataFrame.fillna()
方法用指定的值填充整个 DataFrame
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(0)
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
输出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 0.0 Bob 0.0 0.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 0.0 18.0
5 506.0 Anish 0.0 0.0
它将 DataFrame student_df
中的所有 NaN
值替换为 0
,该值作为参数传递给 DataFrame.fillna()
方法。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(method="ffill")
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
输出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 502.0 Bob 400.0 18.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 30.0 18.0
5 506.0 Anish 30.0 18.0
它将所有 student_df
中的 NaN
值填入与 NaN
值相同列的 NaN
值之前的值。
用指定的值填充指定列的 NaN
值
为了用指定的值来填充特定的值,我们向 fillna()
方法传递一个字典,以列名作为键,以该列的 NaN
值作为值。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 300, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna({"Age": 17, "Income(in $)": 300})
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
输出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 300.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob 300.0 17.0
3 504.0 Emma 300.0 16.0
4 505.0 Luna 300.0 18.0
5 506.0 Anish 300.0 17.0
它将 Age
列中的所有 NaN
值填充为 17,将 Income(in $)
列中的所有 NaN
值填充为 300。Roll No
栏中的 NaN
值保持不变。
相关文章
Pandas DataFrame DataFrame.shift() 函数
发布时间:2024/04/24 浏览次数:133 分类:Python
-
DataFrame.shift() 函数是将 DataFrame 的索引按指定的周期数进行移位。
Python pandas.pivot_table() 函数
发布时间:2024/04/24 浏览次数:82 分类:Python
-
Python Pandas pivot_table()函数通过对数据进行汇总,避免了数据的重复。
Pandas read_csv()函数
发布时间:2024/04/24 浏览次数:254 分类:Python
-
Pandas read_csv()函数将指定的逗号分隔值(csv)文件读取到 DataFrame 中。
Pandas 多列合并
发布时间:2024/04/24 浏览次数:628 分类:Python
-
本教程介绍了如何在 Pandas 中使用 DataFrame.merge()方法合并两个 DataFrames。
Pandas loc vs iloc
发布时间:2024/04/24 浏览次数:837 分类:Python
-
本教程介绍了如何使用 Python 中的 loc 和 iloc 从 Pandas DataFrame 中过滤数据。
在 Python 中将 Pandas 系列的日期时间转换为字符串
发布时间:2024/04/24 浏览次数:894 分类:Python
-
了解如何在 Python 中将 Pandas 系列日期时间转换为字符串