How to Exclude Columns in Pandas?
Excluding columns in a Pandas DataFrame is a common operation when you want to work with only relevant data. In this article, we will discuss various methods to exclude columns from a DataFrame, including using .loc[]
, .drop()
, and other techniques.
Exclude One Column using .loc[]
We can exclude a column by its location using the .loc [] function. The code below demonstrates how to exclude a specific column by comparing column names.
import pandas as pd
d = pd.DataFrame({'food_id': [1, 2, 3, 4],
'name': ['idly', 'dosa', 'poori', 'chapathi'],
'city': ['delhi', 'goa', 'hyd', 'chennai'],
'cost': [12, 34, 21, 23]})
# Exclude the 'cost' column
ex = d.loc[:, d.columns != 'cost']
print("DataFrame after excluding 'cost' column using .loc[]")
print(ex)
Output:
DataFrame after excluding 'cost' column using .loc[]
food_id name city
0 1 idly delhi
1 2 dosa goa
2 3 poori hyd
3 4 chapathi chennai
If we need to exclude multiple columns, we can use .isin()
function.
# exclude name and food_id column
print(d.loc[:, ~d.columns.isin(['name', 'food_id'])])
Output
city cost
0 delhi 12
1 goa 34
2 hyd 21
3 chennai 23
Other methods to exclude columns in Pandas, apart from using DataFrame.loc[], are discussed below:
Excluding Columns Using .remove()
Although .remove() is not a direct Pandas method, we can use it with a list of column names before reassigning the DataFrame.
# Convert columns to a list and remove the column 'cost'
columns = list(d.columns)
columns.remove('cost')
# Create a new DataFrame with the remaining columns
ex = d[columns]
print(ex)
Output
food_id name city
0 1 idly delhi
1 2 dosa goa
2 3 poori hyd
3 4 chapathi chennai
Excluding Columns by Name Using drop()
drop() method is one of the most common ways to exclude specific columns by name:
# Exclude the 'cost' column
ex = d.drop(columns=['cost'])
print(ex)
Output
food_id name city
0 1 idly delhi
1 2 dosa goa
2 3 poori hyd
3 4 chapathi chennai
Exclude Columns by Data Type Using select_dtypes()
We can exclude columns based on their data type using the select_dtypes() method.
# Exclude all numeric columns
ex = d.select_dtypes(exclude=['number'])
print(ex)
Output
name city
0 idly delhi
1 dosa goa
2 poori hyd
3 chapathi chennai
Excluding Columns Dynamically Using List Comprehensions
If we need to exclude columns dynamically based on a condition, we can use list comprehensions. For example, to exclude columns whose names start with a specific letter:
# Exclude columns whose names start with 'c'
ex = d[[col for col in d.columns if not col.startswith('c')]]
print(ex)
Output
food_id name
0 1 idly
1 2 dosa
2 3 poori
3 4 chapathi
These techniques make column exclusion in Pandas flexible and straightforward, helping us focus on the data we need.