Open In App

Indexing and Selecting Data with Pandas

Last Updated : 12 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Indexing and selecting data helps us to efficiently retrieve specific rows, columns or subsets of data from a DataFrame. Whether we're filtering rows based on conditions, extracting particular columns or accessing data by labels or positions, mastering these techniques helps to work effectively with large datasets. In this article, we’ll see various ways to index and select data in Pandas which shows us how to access the parts of our dataset.

1. Indexing Data using the [] Operator

The [] operator is the basic and frequently used method for indexing in Pandas. It allows us to select columns and filter rows based on conditions. This method can be used to select individual columns or multiple columns.

1. Selecting a Single Column

To select a single column, we simply refer the column name inside square brackets:

Python
import pandas as pd

data = pd.read_csv("/content/nba.csv", index_col="Name")
print("Dataset")
display(data.head(5))

first = data["Age"]
print("\nSingle Column selected from Dataset")
display(first.head(5))

Output:

Single-Column-selected-from-the-Dataset
Single column

2. Selecting Multiple Columns

To select multiple columns, pass a list of column names inside the [] operator:

Python
first = data[["Age", "College", "Salary"]]
print("\nMultiple Columns selected from Dataset")
display(first.head(5))  

Output:

Multiple-columns-selected
Multiple Columns

2. Indexing with .loc[ ]

The.loc[] function is used for label-based indexing. It allows us to access rows and columns by their labels. Unlike the indexing operator, it can select subsets of rows and columns simultaneously which offers flexibility in data retrieval.

1. Selecting a Single Row by Label

We can select a single row by its label:

Python
import pandas as pd
data = pd.read_csv("/content/nba.csv", index_col="Name")

row = data.loc["Avery Bradley"]
print(row)

Output:

index1
Single Row by Label

2. Selecting Multiple Rows by Label

To select multiple rows, pass a list of labels:

Python
rows = data.loc[["Avery Bradley", "R.J. Hunter"]]
print(rows)

Output:

3. Selecting Specific Rows and Columns

We can select specific rows and columns by providing lists of row labels and column names:

Dataframe.loc[["row1", "row2"], ["column1", "column2", "column3"]]

Python
selection = data.loc[["Avery Bradley", "R.J. Hunter"], ["Team", "Number", "Position"]]
print(selection)

Output:

4. Selecting All Rows and Specific Columns

We can select all rows and specific columns by using a colon [:] to indicate all rows followed by the list of column names:

Dataframe.loc[:, ["column1", "column2", "column3"]]

Python
all_rows_specific_columns = data.loc[:, ["Team", "Position", "Salary"]]
print(all_rows_specific_columns)

Output:

index24
All Rows and Specific Columns

3. Indexing with .iloc[ ]

The .iloc[] function is used for position-based indexing. It allows us to access rows and columns by their integer positions. It is similar to .loc[] but only accepts integer-based indices to specify rows and columns.

1. Selecting a Single Row by Position

To select a single row using .iloc[] provide the integer position of the row:

Python
import pandas as pd
data = pd.read_csv("/content/nba.csv", index_col="Name")
row = data.iloc[3]
print(row)

Output:

2. Selecting Multiple Rows by Position

We can select multiple rows by passing a list of integer positions:

Python
rows = data.iloc[[3, 5, 7]]
print(rows)

Output:

3. Selecting Specific Rows and Columns by Position

We can select specific rows and columns by providing integer positions for both rows and columns:

Python
selection = data.iloc[[3, 4], [1, 2]]
print(selection)

Output:

4. Selecting All Rows and Specific Columns by Position

To select all rows and specific columns, use a colon [:] for all rows and a list of column positions:

Python
selection = data.iloc[:, [1, 2]]
print(selection)

Output:

index34
All Rows and Specific Columns by Position

4. Other Useful Indexing Methods

Pandas also provides several other methods that we may find useful for indexing and manipulating DataFrames:

1. .head(): Returns the first n rows of a DataFrame

Python
print(data.head(5))

Output:

index41
.head()

2. .tail(): Returns the last n rows of a DataFrame

Python
print(data.tail(5))

Output:

index42
.tail()

3. .at[]: Access a single value for a row/column label pair

Python
value = data.at["Avery Bradley", "Age"]
print(value)

Output:

25.0

4. .query(): Query the DataFrame using a boolean expression

Python
result = data.query("Age > 25 and College == 'Duke'")
print(result)

Output:

index44
.query()

More methods for indexing in a Pandas DataFrame include:

FunctionDescription
DataFrame.iat[]Access a single value for a row/column pair by integer position.
DataFrame.pop()Return item and drop from DataFrame.

DataFrame.xs()

Return a cross-section (row(s) or column(s)) from the DataFrame.
DataFrame.get()Get item from object for given key (e.g DataFrame column).
DataFrame.isin()Return a boolean DataFrame showing whether each element is contained in values.
DataFrame.where()Return an object of the same shape with entries from self where cond is True otherwise from other.
DataFrame.mask()Return an object of the same shape with entries from self where cond is False otherwise from other.
DataFrame.insert()Insert a column into DataFrame at a specified location.

By mastering these indexing methods, we'll be able to efficiently navigate and manipulate our data in Pandas which helps in enhancing our data analysis workflow and making our tasks simpler and more efficient.


Next Article

Similar Reads