How to Reference the Next Row in a Pandas DataFrame

Last Updated : 04 Dec, 2024

To reference the next row in a Pandas DataFrame, you can use the .shift() method. This method shifts the data by a specified number of periods (rows), allowing you to access the previous or next row's values in a given column. It's useful for comparing consecutive rows or calculating differences between rows. For example, consider a DataFrame df:

Method 1: Using the `shift()` Method

By shifting the index by a specified number of periods (typically -1 for the next row), method allows you to align data from different rows.

Python

import pandas as pd

data = {'X': [10, 20, 30], 'Y': [5, 15, 25]}
df = pd.DataFrame(data)

# Create a new column referencing the next row
df['X_next'] = df['X'].shift(-1)
print(df)

Output:

Screenshot-2024-12-02-185350 — Reference the Next Row in a Pandas DataFrame

We can also use shift for:

shift(1): Shifts the values up by one row.
shift(n): You can shift by any number of rows, positive or negative.

Example 1: Let us consider a dataframe and use shift(-1) and shift(1).

Python

import pandas as pd

df = pd.DataFrame({'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'],
                   'Sales': [200, 220, 250, 300, 280, 310]})
df['Shifted_Sales'] = df['Sales'].shift(-4)
print(df)

df['Shifted_Sales'] = df['Sales'].shift(2)
print(df)

Output:

Screenshot-2024-12-02-191311 — Reference the Next Row in a Pandas DataFrame

Method 2: iloc for Row Access

iloc basically stands for positional indexing. By modifying the iloc function, we can shift the values and make it work just like the shift method. Let us consider a sample code.

Python

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'A': [10, 20, 30, 40]
})

# Create a new column by referencing the next row using iloc
df['A_next_iloc'] = pd.concat([df['A'].iloc[1:].reset_index(drop=True), pd.Series([None])], ignore_index=True)

# Print the resulting DataFrame
print(df)

Output:

Screenshot-2024-12-02-190407 — Reference the Next Row in a Pandas DataFrame

In iloc we can use indexing and slicing to access rows of any dataframe. Here we are omitting the first row and fetching the values from the second row. After that we create a new index and for the last row whose value does not exist, we have set it to None.

Method 3: Lambda function to refer to the next row

Using Lambda function we can set some condition and perform row wise operations. Let us consider one dataframe. Here we want to shift the rows. So we are basically creating a new column and iterating from the second row. Here we are imposing a condition that if the value of index becomes equal to the length, then we assign None to that row.

Python

import pandas as pd

df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David'],'Score': [85, 90, 88, 92]})

# Create a new column by referencing the next row using apply with lambda
df['Next_Score'] = df.apply(lambda row: df['Score'].iloc[row.name + 1] if row.name + 1 < len(df) else None,axis=1)
print(df)

Output:

Screenshot-2024-12-02-190758 — Reference the Next Row in a Pandas DataFrame

Handling NaN values while reference the next row

When we use any of the methods, we generally encounter NaN values. They are basically missing values which are also unwanted in the dataframe. Now there are different techniques to handle the NaN values. Some of them are as follows: