How to Merge DataFrames of different length in Pandas ?

Last Updated : 12 Nov, 2024

Merging DataFrames of different lengths in Pandas can be done using the merge(), and concat(). These functions allow you to combine data based on shared columns or indices, even if the DataFrames have unequal lengths. By using the appropriate merge method (like a left join, right join, or outer join), you can decide how to handle rows that don't have matching values in both DataFrames.

In this article, we will discuss how to merge the two dataframes with different lengths in Pandas using the merge() method.

merge_two_dataframes_of_different_length_in_pandas — Merge DataFrames of different length in Pandas

Merging on Indices

The merge() function allows you to combine two DataFrames based on common columns or indices. By default, it performs an inner join, which only keeps rows with matching values in both DataFrames. However, if you're dealing with DataFrames of different lengths, it's common to use a left join or outer join to keep all rows from one or both DataFrames.

Left Join: Keeps all rows from the left DataFrame and matches rows from the right DataFrame where possible. Missing values are filled with NaN.
Outer Join: Keeps all rows from both DataFrames, filling missing values with NaN where no match is found.

Below are some examples that depict how to merge data frames of different lengths using the above method:

Example 1: Below is a program to merge two student data frames of different lengths.

Python

import pandas as pd

# Creating two DataFrames of different lengths
df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'ID': [1, 2], 'Age': [25, 30]})
print("DataFrame df1:")
print(df1)
print("\nDataFrame df2:")
print(df2)

# Merging with a left join
result = df1.merge(df2, how='left', on='ID')
print("\nResult of left join:")
print(result)

Output:

DataFrame df1:
   ID     Name
0   1    Alice
1   2      Bob
2   3  Charlie


DataFrame df2:
   ID  Age
0   1   25
1   2   30

Merged DataFrame:

Result of left join:
   ID     Name   Age
0   1    Alice  25.0
1   2      Bob  30.0
2   3  Charlie   NaN

Merging DataFrames of different lengths Using `concat()`

The concat() function is useful when you want to stack DataFrames either vertically (row-wise) or horizontally (column-wise). When concatenating along columns (axis=1), it will automatically fill missing values with NaN for rows that don't exist in both DataFrames.

Example 2: Here is the program using concat method. Note that dataframe stills remains the same as used in above.

Python

# Concatenating along columns (axis=1)
result = pd.concat([df1, df2], axis=1)
print("\nResult after concat:")
print(result)

Output:

DataFrame df1:
   ID     Name
0   1    Alice
1   2      Bob
2   3  Charlie

DataFrame df2:
   ID  Age
0   1   25
1   2   30

Merged Dataframe:

Result after concat:
   ID     Name   ID   Age
0   1    Alice  1.0  25.0
1   2      Bob  2.0  30.0
2   3  Charlie  NaN   NaN

The join() function is similar to merge(), but it works by joining on indices rather than columns by default. This can be useful when your DataFrames share an index but have different lengths.