Pandas Dataframe.sample() | Python

Last Updated : 11 Apr, 2025

Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.

Example : Sampling a Single Random Row

In this example, we load a dataset and generate a single random row using the sample() method by setting n=1.

C++

import pandas as pd

# Load dataset
d = pd.read_csv("employees.csv")

# Sample one random row
r_row = d.sample(n=1)

# Display the result
r_row

Output

The sample(n=1) function selects one random row from the DataFrame.

Syntax

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

Parameters:

n: int value, Number of random rows to generate.
frac: Float value, Returns (float value * length of data frame values ) . frac cannot be used with n.
replace: Boolean value, return sample with replacement if True.
random_state: int value or numpy.random.RandomState, optional. if set to a particular integer, will return same rows as sample in every iteration.
axis: 0 or 'row' for Rows and 1 or 'column' for Columns.

Return Type: New object of same type as caller.

To download the CSV file used, Click Here.

Examples of Pandas Dataframe.sample()

Example 1: Sample 25% of the DataFrame

In this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter.

C++

import pandas as pd
d = pd.read_csv("employees.csv")

# Sample 25% of the data
sr = d.sample(frac=0.25)

# Verify the number of rows
print(f"Original rows: {len(d)}")
print(f"Sampled rows (25%): {len(sr)}")

# Display the result
sr

Output

As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.

Example 2: Sampling with Replacement and a Fixed Random State

This example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed.

C++

import pandas as pd
d = pd.read_csv("employees.csv")

# Sample 3 rows with replacement and fixed seed
sd = d.sample(n=3, replace=True, random_state=42)

sd

Output

Sample_random_state — sampling with replacement

The replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging.

Python | Pandas dataframe.info()

Kartikaybhutani

Improve

Article Tags :

Practice Tags :

Pandas Dataframe.sample() | Python

Example : Sampling a Single Random Row

Syntax

Examples of Pandas Dataframe.sample()

Example 1: Sample 25% of the DataFrame

Example 2: Sampling with Replacement and a Fixed Random State

Similar Reads

Thank You!

What kind of Experience do you want to share?