Convert Excel to CSV in Python
In this article, we will discuss how to convert an Excel (.xlsx) file to a CSV (.csv) file using Python. Excel files are commonly used to store data, and sometimes you may need to convert these files into CSV format for better compatibility or easier processing.
Excel File Formats
- .xlsx: The newer Excel format based on Microsoft Office Open XML Format.
- .xls: The older Excel format used in Excel 97-2003.
Let's Consider a dataset of a shopping store having data about Customer Serial Number, Customer Name, Customer ID, and Product Cost stored in Excel file.
To download all the csv files used in this article, click here.
import pandas as pd
df = pd.DataFrame(pd.read_excel("Test.xlsx"))
df
Output :
Let's see different ways to convert an Excel file into a CSV file :
Method 1: Convert Excel to CSV using Pandas
Pandas is an open-source data manipulation and analysis library. It provides powerful data structures and operations for handling large datasets and supports reading from and writing to various formats including Excel and CSV.
The read_excel() function reads the Excel file and stores the content in a DataFrame. The to_csv() method converts the DataFrame into a CSV file.
import pandas as pd
read_file = pd.read_excel("Test.xlsx")
read_file.to_csv("Test.csv", index=None, header=True)
df = pd.DataFrame(pd.read_csv("Test.csv"))
print(df)
Output:
Explanation:
- pd.read_excel("Test.xlsx"): Reads the content of the Excel file into a DataFrame.
- to_csv("Test.csv", index=None, header=True): Converts the DataFrame to a CSV file. index=None prevents writing the DataFrame index, and header=True includes column headers in the CSV.
- pd.read_csv("Test.csv"): Reads the newly created CSV file and stores it in a DataFrame.
Method 2: Convert Excel to CSV using xlrd and csv libraries
xlrd is a library used to read Excel files, and csv is the standard library for reading and writing CSV files. In this method, we read data from an Excel file using xlrd and write it into a CSV file using the csv library.
import xlrd
import csv
import pandas as pd
sheet = xlrd.open_workbook("Test.xlsx").sheet_by_index(0)
with open("T.csv", 'w', newline="") as f:
writer = csv.writer(f)
for row in range(sheet.nrows):
writer.writerow(sheet.row_values(row))
df = pd.DataFrame(pd.read_csv("T.csv"))
print(df)
Output:
Explanation:
- xlrd.open_workbook("Test.xlsx"): Opens the Excel workbook.
- sheet_by_index(0): Selects the first sheet of the workbook.
- csv.writer(f): Creates a writer object to write data to a CSV file.
- sheet.row_values(row): Retrieves all values of a row as a list, which are written to the CSV file.
Method 3: Convert Excel to CSV using openpyxl and csv libraries
openpyxl is a library that allows reading and writing Excel files (particularly .xlsx files). This method involves using openpyxl to load the Excel file and csv to write the data to a CSV file.
import openpyxl
import csv
import pandas as pd
excel = openpyxl.load_workbook("Test.xlsx")
sheet = excel.active
with open("tt.csv", 'w', newline="") as f:
writer = csv.writer(f)
for row in sheet.rows:
writer.writerow([cell.value for cell in row])
df = pd.DataFrame(pd.read_csv("tt.csv"))
print(df)
Output:
Explanation:
- openpyxl.load_workbook("Test.xlsx"): Loads the Excel file.
- sheet = excel.active: Selects the active sheet in the Excel file.
- sheet.rows: Iterates over the rows in the sheet.
- writer.writerow([cell.value for cell in row]): Extracts each cell's value from the row and writes it to the CSV file.
Related articles: