pandas tutorial

Convert Text File to CSV in Python Pandas

Use pandas to convert text files to CSV with full control: read_csv with the right sep, header and encoding, read_fwf for fixed-width text, and to_csv options for index, quoting and Excel-friendly output.

Step by stepCopy-paste codeFree converter included
Prefer not to code? Use the text to CSV converter — it runs in your browser, free, with nothing uploaded.

Open the converter

Advertisement
Overview

Convert text to CSV with pandas

pandas turns messy text into a clean DataFrame and writes CSV in a single call, which makes it the go-to for data work. The key is passing the correct parameters to read_csv so your columns and types come out right, then using to_csv to control the output.

If you only need a quick conversion, the browser tool is faster; reach for pandas when you also want to filter, rename or clean the data.

read_csv with the right delimiter and header

read_csv reads delimited text, not just commas. Set sep to your delimiter, and use header/names to control column names:

pandas — read delimited text
import pandas as pd

# tab-delimited with a header row
df = pd.read_csv("input.txt", sep="\t", encoding="utf-8")

# no header in the file: supply your own column names
df = pd.read_csv("input.txt", sep="|", header=None,
                 names=["id", "name", "city"])

# let pandas sniff the delimiter
df = pd.read_csv("input.txt", sep=None, engine="python")

Read fixed-width text with read_fwf

When columns are aligned by position rather than a delimiter, use read_fwf:

pandas — fixed-width text to DataFrame
import pandas as pd

# infer column widths automatically
df = pd.read_fwf("input.txt")

# or specify exact widths
df = pd.read_fwf("input.txt", widths=[10, 8, 12],
                 names=["name", "code", "city"])

to_csv: index, quoting and Excel-friendly output

Write the DataFrame to CSV, controlling the common options:

pandas — write CSV
import csv

df.to_csv(
    "output.csv",
    index=False,                 # omit the row index
    sep=",",                     # output delimiter
    quoting=csv.QUOTE_MINIMAL,   # or csv.QUOTE_ALL
    encoding="utf-8-sig",        # BOM so Excel reads UTF-8
    lineterminator="\n",         # or "\r\n" for Windows
)

Use encoding="utf-8-sig" for Excel, and dtype=str in read_csv if you must keep leading zeros while loading.

Good to know

Reading and writing without surprises

pandas is powerful but opinionated, so a few parameters save a lot of cleanup. On read, pass dtype=str (or a per-column mapping) to stop identifier columns being coerced to numbers and losing leading zeros, and use keep_default_na=False if literal strings like NA or empty cells should stay as text rather than becoming NaN. When the delimiter is uncertain, prefer an explicit sep over the sniffer, which can misread files with quoted fields.

On write, index=False is almost always what you want — otherwise pandas adds an unlabeled first column of row numbers that confuses downstream tools. Use encoding="utf-8-sig" for Excel, and set lineterminator to match the target platform. To confirm the round trip, read your output back with the same parameters and compare df.shape before and after; a changed row or column count is an immediate signal that quoting or the delimiter needs attention.

For files too large to fit in memory, pandas can read in chunks with chunksize and append each piece to the output, writing the header only once. That keeps memory flat regardless of file size while still giving you pandas’ parsing. When you only need a straight format conversion rather than analysis, though, the streaming csv module or a command-line tool is lighter weight than loading a DataFrame at all.

In short, pandas is the right tool when you also need to clean or reshape the data on the way to CSV; for a pure format change, a lighter approach is fine. Either way, read identifier columns as text, write with index=False and a UTF-8 BOM for Excel, and confirm the shape round-trips.

FAQ

Convert Text File to CSV in Python Pandas — FAQ

How do I set the delimiter in read_csv?

Pass sep="\t", sep="|", etc., or sep=None, engine="python" to auto-detect.

How do I keep leading zeros when reading?

Load with dtype=str (or per-column) so pandas does not convert the values to numbers.

How do I read fixed-width text?

Use pd.read_fwf, either inferring widths or passing widths=[...].

How do I make to_csv Excel-friendly?

Use encoding="utf-8-sig" for a BOM and index=False to drop the index column.