Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/duckdb/duckdb/llms.txt

Use this file to discover all available pages before exploring further.

DuckDB provides first-class support for CSV (Comma-Separated Values) files with automatic dialect detection, type inference, and flexible configuration options.

Reading CSV Files

The simplest way to read a CSV file is to reference it directly in a query:
SELECT * FROM 'data.csv';
DuckDB automatically:
  • Detects the delimiter (comma, tab, pipe, etc.)
  • Identifies header rows
  • Infers column types
  • Handles quoted values and escape characters

Using read_csv_auto

For more control, use the read_csv_auto function:
SELECT * FROM read_csv_auto('sales.csv');

CSV Options

Delimiter Options

-- Pipe-delimited file
SELECT * FROM read_csv_auto('data.csv', delim = '|');

-- Tab-delimited file
SELECT * FROM read_csv_auto('data.tsv', delim = '\t');

Header Options

SELECT * FROM read_csv_auto('data.csv', header = true);

Null Handling

-- Treat 'NA', 'NULL', and empty strings as NULL
SELECT * FROM read_csv_auto('data.csv',
  nullstr = ['NA', 'NULL', '']
);

Quote and Escape Characters

SELECT * FROM read_csv_auto('data.csv',
  quote = '"',
  escape = '\\'
);

Writing CSV Files

Use the COPY statement to write data to CSV files:
COPY customers TO 'customers.csv';

Loading into Tables

Import CSV data into existing tables:
CREATE TABLE test (a INTEGER, b INTEGER, c VARCHAR(10));

COPY test FROM 'test.csv';

Partial Column Import/Export

Work with specific columns only:
COPY test (a, c) TO 'partial.csv' (
  DELIMITER ',',
  HEADER false
);

Multiple Files

Read multiple CSV files at once:
SELECT * FROM 'data/sales_*.csv';

Hive Partitioning

DuckDB supports Hive-style partitioning for efficient file filtering:
-- Read from partitioned directory structure
SELECT id, value, part, date 
FROM read_csv_auto(
  'data/part=*/date=*/data.csv',
  HIVE_PARTITIONING = 1
)
WHERE part = 'a' AND date > '2024-01-01';
With Hive partitioning, DuckDB:
  • Extracts partition values from directory names
  • Adds them as columns to the result
  • Skips reading files that don’t match filter conditions

Type Detection

Override automatic type detection:
SELECT * FROM read_csv_auto('data.csv',
  dtypes = {'Year': 'INTEGER', 'Quarter': 'TINYINT'}
);

Common Patterns

Skip Rows

SELECT * FROM read_csv_auto('data.csv', skip = 5);

Limit Rows During Import

COPY (SELECT * FROM read_csv_auto('large.csv') LIMIT 1000)
TO 'sample.csv';

Handle Different Line Endings

DuckDB automatically handles Windows (CRLF), Unix (LF), and Mac (CR) line endings:
-- Works with any line ending format
SELECT * FROM 'mixed_line_endings.csv';

Error Handling

Control how DuckDB handles malformed rows:
-- Reject malformed rows and save to error table
SELECT * FROM read_csv_auto('data.csv',
  rejects_table = 'csv_errors',
  rejects_limit = 100
);

-- View rejected rows
SELECT * FROM csv_errors;

Performance Tips

  • DuckDB automatically parallelizes CSV reading across multiple threads
  • Use COPY instead of INSERT INTO ... SELECT for bulk loading
  • For very large files, consider converting to Parquet format for faster subsequent queries
  • Use column projection to read only needed columns: SELECT col1, col2 FROM 'file.csv'