Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/duckdb/duckdb/llms.txt

Use this file to discover all available pages before exploring further.

Core In-Tree Extensions

DuckDB includes several core in-tree extensions that provide fundamental functionality. These extensions are maintained by the DuckDB team and are considered essential to the database’s capabilities.

Parquet

Apache Parquet file format supportRead and write Parquet files with full support for:
  • Columnar storage format
  • Compression (Snappy, ZSTD, etc.)
  • Encryption and decryption
  • Statistics and bloom filters
  • Nested data types
  • Dictionary encoding
The Parquet extension is typically statically linked into DuckDB by default.

JSON

JSON data type and functionsComprehensive JSON support including:
  • Native JSON data type
  • JSON parsing and generation functions
  • JSONPath-style extraction
  • JSON schema inference
  • Reading/writing JSON and NDJSON files
  • Streaming JSON processing
Includes table functions like read_json() and scalar functions for JSON manipulation.

ICU

International Components for UnicodeProvides internationalization support:
  • Locale-based collation for string ordering
  • Time zone support for timestamp operations
  • Support for 100+ locales
  • Case-insensitive comparisons
  • Unicode normalization
The ICU extension includes all necessary data for collation and timezone operations, with a compiled size of approximately 6MB.

Autocomplete

SQL autocomplete functionalityPowers intelligent SQL completion:
  • PEG-based grammar parser
  • Context-aware suggestions
  • Keyword and identifier completion
  • Schema-aware table/column suggestions
Uses a sophisticated parsing expression grammar (PEG) system for accurate SQL parsing and completion.

TPC-H

TPC-H benchmark data generatorGenerate TPC-H benchmark data:
  • Scale factor support (0.01 to large scales)
  • 22 TPC-H benchmark queries
  • Answer validation for scale factors 0.01, 0.1, and 1
  • Multi-threaded data generation
Functions:
  • dbgen(sf => scale_factor) - Generate TPC-H schema and data
  • tpch_queries() - List all TPC-H queries
  • tpch_answers() - Get expected query results
  • PRAGMA tpch(query_number) - Run specific TPC-H query

TPC-DS

TPC-DS benchmark data generatorGenerate TPC-DS benchmark data:
  • Scale factor support
  • Complete TPC-DS query suite
  • Answer validation for multiple scale factors
  • Schema generation with optional keys
Functions:
  • dsdgen(sf => scale_factor) - Generate TPC-DS schema and data
  • tpcds_queries() - List all TPC-DS queries
  • tpcds_answers() - Get expected query results
  • PRAGMA tpcds(query_number) - Run specific TPC-DS query

Usage Examples

TPC-H Benchmark

-- Generate TPC-H data at scale factor 0.1
CALL dbgen(sf => 0.1);

-- Run TPC-H query 1
PRAGMA tpch(1);

-- List all available queries
SELECT * FROM tpch_queries();

TPC-DS Benchmark

-- Generate TPC-DS data at scale factor 1
CALL dsdgen(sf => 1);

-- Run TPC-DS query 1
PRAGMA tpcds(1);

-- Generate with custom schema and catalog
CALL dsdgen(sf => 1, catalog => 'my_catalog', schema => 'benchmark');

Parquet Files

-- Read a Parquet file
SELECT * FROM 'data.parquet';

-- Write to Parquet with compression
COPY (SELECT * FROM my_table) 
TO 'output.parquet' 
(FORMAT PARQUET, COMPRESSION ZSTD);

JSON Processing

-- Read NDJSON file
SELECT * FROM 'data.ndjson';

-- Extract JSON field
SELECT json_extract(data, '$.user.name') 
FROM json_table;

-- Convert to JSON
SELECT to_json(my_struct) FROM my_table;

ICU Collation

-- Order strings using German collation
SELECT name 
FROM customers 
ORDER BY name COLLATE de;

-- Case-insensitive comparison
SELECT * 
FROM users 
WHERE name COLLATE NOCASE = 'john';

Extension Information

To see which extensions are loaded or available:
-- List all extensions and their status
SELECT extension_name, loaded, installed 
FROM duckdb_extensions();

Next Steps

Loading Extensions

Learn how to install and load extensions at runtime

Building Extensions

Build custom extensions from source