Quick Start

This guide will get you up and running with DuckDB quickly, covering both CLI usage and embedded C programming.

Using the CLI

The fastest way to start using DuckDB is through the command-line interface.

Start DuckDB

Launch the DuckDB CLI:

./duckdb

This opens an in-memory database. To use a persistent database:

./duckdb mydata.db

Create a table and insert data

CREATE TABLE users(id INTEGER, name VARCHAR, email VARCHAR);
INSERT INTO users VALUES 
    (1, 'Alice', 'alice@example.com'),
    (2, 'Bob', 'bob@example.com'),
    (3, 'Charlie', 'charlie@example.com');

Query your data

SELECT * FROM users WHERE id > 1;

Output:

┌───────┬─────────┬──────────────────────┐
│  id   │  name   │        email         │
│ int32 │ varchar │       varchar        │
├───────┼─────────┼──────────────────────┤
│     2 │ Bob     │ bob@example.com      │
│     3 │ Charlie │ charlie@example.com  │
└───────┴─────────┴──────────────────────┘

Querying Files Directly

One of DuckDB’s most powerful features is the ability to query data files directly without importing:

SELECT * FROM 'data.csv' WHERE amount > 100;

DuckDB automatically detects file formats and handles schema inference, making data exploration incredibly fast.

Embedded C Example

For applications that need embedded analytics, DuckDB provides a simple C API. Here’s a complete working example:

#include "duckdb.h"
#include <stdio.h>

int main() {
	duckdb_database db = NULL;
	duckdb_connection con = NULL;
	duckdb_result result;

	if (duckdb_open(NULL, &db) == DuckDBError) {
		fprintf(stderr, "Failed to open database\n");
		goto cleanup;
	}
	if (duckdb_connect(db, &con) == DuckDBError) {
		fprintf(stderr, "Failed to open connection\n");
		goto cleanup;
	}
	if (duckdb_query(con, "CREATE TABLE integers(i INTEGER, j INTEGER);", NULL) == DuckDBError) {
		fprintf(stderr, "Failed to query database\n");
		goto cleanup;
	}
	if (duckdb_query(con, "INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL);", NULL) == DuckDBError) {
		fprintf(stderr, "Failed to query database\n");
		goto cleanup;
	}
	if (duckdb_query(con, "SELECT * FROM integers", &result) == DuckDBError) {
		fprintf(stderr, "Failed to query database\n");
		goto cleanup;
	}
	// print the names of the result
	idx_t row_count = duckdb_row_count(&result);
	idx_t column_count = duckdb_column_count(&result);
	for (size_t i = 0; i < column_count; i++) {
		printf("%s ", duckdb_column_name(&result, i));
	}
	printf("\n");
	// print the data of the result
	for (size_t row_idx = 0; row_idx < row_count; row_idx++) {
		for (size_t col_idx = 0; col_idx < column_count; col_idx++) {
			char *val = duckdb_value_varchar(&result, col_idx, row_idx);
			printf("%s ", val);
			duckdb_free(val);
		}
		printf("\n");
	}
cleanup:
	duckdb_destroy_result(&result);
	duckdb_disconnect(&con);
	duckdb_close(&db);
}

Understanding the C API

Initialize database and connection

duckdb_database db = NULL;
duckdb_connection con = NULL;

// Open database (NULL = in-memory)
duckdb_open(NULL, &db);

// Create connection to database
duckdb_connect(db, &con);

Pass a filename to duckdb_open() for persistent storage:

duckdb_open("mydata.db", &db);

Execute queries

// Queries that don't return results (DDL, DML)
duckdb_query(con, "CREATE TABLE integers(i INTEGER, j INTEGER);", NULL);
duckdb_query(con, "INSERT INTO integers VALUES (3, 4), (5, 6);", NULL);

// Queries that return results
duckdb_result result;
duckdb_query(con, "SELECT * FROM integers", &result);

Process query results

// Get result metadata
idx_t row_count = duckdb_row_count(&result);
idx_t column_count = duckdb_column_count(&result);
const char *col_name = duckdb_column_name(&result, 0);

// Access values
for (size_t row = 0; row < row_count; row++) {
    for (size_t col = 0; col < column_count; col++) {
        char *val = duckdb_value_varchar(&result, col, row);
        printf("%s ", val);
        duckdb_free(val);  // Important: free allocated strings
    }
}

Clean up resources

duckdb_destroy_result(&result);
duckdb_disconnect(&con);
duckdb_close(&db);

Always clean up in reverse order of creation.

Embedded C++ Example

The C++ API provides a more ergonomic interface with RAII and exceptions:

#include "duckdb.hpp"

using namespace duckdb;

int main() {
	// Open database (nullptr = in-memory)
	DuckDB db(nullptr);

	// Create connection
	Connection con(db);

	// Execute queries
	con.Query("CREATE TABLE integers(i INTEGER)");
	con.Query("INSERT INTO integers VALUES (3)");
	
	// Query and print results
	auto result = con.Query("SELECT * FROM integers");
	result->Print();
}

The C++ API automatically handles resource cleanup through destructors, making it less error-prone than the C API.

Common SQL Operations

Here are some common analytical queries you can run:

SELECT 
    department,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MAX(salary) as max_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

Working with Real Data

DuckDB excels at analyzing real-world data files:

-- Analyze a large CSV file without importing
SELECT 
    DATE_TRUNC('month', timestamp) as month,
    COUNT(*) as events,
    COUNT(DISTINCT user_id) as unique_users
FROM 'logs.csv'
WHERE event_type = 'purchase'
GROUP BY month
ORDER BY month;

-- Query multiple Parquet files at once
SELECT 
    product_category,
    SUM(revenue) as total_revenue
FROM 'sales_data/*.parquet'
GROUP BY product_category;

When querying large files, DuckDB automatically uses streaming and parallel processing, but keep in mind that complex operations may require significant memory.

Next Steps

Now that you understand the basics, explore more advanced features:

SQL Reference

Learn about DuckDB’s rich SQL dialect and advanced features

Data Import

Deep dive into importing and exporting data in various formats

Client APIs

Explore Python, R, Java, and other client libraries

Extensions

Extend DuckDB with additional functionality

Tips and Tricks

Use DESCRIBE table_name; to see table schema
Use .tables to list all tables (CLI)
Use .schema table_name to see CREATE statement (CLI)
Use EXPLAIN to see query execution plans
Enable profiling with PRAGMA enable_profiling; to optimize queries
Use COPY command for fast bulk data import/export

Getting Started

Core Concepts

SQL Guide

Data Import & Export

Extensions

Performance

Quick Start

Quick Start

Using the CLI

Querying Files Directly

Embedded C Example

Understanding the C API

Embedded C++ Example

Common SQL Operations

Working with Real Data

Next Steps

SQL Reference

Data Import

Client APIs

Extensions

Tips and Tricks

Getting Started

Core Concepts

SQL Guide

Data Import & Export

Extensions

Performance

Documentation Index

​Quick Start

​Using the CLI

​Querying Files Directly

​Embedded C Example

​Understanding the C API

​Embedded C++ Example

​Common SQL Operations

​Working with Real Data

​Next Steps

SQL Reference

Data Import

Client APIs

Extensions

​Tips and Tricks

Quick Start

Using the CLI

Querying Files Directly

Embedded C Example

Understanding the C API

Embedded C++ Example

Common SQL Operations

Working with Real Data

Next Steps

Tips and Tricks