Skip to content
rayyildiz
← All projects

CC2P — CSV to Parquet converter

● Active

High-performance Rust CLI that converts CSV files to Apache Parquet. Interactive TUI for column selection, multi-threaded, schema inference.

cc2p
RustParquetCLI
$ cargo install cc2p
Built with
RustTokioArrowParquet

CC2P (Convert CSV To Parquet) is a high-performance command-line tool written in Rust that efficiently converts CSV files to the Apache Parquet format. Parquet is a columnar storage file format that offers efficient data compression and encoding schemes, making it ideal for big data processing.

Why Use CC2P?

  • Performance: Leverages Rust’s speed and multi-threading for fast conversions
  • Memory Efficiency: Processes files with minimal memory footprint
  • Flexibility: Supports various CSV formats with different delimiters and header options
  • Schema Inference: Automatically detects column types from your data
  • Batch Processing: Convert multiple CSV files in a single command
  • Interactive Mode: Browse and selectively export columns using the TUI

Installation

If you have Rust installed, you can install CC2P directly from crates.io:

cargo install cc2p

From GitHub Releases

You can download pre-built binaries from the GitHub Releases page.

From Source

git clone https://github.com/rayyildiz/cc2p.git
cd cc2p
cargo build --release

Usage

Basic usage:

cc2p [OPTIONS] [PATH]

Where PATH is the path to a CSV file or a glob pattern (default: *.csv).

Examples

Convert a single CSV file:

cc2p data.csv

Convert all CSV files in the current directory:

cc2p

Convert CSV files with semicolon delimiter:

cc2p --delimiter ";" *.csv

Convert CSV files without headers:

cc2p --no-header data_files/*.csv

Use 4 worker threads for faster processing:

cc2p --worker 4 large_data.csv

Options

  • -d, —delimiter : Delimiter character used in CSV files (default: ,)
  • -n, —no-header: Whether to include the header in the CSV search column (default: false)
  • -w, —worker: Number of worker threads to use (default: 1)
  • -s, —sampling: Number of rows to sample for inferring the schema (default: 2048)
  • -i, —interactive: Show an interactive UI to browse files and select columns

Interactive Mode

CC2P includes an interactive Terminal User Interface (TUI) that allows you to browse CSV files, view inferred schemas, and selectively export columns.

cc2p -i
KeyAction
/Navigate through files or columns
TabSwitch between File List and Column List panels
SpaceSelect/Unselect the highlighted column
EnterExport selected columns to Parquet
QQuit the application

Platform Notes

macOS

Apple signing/notarization is not complete yet. After downloading, run:

xattr -c cc2p

Linux

Install via Snap:

sudo snap install cc2p

Technical Requirements

  • Rust: 1.88.0+ (Edition 2024)
  • Minimum Memory: Depends on CSV file size

License

MIT — see LICENSE.