tap uses Arrow, and thus the Arrow type system, for query
execution. The SQL types from
sqlparser-rs
are mapped to Arrow data types according to the following table.
This mapping occurs when defining the schema in a CREATE EXTERNAL TABLE
command or when performing a SQL CAST
operation.
You can see the corresponding Arrow type for any SQL expression using
the arrow_typeof
function. For example:
You can cast a SQL expression to a specific Arrow type using the arrow_cast
function
For example, to cast the output of now()
to a Timestamp
with second precision:
Character Types
SQL DataType | Arrow DataType |
---|
CHAR | Utf8 |
VARCHAR | Utf8 |
TEXT | Utf8 |
STRING | Utf8 |
Numeric Types
SQL DataType | Arrow DataType |
---|
TINYINT | Int8 |
SMALLINT | Int16 |
INT or INTEGER | Int32 |
BIGINT | Int64 |
TINYINT UNSIGNED | UInt8 |
SMALLINT UNSIGNED | UInt16 |
INT UNSIGNED or INTEGER UNSIGNED | UInt32 |
BIGINT UNSIGNED | UInt64 |
FLOAT | Float32 |
REAL | Float32 |
DOUBLE | Float64 |
DECIMAL(precision, scale) | Decimal128(precision, scale) |
Date/Time Types
SQL DataType | Arrow DataType |
---|
DATE | Date32 |
TIME | Time64(Nanosecond) |
TIMESTAMP | Timestamp(Nanosecond, None) |
INTERVAL | Interval(IntervalMonthDayNano) |
Boolean Types
SQL DataType | Arrow DataType |
---|
BOOLEAN | Boolean |
Binary Types
SQL DataType | Arrow DataType |
---|
BYTEA | Binary |
You can create binary literals using a hex string literal such as
X'1234'
to create a Binary
value of two bytes, 0x12
and 0x34
.
Unsupported SQL Types
SQL Data Type | Arrow DataType |
---|
UUID | Not yet supported |
BLOB | Not yet supported |
CLOB | Not yet supported |
BINARY | Not yet supported |
VARBINARY | Not yet supported |
REGCLASS | Not yet supported |
NVARCHAR | Not yet supported |
CUSTOM | Not yet supported |
ARRAY | Not yet supported |
ENUM | Not yet supported |
SET | Not yet supported |
DATETIME | Not yet supported |
Supported Arrow Types
The following types are supported by the arrow_typeof
function:
Arrow Type |
---|
Null |
Boolean |
Int8 |
Int16 |
Int32 |
Int64 |
UInt8 |
UInt16 |
UInt32 |
UInt64 |
Float16 |
Float32 |
Float64 |
Utf8 |
LargeUtf8 |
Binary |
Timestamp(Second, None) |
Timestamp(Millisecond, None) |
Timestamp(Microsecond, None) |
Timestamp(Nanosecond, None) |
Time32 |
Time64 |
Duration(Second) |
Duration(Millisecond) |
Duration(Microsecond) |
Duration(Nanosecond) |
Interval(YearMonth) |
Interval(DayTime) |
Interval(MonthDayNano) |
FixedSizeBinary(<len>) (e.g. FixedSizeBinary(16) ) |
Decimal128(<precision>, <scale>) e.g. Decimal128(3, 10) |
Decimal256(<precision>, <scale>) e.g. Decimal256(3, 10) |