spiceai/docs

spiceai/

docs

Help Login

evgenii/docs-spicepod-v2

Edit on GitHub

Fork

/docs/website/versioned_docs/version-2.0.x/reference/sql/scalar_functions.md

spiceai/docs | Spice Cloud Platform

evgenii/docs-spicepod-v2

Edit on GitHub

Fork

/docs/website/versioned_docs/version-2.0.x/reference/sql/scalar_functions.md

spiceai/docs/README.md

title: 'Scalar Functions' sidebar_label: 'Scalar Functions' pagination_prev: 'reference/sql/ai' sidebar_position: 6

:::info Spice is built on Apache DataFusion and uses the PostgreSQL dialect, even when querying datasources with different SQL dialects. When using a data accelerator like DuckDB, function support is specific to each acceleration engine, and not all functions are supported by all acceleration engines. :::

Scalar functions help transform, compute, and manipulate data at the row level. These functions are evaluated for each row in a query result and return a single value per invocation. Spice.ai supports a broad set of scalar functions, including math, string, conditional, date/time, array, struct, map, regular expression, and hashing functions. The function set closely follows the PostgreSQL dialect.

Spark-compatible scalar functions registered by Spice are documented here only when their Spark-specific behavior differs from the PostgreSQL equivalent, or when they have no PostgreSQL analogue. For functions not listed below, refer to the Spark SQL built-in function reference for semantics.

Math Functions

Math functions in Spice.ai SQL help perform numeric calculations, transformations, and analysis. These functions operate on numeric expressions, which can be constants, columns, or results of other functions and operators. The following math functions are supported:

`abs`

Returns the absolute value of a numeric expression. If the input is negative, the result is its positive equivalent; if the input is positive or zero, the result is unchanged.

Arguments

numeric_expression: Numeric value to evaluate. Accepts constants, columns, or expressions.

Example

`acos`

Returns the arc cosine (inverse cosine) of a numeric expression. The input must be in the range [-1, 1]. The result is in radians.

Arguments

numeric_expression: Value between -1 and 1.

`acosh`

Returns the inverse hyperbolic cosine of a numeric expression. The input must be greater than or equal to 1.

Arguments

numeric_expression: Value greater than or equal to 1.

`asin`

Returns the arc sine (inverse sine) of a numeric expression. The input must be in the range [-1, 1]. The result is in radians.

Arguments

numeric_expression: Value between -1 and 1.

`asinh`

Returns the inverse hyperbolic sine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`atan`

Returns the arc tangent (inverse tangent) of a numeric expression. The result is in radians.

Arguments

numeric_expression: Numeric value.

`atan2`

Returns the arc tangent of the quotient of its arguments, that is, atan(expression_y / expression_x). The result is in radians and takes into account the signs of both arguments to determine the correct quadrant.

Arguments

expression_y: Numerator value.
expression_x: Denominator value.

`atanh`

Returns the inverse hyperbolic tangent of a numeric expression. The input must be in the range (-1, 1).

Arguments

numeric_expression: Value between -1 and 1 (exclusive).

`cbrt`

Returns the cube root of a numeric expression.

Arguments

numeric_expression: Numeric value.

`ceil`

Returns the smallest integer greater than or equal to the input value.

Arguments

numeric_expression: Numeric value.

`cos`

Returns the cosine of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`cosh`

Returns the hyperbolic cosine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`cot`

Returns the cotangent of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`degrees`

Converts radians to degrees.

Arguments

numeric_expression: Value in radians.

`exp`

Returns the value of e (Euler's number) raised to the power of the input value.

Arguments

numeric_expression: Exponent value.

`factorial`

Returns the factorial of a non-negative integer. For values less than 2, returns 1.

Arguments

numeric_expression: Non-negative integer value.

`floor`

Returns the largest integer less than or equal to the input value.

Arguments

numeric_expression: Numeric value.

`gcd`

Returns the greatest common divisor of two integer expressions. If both inputs are zero, returns 0.

Arguments

expression_x: First integer value.
expression_y: Second integer value.

`isnan`

Returns true if the input is NaN (not a number), otherwise returns false.

Arguments

numeric_expression: Numeric value.

`iszero`

Returns true if the input is +0.0 or -0.0, otherwise returns false.

Arguments

numeric_expression: Numeric value.

`lcm`

Returns the least common multiple of two integer expressions. If either input is zero, returns 0.

Arguments

expression_x: First integer value.
expression_y: Second integer value.

`mod`

Returns the remainder after dividing the first argument by the second, matching the Spark SQL %/mod semantics for signed values.

Arguments

dividend: Numeric expression to divide.
divisor: Numeric expression that cannot be zero.

Example

Reference: Spark SQL mod.

`pmod`

Returns a positive remainder for integer or floating-point division. When the standard remainder is negative, the divisor is added to produce a non-negative result, mirroring Spark SQL behavior.

Arguments

dividend: Numeric expression to divide.
divisor: Numeric expression that cannot be zero.

Example

Reference: Spark SQL pmod.

`ln`

Returns the natural logarithm (base e) of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`log`

Returns the logarithm of a numeric expression. If a base is provided, returns the logarithm to that base; otherwise, returns the base-10 logarithm.

Arguments

base: Base of the logarithm (optional).
numeric_expression: Positive numeric value.

`log10`

Returns the base-10 logarithm of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`log2`

Returns the base-2 logarithm of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`nanvl`

Returns the first argument if it is not NaN; otherwise, returns the second argument.

Arguments

expression_x: Value to return if not NaN.
expression_y: Value to return if the first argument is NaN.

`pi`

Returns an approximate value of π (pi).

`pow` and `power`

Returns the value of the first argument raised to the power of the second argument. pow is an alias for power.

Arguments

base: Numeric value to raise.
exponent: Power to raise the base to.

`radians`

Converts degrees to radians.

Arguments

numeric_expression: Value in degrees.

`random`

Returns a random floating-point value in the range [0, 1). The random seed is unique for each row.

`round`

Rounds a numeric expression to the nearest integer or to a specified number of decimal places.

Arguments

numeric_expression: Value to round.
decimal_places: Optional. Number of decimal places to round to. Defaults to 0.

`rint`

Rounds a double-precision value to the nearest integer using IEEE-754 "round to nearest, ties to even" rules and returns the rounded value as a floating-point number, matching Spark SQL semantics.

Arguments

numeric_expression: Floating-point expression to round. Integers are implicitly cast to double.

Example

Reference: Spark SQL rint.

`signum`

Returns the sign of a numeric expression. Returns -1 for negative numbers, 1 for zero and positive numbers.

Arguments

numeric_expression: Numeric value.

`sin`

Returns the sine of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`sinh`

Returns the hyperbolic sine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`sqrt`

Returns the square root of a numeric expression.

Arguments

numeric_expression: Non-negative numeric value.

`tan`

Returns the tangent of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`tanh`

Returns the hyperbolic tangent of a numeric expression.

Arguments

numeric_expression: Numeric value.

`trunc`

Truncates a numeric expression to a whole number or to a specified number of decimal places. If decimal_places is positive, truncates digits to the right of the decimal point; if negative, truncates digits to the left.

Arguments

numeric_expression: Value to truncate.
decimal_places: Optional. Number of decimal places to truncate to. Defaults to 0.

`width_bucket`

Assigns a value to an equiwidth histogram bucket. Returns 0 when the value is below min_value, num_bucket + 1 when it is above max_value, and otherwise the 1-based bucket index, mirroring Spark SQL behavior.

Arguments

value: Numeric or interval expression to bin.
min_value: Lower bound of the histogram range.
max_value: Upper bound of the histogram range.
num_bucket: Positive integer specifying the number of buckets.

Example

Reference: Spark SQL width_bucket.

Conditional Functions

Conditional functions help handle null values, select among alternatives, and compare multiple expressions. These are useful for data cleaning and conditional logic in queries.

`CASE`

Standard SQL CASE expression, supported in both simple and searched forms.

Example

`coalesce`

Returns the first non-null value from its arguments. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to evaluate in order. All arguments must share a common type.

Example

`greatest`

Returns the largest value among the arguments, ignoring NULLs. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to compare. Must share a common type.

Example

`if`

Evaluates a boolean condition and returns one of two expressions, matching the Spark SQL if function semantics.

Arguments

condition: Boolean expression determining which branch to take.
true_value: Expression to return when the condition evaluates to true.
false_value: Expression to return when the condition evaluates to false or NULL.

Example

Reference: Spark SQL if.

`least`

Returns the smallest value among the arguments, ignoring NULLs. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to compare. Must share a common type.

Example

`nullif`

Returns NULL if expression1 equals expression2; otherwise returns expression1. Useful for converting sentinel values to NULL.

Arguments

expression1: Value to return if it differs from expression2.
expression2: Value to compare against.

Example

`nvl`

Returns expression2 when expression1 is NULL; otherwise returns expression1. Equivalent to coalesce(expression1, expression2).

Alias: ifnull.

`nvl2`

Returns expression2 if expression1 is not NULL; otherwise returns expression3.

String Functions

String functions in Spice.ai SQL help manipulate, analyze, and transform text data. These functions operate on string expressions, which can be constants, columns, or results of other functions. The implementation closely follows the PostgreSQL dialect. The following string functions are supported:

`ascii`

Returns the Unicode code point of the first character in a string. If the string is empty, returns 0.

Arguments

str: String expression. Accepts constants, columns, or expressions.

Example

Related function: chr

`bit_length`

Returns the number of bits in the string. Each character is counted according to its byte representation (8 bits per byte).

Arguments

str: String expression.

Example

Related functions: length, octet_length

`btrim`

Removes the longest string containing only characters in trim_str from the start and end of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(BOTH trim_str FROM str) or trim(trim_str FROM str)

Aliases: trim

Related functions: ltrim, rtrim

`char_length`

Alias of character_length.

`character_length`

Returns the number of characters in a string, not bytes. Handles Unicode correctly.

Arguments

str: String expression.

Example

Aliases: length, char_length

Related functions: bit_length, octet_length

`chr`

Returns the character with the specified Unicode code point.

Arguments

expression: Integer code point.

Example

Related function: ascii

`concat`

Concatenates two or more strings into a single string.

Arguments

str: String expression.
str_n: Additional string expressions.

Example

Related function: concat_ws

`concat_ws`

Concatenates strings using a separator between each value.

Arguments

separator: String separator.
str: String expression.
str_n: Additional string expressions.

Example

Related function: concat

`contains`

Returns true if search_str is found within str. The search is case-sensitive.

Arguments

str: String expression.
search_str: Substring to search for.

Example

`like`

Performs SQL pattern matching using % to match zero or more characters and _ to match a single character. The optional SQL ESCAPE clause can be used to treat a wildcard literally, matching Spark SQL behavior.

Arguments

str: String expression to compare.
pattern: Pattern containing literal text plus % and _ wildcards.

Example

Reference: Spark SQL like.

`ilike`

Case-insensitive variant of like that treats ASCII characters in str and pattern without regard to case. The optional SQL ESCAPE clause may be used to treat % or _ literally.

Arguments

str: String expression to compare.
pattern: Case-insensitive pattern containing literal text plus % and _ wildcards.

Example

Reference: Spark SQL ilike.

`ends_with`

Returns true if str ends with the substring substr.

Arguments

str: String expression.
substr: Substring to test for.

Example

`find_in_set`

Returns the position (1-based) of str in the comma-separated list strlist. Returns 0 if not found.

Arguments

str: String to find.
strlist: Comma-separated list of substrings.

Example

`initcap`

Capitalizes the first character of each word in the string. Words are delimited by non-alphanumeric characters.

Arguments

str: String expression.

Example

Related functions: lower, upper

`instr`

Alias of strpos.

`left`

Returns the first n characters from the left side of the string.

Arguments

str: String expression.
n: Number of characters to return.

Example

Related function: right

`length`

Alias of character_length.

`levenshtein`

Returns the Levenshtein distance between two strings.

Arguments

str1: First string.
str2: Second string.

Example

`lower`

Converts all characters in the string to lower case.

Arguments

str: String expression.

Example

Related functions: initcap, upper

`luhn_check`

Validates that a string of digits satisfies the Luhn checksum, returning true for valid numbers and false otherwise. This matches the Spark SQL implementation and is useful for validating identifiers such as credit card numbers.

Arguments

str: String expression containing digits.

Example

Reference: Spark SQL luhn_check.

`lpad`

Pads the left side of the string with another string until the result reaches the specified length. If the padding string is omitted, a space is used.

Arguments

str: String expression.
n: Target length.
padding_str: Optional string to pad with.

Example

Related function: rpad

`ltrim`

Removes the longest string containing only characters in trim_str from the start of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(LEADING trim_str FROM str)

Related functions: btrim, rtrim

`octet_length`

Returns the number of bytes in the string.

Arguments

str: String expression.

Example

Related functions: bit_length, length

`overlay`

Replaces a substring of str with substr, starting at position pos for count characters. If count is omitted, uses the length of substr.

Arguments

str: String expression.
substr: Replacement string.
pos: Start position (1-based).
count: Optional number of characters to replace.

Example

`parse_url`

Extracts a component from a URL, or retrieves an individual query parameter when provided a key, following Spark SQL semantics. Supported parts include HOST, PATH, QUERY, REF, PROTOCOL, FILE, and AUTHORITY.

Arguments

url: URL string expression.
part_to_extract: Case-insensitive token identifying which component to extract.
key: Optional query parameter key to extract from the QUERY part.

Example

Reference: Spark SQL parse_url.

`position`

Alias of strpos.

`repeat`

Returns a string consisting of the input string repeated n times.

Arguments

str: String expression.
n: Number of repetitions.

Example

`replace`

Replaces all occurrences of substr in str with replacement.

Arguments

str: String expression.
substr: Substring to replace.
replacement: Replacement string.

Example

`reverse`

Returns the string with the character order reversed.

Arguments

str: String expression.

Example

`right`

Returns the last n characters from the right side of the string.

Arguments

str: String expression.
n: Number of characters to return.

Example

Related function: left

`rpad`

Pads the right side of the string with another string until the result reaches the specified length. If the padding string is omitted, a space is used.

Arguments

str: String expression.
n: Target length.
padding_str: Optional string to pad with.

Example

Related function: lpad

`rtrim`

Removes the longest string containing only characters in trim_str from the end of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(TRAILING trim_str FROM str)

Related functions: btrim, ltrim

`split_part`

Splits the string on the specified delimiter and returns the substring at the given position (1-based).

Arguments

str: String expression.
delimiter: Delimiter string.
pos: Position of the part to return (1-based).

Example

`starts_with`

Returns true if str starts with the substring substr.

Arguments

str: String expression.
substr: Substring to test for.

Example

`strpos`

Returns the position (1-based) of the first occurrence of substr in str. Returns 0 if not found.

Arguments

str: String expression.
substr: Substring to search for.

Example

Alternative syntax: position(substr in origstr)

Aliases: instr, position

`substr`

Extracts a substring from str, starting at start_pos for length characters. If length is omitted, returns the rest of the string.

Arguments

str: String expression.
start_pos: Start position (1-based).
length: Optional number of characters to extract.

Example

Alternative syntax: substring(str from start_pos for length)

Aliases: substring

`substr_index`

Returns the substring from str before or after a specified number of occurrences of the delimiter delim. If count is positive, returns everything to the left of the final delimiter (counting from the left). If count is negative, returns everything to the right of the final delimiter (counting from the right).

Arguments

str: String expression.
delim: Delimiter string.
count: Number of occurrences (positive or negative).

Example

Aliases: substring_index

`substring`

Alias of substr.

`substring_index`

Alias of substr_index.

`to_hex`

Converts an integer to its hexadecimal string representation.

Arguments

int: Integer expression.

Example

`translate`

Replaces each character in str that matches a character in chars with the corresponding character in translation. If translation is shorter than chars, extra characters are removed.

Arguments

str: String expression.
chars: Characters to translate.
translation: Replacement characters.

Example

`trim`

Alias of btrim.

`upper`

Converts all characters in the string to upper case.

Arguments

str: String expression.

Example

Related functions: initcap, lower

`uuid`

Returns a UUID v4 string value that is unique per row.

Example

Binary String Functions

Binary string functions help encode and decode binary data, such as base64 and hexadecimal conversions. These are useful for working with encoded data or binary blobs.

`bit_get`

Returns the bit (0 or 1) at the specified zero-based position when counting from the least-significant bit of an integral or binary expression, matching Spark SQL semantics.

Arguments

value: Integer or binary expression whose bits are inspected.
position: Zero-based index of the bit to return. Must be non-negative.

Example

Reference: Spark SQL bit_get.

`bit_count`

Counts the number of set bits in an integral or binary expression. Useful for quick popcount operations on bitmaps or packed flags, aligned with Spark SQL behavior.

Arguments

value: Integer or binary expression.

Example

Reference: Spark SQL bit_count.

`bitmap_count`

Returns the number of set bits in a binary bitmap produced by functions such as bitmap_construct_agg, mirroring the Spark SQL implementation.

Arguments

bitmap: Binary expression representing a bitmap.

Example

Reference: Spark SQL bitmap_count.

Regular Expression Functions

Regular expression functions help match, extract, and replace patterns in strings. Spice.ai uses a PCRE-like regular expression syntax. Spice supports the following regular expressions:

regexp_like
regexp_match
regexp_replace
regexp_count
regexp_instr

`regexp_like`

Returns true if a regular expression has at least one match in a string, false otherwise.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_match`

Returns the first regular expression matches in a string.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_replace`

Replaces substrings in a string that match a regular expression.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
replacement: Replacement string expression to operate on. Can be a constant, column, or function, and any combination of operators.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- g: (global) Search globally and don’t return after the first match
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_count`

Returns the number of matches that a regular expression has in a string.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_instr`

Returns the position in a string where the specified occurrence of a POSIX regular expression is located.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function. Defaults to 1.
N: Optional. The N-th occurrence of pattern to find. Defaults to 1 (first match). Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
subexpr: Optional. Specifies which capture group (subexpression) to return the position for. Defaults to 0, which returns the position of the entire match.

Example

Time and Date Functions

Time and date functions help extract, format, and manipulate temporal data. Functions include current_date, now, date_part, date_trunc, and various conversion functions. These are essential for time series analysis and working with timestamps.

current_date
current_time
current_timestamp
date_bin
date_format
date_part
date_trunc
datepart
datetrunc
from_unixtime
make_date
now
to_char
to_date
to_local_time
to_timestamp
to_timestamp_micros
to_timestamp_millis
to_timestamp_nanos
to_timestamp_seconds
to_unixtime
today

`current_date`

Returns the current UTC date.

The current_date() return value is determined at query time and will return the same date, no matter when in the query plan the function executes.

Aliases

today

`current_time`

Returns the current UTC time.

The current_time() return value is determined at query time and will return the same time, no matter when in the query plan the function executes.

`current_timestamp`

Alias of now.

`date_bin`

Calculates time intervals and returns the start of the interval nearest to the specified timestamp. Use date_bin to downsample time series data by grouping rows into time-based "bins" or "windows" and applying an aggregate or selector function to each window.

For example, if you "bin" or "window" data into 15 minute intervals, an input timestamp of 2023-01-01T18:18:18Z will be updated to the start time of the 15 minute bin it is in: 2023-01-01T18:15:00Z.

Arguments

interval: Bin interval.
expression: Time expression to operate on. Can be a constant, column, or function.
origin-timestamp: Optional. Starting point used to determine bin boundaries. If not specified defaults 1970-01-01T00:00:00Z (the UNIX epoch in UTC). The following intervals are supported:
- nanoseconds
- microseconds
- milliseconds
- seconds
- minutes
- hours
- days
- weeks
- months
- years
- century

Example

`date_add`

Adds a number of days to a DATE or TIMESTAMP expression, matching Spark SQL semantics. Negative offsets move backwards in time.

Arguments

start_date: DATE or TIMESTAMP expression.
num_days: Integer number of days to add.

Example

Reference: Spark SQL date_add.

`date_sub`

Subtracts a number of days from a DATE or TIMESTAMP expression using Spark-compatible behavior.

Arguments

start_date: DATE or TIMESTAMP expression.
num_days: Integer number of days to subtract.

Example

Reference: Spark SQL date_sub.

`last_day`

Returns the last day of the month that contains the input date or timestamp, matching Spark SQL semantics.

Arguments

expression: DATE or TIMESTAMP expression.

Example

Reference: Spark SQL last_day.

`next_day`

Returns the first date after start_date that matches the requested day of week. Valid day names include full names (e.g., Monday) or abbreviations such as Mon, matching Spark SQL behavior.

Arguments

start_date: DATE or TIMESTAMP expression.
day_of_week: String literal naming the target weekday.

Example

Reference: Spark SQL next_day.

`date_format`

Alias of to_char.

`date_part`

Returns the specified part of the date as an integer.

Arguments

part: Part of the date to return. The following date parts are supported:
- year
- quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
- month
- week (week of the year)
- day (day of the month)
- hour
- minute
- second
- millisecond
- microsecond
- nanosecond
- dow (day of the week where Sunday is 0)
- doy (day of the year)
- epoch (seconds since Unix epoch)
- isodow (day of the week where Monday is 0)
expression: Time expression to operate on. Can be a constant, column, or function.

Alternative Syntax

Aliases

datepart

`date_trunc`

Truncates a timestamp value to a specified precision.

Arguments

precision: Time precision to truncate to. The following precisions are supported:
- year / YEAR
- quarter / QUARTER
- month / MONTH
- week / WEEK
- day / DAY
- hour / HOUR
- minute / MINUTE
- second / SECOND
- millisecond / MILLISECOND
- microsecond / MICROSECOND
expression: Time expression to operate on. Can be a constant, column, or function.

Aliases

datetrunc

`datepart`

Alias of date_part.

`datetrunc`

Alias of date_trunc.

`from_unixtime`

Converts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z). Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z) return the corresponding timestamp.

Arguments

expression: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
timezone: Optional timezone to use when converting the integer to a timestamp. If not provided, the default timezone is UTC.

Example

`make_date`

Make a date from year/month/day component parts.

Arguments

year: Year to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
month: Month to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

`now`

Returns the current UTC timestamp.

The now() return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.

Aliases

current_timestamp

`to_char`

Returns a string representation of a date, time, timestamp or duration based on a Chrono format. Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.

Arguments

expression: Expression to operate on. Can be a constant, column, or function that results in a date, time, timestamp or duration.
format: A Chrono format string to use to convert the expression.
day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

Aliases

date_format

`to_date`

Converts a value to a date (YYYY-MM-DD). Supports strings, integer and double types as input. Strings are parsed as YYYY-MM-DD (e.g. '2023-07-20') if no Chrono formats are provided. Integers and doubles are interpreted as days since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding date.

Note: to_date returns Date32, which represents its values as the number of days since unix epoch(1970-01-01) stored as signed 32 bit value. The largest supported date value is 9999-12-31.

Arguments

expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_local_time`

Converts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.

Arguments

expression: Time expression to operate on. Can be a constant, column, or function.

Example

`to_timestamp`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00Z). Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no [Chrono formats] are provided. Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.

Note: to_timestamp returns Timestamp(Nanosecond). The supported range for integer input is between -9223372037 and 9223372036. Supported range for string input is between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.0. Please use to_timestamp_seconds for the input outside of supported bounds.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_micros`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z) Returns the corresponding timestamp.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_millis`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_nanos`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_seconds`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_unixtime`

Converts a value to seconds since the unix epoch (1970-01-01T00:00:00Z). Supports strings, dates, timestamps and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`today`

Alias of current_date.

Array Functions

Array functions in Spice.ai SQL help construct, transform, and query array data types. These functions operate on array expressions, which can be constants, columns, or results of other functions. The implementation closely follows the PostgreSQL dialect. The following array functions are supported:

`array`

Constructs an array from the provided expressions using Spark-compatible semantics. Inputs are evaluated left to right, cast to a common element type, and collected into a single Arrow list value without removing duplicates or nulls.

Arguments

expression: Value to include in the array. Expressions must be implicitly castable to a shared element type.
expression_n: Additional expressions to append to the array.

Example

Reference: Spark SQL array.

`array_any_value`

Returns the first non-null element in the array. If all elements are null, returns null.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_any_value

`array_append`

Appends an element to the end of an array and returns the new array.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to append to the array.

Example

Aliases

list_append
array_push_back
list_push_back

`array_cat`

Alias of array_concat.

`array_concat`

Concatenates two or more arrays into a single array.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
array_n: Additional array expressions to concatenate.

Example

Aliases

array_cat
list_concat
list_cat

`array_contains`

Returns true if the array contains the specified element.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to search for in the array.

Example

Note: For array-to-array containment operations, use the @> operator.

`array_dims`

Returns an array of the array's dimensions. For a 2D array, returns the number of rows and columns.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_dims

`array_distance`

Returns the Euclidean distance between two input arrays of equal length.

Arguments

array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_distance

`array_distinct`

Returns a new array with duplicate elements removed, preserving the order of first occurrence.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_distinct

`array_element`

Extracts the element at the specified index from the array. Indexing is 1-based.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
index: Index to extract the element from the array (1-based).

Example

Aliases

array_extract
list_element
list_extract

`array_except`

Returns an array containing elements in array1 that are not in array2, preserving first-occurrence order and without duplicates.

Alias: list_except.

`array_has`

Returns true if the array contains the specified element.

Aliases: array_contains, list_has.

`array_has_all`

Returns true if every element of sub_array is present in array.

Alias: list_has_all.

`array_has_any`

Returns true if array and sub_array share at least one element.

Aliases: list_has_any, arrays_overlap.

`array_intersect`

Returns an array of elements present in both input arrays, deduplicated.

Alias: list_intersect.

`array_length`

Returns the length of the array at the given (optional) dimension. Dimension defaults to 1.

Alias: list_length.

`array_max`

Returns the maximum element of the array, ignoring NULLs.

Alias: list_max.

`array_min`

Returns the minimum element of the array, ignoring NULLs.

Alias: list_min.

`array_ndims`

Returns the number of dimensions of the array.

Alias: list_ndims.

`array_pop_back`

Returns the array with the last element removed.

Alias: list_pop_back.

`array_pop_front`

Returns the array with the first element removed.

Alias: list_pop_front.

`array_position`

Returns the 1-based position of the first occurrence of element in array, or NULL if not found. An optional from_index starts the search at a later position.

Aliases: list_position, array_indexof, list_indexof.

`array_positions`

Returns a 1-based array of all positions where element occurs in array.

Alias: list_positions.

`array_prepend`

Prepends an element to the beginning of an array.

Aliases: list_prepend, array_push_front, list_push_front.

`array_remove`

Returns the array with the first occurrence of element removed.

Alias: list_remove.

`array_remove_n`

Returns the array with the first max occurrences of element removed.

Alias: list_remove_n.

`array_remove_all`

Returns the array with all occurrences of element removed.

Alias: list_remove_all.

`array_repeat`

Returns an array containing element repeated count times.

Alias: list_repeat.

`array_replace`

Replaces the first occurrence of from with to in array.

Alias: list_replace.

`array_replace_n`

Replaces the first max occurrences of from with to in array.

Alias: list_replace_n.

`array_replace_all`

Replaces every occurrence of from with to in array.

Alias: list_replace_all.

`array_resize`

Resizes array to the given length, padding with value (or NULL if omitted) when growing.

Alias: list_resize.

`array_reverse`

Returns the array with elements in reverse order.

Alias: list_reverse.

`array_slice`

Returns a slice of the array from begin to end (1-based, inclusive). Negative indices count from the end.

Alias: list_slice.

`array_sort`

Returns array sorted in ascending order (default). Optional arguments control sort direction (ASC/DESC) and null placement (NULLS FIRST/NULLS LAST).

Alias: list_sort.

`array_to_string`

Concatenates array elements into a single string using the given delimiter. Optional null_string replaces NULL elements.

Aliases: list_to_string, array_join, list_join.

`array_union`

Returns the set-union of two arrays, deduplicated.

Alias: list_union.

`arrays_zip`

Merges the given arrays element-wise into an array of structs. Shorter arrays are padded with NULLs.

Alias: list_zip.

`cardinality`

Returns the total number of elements in an array (including nested elements) or the number of entries in a map.

`empty`

Returns true if the array has length 0 (or is NULL).

Aliases: array_empty, list_empty.

`flatten`

Flattens a nested array into a single-level array.

`make_array`

Constructs an array (Arrow list) from the given expressions. SQL [expr1, expr2, ...] literal syntax compiles to this function.

Alias: make_list.

`range`

Generates a numeric or date range as an array, half-open on the upper bound. When the step is omitted, the default is 1.

For dates, step is an interval literal, e.g. interval '1 day'. Use generate_series for the inclusive-upper-bound variant.

`generate_series`

Like range, but the upper bound is inclusive.

`string_to_array`

Splits a string into an array of substrings using the given delimiter. An optional null_string turns matching substrings into NULLs.

Alias: string_to_list.

Struct Functions

Struct functions help construct and access structured data types (Arrow structs). These are useful for working with nested or composite data.

`struct`

Constructs an anonymous Arrow struct from the given values. Field names default to c0, c1, ... in the order provided.

Example

`named_struct`

Constructs an Arrow struct from alternating field-name / field-value pairs.

Example

`get_field`

Extracts a field by name from a struct or map. struct.field and struct['field'] sugar invoke this function.

Map Functions

Map functions help construct and query key-value data structures. These are useful for semi-structured or JSON-like data.

`map`

Constructs an Arrow map from alternating key/value arguments, or from two arrays (one of keys, one of values).

Example

`map_keys`

Returns the keys of a map as an array.

`map_values`

Returns the values of a map as an array.

`map_entries`

Returns the entries of a map as an array of structs [{key, value}, ...].

`map_extract`

Looks up a key in a map and returns the associated value, or NULL if the key is absent.

Alias: element_at.

Hashing Functions

Hashing functions compute cryptographic hashes and checksums for data integrity, fingerprinting, and security applications. Binary digest output is returned as a Binary (bytes) array; use encode(..., 'hex') to render as hex.

`digest`

Computes the digest of the input using the named hash algorithm. Supported algorithms: 'md5', 'sha224', 'sha256', 'sha384', 'sha512', 'blake2s', 'blake2b', 'blake3'.

Example

`md5`

Computes the MD5 128-bit hash of a string and returns the result as a lowercase hex string.

`sha224`

Computes the SHA-224 hash and returns a binary digest.

`sha256`

Computes the SHA-256 hash and returns a binary digest.

`sha384`

Computes the SHA-384 hash and returns a binary digest.

`sha512`

Computes the SHA-512 hash and returns a binary digest.

Encoding Functions

Binary encoding utilities for converting between binary data and text representations.

`encode`

Encodes a string or binary value using the specified encoding. Supported encodings: 'hex', 'base64'.

Example

`decode`

Decodes text back to binary using the specified encoding. Supported encodings: 'hex', 'base64'.

Union Functions

Union functions help work with union (variant) data types.

`union_extract`

Extracts the value of a named member from a union, returning NULL if the union's active member doesn't match.

`union_tag`

Returns the name of the active member of a union value as a string.

Metadata Functions

PostgreSQL-compatible functions for reading table and column comments from registered datasets. Comments originate from the dataset's source (for connectors that surface COMMENT ON TABLE / COMMENT ON COLUMN metadata, such as PostgreSQL, MySQL, Snowflake, and Databricks) or from description metadata attached to the dataset schema.

`obj_description`

Returns the comment attached to a registered table, or NULL if the table has no comment.

Arguments

table_identifier: Either a string holding a possibly-qualified table name ('table', 'schema.table', or 'catalog.schema.table') or an integer table OID. Unqualified names are resolved against the session's default catalog and schema.
catalog_name: When supplied as the second argument with 'pg_class', the call is treated as PostgreSQL-style obj_description(oid, 'pg_class'); any other value returns NULL.
schema_name, table_name: Explicit schema and table parts. The three-argument form additionally takes the catalog as the first argument.

Example

`col_description`

Returns the comment attached to a column on a registered table, or NULL if no comment exists.

Arguments

table_identifier: Possibly-qualified table name (string) or table OID (integer), resolved the same way as in obj_description.
column: Either a column name (string) or a 1-based ordinal position (integer).
catalog_name, schema_name, table_name: Explicit catalog, schema, and table parts when the four-argument form is used.

Example

Other Functions

Additional scalar functions include type casting, type inspection, and version reporting.

`arrow_cast`

Casts an expression to a specific Arrow data type. Use this function when you need precise control over the target Arrow type, such as specifying timestamp precision.

Arguments

expression: The value to cast.
arrow_type: A string specifying the target Arrow type (e.g., 'Int32', 'Utf8', 'Timestamp(Second, None)').

Example

See Data Types Reference for supported Arrow types.

`arrow_try_cast`

Like arrow_cast but returns NULL instead of erroring when the cast fails.

`arrow_typeof`

Returns the Arrow data type of the given expression as a string.

Arguments

expression: Any SQL expression.

Example

`arrow_metadata`

Returns the Arrow schema metadata associated with an expression as a map of key/value strings. Useful for inspecting field-level metadata (units, comments, logical type hints) attached during ingest.

`version`

Returns the underlying DataFusion runtime version string.

`ai` and `embed`

See AI Functions for ai() (LLM text generation) and embed() (vector embedding generation).

`bucket`

Assigns a deterministic bucket identifier for a value by hashing the input and projecting it into a fixed number of buckets. Helpful for partition_by expressions and for co-locating related rows during acceleration refreshes.

Arguments

num_buckets: Positive integer literal indicating how many buckets to distribute values across. Must be in the range [1, 1_000_000]. The literal's integer type (Int8 … Int64, UInt8 … UInt64) determines the return type.
value: Expression to hash. Accepts strings, numbers, and other scalar types supported by the query engine.

Return Type

Returns an integer in the range [0, num_buckets - 1], matching the integer type of num_buckets. The same input value always maps to the same bucket for a given num_buckets (the hash uses a fixed seed, so buckets are stable across processes and runtime restarts).

Example

In spicepod.yaml, use the function directly inside partition_by to build file-based accelerations:

`truncate`

Iceberg-style truncate transform. For numeric values, rounds down to the nearest multiple of width. For strings and binary, returns the first width characters/bytes. Useful for partitioning by wide numeric ranges or string prefixes.

Arguments

width: Positive Int64 literal that defines the bucket size or, for strings/binary, the number of leading units to retain. Maximum: i64::MAX / 2.
value: Expression to truncate. Accepts:
- Signed integers: Int8, Int16, Int32, Int64
- Unsigned integers: UInt8, UInt16, UInt32, UInt64
- Decimals: Decimal128, Decimal256
- Strings: Utf8
- Binary: Binary

Return Type

Returns the same type as value. For numbers, the result is the largest multiple of width that is less than or equal to value. For strings/binary, the result is the first width characters/bytes.

Example

Spice.ai aims for compatibility with PostgreSQL, but some functions or behaviors may differ depending on the underlying engine version.

spiceai/docs/README.md

title: 'Scalar Functions' sidebar_label: 'Scalar Functions' pagination_prev: 'reference/sql/ai' sidebar_position: 6

Math Functions

`abs`

Returns the absolute value of a numeric expression. If the input is negative, the result is its positive equivalent; if the input is positive or zero, the result is unchanged.

Arguments

numeric_expression: Numeric value to evaluate. Accepts constants, columns, or expressions.

Example

`acos`

Returns the arc cosine (inverse cosine) of a numeric expression. The input must be in the range [-1, 1]. The result is in radians.

Arguments

numeric_expression: Value between -1 and 1.

`acosh`

Returns the inverse hyperbolic cosine of a numeric expression. The input must be greater than or equal to 1.

Arguments

numeric_expression: Value greater than or equal to 1.

`asin`

Returns the arc sine (inverse sine) of a numeric expression. The input must be in the range [-1, 1]. The result is in radians.

Arguments

numeric_expression: Value between -1 and 1.

`asinh`

Returns the inverse hyperbolic sine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`atan`

Returns the arc tangent (inverse tangent) of a numeric expression. The result is in radians.

Arguments

numeric_expression: Numeric value.

`atan2`

Arguments

expression_y: Numerator value.
expression_x: Denominator value.

`atanh`

Returns the inverse hyperbolic tangent of a numeric expression. The input must be in the range (-1, 1).

Arguments

numeric_expression: Value between -1 and 1 (exclusive).

`cbrt`

Returns the cube root of a numeric expression.

Arguments

numeric_expression: Numeric value.

`ceil`

Returns the smallest integer greater than or equal to the input value.

Arguments

numeric_expression: Numeric value.

`cos`

Returns the cosine of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`cosh`

Returns the hyperbolic cosine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`cot`

Returns the cotangent of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`degrees`

Converts radians to degrees.

Arguments

numeric_expression: Value in radians.

`exp`

Returns the value of e (Euler's number) raised to the power of the input value.

Arguments

numeric_expression: Exponent value.

`factorial`

Returns the factorial of a non-negative integer. For values less than 2, returns 1.

Arguments

numeric_expression: Non-negative integer value.

`floor`

Returns the largest integer less than or equal to the input value.

Arguments

numeric_expression: Numeric value.

`gcd`

Returns the greatest common divisor of two integer expressions. If both inputs are zero, returns 0.

Arguments

expression_x: First integer value.
expression_y: Second integer value.

`isnan`

Returns true if the input is NaN (not a number), otherwise returns false.

Arguments

numeric_expression: Numeric value.

`iszero`

Returns true if the input is +0.0 or -0.0, otherwise returns false.

Arguments

numeric_expression: Numeric value.

`lcm`

Returns the least common multiple of two integer expressions. If either input is zero, returns 0.

Arguments

expression_x: First integer value.
expression_y: Second integer value.

`mod`

Returns the remainder after dividing the first argument by the second, matching the Spark SQL %/mod semantics for signed values.

Arguments

dividend: Numeric expression to divide.
divisor: Numeric expression that cannot be zero.

Example

Reference: Spark SQL mod.

`pmod`

Returns a positive remainder for integer or floating-point division. When the standard remainder is negative, the divisor is added to produce a non-negative result, mirroring Spark SQL behavior.

Arguments

dividend: Numeric expression to divide.
divisor: Numeric expression that cannot be zero.

Example

Reference: Spark SQL pmod.

`ln`

Returns the natural logarithm (base e) of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`log`

Returns the logarithm of a numeric expression. If a base is provided, returns the logarithm to that base; otherwise, returns the base-10 logarithm.

Arguments

base: Base of the logarithm (optional).
numeric_expression: Positive numeric value.

`log10`

Returns the base-10 logarithm of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`log2`

Returns the base-2 logarithm of a numeric expression.

Arguments

numeric_expression: Positive numeric value.

`nanvl`

Returns the first argument if it is not NaN; otherwise, returns the second argument.

Arguments

expression_x: Value to return if not NaN.
expression_y: Value to return if the first argument is NaN.

`pi`

Returns an approximate value of π (pi).

`pow` and `power`

Returns the value of the first argument raised to the power of the second argument. pow is an alias for power.

Arguments

base: Numeric value to raise.
exponent: Power to raise the base to.

`radians`

Converts degrees to radians.

Arguments

numeric_expression: Value in degrees.

`random`

Returns a random floating-point value in the range [0, 1). The random seed is unique for each row.

`round`

Rounds a numeric expression to the nearest integer or to a specified number of decimal places.

Arguments

numeric_expression: Value to round.
decimal_places: Optional. Number of decimal places to round to. Defaults to 0.

`rint`

Rounds a double-precision value to the nearest integer using IEEE-754 "round to nearest, ties to even" rules and returns the rounded value as a floating-point number, matching Spark SQL semantics.

Arguments

numeric_expression: Floating-point expression to round. Integers are implicitly cast to double.

Example

Reference: Spark SQL rint.

`signum`

Returns the sign of a numeric expression. Returns -1 for negative numbers, 1 for zero and positive numbers.

Arguments

numeric_expression: Numeric value.

`sin`

Returns the sine of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`sinh`

Returns the hyperbolic sine of a numeric expression.

Arguments

numeric_expression: Numeric value.

`sqrt`

Returns the square root of a numeric expression.

Arguments

numeric_expression: Non-negative numeric value.

`tan`

Returns the tangent of a numeric expression, where the input is in radians.

Arguments

numeric_expression: Value in radians.

`tanh`

Returns the hyperbolic tangent of a numeric expression.

Arguments

numeric_expression: Numeric value.

`trunc`

Arguments

numeric_expression: Value to truncate.
decimal_places: Optional. Number of decimal places to truncate to. Defaults to 0.

`width_bucket`

Arguments

value: Numeric or interval expression to bin.
min_value: Lower bound of the histogram range.
max_value: Upper bound of the histogram range.
num_bucket: Positive integer specifying the number of buckets.

Example

Reference: Spark SQL width_bucket.

Conditional Functions

Conditional functions help handle null values, select among alternatives, and compare multiple expressions. These are useful for data cleaning and conditional logic in queries.

`CASE`

Standard SQL CASE expression, supported in both simple and searched forms.

Example

`coalesce`

Returns the first non-null value from its arguments. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to evaluate in order. All arguments must share a common type.

Example

`greatest`

Returns the largest value among the arguments, ignoring NULLs. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to compare. Must share a common type.

Example

`if`

Evaluates a boolean condition and returns one of two expressions, matching the Spark SQL if function semantics.

Arguments

condition: Boolean expression determining which branch to take.
true_value: Expression to return when the condition evaluates to true.
false_value: Expression to return when the condition evaluates to false or NULL.

Example

Reference: Spark SQL if.

`least`

Returns the smallest value among the arguments, ignoring NULLs. Returns NULL only if every argument is NULL.

Arguments

expression1, ..., expression_n: Expressions to compare. Must share a common type.

Example

`nullif`

Returns NULL if expression1 equals expression2; otherwise returns expression1. Useful for converting sentinel values to NULL.

Arguments

expression1: Value to return if it differs from expression2.
expression2: Value to compare against.

Example

`nvl`

Returns expression2 when expression1 is NULL; otherwise returns expression1. Equivalent to coalesce(expression1, expression2).

Alias: ifnull.

`nvl2`

Returns expression2 if expression1 is not NULL; otherwise returns expression3.

String Functions

`ascii`

Returns the Unicode code point of the first character in a string. If the string is empty, returns 0.

Arguments

str: String expression. Accepts constants, columns, or expressions.

Example

Related function: chr

`bit_length`

Returns the number of bits in the string. Each character is counted according to its byte representation (8 bits per byte).

Arguments

str: String expression.

Example

Related functions: length, octet_length

`btrim`

Removes the longest string containing only characters in trim_str from the start and end of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(BOTH trim_str FROM str) or trim(trim_str FROM str)

Aliases: trim

Related functions: ltrim, rtrim

`char_length`

Alias of character_length.

`character_length`

Returns the number of characters in a string, not bytes. Handles Unicode correctly.

Arguments

str: String expression.

Example

Aliases: length, char_length

Related functions: bit_length, octet_length

`chr`

Returns the character with the specified Unicode code point.

Arguments

expression: Integer code point.

Example

Related function: ascii

`concat`

Concatenates two or more strings into a single string.

Arguments

str: String expression.
str_n: Additional string expressions.

Example

Related function: concat_ws

`concat_ws`

Concatenates strings using a separator between each value.

Arguments

separator: String separator.
str: String expression.
str_n: Additional string expressions.

Example

Related function: concat

`contains`

Returns true if search_str is found within str. The search is case-sensitive.

Arguments

str: String expression.
search_str: Substring to search for.

Example

`like`

Arguments

str: String expression to compare.
pattern: Pattern containing literal text plus % and _ wildcards.

Example

Reference: Spark SQL like.

`ilike`

Case-insensitive variant of like that treats ASCII characters in str and pattern without regard to case. The optional SQL ESCAPE clause may be used to treat % or _ literally.

Arguments

str: String expression to compare.
pattern: Case-insensitive pattern containing literal text plus % and _ wildcards.

Example

Reference: Spark SQL ilike.

`ends_with`

Returns true if str ends with the substring substr.

Arguments

str: String expression.
substr: Substring to test for.

Example

`find_in_set`

Returns the position (1-based) of str in the comma-separated list strlist. Returns 0 if not found.

Arguments

str: String to find.
strlist: Comma-separated list of substrings.

Example

`initcap`

Capitalizes the first character of each word in the string. Words are delimited by non-alphanumeric characters.

Arguments

str: String expression.

Example

Related functions: lower, upper

`instr`

Alias of strpos.

`left`

Returns the first n characters from the left side of the string.

Arguments

str: String expression.
n: Number of characters to return.

Example

Related function: right

`length`

Alias of character_length.

`levenshtein`

Returns the Levenshtein distance between two strings.

Arguments

str1: First string.
str2: Second string.

Example

`lower`

Converts all characters in the string to lower case.

Arguments

str: String expression.

Example

Related functions: initcap, upper

`luhn_check`

Arguments

str: String expression containing digits.

Example

Reference: Spark SQL luhn_check.

`lpad`

Pads the left side of the string with another string until the result reaches the specified length. If the padding string is omitted, a space is used.

Arguments

str: String expression.
n: Target length.
padding_str: Optional string to pad with.

Example

Related function: rpad

`ltrim`

Removes the longest string containing only characters in trim_str from the start of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(LEADING trim_str FROM str)

Related functions: btrim, rtrim

`octet_length`

Returns the number of bytes in the string.

Arguments

str: String expression.

Example

Related functions: bit_length, length

`overlay`

Replaces a substring of str with substr, starting at position pos for count characters. If count is omitted, uses the length of substr.

Arguments

str: String expression.
substr: Replacement string.
pos: Start position (1-based).
count: Optional number of characters to replace.

Example

`parse_url`

Arguments

url: URL string expression.
part_to_extract: Case-insensitive token identifying which component to extract.
key: Optional query parameter key to extract from the QUERY part.

Example

Reference: Spark SQL parse_url.

`position`

Alias of strpos.

`repeat`

Returns a string consisting of the input string repeated n times.

Arguments

str: String expression.
n: Number of repetitions.

Example

`replace`

Replaces all occurrences of substr in str with replacement.

Arguments

str: String expression.
substr: Substring to replace.
replacement: Replacement string.

Example

`reverse`

Returns the string with the character order reversed.

Arguments

str: String expression.

Example

`right`

Returns the last n characters from the right side of the string.

Arguments

str: String expression.
n: Number of characters to return.

Example

Related function: left

`rpad`

Pads the right side of the string with another string until the result reaches the specified length. If the padding string is omitted, a space is used.

Arguments

str: String expression.
n: Target length.
padding_str: Optional string to pad with.

Example

Related function: lpad

`rtrim`

Removes the longest string containing only characters in trim_str from the end of str. If trim_str is omitted, whitespace is removed.

Arguments

str: String expression.
trim_str: Optional string of characters to trim. Defaults to whitespace.

Example

Alternative syntax: trim(TRAILING trim_str FROM str)

Related functions: btrim, ltrim

`split_part`

Splits the string on the specified delimiter and returns the substring at the given position (1-based).

Arguments

str: String expression.
delimiter: Delimiter string.
pos: Position of the part to return (1-based).

Example

`starts_with`

Returns true if str starts with the substring substr.

Arguments

str: String expression.
substr: Substring to test for.

Example

`strpos`

Returns the position (1-based) of the first occurrence of substr in str. Returns 0 if not found.

Arguments

str: String expression.
substr: Substring to search for.

Example

Alternative syntax: position(substr in origstr)

Aliases: instr, position

`substr`

Extracts a substring from str, starting at start_pos for length characters. If length is omitted, returns the rest of the string.

Arguments

str: String expression.
start_pos: Start position (1-based).
length: Optional number of characters to extract.

Example

Alternative syntax: substring(str from start_pos for length)

Aliases: substring

`substr_index`

Arguments

str: String expression.
delim: Delimiter string.
count: Number of occurrences (positive or negative).

Example

Aliases: substring_index

`substring`

Alias of substr.

`substring_index`

Alias of substr_index.

`to_hex`

Converts an integer to its hexadecimal string representation.

Arguments

int: Integer expression.

Example

`translate`

Replaces each character in str that matches a character in chars with the corresponding character in translation. If translation is shorter than chars, extra characters are removed.

Arguments

str: String expression.
chars: Characters to translate.
translation: Replacement characters.

Example

`trim`

Alias of btrim.

`upper`

Converts all characters in the string to upper case.

Arguments

str: String expression.

Example

Related functions: initcap, lower

`uuid`

Returns a UUID v4 string value that is unique per row.

Example

Binary String Functions

Binary string functions help encode and decode binary data, such as base64 and hexadecimal conversions. These are useful for working with encoded data or binary blobs.

`bit_get`

Returns the bit (0 or 1) at the specified zero-based position when counting from the least-significant bit of an integral or binary expression, matching Spark SQL semantics.

Arguments

value: Integer or binary expression whose bits are inspected.
position: Zero-based index of the bit to return. Must be non-negative.

Example

Reference: Spark SQL bit_get.

`bit_count`

Counts the number of set bits in an integral or binary expression. Useful for quick popcount operations on bitmaps or packed flags, aligned with Spark SQL behavior.

Arguments

value: Integer or binary expression.

Example

Reference: Spark SQL bit_count.

`bitmap_count`

Returns the number of set bits in a binary bitmap produced by functions such as bitmap_construct_agg, mirroring the Spark SQL implementation.

Arguments

bitmap: Binary expression representing a bitmap.

Example

Reference: Spark SQL bitmap_count.

Regular Expression Functions

Regular expression functions help match, extract, and replace patterns in strings. Spice.ai uses a PCRE-like regular expression syntax. Spice supports the following regular expressions:

regexp_like
regexp_match
regexp_replace
regexp_count
regexp_instr

`regexp_like`

Returns true if a regular expression has at least one match in a string, false otherwise.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_match`

Returns the first regular expression matches in a string.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_replace`

Replaces substrings in a string that match a regular expression.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to match against. Can be a constant, column, or function.
replacement: Replacement string expression to operate on. Can be a constant, column, or function, and any combination of operators.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- g: (global) Search globally and don’t return after the first match
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_count`

Returns the number of matches that a regular expression has in a string.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?

Example

`regexp_instr`

Returns the position in a string where the specified occurrence of a POSIX regular expression is located.

Arguments

str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function. Defaults to 1.
N: Optional. The N-th occurrence of pattern to find. Defaults to 1 (first match). Can be a constant, column, or function.
flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
subexpr: Optional. Specifies which capture group (subexpression) to return the position for. Defaults to 0, which returns the position of the entire match.

Example

Time and Date Functions

current_date
current_time
current_timestamp
date_bin
date_format
date_part
date_trunc
datepart
datetrunc
from_unixtime
make_date
now
to_char
to_date
to_local_time
to_timestamp
to_timestamp_micros
to_timestamp_millis
to_timestamp_nanos
to_timestamp_seconds
to_unixtime
today

`current_date`

Returns the current UTC date.

The current_date() return value is determined at query time and will return the same date, no matter when in the query plan the function executes.

Aliases

today

`current_time`

Returns the current UTC time.

The current_time() return value is determined at query time and will return the same time, no matter when in the query plan the function executes.

`current_timestamp`

Alias of now.

`date_bin`

Arguments

interval: Bin interval.
expression: Time expression to operate on. Can be a constant, column, or function.
origin-timestamp: Optional. Starting point used to determine bin boundaries. If not specified defaults 1970-01-01T00:00:00Z (the UNIX epoch in UTC). The following intervals are supported:
- nanoseconds
- microseconds
- milliseconds
- seconds
- minutes
- hours
- days
- weeks
- months
- years
- century

Example

`date_add`

Adds a number of days to a DATE or TIMESTAMP expression, matching Spark SQL semantics. Negative offsets move backwards in time.

Arguments

start_date: DATE or TIMESTAMP expression.
num_days: Integer number of days to add.

Example

Reference: Spark SQL date_add.

`date_sub`

Subtracts a number of days from a DATE or TIMESTAMP expression using Spark-compatible behavior.

Arguments

start_date: DATE or TIMESTAMP expression.
num_days: Integer number of days to subtract.

Example

Reference: Spark SQL date_sub.

`last_day`

Returns the last day of the month that contains the input date or timestamp, matching Spark SQL semantics.

Arguments

expression: DATE or TIMESTAMP expression.

Example

Reference: Spark SQL last_day.

`next_day`

Returns the first date after start_date that matches the requested day of week. Valid day names include full names (e.g., Monday) or abbreviations such as Mon, matching Spark SQL behavior.

Arguments

start_date: DATE or TIMESTAMP expression.
day_of_week: String literal naming the target weekday.

Example

Reference: Spark SQL next_day.

`date_format`

Alias of to_char.

`date_part`

Returns the specified part of the date as an integer.

Arguments

part: Part of the date to return. The following date parts are supported:
- year
- quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
- month
- week (week of the year)
- day (day of the month)
- hour
- minute
- second
- millisecond
- microsecond
- nanosecond
- dow (day of the week where Sunday is 0)
- doy (day of the year)
- epoch (seconds since Unix epoch)
- isodow (day of the week where Monday is 0)
expression: Time expression to operate on. Can be a constant, column, or function.

Alternative Syntax

Aliases

datepart

`date_trunc`

Truncates a timestamp value to a specified precision.

Arguments

precision: Time precision to truncate to. The following precisions are supported:
- year / YEAR
- quarter / QUARTER
- month / MONTH
- week / WEEK
- day / DAY
- hour / HOUR
- minute / MINUTE
- second / SECOND
- millisecond / MILLISECOND
- microsecond / MICROSECOND
expression: Time expression to operate on. Can be a constant, column, or function.

Aliases

datetrunc

`datepart`

Alias of date_part.

`datetrunc`

Alias of date_trunc.

`from_unixtime`

Arguments

expression: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
timezone: Optional timezone to use when converting the integer to a timestamp. If not provided, the default timezone is UTC.

Example

`make_date`

Make a date from year/month/day component parts.

Arguments

year: Year to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
month: Month to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

`now`

Returns the current UTC timestamp.

The now() return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.

Aliases

current_timestamp

`to_char`

Returns a string representation of a date, time, timestamp or duration based on a Chrono format. Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.

Arguments

expression: Expression to operate on. Can be a constant, column, or function that results in a date, time, timestamp or duration.
format: A Chrono format string to use to convert the expression.
day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

Aliases

date_format

`to_date`

Note: to_date returns Date32, which represents its values as the number of days since unix epoch(1970-01-01) stored as signed 32 bit value. The largest supported date value is 9999-12-31.

Arguments

expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_local_time`

Converts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.

Arguments

expression: Time expression to operate on. Can be a constant, column, or function.

Example

`to_timestamp`

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_micros`

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_millis`

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_nanos`

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_timestamp_seconds`

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`to_unixtime`

Arguments

expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

`today`

Alias of current_date.

Array Functions

`array`

Arguments

expression: Value to include in the array. Expressions must be implicitly castable to a shared element type.
expression_n: Additional expressions to append to the array.

Example

Reference: Spark SQL array.

`array_any_value`

Returns the first non-null element in the array. If all elements are null, returns null.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_any_value

`array_append`

Appends an element to the end of an array and returns the new array.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to append to the array.

Example

Aliases

list_append
array_push_back
list_push_back

`array_cat`

Alias of array_concat.

`array_concat`

Concatenates two or more arrays into a single array.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
array_n: Additional array expressions to concatenate.

Example

Aliases

array_cat
list_concat
list_cat

`array_contains`

Returns true if the array contains the specified element.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
element: Element to search for in the array.

Example

Note: For array-to-array containment operations, use the @> operator.

`array_dims`

Returns an array of the array's dimensions. For a 2D array, returns the number of rows and columns.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_dims

`array_distance`

Returns the Euclidean distance between two input arrays of equal length.

Arguments

array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_distance

`array_distinct`

Returns a new array with duplicate elements removed, preserving the order of first occurrence.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

Aliases

list_distinct

`array_element`

Extracts the element at the specified index from the array. Indexing is 1-based.

Arguments

array: Array expression. Can be a constant, column, or function, and any combination of array operators.
index: Index to extract the element from the array (1-based).

Example

Aliases

array_extract
list_element
list_extract

`array_except`

Returns an array containing elements in array1 that are not in array2, preserving first-occurrence order and without duplicates.

Alias: list_except.

`array_has`

Returns true if the array contains the specified element.

Aliases: array_contains, list_has.

`array_has_all`

Returns true if every element of sub_array is present in array.

Alias: list_has_all.

`array_has_any`

Returns true if array and sub_array share at least one element.

Aliases: list_has_any, arrays_overlap.

`array_intersect`

Returns an array of elements present in both input arrays, deduplicated.

Alias: list_intersect.

`array_length`

Returns the length of the array at the given (optional) dimension. Dimension defaults to 1.

Alias: list_length.

`array_max`

Returns the maximum element of the array, ignoring NULLs.

Alias: list_max.

`array_min`

Returns the minimum element of the array, ignoring NULLs.

Alias: list_min.

`array_ndims`

Returns the number of dimensions of the array.

Alias: list_ndims.

`array_pop_back`

Returns the array with the last element removed.

Alias: list_pop_back.

`array_pop_front`

Returns the array with the first element removed.

Alias: list_pop_front.

`array_position`

Returns the 1-based position of the first occurrence of element in array, or NULL if not found. An optional from_index starts the search at a later position.

Aliases: list_position, array_indexof, list_indexof.

`array_positions`

Returns a 1-based array of all positions where element occurs in array.

Alias: list_positions.

`array_prepend`

Prepends an element to the beginning of an array.

Aliases: list_prepend, array_push_front, list_push_front.

`array_remove`

Returns the array with the first occurrence of element removed.

Alias: list_remove.

`array_remove_n`

Returns the array with the first max occurrences of element removed.

Alias: list_remove_n.

`array_remove_all`

Returns the array with all occurrences of element removed.

Alias: list_remove_all.

`array_repeat`

Returns an array containing element repeated count times.

Alias: list_repeat.

`array_replace`

Replaces the first occurrence of from with to in array.

Alias: list_replace.

`array_replace_n`

Replaces the first max occurrences of from with to in array.

Alias: list_replace_n.

`array_replace_all`

Replaces every occurrence of from with to in array.

Alias: list_replace_all.

`array_resize`

Resizes array to the given length, padding with value (or NULL if omitted) when growing.

Alias: list_resize.

`array_reverse`

Returns the array with elements in reverse order.

Alias: list_reverse.

`array_slice`

Returns a slice of the array from begin to end (1-based, inclusive). Negative indices count from the end.

Alias: list_slice.

`array_sort`

Returns array sorted in ascending order (default). Optional arguments control sort direction (ASC/DESC) and null placement (NULLS FIRST/NULLS LAST).

Alias: list_sort.

`array_to_string`

Concatenates array elements into a single string using the given delimiter. Optional null_string replaces NULL elements.

Aliases: list_to_string, array_join, list_join.

`array_union`

Returns the set-union of two arrays, deduplicated.

Alias: list_union.

`arrays_zip`

Merges the given arrays element-wise into an array of structs. Shorter arrays are padded with NULLs.

Alias: list_zip.

`cardinality`

Returns the total number of elements in an array (including nested elements) or the number of entries in a map.

`empty`

Returns true if the array has length 0 (or is NULL).

Aliases: array_empty, list_empty.

`flatten`

Flattens a nested array into a single-level array.

`make_array`

Constructs an array (Arrow list) from the given expressions. SQL [expr1, expr2, ...] literal syntax compiles to this function.

Alias: make_list.

`range`

Generates a numeric or date range as an array, half-open on the upper bound. When the step is omitted, the default is 1.

For dates, step is an interval literal, e.g. interval '1 day'. Use generate_series for the inclusive-upper-bound variant.

`generate_series`

Like range, but the upper bound is inclusive.

`string_to_array`

Splits a string into an array of substrings using the given delimiter. An optional null_string turns matching substrings into NULLs.

Alias: string_to_list.

Struct Functions

Struct functions help construct and access structured data types (Arrow structs). These are useful for working with nested or composite data.

`struct`

Constructs an anonymous Arrow struct from the given values. Field names default to c0, c1, ... in the order provided.

Example

`named_struct`

Constructs an Arrow struct from alternating field-name / field-value pairs.

Example

`get_field`

Extracts a field by name from a struct or map. struct.field and struct['field'] sugar invoke this function.

Map Functions

Map functions help construct and query key-value data structures. These are useful for semi-structured or JSON-like data.

`map`

Constructs an Arrow map from alternating key/value arguments, or from two arrays (one of keys, one of values).

Example

`map_keys`

Returns the keys of a map as an array.

`map_values`

Returns the values of a map as an array.

`map_entries`

Returns the entries of a map as an array of structs [{key, value}, ...].

`map_extract`

Looks up a key in a map and returns the associated value, or NULL if the key is absent.

Alias: element_at.

Hashing Functions

`digest`

Computes the digest of the input using the named hash algorithm. Supported algorithms: 'md5', 'sha224', 'sha256', 'sha384', 'sha512', 'blake2s', 'blake2b', 'blake3'.

Example

`md5`

Computes the MD5 128-bit hash of a string and returns the result as a lowercase hex string.

`sha224`

Computes the SHA-224 hash and returns a binary digest.

`sha256`

Computes the SHA-256 hash and returns a binary digest.

`sha384`

Computes the SHA-384 hash and returns a binary digest.

`sha512`

Computes the SHA-512 hash and returns a binary digest.

Encoding Functions

Binary encoding utilities for converting between binary data and text representations.

`encode`

Encodes a string or binary value using the specified encoding. Supported encodings: 'hex', 'base64'.

Example

`decode`

Decodes text back to binary using the specified encoding. Supported encodings: 'hex', 'base64'.

Union Functions

Union functions help work with union (variant) data types.

`union_extract`

Extracts the value of a named member from a union, returning NULL if the union's active member doesn't match.

`union_tag`

Returns the name of the active member of a union value as a string.

Metadata Functions

`obj_description`

Returns the comment attached to a registered table, or NULL if the table has no comment.

Arguments

table_identifier: Either a string holding a possibly-qualified table name ('table', 'schema.table', or 'catalog.schema.table') or an integer table OID. Unqualified names are resolved against the session's default catalog and schema.
catalog_name: When supplied as the second argument with 'pg_class', the call is treated as PostgreSQL-style obj_description(oid, 'pg_class'); any other value returns NULL.
schema_name, table_name: Explicit schema and table parts. The three-argument form additionally takes the catalog as the first argument.

Example

`col_description`

Returns the comment attached to a column on a registered table, or NULL if no comment exists.

Arguments

table_identifier: Possibly-qualified table name (string) or table OID (integer), resolved the same way as in obj_description.
column: Either a column name (string) or a 1-based ordinal position (integer).
catalog_name, schema_name, table_name: Explicit catalog, schema, and table parts when the four-argument form is used.

Example

Other Functions

Additional scalar functions include type casting, type inspection, and version reporting.

`arrow_cast`

Casts an expression to a specific Arrow data type. Use this function when you need precise control over the target Arrow type, such as specifying timestamp precision.

Arguments

expression: The value to cast.
arrow_type: A string specifying the target Arrow type (e.g., 'Int32', 'Utf8', 'Timestamp(Second, None)').

Example

See Data Types Reference for supported Arrow types.

`arrow_try_cast`

Like arrow_cast but returns NULL instead of erroring when the cast fails.

`arrow_typeof`

Returns the Arrow data type of the given expression as a string.

Arguments

expression: Any SQL expression.

Example

`arrow_metadata`

Returns the Arrow schema metadata associated with an expression as a map of key/value strings. Useful for inspecting field-level metadata (units, comments, logical type hints) attached during ingest.

`version`

Returns the underlying DataFusion runtime version string.

`ai` and `embed`

See AI Functions for ai() (LLM text generation) and embed() (vector embedding generation).

`bucket`

Arguments

num_buckets: Positive integer literal indicating how many buckets to distribute values across. Must be in the range [1, 1_000_000]. The literal's integer type (Int8 … Int64, UInt8 … UInt64) determines the return type.
value: Expression to hash. Accepts strings, numbers, and other scalar types supported by the query engine.

Return Type

Example

In spicepod.yaml, use the function directly inside partition_by to build file-based accelerations:

`truncate`

Arguments

width: Positive Int64 literal that defines the bucket size or, for strings/binary, the number of leading units to retain. Maximum: i64::MAX / 2.
value: Expression to truncate. Accepts:
- Signed integers: Int8, Int16, Int32, Int64
- Unsigned integers: UInt8, UInt16, UInt32, UInt64
- Decimals: Decimal128, Decimal256
- Strings: Utf8
- Binary: Binary

Return Type

Example

Spice.ai aims for compatibility with PostgreSQL, but some functions or behaviors may differ depending on the underlying engine version.

abs(numeric_expression)

abs(numeric_expression)

> select abs(-5);
+-------------+
| abs(Int64(-5)) |
+-------------+
| 5           |
+-------------+

> select abs(-5);
+-------------+
| abs(Int64(-5)) |
+-------------+
| 5           |
+-------------+

acos(numeric_expression)

acos(numeric_expression)

acosh(numeric_expression)

acosh(numeric_expression)

asin(numeric_expression)

asin(numeric_expression)

asinh(numeric_expression)

asinh(numeric_expression)

atan(numeric_expression)

atan(numeric_expression)

atan2(expression_y, expression_x)

atan2(expression_y, expression_x)

atanh(numeric_expression)

atanh(numeric_expression)

cbrt(numeric_expression)

cbrt(numeric_expression)

ceil(numeric_expression)

ceil(numeric_expression)

cos(numeric_expression)

cos(numeric_expression)

cosh(numeric_expression)

cosh(numeric_expression)

cot(numeric_expression)

cot(numeric_expression)

degrees(numeric_expression)

degrees(numeric_expression)

exp(numeric_expression)

exp(numeric_expression)

factorial(numeric_expression)

factorial(numeric_expression)

floor(numeric_expression)

floor(numeric_expression)

gcd(expression_x, expression_y)

gcd(expression_x, expression_y)

isnan(numeric_expression)

isnan(numeric_expression)

iszero(numeric_expression)

iszero(numeric_expression)

lcm(expression_x, expression_y)

lcm(expression_x, expression_y)

mod(dividend, divisor)

mod(dividend, divisor)

> select mod(-10, 3);
+----------------------------+
| mod(Int64(-10),Int64(3))   |
+----------------------------+
| -1                         |
+----------------------------+

> select mod(-10, 3);
+----------------------------+
| mod(Int64(-10),Int64(3))   |
+----------------------------+
| -1                         |
+----------------------------+

pmod(dividend, divisor)

pmod(dividend, divisor)

> select pmod(-10, 3);
+-----------------------------+
| pmod(Int64(-10),Int64(3))   |
+-----------------------------+
| 2                           |
+-----------------------------+

> select pmod(-10, 3);
+-----------------------------+
| pmod(Int64(-10),Int64(3))   |
+-----------------------------+
| 2                           |
+-----------------------------+

ln(numeric_expression)

ln(numeric_expression)

log(base, numeric_expression)
log(numeric_expression)

log(base, numeric_expression)
log(numeric_expression)

log10(numeric_expression)

log10(numeric_expression)

log2(numeric_expression)

log2(numeric_expression)

nanvl(expression_x, expression_y)

nanvl(expression_x, expression_y)

power(base, exponent)
pow(base, exponent)

power(base, exponent)
pow(base, exponent)

radians(numeric_expression)

radians(numeric_expression)

random()

round(numeric_expression[, decimal_places])

round(numeric_expression[, decimal_places])

rint(numeric_expression)

rint(numeric_expression)

> select rint(12.5);
+-------------------------+
| rint(Float64(12.5))     |
+-------------------------+
| 12.0                    |
+-------------------------+

> select rint(12.5);
+-------------------------+
| rint(Float64(12.5))     |
+-------------------------+
| 12.0                    |
+-------------------------+

signum(numeric_expression)

signum(numeric_expression)

sin(numeric_expression)

sin(numeric_expression)

sinh(numeric_expression)

sinh(numeric_expression)

sqrt(numeric_expression)

sqrt(numeric_expression)

tan(numeric_expression)

tan(numeric_expression)

tanh(numeric_expression)

tanh(numeric_expression)

trunc(numeric_expression[, decimal_places])

trunc(numeric_expression[, decimal_places])

width_bucket(value, min_value, max_value, num_bucket)

width_bucket(value, min_value, max_value, num_bucket)

> select width_bucket(5.3, 0.2, 10.6, 5);
+--------------------------------------------------+
| width_bucket(Float64(5.3),Float64(0.2),Float64(10.6),Int64(5)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+

> select width_bucket(5.3, 0.2, 10.6, 5);
+--------------------------------------------------+
| width_bucket(Float64(5.3),Float64(0.2),Float64(10.6),Int64(5)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+

CASE expression
  WHEN value1 THEN result1
  [WHEN value2 THEN result2 ...]
  [ELSE default_result]
END

CASE
  WHEN condition1 THEN result1
  [WHEN condition2 THEN result2 ...]
  [ELSE default_result]
END

CASE expression
  WHEN value1 THEN result1
  [WHEN value2 THEN result2 ...]
  [ELSE default_result]
END

CASE
  WHEN condition1 THEN result1
  [WHEN condition2 THEN result2 ...]
  [ELSE default_result]
END

> SELECT CASE WHEN score >= 90 THEN 'A' WHEN score >= 80 THEN 'B' ELSE 'C' END AS grade
  FROM (VALUES (95), (82), (70)) AS t(score);
+-------+
| grade |
+-------+
| A     |
| B     |
| C     |
+-------+

> SELECT CASE WHEN score >= 90 THEN 'A' WHEN score >= 80 THEN 'B' ELSE 'C' END AS grade
  FROM (VALUES (95), (82), (70)) AS t(score);
+-------+
| grade |
+-------+
| A     |
| B     |
| C     |
+-------+

coalesce(expression1[, ..., expression_n])

coalesce(expression1[, ..., expression_n])

> SELECT coalesce(NULL, NULL, 'spice', 'datafusion');
+----------------------------------------------------+
| coalesce(NULL,NULL,Utf8("spice"),Utf8("datafusion")) |
+----------------------------------------------------+
| spice                                              |
+----------------------------------------------------+

> SELECT coalesce(NULL, NULL, 'spice', 'datafusion');
+----------------------------------------------------+
| coalesce(NULL,NULL,Utf8("spice"),Utf8("datafusion")) |
+----------------------------------------------------+
| spice                                              |
+----------------------------------------------------+

greatest(expression1[, ..., expression_n])

greatest(expression1[, ..., expression_n])

> SELECT greatest(3, 7, NULL, 4);
+------------------------------------------------+
| greatest(Int64(3),Int64(7),NULL,Int64(4))      |
+------------------------------------------------+
| 7                                              |
+------------------------------------------------+

> SELECT greatest(3, 7, NULL, 4);
+------------------------------------------------+
| greatest(Int64(3),Int64(7),NULL,Int64(4))      |
+------------------------------------------------+
| 7                                              |
+------------------------------------------------+

if(condition, true_value, false_value)

if(condition, true_value, false_value)

> select if(temperature > 70, 'warm', 'cool') as label from (values (65), (72)) as t(temperature);
+------------+
| label      |
+------------+
| cool       |
| warm       |
+------------+

> select if(temperature > 70, 'warm', 'cool') as label from (values (65), (72)) as t(temperature);
+------------+
| label      |
+------------+
| cool       |
| warm       |
+------------+

least(expression1[, ..., expression_n])

least(expression1[, ..., expression_n])

> SELECT least(3, 7, NULL, 4);
+--------------------------------------------+
| least(Int64(3),Int64(7),NULL,Int64(4))     |
+--------------------------------------------+
| 3                                          |
+--------------------------------------------+

> SELECT least(3, 7, NULL, 4);
+--------------------------------------------+
| least(Int64(3),Int64(7),NULL,Int64(4))     |
+--------------------------------------------+
| 3                                          |
+--------------------------------------------+

nullif(expression1, expression2)

nullif(expression1, expression2)

> SELECT nullif('unknown', 'unknown');
+-------------------------------------------+
| nullif(Utf8("unknown"),Utf8("unknown"))   |
+-------------------------------------------+
|                                           |
+-------------------------------------------+
> SELECT nullif('spice', 'unknown');
+-------------------------------------------+
| nullif(Utf8("spice"),Utf8("unknown"))     |
+-------------------------------------------+
| spice                                     |
+-------------------------------------------+

> SELECT nullif('unknown', 'unknown');
+-------------------------------------------+
| nullif(Utf8("unknown"),Utf8("unknown"))   |
+-------------------------------------------+
|                                           |
+-------------------------------------------+
> SELECT nullif('spice', 'unknown');
+-------------------------------------------+
| nullif(Utf8("spice"),Utf8("unknown"))     |
+-------------------------------------------+
| spice                                     |
+-------------------------------------------+

nvl(expression1, expression2)

nvl(expression1, expression2)

nvl2(expression1, expression2, expression3)

nvl2(expression1, expression2, expression3)

ascii(str)

ascii(str)

> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97                 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640            |
+-------------------+

> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97                 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640            |
+-------------------+

bit_length(str)

bit_length(str)

> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80                             |
+--------------------------------+

> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80                             |
+--------------------------------+

btrim(str[, trim_str])

btrim(str[, trim_str])

> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion                                |
+-------------------------------------------+

> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion                                |
+-------------------------------------------+

character_length(str)

character_length(str)

> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8                                  |
+------------------------------------+

> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8                                  |
+------------------------------------+

chr(expression)

chr(expression)

> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀                 |
+--------------------+

> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀                 |
+--------------------+

concat(str[, ..., str_n])

concat(str[, ..., str_n])

> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion                                            |
+-------------------------------------------------------+

> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion                                            |
+-------------------------------------------------------+

concat_ws(separator, str[, ..., str_n])

concat_ws(separator, str[, ..., str_n])

> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion                                      |
+--------------------------------------------------+

> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion                                      |
+--------------------------------------------------+

contains(str, search_str)

contains(str, search_str)

> select contains('the quick brown fox', 'row');
+---------------------------------------------------+
| contains(Utf8("the quick brown fox"),Utf8("row")) |
+---------------------------------------------------+
| true                                              |
+---------------------------------------------------+

> select contains('the quick brown fox', 'row');
+---------------------------------------------------+
| contains(Utf8("the quick brown fox"),Utf8("row")) |
+---------------------------------------------------+
| true                                              |
+---------------------------------------------------+

like(str, pattern)

like(str, pattern)

> select like('spice.ai', 'spice%');
+----------------------------------+
| like(Utf8("spice.ai"),Utf8("spice%")) |
+----------------------------------+
| true                             |
+----------------------------------+

> select like('spice.ai', 'spice%');
+----------------------------------+
| like(Utf8("spice.ai"),Utf8("spice%")) |
+----------------------------------+
| true                             |
+----------------------------------+

ilike(str, pattern)

ilike(str, pattern)

> select ilike('Spice.AI', 'spice%');
+-----------------------------------+
| ilike(Utf8("Spice.AI"),Utf8("spice%")) |
+-----------------------------------+
| true                              |
+-----------------------------------+

> select ilike('Spice.AI', 'spice%');
+-----------------------------------+
| ilike(Utf8("Spice.AI"),Utf8("spice%")) |
+-----------------------------------+
| true                              |
+-----------------------------------+

ends_with(str, substr)

ends_with(str, substr)

> select ends_with('datafusion', 'soin');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("soin")) |
+--------------------------------------------+
| false                                      |
+--------------------------------------------+
> select ends_with('datafusion', 'sion');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("sion")) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+

> select ends_with('datafusion', 'soin');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("soin")) |
+--------------------------------------------+
| false                                      |
+--------------------------------------------+
> select ends_with('datafusion', 'sion');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("sion")) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+

find_in_set(str, strlist)

find_in_set(str, strlist)

> select find_in_set('b', 'a,b,c,d');
+----------------------------------------+
| find_in_set(Utf8("b"),Utf8("a,b,c,d")) |
+----------------------------------------+
| 2                                      |
+----------------------------------------+

> select find_in_set('b', 'a,b,c,d');
+----------------------------------------+
| find_in_set(Utf8("b"),Utf8("a,b,c,d")) |
+----------------------------------------+
| 2                                      |
+----------------------------------------+

initcap(str)

initcap(str)

> select initcap('apache datafusion');
+------------------------------------+
| initcap(Utf8("apache datafusion")) |
+------------------------------------+
| Apache Datafusion                  |
+------------------------------------+

> select initcap('apache datafusion');
+------------------------------------+
| initcap(Utf8("apache datafusion")) |
+------------------------------------+
| Apache Datafusion                  |
+------------------------------------+

left(str, n)

left(str, n)

> select left('datafusion', 4);
+-----------------------------------+
| left(Utf8("datafusion"),Int64(4)) |
+-----------------------------------+
| data                              |
+-----------------------------------+

> select left('datafusion', 4);
+-----------------------------------+
| left(Utf8("datafusion"),Int64(4)) |
+-----------------------------------+
| data                              |
+-----------------------------------+

levenshtein(str1, str2)

levenshtein(str1, str2)

> select levenshtein('kitten', 'sitting');
+---------------------------------------------+
| levenshtein(Utf8("kitten"),Utf8("sitting")) |
+---------------------------------------------+
| 3                                           |
+---------------------------------------------+

> select levenshtein('kitten', 'sitting');
+---------------------------------------------+
| levenshtein(Utf8("kitten"),Utf8("sitting")) |
+---------------------------------------------+
| 3                                           |
+---------------------------------------------+

lower(str)

lower(str)

> select lower('Ångström');
+-------------------------+
| lower(Utf8("Ångström")) |
+-------------------------+
| ångström                |
+-------------------------+

> select lower('Ångström');
+-------------------------+
| lower(Utf8("Ångström")) |
+-------------------------+
| ångström                |
+-------------------------+

luhn_check(str)

luhn_check(str)

> select luhn_check('79927398713');
+--------------------------------------+
| luhn_check(Utf8("79927398713"))      |
+--------------------------------------+
| true                                 |
+--------------------------------------+

> select luhn_check('79927398713');
+--------------------------------------+
| luhn_check(Utf8("79927398713"))      |
+--------------------------------------+
| true                                 |
+--------------------------------------+

lpad(str, n[, padding_str])

lpad(str, n[, padding_str])

> select lpad('Dolly', 10, 'hello');
+---------------------------------------------+
| lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) |
+---------------------------------------------+
| helloDolly                                  |
+---------------------------------------------+

> select lpad('Dolly', 10, 'hello');
+---------------------------------------------+
| lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) |
+---------------------------------------------+
| helloDolly                                  |
+---------------------------------------------+

ltrim(str[, trim_str])

ltrim(str[, trim_str])

> select ltrim('  datafusion  ');
+-------------------------------+
| ltrim(Utf8("  datafusion  ")) |
+-------------------------------+
| datafusion                    |
+-------------------------------+
> select ltrim('___datafusion___', '_');
+-------------------------------------------+
| ltrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| datafusion___                             |
+-------------------------------------------+

> select ltrim('  datafusion  ');
+-------------------------------+
| ltrim(Utf8("  datafusion  ")) |
+-------------------------------+
| datafusion                    |
+-------------------------------+
> select ltrim('___datafusion___', '_');
+-------------------------------------------+
| ltrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| datafusion___                             |
+-------------------------------------------+

octet_length(str)

octet_length(str)

> select octet_length('Ångström');
+--------------------------------+
| octet_length(Utf8("Ångström")) |
+--------------------------------+
| 10                             |
+--------------------------------+

> select octet_length('Ångström');
+--------------------------------+
| octet_length(Utf8("Ångström")) |
+--------------------------------+
| 10                             |
+--------------------------------+

overlay(str PLACING substr FROM pos [FOR count])

overlay(str PLACING substr FROM pos [FOR count])

> select overlay('Txxxxas' placing 'hom' from 2 for 4);
+--------------------------------------------------------+
| overlay(Utf8("Txxxxas"),Utf8("hom"),Int64(2),Int64(4)) |
+--------------------------------------------------------+
| Thomas                                                 |
+--------------------------------------------------------+

> select overlay('Txxxxas' placing 'hom' from 2 for 4);
+--------------------------------------------------------+
| overlay(Utf8("Txxxxas"),Utf8("hom"),Int64(2),Int64(4)) |
+--------------------------------------------------------+
| Thomas                                                 |
+--------------------------------------------------------+

parse_url(url, part_to_extract[, key])

parse_url(url, part_to_extract[, key])

> select parse_url('https://spice.ai/blog?id=42', 'HOST');
+-------------------------------------------------------+
| parse_url(Utf8("https://spice.ai/blog?id=42"),Utf8("HOST")) |
+-------------------------------------------------------+
| spice.ai                                              |
+-------------------------------------------------------+
> select parse_url('https://spice.ai/blog?id=42', 'QUERY', 'id');
+------------------------------------------------------------------+
| parse_url(Utf8("https://spice.ai/blog?id=42"),Utf8("QUERY"),Utf8("id")) |
+------------------------------------------------------------------+
| 42                                                               |
+------------------------------------------------------------------+

> select parse_url('https://spice.ai/blog?id=42', 'HOST');
+-------------------------------------------------------+
| parse_url(Utf8("https://spice.ai/blog?id=42"),Utf8("HOST")) |
+-------------------------------------------------------+
| spice.ai                                              |
+-------------------------------------------------------+
> select parse_url('https://spice.ai/blog?id=42', 'QUERY', 'id');
+------------------------------------------------------------------+
| parse_url(Utf8("https://spice.ai/blog?id=42"),Utf8("QUERY"),Utf8("id")) |
+------------------------------------------------------------------+
| 42                                                               |
+------------------------------------------------------------------+

repeat(str, n)

repeat(str, n)

> select repeat('data', 3);
+-------------------------------+
| repeat(Utf8("data"),Int64(3)) |
+-------------------------------+
| datadatadata                  |
+-------------------------------+

> select repeat('data', 3);
+-------------------------------+
| repeat(Utf8("data"),Int64(3)) |
+-------------------------------+
| datadatadata                  |
+-------------------------------+

replace(str, substr, replacement)

replace(str, substr, replacement)

> select replace('ABabbaBA', 'ab', 'cd');
+-------------------------------------------------+
| replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) |
+-------------------------------------------------+
| ABcdbaBA                                        |
+-------------------------------------------------+

> select replace('ABabbaBA', 'ab', 'cd');
+-------------------------------------------------+
| replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) |
+-------------------------------------------------+
| ABcdbaBA                                        |
+-------------------------------------------------+

reverse(str)

reverse(str)

> select reverse('datafusion');
+-----------------------------+
| reverse(Utf8("datafusion")) |
+-----------------------------+
| noisufatad                  |
+-----------------------------+

> select reverse('datafusion');
+-----------------------------+
| reverse(Utf8("datafusion")) |
+-----------------------------+
| noisufatad                  |
+-----------------------------+

right(str, n)

right(str, n)

> select right('datafusion', 6);
+------------------------------------+
| right(Utf8("datafusion"),Int64(6)) |
+------------------------------------+
| fusion                             |
+------------------------------------+

> select right('datafusion', 6);
+------------------------------------+
| right(Utf8("datafusion"),Int64(6)) |
+------------------------------------+
| fusion                             |
+------------------------------------+

rpad(str, n[, padding_str])

rpad(str, n[, padding_str])

> select rpad('datafusion', 20, '_-');
+-----------------------------------------------+
| rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) |
+-----------------------------------------------+
| datafusion_-_-_-_-_-                          |
+-----------------------------------------------+

> select rpad('datafusion', 20, '_-');
+-----------------------------------------------+
| rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) |
+-----------------------------------------------+
| datafusion_-_-_-_-_-                          |
+-----------------------------------------------+

rtrim(str[, trim_str])

rtrim(str[, trim_str])

> select rtrim('  datafusion  ');
+-------------------------------+
| rtrim(Utf8("  datafusion  ")) |
+-------------------------------+
|   datafusion                  |
+-------------------------------+
> select rtrim('___datafusion___', '_');
+-------------------------------------------+
| rtrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| ___datafusion                             |
+-------------------------------------------+

> select rtrim('  datafusion  ');
+-------------------------------+
| rtrim(Utf8("  datafusion  ")) |
+-------------------------------+
|   datafusion                  |
+-------------------------------+
> select rtrim('___datafusion___', '_');
+-------------------------------------------+
| rtrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| ___datafusion                             |
+-------------------------------------------+

split_part(str, delimiter, pos)

split_part(str, delimiter, pos)

> select split_part('1.2.3.4.5', '.', 3);
+--------------------------------------------------+
| split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+

> select split_part('1.2.3.4.5', '.', 3);
+--------------------------------------------------+
| split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+

starts_with(str, substr)

starts_with(str, substr)

> select starts_with('datafusion','data');
+----------------------------------------------+
| starts_with(Utf8("datafusion"),Utf8("data")) |
+----------------------------------------------+
| true                                         |
+----------------------------------------------+

> select starts_with('datafusion','data');
+----------------------------------------------+
| starts_with(Utf8("datafusion"),Utf8("data")) |
+----------------------------------------------+
| true                                         |
+----------------------------------------------+

strpos(str, substr)

strpos(str, substr)

> select strpos('datafusion', 'fus');
+----------------------------------------+
| strpos(Utf8("datafusion"),Utf8("fus")) |
+----------------------------------------+
| 5                                      |
+----------------------------------------+

> select strpos('datafusion', 'fus');
+----------------------------------------+
| strpos(Utf8("datafusion"),Utf8("fus")) |
+----------------------------------------+
| 5                                      |
+----------------------------------------+

substr(str, start_pos[, length])

substr(str, start_pos[, length])

> select substr('datafusion', 5, 3);
+----------------------------------------------+
| substr(Utf8("datafusion"),Int64(5),Int64(3)) |
+----------------------------------------------+
| fus                                          |
+----------------------------------------------+

> select substr('datafusion', 5, 3);
+----------------------------------------------+
| substr(Utf8("datafusion"),Int64(5),Int64(3)) |
+----------------------------------------------+
| fus                                          |
+----------------------------------------------+

substr_index(str, delim, count)

substr_index(str, delim, count)

> select substr_index('www.apache.org', '.', 1);
+---------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) |
+---------------------------------------------------------+
| www                                                     |
+---------------------------------------------------------+
> select substr_index('www.apache.org', '.', -1);
+----------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) |
+----------------------------------------------------------+
| org                                                      |
+----------------------------------------------------------+

> select substr_index('www.apache.org', '.', 1);
+---------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) |
+---------------------------------------------------------+
| www                                                     |
+---------------------------------------------------------+
> select substr_index('www.apache.org', '.', -1);
+----------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) |
+----------------------------------------------------------+
| org                                                      |
+----------------------------------------------------------+

to_hex(int)

to_hex(int)

> select to_hex(12345689);
+-------------------------+
| to_hex(Int64(12345689)) |
+-------------------------+
| bc6159                  |
+-------------------------+

> select to_hex(12345689);
+-------------------------+
| to_hex(Int64(12345689)) |
+-------------------------+
| bc6159                  |
+-------------------------+

translate(str, chars, translation)

translate(str, chars, translation)

> select translate('twice', 'wic', 'her');
+--------------------------------------------------+
| translate(Utf8("twice"),Utf8("wic"),Utf8("her")) |
+--------------------------------------------------+
| there                                            |
+--------------------------------------------------+

> select translate('twice', 'wic', 'her');
+--------------------------------------------------+
| translate(Utf8("twice"),Utf8("wic"),Utf8("her")) |
+--------------------------------------------------+
| there                                            |
+--------------------------------------------------+

upper(str)

upper(str)

> select upper('dataFusion');
+---------------------------+
| upper(Utf8("dataFusion")) |
+---------------------------+
| DATAFUSION                |
+---------------------------+

> select upper('dataFusion');
+---------------------------+
| upper(Utf8("dataFusion")) |
+---------------------------+
| DATAFUSION                |
+---------------------------+

> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 |
+--------------------------------------+

> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 |
+--------------------------------------+

bit_get(value, position)

bit_get(value, position)

> select bit_get(11, 2) as bit;
+-----+
| bit |
+-----+
| 0   |
+-----+

> select bit_get(11, 2) as bit;
+-----+
| bit |
+-----+
| 0   |
+-----+

bit_count(value)

bit_count(value)

> select bit_count(255) as popcnt;
+--------+
| popcnt |
+--------+
| 8      |
+--------+

> select bit_count(255) as popcnt;
+--------+
| popcnt |
+--------+
| 8      |
+--------+

bitmap_count(bitmap)

bitmap_count(bitmap)

> select bitmap_count(x'0F') as popcnt;
+--------+
| popcnt |
+--------+
| 4      |
+--------+

> select bitmap_count(x'0F') as popcnt;
+--------+
| popcnt |
+--------+
| 4      |
+--------+

regexp_like(str, regexp[, flags])

regexp_like(str, regexp[, flags])

> select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+--------------------------------------------------------+
| regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+--------------------------------------------------------+
| true                                                   |
+--------------------------------------------------------+
> SELECT regexp_like('aBc', '(b|d)', 'i');
+--------------------------------------------------+
| regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+--------------------------------------------------+
| true                                             |
+--------------------------------------------------+

> select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+--------------------------------------------------------+
| regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+--------------------------------------------------------+
| true                                                   |
+--------------------------------------------------------+
> SELECT regexp_like('aBc', '(b|d)', 'i');
+--------------------------------------------------+
| regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+--------------------------------------------------+
| true                                             |
+--------------------------------------------------+

regexp_match(str, regexp[, flags])

regexp_match(str, regexp[, flags])

> select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+---------------------------------------------------------+
| regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+---------------------------------------------------------+
| [Köln]                                                  |
+---------------------------------------------------------+
SELECT regexp_match('aBc', '(b|d)', 'i');
+---------------------------------------------------+
| regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+---------------------------------------------------+
| [B]                                               |
+---------------------------------------------------+

> select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+---------------------------------------------------------+
| regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+---------------------------------------------------------+
| [Köln]                                                  |
+---------------------------------------------------------+
SELECT regexp_match('aBc', '(b|d)', 'i');
+---------------------------------------------------+
| regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+---------------------------------------------------+
| [B]                                               |
+---------------------------------------------------+

regexp_replace(str, regexp, replacement[, flags])

regexp_replace(str, regexp, replacement[, flags])

> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g');
+------------------------------------------------------------------------+
| regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) |
+------------------------------------------------------------------------+
| fooXarYXazY                                                            |
+------------------------------------------------------------------------+
SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i');
+-------------------------------------------------------------------+
| regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) |
+-------------------------------------------------------------------+
| aAbBac                                                            |
+-------------------------------------------------------------------+

> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g');
+------------------------------------------------------------------------+
| regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) |
+------------------------------------------------------------------------+
| fooXarYXazY                                                            |
+------------------------------------------------------------------------+
SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i');
+-------------------------------------------------------------------+
| regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) |
+-------------------------------------------------------------------+
| aAbBac                                                            |
+-------------------------------------------------------------------+

regexp_count(str, regexp[, start, flags])

regexp_count(str, regexp[, start, flags])

> select regexp_count('abcAbAbc', 'abc', 2, 'i');
+---------------------------------------------------------------+
| regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) |
+---------------------------------------------------------------+
| 1                                                             |
+---------------------------------------------------------------+

> select regexp_count('abcAbAbc', 'abc', 2, 'i');
+---------------------------------------------------------------+
| regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) |
+---------------------------------------------------------------+
| 1                                                             |
+---------------------------------------------------------------+

regexp_instr(str, regexp[, start[, N[, flags[, subexpr]]]])

regexp_instr(str, regexp[, start[, N[, flags[, subexpr]]]])

> SELECT regexp_instr('ABCDEF', 'C(.)(..)');
+---------------------------------------------------------------+
| regexp_instr(Utf8("ABCDEF"),Utf8("C(.)(..)"))                 |
+---------------------------------------------------------------+
| 3                                                             |
+---------------------------------------------------------------+

> SELECT regexp_instr('ABCDEF', 'C(.)(..)');
+---------------------------------------------------------------+
| regexp_instr(Utf8("ABCDEF"),Utf8("C(.)(..)"))                 |
+---------------------------------------------------------------+
| 3                                                             |
+---------------------------------------------------------------+

current_date()

current_date()

current_time()

current_time()

date_bin(interval, expression, origin-timestamp)

date_bin(interval, expression, origin-timestamp)

-- Bin the timestamp into 1 day intervals
> SELECT date_bin(interval '1 day', time) as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T00:00:00 |
| 2023-01-03T00:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the timestamp into 1 day intervals starting at 3AM on  2023-01-01
> SELECT date_bin(interval '1 day', time,  '2023-01-01T03:00:00') as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T03:00:00 |
| 2023-01-03T03:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the timestamp into 1 day intervals
> SELECT date_bin(interval '1 day', time) as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T00:00:00 |
| 2023-01-03T00:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the timestamp into 1 day intervals starting at 3AM on  2023-01-01
> SELECT date_bin(interval '1 day', time,  '2023-01-01T03:00:00') as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T03:00:00 |
| 2023-01-03T03:00:00 |
+---------------------+
2 row(s) fetched.

date_add(start_date, num_days)

date_add(start_date, num_days)

> select date_add(date '2024-02-27', 3);
+---------------------------------+
| date_add(Date32("2024-02-27"),Int64(3)) |
+---------------------------------+
| 2024-03-01                     |
+---------------------------------+

> select date_add(date '2024-02-27', 3);
+---------------------------------+
| date_add(Date32("2024-02-27"),Int64(3)) |
+---------------------------------+
| 2024-03-01                     |
+---------------------------------+

date_sub(start_date, num_days)

date_sub(start_date, num_days)

> select date_sub(date '2024-03-05', 7);
+---------------------------------+
| date_sub(Date32("2024-03-05"),Int64(7)) |
+---------------------------------+
| 2024-02-27                     |
+---------------------------------+

> select date_sub(date '2024-03-05', 7);
+---------------------------------+
| date_sub(Date32("2024-03-05"),Int64(7)) |
+---------------------------------+
| 2024-02-27                     |
+---------------------------------+

last_day(expression)

last_day(expression)

> select last_day(date '2024-02-14');
+----------------------------------+
| last_day(Date32("2024-02-14"))   |
+----------------------------------+
| 2024-02-29                      |
+----------------------------------+

> select last_day(date '2024-02-14');
+----------------------------------+
| last_day(Date32("2024-02-14"))   |
+----------------------------------+
| 2024-02-29                      |
+----------------------------------+

next_day(start_date, day_of_week)

next_day(start_date, day_of_week)

> select next_day(date '2024-02-14', 'FRI');
+--------------------------------------------------+
| next_day(Date32("2024-02-14"),Utf8("FRI"))        |
+--------------------------------------------------+
| 2024-02-16                                       |
+--------------------------------------------------+

> select next_day(date '2024-02-14', 'FRI');
+--------------------------------------------------+
| next_day(Date32("2024-02-14"),Utf8("FRI"))        |
+--------------------------------------------------+
| 2024-02-16                                       |
+--------------------------------------------------+

date_part(part, expression)

date_part(part, expression)

extract(field FROM source)

extract(field FROM source)

date_trunc(precision, expression)

date_trunc(precision, expression)

from_unixtime(expression[, timezone])

from_unixtime(expression[, timezone])

> select from_unixtime(1599572549, 'America/New_York');
+-----------------------------------------------------------+
| from_unixtime(Int64(1599572549),Utf8("America/New_York")) |
+-----------------------------------------------------------+
| 2020-09-08T09:42:29-04:00                                 |
+-----------------------------------------------------------+

> select from_unixtime(1599572549, 'America/New_York');
+-----------------------------------------------------------+
| from_unixtime(Int64(1599572549),Utf8("America/New_York")) |
+-----------------------------------------------------------+
| 2020-09-08T09:42:29-04:00                                 |
+-----------------------------------------------------------+

make_date(year, month, day)

make_date(year, month, day)

> select make_date(2023, 1, 31);
+-------------------------------------------+
| make_date(Int64(2023),Int64(1),Int64(31)) |
+-------------------------------------------+
| 2023-01-31                                |
+-------------------------------------------+
> select make_date('2023', '01', '31');
+-----------------------------------------------+
| make_date(Utf8("2023"),Utf8("01"),Utf8("31")) |
+-----------------------------------------------+
| 2023-01-31                                    |
+-----------------------------------------------+

> select make_date(2023, 1, 31);
+-------------------------------------------+
| make_date(Int64(2023),Int64(1),Int64(31)) |
+-------------------------------------------+
| 2023-01-31                                |
+-------------------------------------------+
> select make_date('2023', '01', '31');
+-----------------------------------------------+
| make_date(Utf8("2023"),Utf8("01"),Utf8("31")) |
+-----------------------------------------------+
| 2023-01-31                                    |
+-----------------------------------------------+

to_char(expression, format)

to_char(expression, format)

> select to_char('2023-03-01'::date, '%d-%m-%Y');
+----------------------------------------------+
| to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) |
+----------------------------------------------+
| 01-03-2023                                   |
+----------------------------------------------+

> select to_char('2023-03-01'::date, '%d-%m-%Y');
+----------------------------------------------+
| to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) |
+----------------------------------------------+
| 01-03-2023                                   |
+----------------------------------------------+

to_date('2017-05-31', '%Y-%m-%d')

to_date('2017-05-31', '%Y-%m-%d')

> select to_date('2023-01-31');
+-------------------------------+
| to_date(Utf8("2023-01-31")) |
+-------------------------------+
| 2023-01-31                    |
+-------------------------------+
> select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d');
+---------------------------------------------------------------------+
| to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) |
+---------------------------------------------------------------------+
| 2023-01-31                                                          |
+---------------------------------------------------------------------+

> select to_date('2023-01-31');
+-------------------------------+
| to_date(Utf8("2023-01-31")) |
+-------------------------------+
| 2023-01-31                    |
+-------------------------------+
> select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d');
+---------------------------------------------------------------------+
| to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) |
+---------------------------------------------------------------------+
| 2023-01-31                                                          |
+---------------------------------------------------------------------+

to_local_time(expression)

to_local_time(expression)

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp);
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT
  time,
  arrow_typeof(time) as type,
  to_local_time(time) as to_local_time,
  arrow_typeof(to_local_time(time)) as to_local_time_type
FROM (
  SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time
);
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| time                      | type                                           | to_local_time       | to_local_time_type          |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) | 2024-04-01T00:00:20 | Timestamp(Nanosecond, None) |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+

# combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather
# than UTC boundaries

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin;
+---------------------+
| date_bin            |
+---------------------+
| 2024-04-01T00:00:00 |
+---------------------+

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone;
+---------------------------+
| date_bin_with_timezone    |
+---------------------------+
| 2024-04-01T00:00:00+02:00 |
+---------------------------+

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp);
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT
  time,
  arrow_typeof(time) as type,
  to_local_time(time) as to_local_time,
  arrow_typeof(to_local_time(time)) as to_local_time_type
FROM (
  SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time
);
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| time                      | type                                           | to_local_time       | to_local_time_type          |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) | 2024-04-01T00:00:20 | Timestamp(Nanosecond, None) |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+

# combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather
# than UTC boundaries

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin;
+---------------------+
| date_bin            |
+---------------------+
| 2024-04-01T00:00:00 |
+---------------------+

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone;
+---------------------------+
| date_bin_with_timezone    |
+---------------------------+
| 2024-04-01T00:00:00+02:00 |
+---------------------------+

to_timestamp(expression[, ..., format_n])

to_timestamp(expression[, ..., format_n])

> select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                             |
+-----------------------------------------------------------+
> select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                          |
+--------------------------------------------------------------------------------------------------------+

> select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                             |
+-----------------------------------------------------------+
> select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                          |
+--------------------------------------------------------------------------------------------------------+

to_timestamp_micros(expression[, ..., format_n])

to_timestamp_micros(expression[, ..., format_n])

> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456                                       |
+------------------------------------------------------------------+
> select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456                                                                                    |
+---------------------------------------------------------------------------------------------------------------+

> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456                                       |
+------------------------------------------------------------------+
> select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456                                                                                    |
+---------------------------------------------------------------------------------------------------------------+

to_timestamp_millis(expression[, ..., format_n])

to_timestamp_millis(expression[, ..., format_n])

> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123                                          |
+------------------------------------------------------------------+
> select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123                                                                                       |
+---------------------------------------------------------------------------------------------------------------+

> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123                                          |
+------------------------------------------------------------------+
> select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123                                                                                       |
+---------------------------------------------------------------------------------------------------------------+

to_timestamp_nanos(expression[, ..., format_n])

to_timestamp_nanos(expression[, ..., format_n])

> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                                   |
+-----------------------------------------------------------------+
> select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                                |
+---------------------------------------------------------------------------------------------------------------+

> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                                   |
+-----------------------------------------------------------------+
> select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                                |
+---------------------------------------------------------------------------------------------------------------+

to_timestamp_seconds(expression[, ..., format_n])

to_timestamp_seconds(expression[, ..., format_n])

> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56                                               |
+-------------------------------------------------------------------+
> select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00                                                                                            |
+----------------------------------------------------------------------------------------------------------------+

> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56                                               |
+-------------------------------------------------------------------+
> select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00                                                                                            |
+----------------------------------------------------------------------------------------------------------------+

to_unixtime(expression[, ..., format_n])

to_unixtime(expression[, ..., format_n])

> select to_unixtime('2020-09-08T12:00:00+00:00');
+------------------------------------------------+
| to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) |
+------------------------------------------------+
| 1599566400                                     |
+------------------------------------------------+
> select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z');
+-----------------------------------------------------------------------------------------------------------------------------+
| to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) |
+-----------------------------------------------------------------------------------------------------------------------------+
| 1673638290                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

> select to_unixtime('2020-09-08T12:00:00+00:00');
+------------------------------------------------+
| to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) |
+------------------------------------------------+
| 1599566400                                     |
+------------------------------------------------+
> select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z');
+-----------------------------------------------------------------------------------------------------------------------------+
| to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) |
+-----------------------------------------------------------------------------------------------------------------------------+
| 1673638290                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

array(expression[, ..., expression_n])

array(expression[, ..., expression_n])

> select array(1, 2, 3);
+-----------------------------------+
| array(Int64(1),Int64(2),Int64(3)) |
+-----------------------------------+
| [1, 2, 3]                         |
+-----------------------------------+

> select array(1, 2, 3);
+-----------------------------------+
| array(Int64(1),Int64(2),Int64(3)) |
+-----------------------------------+
| [1, 2, 3]                         |
+-----------------------------------+

array_any_value(array)

array_any_value(array)

> select array_any_value([NULL, 1, 2, 3]);
+-------------------------------+
| array_any_value(List([NULL,1,2,3])) |
+-------------------------------------+
| 1                                   |
+-------------------------------------+

> select array_any_value([NULL, 1, 2, 3]);
+-------------------------------+
| array_any_value(List([NULL,1,2,3])) |
+-------------------------------------+
| 1                                   |
+-------------------------------------+

array_append(array, element)

array_append(array, element)

> select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4]                         |
+--------------------------------------+

> select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4]                         |
+--------------------------------------+

array_concat(array[, ..., array_n])

array_concat(array[, ..., array_n])

> select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                |
+---------------------------------------------------+

> select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                |
+---------------------------------------------------+

array_contains(array, element)

array_contains(array, element)

> select array_contains([1, 2, 3], 2);
+----------------------------------------+
| array_contains(List([1,2,3]),Int64(2)) |
+----------------------------------------+
| true                                   |
+----------------------------------------+

> select array_contains([1, 2, 3], 2);
+----------------------------------------+
| array_contains(List([1,2,3]),Int64(2)) |
+----------------------------------------+
| true                                   |
+----------------------------------------+

array_dims(array)

array_dims(array)

> select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3]                          |
+---------------------------------+

> select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3]                          |
+---------------------------------+

array_distance(array1, array2)

array_distance(array1, array2)

> select array_distance([1, 2], [1, 4]);
+------------------------------------+
| array_distance(List([1,2], [1,4])) |
+------------------------------------+
| 2.0                                |
+------------------------------------+

> select array_distance([1, 2], [1, 4]);
+------------------------------------+
| array_distance(List([1,2], [1,4])) |
+------------------------------------+
| 2.0                                |
+------------------------------------+

array_distinct(array)

array_distinct(array)

> select array_distinct([1, 3, 2, 3, 1, 2, 4]);
+---------------------------------+
| array_distinct(List([1,2,3,4])) |
+---------------------------------+
| [1, 3, 2, 4]                    |
+---------------------------------+

> select array_distinct([1, 3, 2, 3, 1, 2, 4]);
+---------------------------------+
| array_distinct(List([1,2,3,4])) |
+---------------------------------+
| [1, 3, 2, 4]                    |
+---------------------------------+

array_element(array, index)

array_element(array, index)

> select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3                                       |
+-----------------------------------------+

> select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3                                       |
+-----------------------------------------+

array_except(array1, array2)

array_except(array1, array2)

array_has(array, element)

array_has(array, element)

array_has_all(array, sub_array)

array_has_all(array, sub_array)

array_has_any(array, sub_array)

array_has_any(array, sub_array)

array_intersect(array1, array2)

array_intersect(array1, array2)

array_length(array[, dimension])

array_length(array[, dimension])

array_max(array)

array_max(array)

array_min(array)

array_min(array)

array_ndims(array)

array_ndims(array)

array_pop_back(array)

array_pop_back(array)

array_pop_front(array)

array_pop_front(array)

array_position(array, element[, from_index])

array_position(array, element[, from_index])

array_positions(array, element)

array_positions(array, element)

array_prepend(element, array)

array_prepend(element, array)

array_remove(array, element)

array_remove(array, element)

array_remove_n(array, element, max)

array_remove_n(array, element, max)

array_remove_all(array, element)

array_remove_all(array, element)

array_repeat(element, count)

array_repeat(element, count)

array_replace(array, from, to)

array_replace(array, from, to)

array_replace_n(array, from, to, max)

array_replace_n(array, from, to, max)

array_replace_all(array, from, to)

array_replace_all(array, from, to)

array_resize(array, size[, value])

array_resize(array, size[, value])

array_reverse(array)

array_reverse(array)

array_slice(array, begin, end[, stride])

array_slice(array, begin, end[, stride])

array_sort(array[, desc[, nulls_first]])

array_sort(array[, desc[, nulls_first]])

array_to_string(array, delimiter[, null_string])

array_to_string(array, delimiter[, null_string])

array_union(array1, array2)

array_union(array1, array2)

arrays_zip(array1[, array2, ...])

arrays_zip(array1[, array2, ...])

cardinality(array_or_map)

cardinality(array_or_map)

empty(array)

empty(array)

flatten(array)

flatten(array)

make_array(expression1[, ..., expression_n])

make_array(expression1[, ..., expression_n])

range(start, stop[, step])
range(stop)

range(start, stop[, step])
range(stop)

generate_series(start, stop[, step])

generate_series(start, stop[, step])

string_to_array(str, delimiter[, null_string])

string_to_array(str, delimiter[, null_string])

struct(expression1[, ..., expression_n])

struct(expression1[, ..., expression_n])

> SELECT struct(1, 'spice', true);
+-----------------------------------------------------+
| struct(Int64(1),Utf8("spice"),Boolean(true))        |
+-----------------------------------------------------+
| {c0: 1, c1: spice, c2: true}                        |
+-----------------------------------------------------+

> SELECT struct(1, 'spice', true);
+-----------------------------------------------------+
| struct(Int64(1),Utf8("spice"),Boolean(true))        |
+-----------------------------------------------------+
| {c0: 1, c1: spice, c2: true}                        |
+-----------------------------------------------------+

named_struct(name1, expression1[, name2, expression2, ...])

named_struct(name1, expression1[, name2, expression2, ...])

> SELECT named_struct('id', 1, 'label', 'spice');
+--------------------------------------------------------+
| named_struct(Utf8("id"),Int64(1),Utf8("label"),Utf8("spice")) |
+--------------------------------------------------------+
| {id: 1, label: spice}                                  |
+--------------------------------------------------------+

> SELECT named_struct('id', 1, 'label', 'spice');
+--------------------------------------------------------+
| named_struct(Utf8("id"),Int64(1),Utf8("label"),Utf8("spice")) |
+--------------------------------------------------------+
| {id: 1, label: spice}                                  |
+--------------------------------------------------------+

get_field(expression, field_name)

get_field(expression, field_name)

map(key1, value1[, key2, value2, ...])
map(keys_array, values_array)

map(key1, value1[, key2, value2, ...])
map(keys_array, values_array)

> SELECT map('a', 1, 'b', 2);
+-------------------------------------------------------+
| map(Utf8("a"),Int64(1),Utf8("b"),Int64(2))            |
+-------------------------------------------------------+
| {a: 1, b: 2}                                          |
+-------------------------------------------------------+

> SELECT map('a', 1, 'b', 2);
+-------------------------------------------------------+
| map(Utf8("a"),Int64(1),Utf8("b"),Int64(2))            |
+-------------------------------------------------------+
| {a: 1, b: 2}                                          |
+-------------------------------------------------------+

map_keys(map)

map_keys(map)

map_values(map)

map_values(map)

map_entries(map)

map_entries(map)

map_extract(map, key)

map_extract(map, key)

digest(expression, algorithm)

digest(expression, algorithm)

> SELECT encode(digest('spice.ai', 'sha256'), 'hex');

> SELECT encode(digest('spice.ai', 'sha256'), 'hex');

md5(expression)

md5(expression)

sha224(expression)

sha224(expression)

sha256(expression)

sha256(expression)

sha384(expression)

sha384(expression)

sha512(expression)

sha512(expression)

encode(expression, encoding)

encode(expression, encoding)

> SELECT encode('spice', 'base64');
+--------------------------------------+
| encode(Utf8("spice"),Utf8("base64")) |
+--------------------------------------+
| c3BpY2U=                             |
+--------------------------------------+

> SELECT encode('spice', 'base64');
+--------------------------------------+
| encode(Utf8("spice"),Utf8("base64")) |
+--------------------------------------+
| c3BpY2U=                             |
+--------------------------------------+

decode(expression, encoding)

decode(expression, encoding)

union_extract(expression, field_name)

union_extract(expression, field_name)

union_tag(expression)

union_tag(expression)

obj_description(table_identifier)
obj_description(table_identifier, catalog_name)
obj_description(schema_name, table_name)
obj_description(catalog_name, schema_name, table_name)

obj_description(table_identifier)
obj_description(table_identifier, catalog_name)
obj_description(schema_name, table_name)
obj_description(catalog_name, schema_name, table_name)

> SELECT obj_description('public.taxi_trips');
+----------------------------------------+
| obj_description(Utf8("public.taxi_trips")) |
+----------------------------------------+
| NYC yellow-cab trip records             |
+----------------------------------------+

> SELECT obj_description('public.taxi_trips');
+----------------------------------------+
| obj_description(Utf8("public.taxi_trips")) |
+----------------------------------------+
| NYC yellow-cab trip records             |
+----------------------------------------+

col_description(table_identifier, column)
col_description(catalog_name, schema_name, table_name, column)

col_description(table_identifier, column)
col_description(catalog_name, schema_name, table_name, column)

> SELECT col_description('public.taxi_trips', 'fare_amount');
+---------------------------------------------------------+
| col_description(Utf8("public.taxi_trips"),Utf8("fare_amount")) |
+---------------------------------------------------------+
| Total fare in USD, excluding tip                        |
+---------------------------------------------------------+

> SELECT col_description('public.taxi_trips', 3);
+--------------------------------------------------+
| col_description(Utf8("public.taxi_trips"),Int64(3)) |
+--------------------------------------------------+
| Pickup datetime in source-local time             |
+--------------------------------------------------+

> SELECT col_description('public.taxi_trips', 'fare_amount');
+---------------------------------------------------------+
| col_description(Utf8("public.taxi_trips"),Utf8("fare_amount")) |
+---------------------------------------------------------+
| Total fare in USD, excluding tip                        |
+---------------------------------------------------------+

> SELECT col_description('public.taxi_trips', 3);
+--------------------------------------------------+
| col_description(Utf8("public.taxi_trips"),Int64(3)) |
+--------------------------------------------------+
| Pickup datetime in source-local time             |
+--------------------------------------------------+

arrow_cast(expression, arrow_type)

arrow_cast(expression, arrow_type)

> SELECT arrow_cast(now(), 'Timestamp(Second, None)') AS now_seconds;
+---------------------+
| now_seconds         |
+---------------------+
| 2024-01-15T10:30:45 |
+---------------------+

> SELECT arrow_cast('123', 'Int64') AS num;
+-----+
| num |
+-----+
| 123 |
+-----+

> SELECT arrow_cast(now(), 'Timestamp(Second, None)') AS now_seconds;
+---------------------+
| now_seconds         |
+---------------------+
| 2024-01-15T10:30:45 |
+---------------------+

> SELECT arrow_cast('123', 'Int64') AS num;
+-----+
| num |
+-----+
| 123 |
+-----+

arrow_try_cast(expression, arrow_type)

arrow_try_cast(expression, arrow_type)

arrow_typeof(expression)

arrow_typeof(expression)

> SELECT arrow_typeof(1);
+------------------------+
| arrow_typeof(Int64(1)) |
+------------------------+
| Int64                  |
+------------------------+

> SELECT arrow_typeof(now());
+-------------------------------+
| arrow_typeof(now())           |
+-------------------------------+
| Timestamp(Nanosecond, None)   |
+-------------------------------+

> SELECT arrow_typeof(interval '1 month');
+------------------------------+
| arrow_typeof(...)            |
+------------------------------+
| Interval(MonthDayNano)       |
+------------------------------+

> SELECT arrow_typeof(1);
+------------------------+
| arrow_typeof(Int64(1)) |
+------------------------+
| Int64                  |
+------------------------+

> SELECT arrow_typeof(now());
+-------------------------------+
| arrow_typeof(now())           |
+-------------------------------+
| Timestamp(Nanosecond, None)   |
+-------------------------------+

> SELECT arrow_typeof(interval '1 month');
+------------------------------+
| arrow_typeof(...)            |
+------------------------------+
| Interval(MonthDayNano)       |
+------------------------------+

arrow_metadata(expression)

arrow_metadata(expression)

version()

version()

bucket(num_buckets, value)

bucket(num_buckets, value)

-- Partition account IDs into 100 stable buckets
SELECT account_id, bucket(100, account_id) AS account_bucket
FROM accounts;

-- Partition account IDs into 100 stable buckets
SELECT account_id, bucket(100, account_id) AS account_bucket
FROM accounts;

datasets:
  - name: my_table
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      partition_by:
        - bucket(100, account_id)

datasets:
  - name: my_table
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      partition_by:
        - bucket(100, account_id)

truncate(width, value)

truncate(width, value)

-- Numeric: floor-bucket integers into ranges of 10
SELECT truncate(10, 101) AS truncated_id;  -- returns 100

-- Truncate event timestamps to the start of each hour (3600 seconds)
SELECT truncate(3600, extract(epoch FROM event_time)) AS hour_start
FROM events;

-- String: keep the first 2 characters (e.g., country prefix)
SELECT truncate(2, 'United Kingdom');  -- returns 'Un'

-- Numeric: floor-bucket integers into ranges of 10
SELECT truncate(10, 101) AS truncated_id;  -- returns 100

-- Truncate event timestamps to the start of each hour (3600 seconds)
SELECT truncate(3600, extract(epoch FROM event_time)) AS hour_start
FROM events;

-- String: keep the first 2 characters (e.g., country prefix)
SELECT truncate(2, 'United Kingdom');  -- returns 'Un'

title: 'Scalar Functions' sidebar_label: 'Scalar Functions' pagination_prev: 'reference/sql/ai' sidebar_position: 6

Function Categories

Math Functions

abs

Arguments

Example

acos

Arguments

acosh

Arguments

asin

Arguments

asinh

Arguments

atan

Arguments

atan2

Arguments

atanh

Arguments

cbrt

Arguments

ceil

Arguments

cos

Arguments

cosh

Arguments

cot

Arguments

degrees

Arguments

exp

Arguments

factorial

Arguments

floor

Arguments

gcd

Arguments

isnan

Arguments

iszero

Arguments

lcm

Arguments

mod

Arguments

Example

pmod

Arguments

Example

ln

Arguments

log

Arguments

log10

Arguments

log2

Arguments

nanvl

Arguments

pi

pow and power

Arguments

radians

Arguments

random

round

Arguments

rint

Arguments

Example

signum

Arguments

sin

Arguments

sinh

Arguments

sqrt

`abs`

`acos`

`acosh`

`asin`

`asinh`

`atan`

`atan2`

`atanh`

`cbrt`

`ceil`

`cos`

`cosh`

`cot`

`degrees`

`exp`

`factorial`

`floor`

`gcd`

`isnan`

`iszero`

`lcm`

`mod`

`pmod`

`ln`

`log`

`log10`

`log2`

`nanvl`

`pi`

`pow` and `power`

`radians`

`random`

`round`

`rint`

`signum`

`sin`

`sinh`

`sqrt`

`tan`

`tanh`

`trunc`

`width_bucket`

`CASE`

`coalesce`

`greatest`

`if`

`least`

`nullif`

`nvl`

`nvl2`

`ascii`

`bit_length`

`btrim`

`char_length`

`character_length`

`chr`

`concat`

`concat_ws`

`contains`

`like`

`ilike`

`ends_with`

`find_in_set`

`initcap`