Built-in functions
Applies to:
Databricks SQL
Databricks Runtime
This article presents links to and descriptions of built-in operators and functions for strings and binary types, numeric scalars, aggregations, windows, arrays, maps, dates and timestamps, casting, CSV data, JSON data, XPath manipulation, and other miscellaneous functions.
Also see:
Operators and predicates
For information on how operators are parsed with respect to each other, see Operator precedence.
| Operator | Syntax | Description |
|---|---|---|
| & | expr1 & expr2 |
Returns the bitwise AND of expr1 and expr2. |
| and | expr1 and expr2 |
Returns the logical AND of expr1 and expr2. |
| * | multiplier * multiplicand |
Returns multiplier multiplied by multiplicand. |
| != | expr1 != expr2 |
Returns true if expr1 does not equal expr2, or false otherwise. |
| ! | !expr |
Returns the logical NOT of a Boolean expression. |
| between | expr1 [not] between expr2 and expr2 |
Tests whether expr1 is greater or equal than expr2 and less than or equal to expr3. |
| [ ] | arrayExpr [ indexExpr ] |
Returns indexExprnd element of ARRAY arrayExpr |
| [ ] | mapExpr [ keyExpr ] |
Returns value at keyExpr of MAP mapExpr |
| ^ | expr1 ^ expr2 |
Returns the bitwise exclusive OR (XOR) of expr1 and expr2. |
| : | jsonStr : jsonPath |
Returns fields extracted from the jsonStr. |
| :: | expr :: type |
Casts the value expr to the target data type type. |
| div | divisor div dividend |
Returns the integral part of the division of divisor by dividend. |
| . | mapExpr . keyIdentifier |
Returns a MAP value by keyIdentifier. |
| . | structExpr . fieldIdentifier |
Returns a STRUCT field by fieldIdentifier. |
| == | expr1 == expr2 |
Returns true if expr1 equals expr2, or false otherwise. |
| = | expr1 = expr2 |
Returns true if expr1 equals expr2, or false otherwise. |
| >= | expr1 >= expr2 |
Returns true if expr1 is greater than or equal to expr2, or false otherwise. |
| > | expr1 > expr2 |
Returns true if expr1 is greater than expr2, or false otherwise. |
| exists | exists(query) |
Returns true if query returns at least one row, or false otherwise. |
| ilike | str [not] ilike (pattern[ESCAPE escape]) |
Returns true if str does (not) match pattern with escape case-insensitively. |
| ilike | str [not] ilike {ANY|SOME|ALL}([pattern[, ...]]) |
Returns true if str does (not) match any/all patterns case-insensitively. |
| in | elem [not] in (expr1[, ...]) |
Returns true if elem does (not) equal any exprN. |
| in | elem [not] in (query) |
Returns true if elem does (not) equal any row in query. |
| is distinct | expr1 is [not] distinct from expr2 |
Tests whether the arguments do (not) have different values where NULLs are considered as comparable values. |
| is false | expr is [not] false |
Tests whether expr is (not) false. |
| is null | expr is [not] null |
Returns true if expr is (not) NULL. |
| is true | expr is [not] true |
Tests whether expr is (not) true. |
| like | str [not] like (pattern[ESCAPE escape]) |
Returns true if str does (not) match pattern with escape. |
| like | str [not] like {ANY|SOME|ALL}([pattern[, ...]]) |
Returns true if str does (not) match any/all patterns. |
| <=> | expr1 <=> expr2 |
Returns the same result as the EQUAL(=) for non-null operands, but returns true if both are NULL, false if one of the them is NULL. |
| <= | expr1 <= expr2 |
Returns true if expr1 is less than or equal to expr2, or false otherwise. |
| <> | expr1 <> expr2 |
Returns true if expr1 does not equal expr2, or false otherwise. |
| < | expr1 < expr2 |
Returns true if expr1 is less than expr2, or false otherwise. |
| - | expr1 - expr2 |
Returns the subtraction of expr2 from expr1. |
| not | not expr |
Returns the logical NOT of a Boolean expression. |
| or | expr1 or expr2 |
Returns the logical OR of expr1 and expr2. |
| % | dividend % divisor |
Returns the remainder after dividend / divisor. |
| || | expr1 || expr2 |
Returns the concatenation of expr1 and expr2. |
| | | expr1 | expr2 |
Returns the bitwise OR of expr1 and expr2. |
| + | expr1 + expr2 |
Returns the sum of expr1 and expr2. |
| regexp | str [not] regexp regex |
Returns true if str does (not) match regex. |
| regexp_like | str [not] regexp_like regex |
Returns true if str does (not) match regex. |
| rlike | str [not] rlike regex |
Returns true if str does (not) match regex. |
| / | dividend / divisor |
Returns dividend divided by divisor. |
| ~ | ~ expr |
Returns the bitwise NOT of expr. |
Operator precedence
| Precedence | Operator |
|---|---|
| 1 | :, ::, [ ] |
| 2 | -(unary), +(unary), ~ |
| 3 | *, /, %, div |
| 4 | +, -, || |
| 5 | & |
| 6 | ^ |
| 7 | | |
| 8 | =, ==, <=>, <>, !=, <, <=, >, >= |
| 9 | not, exists |
| 10 | between, in, rlike, regexp, ilike, like, is [not] [NULL, true, false], is [not] distinct from |
| 11 | and |
| 12 | or |
String and binary functions
| Function | Description |
|---|---|
| expr1 || expr2 | Returns the concatenation of expr1 and expr2. |
| aes_decrypt(expr, key[, mode[, padding]]) | Decrypts a binary expr using AES encryption. |
| aes_encrypt(expr, key[, mode[, padding]]) | Encrypts a binary expr using AES encryption. |
| ascii(str) | Returns the ASCII code point of the first character of str. |
| base64(expr) | Converts expr to a base 64 string. |
| bin(expr) | Returns the binary representation of expr. |
| binary(expr) | Casts the value of expr to BINARY. |
| bit_length(expr) | Returns the bit length of string data or number of bits of binary data. |
| btrim(str [, trimStr]) | Returns str with leading and trailing characters removed. |
| char(expr) | Returns the character at the supplied UTF-16 code point. |
| char_length(expr) | Returns the character length of string data or number of bytes of binary data. |
| character_length(expr) | Returns the character length of string data or number of bytes of binary data. |
| charindex(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos. |
| chr(expr) | Returns the character at the supplied UTF-16 code point. |
| concat(expr1, expr2[, …]) | Returns the concatenation of the arguments. |
| concat_ws(sep[, expr1[, …]]) | Returns the concatenation strings separated by sep. |
| contains(expr, subExpr) | Returns true if expr STRING or BINARY contains subExpr. |
| crc32(expr) | Returns a cyclic redundancy check value of expr. |
| decode(expr, charSet) | Translates binary expr to a string using the character set encoding charSet. |
| encode(expr, charSet) | Returns the binary representation of a string using the charSet character encoding. |
| endswith(expr, endExpr) | Returns true if expr STRING or BINARY ends with endExpr. |
| find_in_set(searchExpr, sourceExpr) | Returns the position of a string within a comma-separated list of strings. |
| format_number(expr, scale) | Formats expr like #,###,###.##, rounded to scale decimal places. |
| format_number(expr, fmt) | Formats expr like fmt. |
| format_string(strfmt[, obj1 [, …]]) | Returns a formatted string from printf-style format strings. |
| hex(expr) | Converts expr to hexadecimal. |
| str ilike (pattern[ESCAPE escape]) | Returns true if str matches pattern with escape case insensitively. |
| initcap(expr) | Returns expr with the first letter of each word in uppercase. |
| instr(str, substr) | Returns the (1-based) index of the first occurrence of substr in str. |
| lcase(expr) | Returns expr with all characters changed to lowercase. |
| left(str, len) | Returns the leftmost len characters from str. |
| len(expr) | Returns the character length of string data or number of bytes of binary data. |
| length(expr) | Returns the character length of string data or number of bytes of binary data. |
| levenshtein(str1, str2) | Returns the Levenshtein distance between the strings str1 and str2. |
| str like (pattern[ESCAPE escape]) | Returns true if str matches pattern with escape. |
| locate(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos. |
| lower(expr) | Returns expr with all characters changed to lowercase. |
| lpad(expr, len[, pad]) | Returns expr, left-padded with pad to a length of len. |
| ltrim([trimstr,] str) | Returns str with leading characters within trimStr removed. |
| mask(str[, uChar[, lChar[, dChar[, oChar]]]]) | Returns a masked version of the input str. |
| md5(expr) | Returns an MD5 128-bit checksum of expr as a hex string. |
| octet_length(expr) | Returns the byte length of string data or number of bytes of binary data. |
| overlay(input PLACING replace FROM pos [FOR len]) | Replaces input with replace that starts at pos and is of length len. |
| parse_url(url, partToExtract[, key]) | Extracts a part from url. |
| position(substr, str[, pos]) | Returns the position of the first occurrence of substr in str after position pos. |
| position(subtr IN str) | Returns the position of the first occurrence of substr in str after position pos. |
| printf(strfmt[, obj1 [, …]]) | Returns a formatted string from printf-style format strings. |
| str regexp regex | Returns true if str matches regex. |
| str regexp_like regex | Returns true if str matches regex. |
| regexp_count(str, regexp) | Returns the number of times str matches the regexp pattern. |
| regexp_extract(str, regexp[, idx]) | Extracts the first string in str that matches the regexp expression and corresponds to the regex group index. |
| regexp_extract_all(str, regexp[, idx]) | Extracts the all strings in str that matches the regexp expression and corresponds to the regex group index. |
| regexp_instr(str, regexp) | Returns the position of the first substring in str that matches regexp. |
| regexp_replace(str, regexp, rep[, position]) | Replaces all substrings of str that match regexp with rep. |
| regexp_substr(str, regexp) | Returns the first substring in str that matches regexp. |
| repeat(expr, n) | Returns the string that repeats expr n times. |
| replace(str, search [, replace]) | Replaces all occurrences of search with replace. |
| reverse(expr) | Returns a reversed string or an array with reverse order of elements. |
| right(str, len) | Returns the rightmost len characters from the string str. |
| str rlike regex | Returns true if str matches regex. |
| rpad(expr, len[, pad]) | Returns expr, right-padded with pad to a length of len. |
| rtrim([trimStr,] str) | Returns str with trailing characters removed. |
| sentences(str[, lang, country]) | Splits str into an array of array of words. |
| sha(expr) | Returns a sha1 hash value as a hex string of expr. |
| sha1(expr) | Returns a sha1 hash value as a hex string of expr. |
| sha2(expr, bitLength) | Returns a checksum of the SHA-2 family as a hex string of expr. |
| soundex(expr) | Returns the soundex code of the string. |
| space(n) | Returns a string consisting of n spaces. |
| split(str, regex[, limit]) | Splits str around occurrences that match regex and returns an array with a length of at most limit. |
| split_part(str, delim, partNum) | Splits str around occurrences of delim and returns the partNum part. |
| startswith(expr, startExpr) | Returns true if expr STRING or BINARY starts with startExpr. |
| string(expr) | Casts the value expr to STRING. |
| substr(expr, pos[, len]) | Returns the substring of expr that starts at pos and is of length len. |
| substr(expr FROM pos[ FOR len]) | Returns the substring of expr that starts at pos and is of length len. |
| substring(expr, pos[, len]) | Returns the substring of expr that starts at pos and is of length len. |
| substring(expr FROM pos[ FOR len]) | Returns the substring of expr that starts at pos and is of length len. |
| substring_index(expr, delim, count) | Returns the substring of expr before count occurrences of the delimiter delim. |
| to_binary(expr[, fmt]) | Returns expr cast to a Binary based on fmt. |
| to_char(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt.” |
| to_varchar(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt.” |
| translate(expr, from, to) | Returns an expr where all characters in from have been replaced with those in to. |
| trim([[BOTH | LEADING | TRAILING] [trimStr] FROM] str) | Trim characters from a string. |
| try_aes_decrypt(expr, key[, mode[, padding]]) | Decrypts a binary expr using AES encryption, and return NULL in case of error. |
| try_to_binary(expr [, fmt]) | Returns expr cast to BINARY based on fmt, or NULL if the input is invalid. |
| ucase(expr) | Returns expr with all characters changed to uppercase. |
| unbase64(expr) | Returns a decoded base64 string as binary. |
| unhex(expr) | Converts hexadecimal expr to BINARY. |
| upper(expr) | Returns expr with all characters changed to uppercase. |
| url_decode(str) | Translates a string back from application/x-www-form-urlencoded format. |
| url_encode(str) | Translates a string into application/x-www-form-urlencoded format. |
Numeric scalar functions
| Function | Description |
|---|---|
| ~ expr | Returns the bitwise NOT of expr. |
| dividend / divisor | Returns dividend divided by divisor. |
| expr1 | expr2 | Returns the bitwise OR of expr1 and expr2. |
| - expr | Returns the negated value of expr. |
| expr1 - expr2 | Returns the subtraction of expr2 from expr1. |
| + expr | Returns the value of expr. |
| expr1 + expr2 | Returns the sum of expr1 and expr2. |
| dividend % divisor | Returns the remainder after dividend / divisor. |
| expr1 ^ expr2 | Returns the bitwise exclusive OR (XOR) of expr1 and expr2. |
| expr1 & expr2 | Returns the bitwise AND of expr1 and expr2. |
| multiplier * multiplicand | Returns multiplier multiplied by multiplicand. |
| abs(expr) | Returns the absolute value of the numeric value in expr. |
| acos(expr) | Returns the inverse cosine (arccosine) of expr. |
| acosh(expr) | Returns the inverse hyperbolic cosine of expr. |
| asin(expr) | Returns the inverse sine (arcsine) of expr. |
| asinh(expr) | Returns the inverse hyperbolic sine of expr. |
| atan(expr) | Returns the inverse tangent (arctangent) of expr. |
| atan2(exprY, exprX) | Returns the angle in radians between the positive x-axis of a plane and the point specified by the coordinates (exprX, exprY). |
| atanh(expr) | Returns inverse hyperbolic tangent of expr. |
| bigint(expr) | Casts the value expr to BIGINT. |
| bit_count(expr) | Returns the number of bits set in the argument. |
| bit_get(expr, pos) | Returns the value of a bit in a binary representation of an integral numeric. |
| bit_reverse(expr) | Returns the value obtained by reversing the order of the bits in the argument. |
| bround(expr[,targetScale]) | Returns the rounded expr using HALF_EVEN rounding mode. |
| cbrt(expr) | Returns the cube root of expr. |
| ceil(expr[,targetScale]) | Returns the smallest number not smaller than expr rounded up to targetScale digits relative to the decimal point. |
| ceiling(expr[,targetScale]) | Returns the smallest number not smaller than expr rounded up to targetScale digits relative to the decimal point. |
| conv(num, fromBase, toBase) | Converts num from fromBase to toBase. |
| convert_timezone([sourceTz, ]targetTz, sourceTs) | Converts the TIMESTAMP_NTZ sourceTs from the sourceTz time zone to targetTz. |
| cos(expr) | Returns the cosine of expr. |
| cosh(expr) | Returns the hyperbolic cosine of expr. |
| cot(expr) | Returns the cotangent of expr. |
| csc(expr) | Returns the cosecant of expr. |
| decimal(expr) | Casts the value expr to DECIMAL. |
| degrees(expr) | Converts radians to degrees. |
| divisor div dividend | Returns the integral part of the division of divisor by dividend. |
| double(expr) | Casts the value expr to DOUBLE. |
| e() | Returns the constant e. |
| exp(expr) | Returns e to the power of expr. |
| expm1(expr) | Returns exp(expr) - 1. |
| factorial(expr) | Returns the factorial of expr. |
| float(expr) | Casts the value expr to FLOAT. |
| floor(expr[,targetScale]) | Returns the largest number not smaller than expr rounded down to targetScale digits relative to the decimal point. |
| getbit(expr, pos) | Returns the value of a bit in a binary representation of an integral numeric. |
| hypot(expr1, expr2) | Returns sqrt(expr1 * expr1 + expr2 * expr2). |
| int(expr) | Casts the value expr to INTEGER. |
| isnan(expr) | Returns true if expr is NaN. |
| ln(expr) | Returns the natural logarithm (base e) of expr. |
| log([base,] expr) | Returns the logarithm of expr with base. |
| log1p(expr) | Returns log(1 + expr). |
| log2(expr) | Returns the logarithm of expr with base 2. |
| log10(expr) | Returns the logarithm of expr with base 10. |
| mod(dividend, divisor) | Returns the remainder after dividend / divisor. |
| nanvl(expr1, expr2) | Returns expr1 if it’s not NaN, or expr2 otherwise. |
| negative(expr) | Returns the negated value of expr. |
| pi() | Returns pi. |
| pmod(dividend, divisor) | Returns the positive remainder after dividend / divisor. |
| positive(expr) | Returns the value of expr. |
| pow(expr1, expr2) | Raises expr1 to the power of expr2. |
| power(expr1, expr2) | Raises expr1 to the power of expr2. |
| radians(expr) | Converts expr in degrees to radians. |
| rand([seed]) | Returns a random value between 0 and 1. |
| randn([seed]) | Returns a random value from a standard normal distribution. |
| random([seed]) | Returns a random value between 0 and 1. |
| rint(expr) | Returns expr rounded to a whole number as a DOUBLE. |
| round(expr[,targetScale]) | Returns the rounded expr using HALF_UP rounding mode. |
| sec(expr) | Returns the secant of expr. |
| sin(expr) | Returns the sine of expr. |
| shiftleft(expr, n) | Returns a bitwise left shifted by n bits. |
| shiftright(expr, n) | Returns a bitwise signed signed integral number right shifted by n bits. |
| shiftrightunsigned(expr, n) | Returns a bitwise unsigned signed integral number right shifted by n bits. |
| sign(expr) | Returns -1.0, 0.0, or 1.0 as expr is negative, 0, or positive. |
| signum(expr) | Returns -1.0, 0.0, or 1.0 as expr is negative, 0, or positive. |
| sinh(expr) | Returns the hyperbolic sine of expr. |
| smallint(expr) | Casts the value expr to SMALLINT. |
| sqrt(expr) | Returns the square root of expr. |
| tan(expr) | Returns the tangent of expr. |
| tanh(expr) | Returns the hyperbolic tangent of expr. |
| tinyint(expr) | Casts expr to TINYINT. |
| to_number(expr, fmt ) | Returns expr cast to DECIMAL using formatting fmt. |
| try_add(expr1, expr2) | Returns the sum of expr1 and expr2, or NULL in case of error. |
| try_divide(dividend, divisor) | Returns dividend divided by divisor, or NULL if divisor is 0. |
| try_multiply(multiplier, multiplicand) | Returns multiplier multiplied by multiplicand, or NULL on overflow. |
| try_subtract(expr1, expr2) | Returns the subtraction of expr2 from expr1, or NULL on overflow. |
| try_to_number(expr, fmt ) | Returns expr cast to DECIMAL using formatting fmt, or NULL if expr does not match the format. |
| width_bucket(expr, minExpr, maxExpr, numBuckets) | Returns the bucket number for a value in an equi-width histogram. |
Aggregate functions
| Function | Description |
|---|---|
| any(expr) | Returns true if at least one value of expr in the group is true. |
| any_value(expr[,ignoreNull]) | Returns any random value of expr for a group of rows. |
| approx_count_distinct(expr[,relativeSD]) | Returns the estimated number of distinct values in expr within the group. |
| approx_percentile(expr,percentage[,accuracy]) | Returns the approximate percentile of the expr within the group. |
| approx_top_k(expr[,k[,maxItemsTracked]]) | Returns the top k most frequently occurring item values in an expr along with their approximate counts. |
| array_agg(expr) | Returns an array consisting of all values in expr within the group. |
| avg(expr) | Returns the mean calculated from values of a group. |
| bit_and(expr) | Returns the bitwise AND of all input values in the group. |
| bit_or(expr) | Returns the bitwise OR of all input values in the group. |
| bit_xor(expr) | Returns the bitwise XOR of all input values in the group. |
| bool_and(expr) | Returns true if all values in expr are true within the group. |
| bool_or(expr) | Returns true if at least one value in expr is true within the group. |
| collect_list(expr) | Returns an array consisting of all values in expr within the group. |
| collect_set(expr) | Returns an array consisting of all unique values in expr within the group. |
| corr(expr1,expr2) | Returns Pearson coefficient of correlation between a group of number pairs. |
| count(*) | Returns the total number of retrieved rows in a group, including rows containing null. |
| count(expr[, …]) | Returns the number of rows in a group for which the supplied expressions are all non-null. |
| count_if(expr) | Returns the number of true values for the group in expr. |
| count_min_sketch(expr, epsilon, confidence, seed) | Returns a count-min sketch of all values in the group in expr with the epsilon, confidence and seed. |
| covar_pop(expr1,expr2) | Returns the population covariance of number pairs in a group. |
| covar_samp(expr1,expr2) | Returns the sample covariance of number pairs in a group. |
| every(expr) | Returns true if all values of expr in the group are true. |
| first(expr[,ignoreNull]) | Returns the first value of expr for a group of rows. |
| first_value(expr[,ignoreNull]) | Returns the first value of expr for a group of rows. |
| hll_sketch_agg(expr[,lgConfigK]) | Returns a HyperLogLog sketch used to approximate a distinct values count. |
| hll_union_agg(expr[,allowDifferentLgConfigK]) | Aggregates HyperLogLog sketches for a group of rows. |
| kurtosis(expr) | Returns the kurtosis value calculated from values of a group. |
| last(expr[,ignoreNull]) | Returns the last value of expr for the group of rows. |
| last_value(expr[,ignoreNull]) | Returns the last value of expr for the group of rows. |
| max(expr) | Returns the maximum value of expr in a group. |
| max_by(expr1,expr2) | Returns the value of an expr1 associated with the maximum value of expr2 in a group. |
| mean(expr) | Returns the mean calculated from values of a group. |
| median(expr) | Returns the median calculated from values of a group. |
| min(expr) | Returns the minimum value of expr in a group. |
| min_by(expr1, expr2) | Returns the value of an expr1 associated with the minimum value of expr2 in a group. |
| mode(expr) | Returns the most frequent, not NULL, value of expr in a group. |
| percentile(expr, percentage [,frequency]) | Returns the exact percentile value of expr at the specified percentage. |
| percentile_approx(expr,percentage[,accuracy]) | Returns the approximate percentile of the expr within the group. |
| percentile_cont(pct) WITHIN GROUP (ORDER BY key) | Returns the interpolated percentile of the key within the group. |
| percentile_disc(pct) WITHIN GROUP (ORDER BY key) | Returns the discrete percentile of the key within the group. |
| regr_avgx(yExpr, xExpr) | Returns the mean of xExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
| regr_avgy(yExpr, xExpr) | Returns the mean of yExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
| regr_count(yExpr, xExpr) | Returns the number of non-null value pairs yExpr, xExpr in the group. |
| regr_intercept(yExpr, xExpr) | Returns the intercept of the uni-variate linear regression line in a group where xExpr and yExpr are NOT NULL. |
| regr_r2(yExpr, xExpr) | Returns the coefficient of determination from values of a group where xExpr and yExpr are NOT NULL. |
| regr_slope(yExpr, xExpr) | Returns the slope of the linear regression line of non-null value pairs yExpr, xExpr in the group. |
| regr_sxx(yExpr, xExpr) | Returns the sum of squares of the xExpr values of a group where xExpr and yExpr are NOT NULL. |
| regr_sxy(yExpr, xExpr) | Returns the sum of products of yExpr and xExpr calculated from values of a group where xExpr and yExpr are NOT NULL. |
| regr_syy(yExpr, xExpr) | Returns the sum of squares of the yExpr values of a group where xExpr and yExpr are NOT NULL. |
| schema_of_json_agg(json[, options]) | Returns the combined schema of JSON strings in a group in DDL format. |
| skewness(expr) | Returns the skewness value calculated from values of a group. |
| some(expr) | Returns true if at least one value of expr in a group is true. |
| std(expr) | Returns the sample standard deviation calculated from the values within the group. |
| stddev(expr) | Returns the sample standard deviation calculated from the values within the group. |
| stddev_pop(expr) | Returns the population standard deviation calculated from values of a group. |
| stddev_samp(expr) | Returns the sample standard deviation calculated from values of a group. |
| sum(expr) | Returns the sum calculated from values of a group. |
| try_avg(expr) | Returns the mean calculated from values of a group, NULL if there is an overflow. |
| try_sum(expr) | Returns the sum calculated from values of a group, NULL if there is an overflow. |
| var_pop(expr) | Returns the population variance calculated from values of a group. |
| var_samp(expr) | Returns the sample variance calculated from values of a group. |
| variance(expr) | Returns the sample variance calculated from values of a group. |
Ranking window functions
| Function | Description |
|---|---|
| dense_rank() | Returns the rank of a value compared to all values in the partition. |
| ntile(n) | Divides the rows for each window partition into n buckets ranging from 1 to at most n. |
| percent_rank() | Computes the percentage ranking of a value within the partition. |
| rank() | Returns the rank of a value compared to all values in the partition. |
| row_number() | Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition. |
Analytic window functions
| Function | Description |
|---|---|
| cume_dist() | Returns the position of a value relative to all values in the partition. |
| lag(expr[,offset[,default]]) | Returns the value of expr from a preceding row within the partition. |
| lead(expr[,offset[,default]]) | Returns the value of expr from a subsequent row within the partition. |
| nth_value(expr, offset[, ignoreNulls]) | Returns the value of expr at a specific offset in the window. |
Array functions
| Function | Description |
|---|---|
| arrayExpr[indexExpr] | Returns element at position indexExpr of ARRAY arrayExpr. |
| aggregate(expr,start,merge[,finish]) | Aggregates elements in an array using a custom aggregator. |
| array([expr [, …]]) | Returns an array with the elements in expr. |
| array_append(array, elem) | Returns array appended by elem. |
| array_compact(array) | Removes NULL values from array. |
| array_contains(array,value) | Returns true if array contains value. |
| array_distinct(array) | Removes duplicate values from array. |
| array_except(array1,array2) | Returns an array of the elements in array1 but not in array2. |
| array_insert(array, index, elem) | Returns an expanded array where elem is inserted at the index position. |
| array_intersect(array1,array2) | Returns an array of the elements in the intersection of array1 and array2. |
| array_join(array,delimiter[,nullReplacement]) | Concatenates the elements of array. |
| array_max(array) | Returns the maximum value in array. |
| array_min(array) | Returns the minimum value in array. |
| array_position(array,element) | Returns the position of the first occurrence of element in array. |
| array_prepend(array, elem) | Returns array prepended by elem. |
| array_remove(array,element) | Removes all occurrences of element from array. |
| array_repeat(element,count) | Returns an array containing element count times. |
| array_size(array) | Returns the number of elements in array. |
| array_sort(array,func) | Returns array sorted according to func. |
| array_union(array1,array2) | Returns an array of the elements in the union of array1 and array2 without duplicates. |
| arrays_overlap(array1, array2) | Returns true if the intersection of array1 and array2 is not empty. |
| arrays_zip(array1 [, …]) | Returns a merged array of structs in which the nth struct contains all Nth values of input arrays. |
| cardinality(expr) | Returns the size of expr. |
| concat(expr1, expr2 [, …]) | Returns the concatenation of the arguments. |
| element_at(arrayExpr, index) | Returns the element of an arrayExpr at index. |
| exists(expr, pred) | Returns true if pred is true for any element in expr. |
| explode(expr) | Returns rows by un-nesting expr. |
| explode_outer(expr) | Returns rows by un-nesting expr using outer semantics. |
| filter(expr,func) | Filters the array in expr using the function func. |
| flatten(arrayOfArrays) | Transforms an array of arrays into a single array. |
| forall(expr, predFunc) | Tests whether predFunc holds for all elements in the array. |
| get(arrayExpr, index) | Returns the element of an arrayExpr at index, starting at 0. |
| inline(expr) | Explodes an array of structs into a table. |
| inline_outer(expr) | Explodes an array of structs into a table with outer semantics. |
| posexplode(expr) | Returns rows by un-nesting the array with numbering of positions. |
| posexplode_outer(expr) | Returns rows by un-nesting the array with numbering of positions using OUTER semantics. |
| reduce(expr,start,merge[,finish]) | Aggregates elements in an array using a custom aggregator. |
| reverse(array) | Returns a reversed string or an array with reverse order of elements. |
| sequence(start,stop,step) | Generates an array of elements from start to stop (inclusive), incrementing by step. |
| shuffle(array) | Returns a random permutation of the array in expr. |
| size(expr) | Returns the cardinality of expr. |
| slice(expr,start,length) | Returns a subset of an array. |
| sort_array(expr[,ascendingOrder]) | Returns the array in expr in sorted order. |
| transform(expr, func) | Transforms elements in an array in expr using the function func. |
| try_element_at(arrayExpr, index) | Returns the element of an arrayExpr at index, or NULL if index is out of bound. |
| zip_with(expr1, expr2, func) | Merges the arrays in expr1 and expr2, element-wise, into a single array using func. |
Map functions
| Function | Description |
|---|---|
| mapExpr[keyExpr] | Returns value at keyExpr of MAP mapExpr. |
| cardinality(expr) | Returns the size of expr. |
| element_at(mapExpr, key) | Returns the value of mapExpr for key. |
| explode(expr) | Returns rows by un-nesting expr. |
| explode_outer(expr) | Returns rows by un-nesting expr using outer semantics. |
| map([{key1, value1}[, …]]) | Creates a map with the specified key-value pairs. |
| map_concat([expr1 [, …]]) | Returns the union of all expr map expressions. |
| map_contains_key(map, key) | Returns true if map contains key, false otherwise. |
| map_entries(map) | Returns an unordered array of all entries in map. |
| map_filter(expr, func) | Filters entries in the map in expr using the function func. |
| map_from_arrays(keys, values) | Creates a map with a pair of the keys and values arrays. |
| map_from_entries(expr) | Creates a map created from the specified array of entries. |
| map_keys(map) | Returns an unordered array containing the keys of map. |
| map_values(map) | Returns an unordered array containing the values of map. |
| map_zip_with(map1, map2, func) | Merges map1 and map2 into a single map. |
| size(expr) | Returns the cardinality of expr. |
| str_to_map(expr[,pairDelim[,keyValueDelim]]) | Returns a map after splitting expr into key-value pairs using delimiters. |
| transform_keys(expr, func) | Transforms keys in a map in expr using the function func. |
| transform_values(expr, func) | Transforms values in a map in expr using the function func. |
| try_element_at(mapExpr, key) | Returns the value of mapExpr for key, or NULL if key does not exist. |
Date, timestamp, and interval functions
For information on date and timestamp formats, see Datetime patterns.
| Function | Description |
|---|---|
| intervalExpr / divisor | Returns interval divided by divisor. |
| - intervalExpr | Returns the negated value of intervalExpr. |
| intervalExpr1 - intervalExpr2 | Returns the subtraction of intervalExpr2 from intervalExpr1. |
| datetimeExpr1 - datetimeExpr2 | Returns the subtraction of datetimeExpr2 from datetimeExpr1. |
| + intervalExpr | Returns the value of intervalExpr. |
| intervalExpr1 + intervalExpr2 | Returns the sum of intervalExpr1 and intervalExpr2. |
| intervalExpr * multiplicand | Returns intervalExpr multiplied by multiplicand. |
| abs(expr) | Returns the absolute value of the interval value in expr. |
| add_months(startDate,numMonths) | Returns the date that is numMonths after startDate. |
| curdate() | Returns the current date at the start of query evaluation. |
| current_date() | Returns the current date at the start of query evaluation. |
| current_timestamp() | Returns the current timestamp at the start of query evaluation. |
| current_timezone() | Returns the current session local timezone. |
| date(expr) | Casts the value expr to DATE. |
| date_add(startDate,numDays) | Returns the date numDays after startDate. |
| date_add(unit, value, expr) | Adds value units to a timestamp expr. |
| date_diff(unit, start, stop) | Returns the difference between two timestamps measured in units. |
| date_format(expr,fmt) | Converts a timestamp to a string in the format fmt. |
| date_from_unix_date(days) | Creates a date from the number of days since 1970-01-01. |
| date_part(field,expr) | Extracts a part of the date, timestamp, or interval. |
| date_sub(startDate,numDays) | Returns the date numDays before startDate. |
| date_trunc(unit,expr) | Returns timestamp truncated to the unit specified in unit. |
| dateadd(startDate,numDays) | Returns the date numDays after startDate. |
| dateadd(unit, value, expr) | Adds value units to a timestamp expr. |
| datediff(endDate,startDate) | Returns the number of days from startDate to endDate. |
| datediff(unit, start, stop) | Returns the difference between two timestamps measured in units. |
| day(expr) | Returns the day of month of the date or timestamp. |
| dayofmonth(expr) | Returns the day of month of the date or timestamp. |
| dayofweek(expr) | Returns the day of week of the date or timestamp. |
| dayofyear(expr) | Returns the day of year of the date or timestamp. |
| divisor div dividend | Returns the integral part of the division of interval divisor by interval dividend. |
| extract(field FROM source) | Returns field of source. |
| from_unixtime(unixTime,fmt) | Returns unixTime in fmt. |
| from_utc_timestamp(expr,timezone) | Returns a timestamp in expr specified in UTC in the timezone timeZone. |
| hour(expr) | Returns the hour component of a timestamp. |
| last_day(expr) | Returns the last day of the month that the date belongs to. |
| make_date(year,month,day) | Creates a date from year, month, and day fields. |
| make_dt_interval([days[, hours[, mins[, secs]]]]) | Creates an day-time interval from days, hours, mins and secs. |
| make_interval(years, months, weeks, days, hours, mins, secs) | Deprecated: Creates an interval from years, months, weeks, days, hours, mins and secs. |
| make_timestamp(year,month,day,hour,min,sec[,timezone]) | Creates a timestamp from year, month, day, hour, min, sec, and timezone fields. |
| make_ym_interval([years[, months]]) | Creates a year-month interval from years, and months. |
| minute(expr) | Returns the minute component of the timestamp in expr. |
| month(expr) | Returns the month component of the timestamp in expr. |
| months_between(expr1,expr2[,roundOff]) | Returns the number of months elapsed between dates or timestamps in expr1 and expr2. |
| next_day(expr,dayOfWeek) | Returns the first date which is later than expr and named as in dayOfWeek. |
| now() | Returns the current timestamp at the start of query evaluation. |
| quarter(expr) | Returns the quarter of the year for expr in the range 1 to 4. |
| second(expr) | Returns the second component of the timestamp in expr. |
| session_window(expr, gpDuration) | Creates a session-window over a timestamp expression. |
| sign(expr) | Returns -1.0, 0.0, or 1.0 as interval expr is negative, 0, or positive. |
| signum(expr) | Returns -1.0, 0.0, or 1.0 as interval expr is negative, 0, or positive. |
| timestamp(expr) | Casts expr to TIMESTAMP. |
| timestamp_micros(expr) | Creates a timestamp expr microseconds since UTC epoch. |
| timestamp_millis(expr) | Creates a timestamp expr milliseconds since UTC epoch. |
| timestamp_seconds(expr) | Creates timestamp expr seconds since UTC epoch. |
| timestampadd(unit, value, expr) | Adds value units to a timestamp expr. |
| timestampdiff(unit, start, stop) | Returns the difference between two timestamps measured in units. |
| to_date(expr[,fmt]) | Returns expr cast to a date using an optional formatting. |
| to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting. |
| to_unix_timestamp(expr[,fmt]) | Returns the timestamp in expr as a UNIX timestamp. |
| to_utc_timestamp(expr,timezone) | Returns the timestamp in expr in a different timezone as UTC. |
| trunc(expr, fmt) | Returns a date with the a portion of the date truncated to the unit specified by the format model fmt. |
| try_add(expr1, expr2) | Returns the sum of expr1 and expr2, or NULL in case of error. |
| try_divide(dividend, divisor) | Returns dividend divided by divisor, or NULL if divisor is 0. |
| try_multiply(multiplier, multiplicand) | Returns multiplier multiplied by multiplicand, or NULL on overflow. |
| try_subtract(expr1, expr2) | Returns the subtraction of expr2 from expr1, or NULL on overflow. |
| try_to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting, or NULL if the cast fails. |
| unix_date(expr) | Returns the number of days since 1970-01-01. |
| unix_micros(expr) | Returns the number of microseconds since 1970-01-01 00:00:00 UTC. |
| unix_millis(expr) | Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. |
| unix_seconds(expr) | Returns the number of seconds since 1970-01-01 00:00:00 UTC. |
| unix_timestamp([expr[, fmt]]) | eturns the UNIX timestamp of current or specified time. |
| weekday(expr) | Returns the day of the week of expr. |
| weekofyear(expr) | Returns the week of the year of expr. |
| year(expr) | Returns the year component of expr. |
| window(expr, width[, step[, start]]) | Creates a hopping based sliding-window over a timestamp expression. |
| window_time(window) | Returns the inclusive end time of a sliding-window produced by the window or session_window functions. |
H3 geospatial functions
For information about H3 geospatial functions, see H3 geospatial functions.
Cast functions and constructors
For information on casting between types, see cast function and try_cast function.
| Function | Description |
|---|---|
| array([expr [, …]]) | Returns an array with the elements in expr. |
| bigint(expr) | Casts the value expr to BIGINT. |
| binary(expr) | Casts the value of expr to BINARY. |
| boolean(expr) | Casts expr to Boolean. |
| cast(expr AS type) | Casts the value expr to the target data type type. |
| expr :: type | Casts the value expr to the target data type type. |
| date(expr) | Casts the value expr to DATE. |
| decimal(expr) | Casts the value expr to DECIMAL. |
| double(expr) | Casts the value expr to DOUBLE. |
| float(expr) | Casts the value expr to FLOAT. |
| int(expr) | Casts the value expr to INTEGER. |
| make_date(year,month,day) | Creates a date from year, month, and day fields. |
| make_dt_interval([days[, hours[, mins[, secs]]]]) | Creates an day-time interval from days, hours, mins and secs. |
| make_interval(years, months, weeks, days, hours, mins, secs) | Creates an interval from years, months, weeks, days, hours, mins and secs. |
| make_timestamp(year,month,day,hour,min,sec[,timezone]) | Creates a timestamp from year, month, day, hour, min, sec, and timezone fields. |
| make_ym_interval([years[, months]]) | Creates a year-month interval from years, and months. |
| map([{key1, value1} [, …]]) | Creates a map with the specified key-value pairs. |
| named_struct({name1, val1} [, …]) | Creates a struct with the specified field names and values. |
| smallint(expr) | Casts the value expr to SMALLINT. |
| string(expr) | Casts the value expr to STRING. |
| struct(expr1 [, …]) | Creates a STRUCT with the specified field values. |
| tinyint(expr) | Casts expr to TINYINT. |
| timestamp(expr) | Casts expr to TIMESTAMP. |
| to_char(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt.” |
| to_date(expr[,fmt]) | Returns expr cast to a date using an optional formatting. |
| to_number(expr, fmt) | Returns expr cast to DECIMAL using formatting fmt. |
| to_timestamp(expr[,fmt]) | Returns expr cast to a timestamp using an optional formatting. |
| to_varchar(numExpr, fmt) | Returns numExpr cast to STRING using formatting fmt.” |
| try_cast(expr AS type) | Casts the value expr to the target data type type safely. |
| try_to_number(expr, fmt) | Returns expr cast to DECIMAL using formatting fmt, or NULL if expr is not a valid. |
CSV functions
| Function | Description |
|---|---|
| from_csv(csvStr, schema[, options]) | Returns a struct value with the csvStr and schema. |
| schema_of_csv(csv[, options]) | Returns the schema of a CSV string in DDL format. |
| to_csv(expr[, options]) | Returns a CSV string with the specified struct value. |
JSON functions
| Function | Description |
|---|---|
| jsonStr : jsonPath | Returns fields extracted from the jsonStr. |
| from_json(jsonStr, schema[, options]) | Returns a struct value with the jsonStr and schema. |
| get_json_object(expr, path) | Extracts a JSON object from path. |
| json_array_length(jsonArray) | Returns the number of elements in the outermost JSON array. |
| json_object_keys(jsonObject) | Returns all the keys of the outermost JSON object as an array. |
| json_tuple(jsonStr, path1 [, …]) | Returns multiple JSON objects as a tuple. |
| schema_of_json(jsonStr[, options]) | Returns the schema of a JSON string in DDL format. |
| schema_of_json_agg(jsonStr[, options]) | Returns the combined schema of JSON strings in a group in DDL format. |
| to_json(expr[, options]) | Returns a JSON string with the struct specified in expr. |
XPath functions
| Function | Description |
|---|---|
| xpath(xml, xpath) | Returns values within the nodes of xml that match xpath. |
| xpath_boolean(xml, xpath) | Returns true if the xpath expression evaluates to true, or if a matching node in xml is found. |
| xpath_double(xml, xpath) | Returns a DOUBLE value from an XML document. |
| xpath_float(xml, xpath) | Returns a FLOAT value from an XML document. |
| xpath_int(xml, xpath) | Returns a INTEGER value from an XML document. |
| xpath_long(xml, xpath) | Returns a BIGINT value from an XML document. |
| xpath_number(xml, xpath) | Returns a DOUBLE value from an XML document. |
| xpath_short(xml, xpath) | Returns a SHORT value from an XML document. |
| xpath_string(xml, xpath) | Returns the contents of the first XML node that matches the XPath expression. |
AI functions
| Function | Description |
|---|---|
| ai_generate_text(prompt, modelName[, param1, value1] […]) | Returns text generated by a selected large language model (LLM) given the prompt. |
| ai_query(endpointName, request, [, param1, value1] […]) | Invokes an existing Databricks Model Serving endpoint and parses and returns its response. |
Read functions
| Function | Description |
|---|---|
| read_files(path, [optionKey => optionValue] [, …]) | Reads data files on cloud storage and returns it in tabular format. |
| read_kafka([optionKey => optionValue] [, …]) | Reads records from an Apache Kafka cluster and returns it in tabular format. |
Miscellaneous functions
| Function | Description |
|---|---|
| assert_true(expr) | Returns an error if expr is not true. |
| CASE expr { WHEN opt1 THEN res1 } […] [ELSE def] END | Returns resN for the first optN that equals expr or def if none matches. |
| CASE { WHEN cond1 THEN res1 } […] [ELSE def] END | Returns resN for the first condN that evaluates to true, or def if none found. |
| cloud_files_state( { TABLE(table) | checkpoint } ) | |
| coalesce(expr1, expr2 [, …]) | Returns the first non-null argument. |
| cube (expr1 [, …]) | Creates a multi-dimensional cube using the specified expression columns. |
| current_catalog() | Returns the current catalog. |
| current_database() | Returns the current schema. |
| current_metastore() | Returns the current Unity Catalog Metastore id. |
| current_schema() | Returns the current schema. |
| current_user() | Returns the current user. |
| current_version() | Returns the current version of Azure Databricks. |
| decode(expr, { key, value } [, …] [,defValue]) | Returns the value matching the key. |
| elt(index, expr1 [, …] ) | Returns the nth expression. |
| equal_null(expr1, expr2) | Returns true if expr1 equals expr2 or both expressions are NULL, or false otherwise. |
| event_log( { TABLE(table) | pipeline_id } ) | |
| greatest(expr1, expr2 [, …]) | Returns the largest value of all arguments, skipping null values. |
| grouping(col) | Indicates whether a specified column in a GROUPING SET, ROLLUP, or CUBE represents a subtotal. |
| grouping_id([col1 [, …]]) | Returns the level of grouping for a set of columns. |
| hash(expr1 [, …]) | Returns a hashed value of the arguments. |
| hll_sketch_estimate(expr) | Etimates number of distinct values collected in a HyperLogLog sketch. |
| hll_union(expr1, expr2 [,allowDifferentLgConfigK]) | Combines two HyperLogLog sketches. |
| java_method(class, method[, arg1 [, …]]) | Calls a method with reflection. |
| if(cond, expr1, expr2) | Returns expr1 if cond is true, or expr2 otherwise. |
| iff(cond, expr1, expr2) | Returns expr1 if cond is true, or expr2 otherwise. |
| ifnull(expr1, expr2) | Returns expr2 if expr1 is NULL, or expr1 otherwise. |
| input_file_block_length() | Returns the length in bytes of the block being read. |
| input_file_block_start() | Returns the start offset in bytes of the block being read. |
| input_file_name() | Returns the name of the file being read, or empty string if not available. |
| is_account_group_member(group) | Returns true if the current user is a member of group at the account level. |
| is_member(group) | Returns true if the current user is a member of group at the workspace level. |
| isnull(expr) | Returns true if expr is NULL. |
| isnotnull(expr) | Returns true if expr is not NULL. |
| least(expr1 [, …]) | Returns the smallest value of all arguments, skipping null values. |
| list_secrets() | Returns the keys which the user is authorized to see from Databricks secret service. |
| luhn_check(numStr) | Returns true if numStr passes the Luhn algorithm check. |
| monotonically_increasing_id() | Returns monotonically increasing 64-bit integers. |
| nullif(expr1, expr2) | Returns NULL if expr1 equals expr2, or expr1 otherwise. |
| nvl(expr1, expr2) | Returns expr2 if expr1 is NULL, or expr1 otherwise. |
| nvl2(expr1, expr2, expr3) | Returns expr2 if expr1 is not NULL, or expr3 otherwise. |
| raise_error(expr) | Throws an exception with expr as the message. |
| range(end) | Returns a table of values within a specified range. |
| range(start, end [, step [, numParts]]) | Returns a table of values within a specified range. |
| reflect(class, method[, arg1 [, …]]) | Calls a method with reflection. |
| secret(scope, key) | Extracts a secret value with the given scope and key from Databricks secret service. |
| spark_partition_id() | Returns the current partition ID. |
| sql_keywords() | Returns the set of SQL keywords in Azure Databricks. |
| stack(numRows, expr1 [, …]) | Separates expr1, …, exprN into numRows rows. |
| table_changes(table_str, start [, end]) | Returns a log of changes to a Delta Lake table with Change Data Feed enabled. |
| typeof(expr) | Return a DDL-formatted type string for the data type of expr. |
| user() | Returns the current user. |
| uuid() | Returns an universally unique identifier (UUID) string. |
| window(expr, width[, step [, start]]) | Creates a hopping based sliding-window over a timestamp expression. |
| xxhash64(expr1 [, …]) | Returns a 64-bit hashed value of the arguments. |
| version() | Returns the Apache Spark version. |
Feedback
Submit and view feedback for

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
