Spark SQL provides a comprehensive set of built-in functions for data manipulation and analysis. Functions are organized into the following categories:

Agg Functions (86)

any any_value approx_count_distinct approx_percentile approx_top_k approx_top_k_accumulate approx_top_k_combine array_agg avg bit_and bit_or bit_xor bitmap_and_agg bitmap_construct_agg bitmap_or_agg bool_and bool_or collect_list collect_set corr count count_if count_min_sketch covar_pop covar_samp every first first_value grouping grouping_id histogram_numeric hll_sketch_agg hll_union_agg kll_merge_agg_bigint kll_merge_agg_double kll_merge_agg_float kll_sketch_agg_bigint kll_sketch_agg_double kll_sketch_agg_float kurtosis last last_value listagg max max_by mean measure median min min_by mode percentile percentile_approx percentile_cont percentile_disc regr_avgx regr_avgy regr_count regr_intercept regr_r2 regr_slope regr_sxx regr_sxy regr_syy skewness some std stddev stddev_pop stddev_samp string_agg sum theta_intersection_agg theta_sketch_agg theta_union_agg try_avg try_sum tuple_intersection_agg_double tuple_intersection_agg_integer tuple_sketch_agg_double tuple_sketch_agg_integer tuple_union_agg_double tuple_union_agg_integer var_pop var_samp variance

Array Functions (25)

array array_append array_compact array_contains array_distinct array_except array_insert array_intersect array_join array_max array_min array_position array_prepend array_remove array_repeat array_size array_union arrays_overlap arrays_zip flatten get sequence shuffle slice sort_array

Avro Functions (3)

from_avro schema_of_avro to_avro

Bitwise Functions (13)

& << >> >>> ^ bit_count bit_get getbit shiftleft shiftright shiftrightunsigned | ~

Collection Functions (18)

aggregate array_sort cardinality concat element_at exists filter forall map_filter map_zip_with reduce reverse size transform transform_keys transform_values try_element_at zip_with

Conditional Functions (12)

between case coalesce if ifnull nanvl nullif nullifzero nvl nvl2 when zeroifnull

Conversion Functions (14)

bigint binary boolean cast date decimal double float int smallint string time timestamp tinyint

Csv Functions (3)

from_csv schema_of_csv to_csv

Datetime Functions (81)

add_months convert_timezone curdate current_date current_time current_timestamp current_timezone date_add date_diff date_format date_from_unix_date date_part date_sub date_trunc dateadd datediff datepart day dayname dayofmonth dayofweek dayofyear extract from_unixtime from_utc_timestamp hour last_day localtimestamp make_date make_dt_interval make_interval make_time make_timestamp make_timestamp_ltz make_timestamp_ntz make_ym_interval minute month monthname months_between next_day now quarter second session_window time_diff time_from_micros time_from_millis time_from_seconds time_to_micros time_to_millis time_to_seconds time_trunc timestamp_micros timestamp_millis timestamp_seconds to_date to_time to_timestamp to_timestamp_ltz to_timestamp_ntz to_unix_timestamp to_utc_timestamp trunc try_make_interval try_make_timestamp try_make_timestamp_ltz try_make_timestamp_ntz try_to_date try_to_time try_to_timestamp unix_date unix_micros unix_millis unix_seconds unix_timestamp weekday weekofyear window window_time year

Generator Functions (9)

collations explode explode_outer inline inline_outer posexplode posexplode_outer sql_keywords stack

Hash Functions (7)

crc32 hash md5 sha sha1 sha2 xxhash64

Json Functions (7)

from_json get_json_object json_array_length json_object_keys json_tuple schema_of_json to_json

Map Functions (9)

map map_concat map_contains_key map_entries map_from_arrays map_from_entries map_keys map_values str_to_map

Math Functions (68)

% * + - / abs acos acosh asin asinh atan atan2 atanh bin bround cbrt ceil ceiling conv cos cosh cot csc degrees div e exp expm1 factorial floor greatest hex hypot least ln log log10 log1p log2 mod negative pi pmod positive pow power radians rand randn random rint round sec sign signum sin sinh sqrt tan tanh try_add try_divide try_mod try_multiply try_subtract unhex uniform width_bucket

Misc Functions (25)

aes_decrypt aes_encrypt assert_true bitmap_bit_position bitmap_bucket_number bitmap_count current_catalog current_database current_schema current_user input_file_block_length input_file_block_start input_file_name java_method monotonically_increasing_id raise_error reflect session_user spark_partition_id try_aes_decrypt try_reflect typeof user uuid version

Predicate Functions (23)

! != < <= <=> <> = == > >= and equal_null ilike in isnan isnotnull isnull like not or regexp regexp_like rlike

Protobuf Functions (2)

from_protobuf to_protobuf

Sketch Functions (40)

approx_top_k_estimate hll_sketch_estimate hll_union kll_sketch_get_n_bigint kll_sketch_get_n_double kll_sketch_get_n_float kll_sketch_get_quantile_bigint kll_sketch_get_quantile_double kll_sketch_get_quantile_float kll_sketch_get_rank_bigint kll_sketch_get_rank_double kll_sketch_get_rank_float kll_sketch_merge_bigint kll_sketch_merge_double kll_sketch_merge_float kll_sketch_to_string_bigint kll_sketch_to_string_double kll_sketch_to_string_float theta_difference theta_intersection theta_sketch_estimate theta_union tuple_difference_double tuple_difference_integer tuple_difference_theta_double tuple_difference_theta_integer tuple_intersection_double tuple_intersection_integer tuple_intersection_theta_double tuple_intersection_theta_integer tuple_sketch_estimate_double tuple_sketch_estimate_integer tuple_sketch_summary_double tuple_sketch_summary_integer tuple_sketch_theta_double tuple_sketch_theta_integer tuple_union_double tuple_union_integer tuple_union_theta_double tuple_union_theta_integer

St Functions (5)

st_asbinary st_geogfromwkb st_geomfromwkb st_setsrid st_srid

String Functions (74)

ascii base64 bit_length btrim char char_length character_length chr collate collation concat_ws contains decode elt encode endswith find_in_set format_number format_string initcap instr is_valid_utf8 lcase left len length levenshtein locate lower lpad ltrim luhn_check make_valid_utf8 mask octet_length overlay position printf quote randstr regexp_count regexp_extract regexp_extract_all regexp_instr regexp_replace regexp_substr repeat replace right rpad rtrim sentences soundex space split split_part startswith substr substring substring_index to_binary to_char to_number to_varchar translate trim try_to_binary try_to_number try_validate_utf8 ucase unbase64 upper validate_utf8 ||

Struct Functions (2)

named_struct struct

Table Functions (2)

python_worker_logs range

Url Functions (5)

parse_url try_parse_url try_url_decode url_decode url_encode

Variant Functions (10)

is_variant_null parse_json schema_of_variant schema_of_variant_agg to_variant_object try_parse_json try_variant_get variant_explode variant_explode_outer variant_get

Vector Functions (7)

vector_avg vector_cosine_similarity vector_inner_product vector_l2_distance vector_norm vector_normalize vector_sum

Window Functions (9)

cume_dist dense_rank lag lead nth_value ntile percent_rank rank row_number

Xml Functions (12)

from_xml schema_of_xml to_xml xpath xpath_boolean xpath_double xpath_float xpath_int xpath_long xpath_number xpath_short xpath_string