Spark SQL, Built-in Functions
Spark SQL provides a comprehensive set of built-in functions for data manipulation and analysis. Functions are organized into the following categories:
Agg Functions (86)¶
any any_value approx_count_distinct approx_percentile approx_top_k approx_top_k_accumulate approx_top_k_combine array_agg avg bit_and bit_or bit_xor bitmap_and_agg bitmap_construct_agg bitmap_or_agg bool_and bool_or collect_list collect_set corr count count_if count_min_sketch covar_pop covar_samp every first first_value grouping grouping_id histogram_numeric hll_sketch_agg hll_union_agg kll_merge_agg_bigint kll_merge_agg_double kll_merge_agg_float kll_sketch_agg_bigint kll_sketch_agg_double kll_sketch_agg_float kurtosis last last_value listagg max max_by mean measure median min min_by mode percentile percentile_approx percentile_cont percentile_disc regr_avgx regr_avgy regr_count regr_intercept regr_r2 regr_slope regr_sxx regr_sxy regr_syy skewness some std stddev stddev_pop stddev_samp string_agg sum theta_intersection_agg theta_sketch_agg theta_union_agg try_avg try_sum tuple_intersection_agg_double tuple_intersection_agg_integer tuple_sketch_agg_double tuple_sketch_agg_integer tuple_union_agg_double tuple_union_agg_integer var_pop var_samp variance
Array Functions (25)¶
array array_append array_compact array_contains array_distinct array_except array_insert array_intersect array_join array_max array_min array_position array_prepend array_remove array_repeat array_size array_union arrays_overlap arrays_zip flatten get sequence shuffle slice sort_array
Avro Functions (3)¶
Bitwise Functions (13)¶
Collection Functions (18)¶
aggregate array_sort cardinality concat element_at exists filter forall map_filter map_zip_with reduce reverse size transform transform_keys transform_values try_element_at zip_with
Conditional Functions (12)¶
Conversion Functions (14)¶
Csv Functions (3)¶
Datetime Functions (81)¶
add_months convert_timezone curdate current_date current_time current_timestamp current_timezone date_add date_diff date_format date_from_unix_date date_part date_sub date_trunc dateadd datediff datepart day dayname dayofmonth dayofweek dayofyear extract from_unixtime from_utc_timestamp hour last_day localtimestamp make_date make_dt_interval make_interval make_time make_timestamp make_timestamp_ltz make_timestamp_ntz make_ym_interval minute month monthname months_between next_day now quarter second session_window time_diff time_from_micros time_from_millis time_from_seconds time_to_micros time_to_millis time_to_seconds time_trunc timestamp_micros timestamp_millis timestamp_seconds to_date to_time to_timestamp to_timestamp_ltz to_timestamp_ntz to_unix_timestamp to_utc_timestamp trunc try_make_interval try_make_timestamp try_make_timestamp_ltz try_make_timestamp_ntz try_to_date try_to_time try_to_timestamp unix_date unix_micros unix_millis unix_seconds unix_timestamp weekday weekofyear window window_time year
Generator Functions (9)¶
Hash Functions (7)¶
Json Functions (7)¶
Map Functions (9)¶
map map_concat map_contains_key map_entries map_from_arrays map_from_entries map_keys map_values str_to_map
Math Functions (68)¶
% * + - / abs acos acosh asin asinh atan atan2 atanh bin bround cbrt ceil ceiling conv cos cosh cot csc degrees div e exp expm1 factorial floor greatest hex hypot least ln log log10 log1p log2 mod negative pi pmod positive pow power radians rand randn random rint round sec sign signum sin sinh sqrt tan tanh try_add try_divide try_mod try_multiply try_subtract unhex uniform width_bucket
Misc Functions (25)¶
aes_decrypt aes_encrypt assert_true bitmap_bit_position bitmap_bucket_number bitmap_count current_catalog current_database current_schema current_user input_file_block_length input_file_block_start input_file_name java_method monotonically_increasing_id raise_error reflect session_user spark_partition_id try_aes_decrypt try_reflect typeof user uuid version
Predicate Functions (23)¶
! != < <= <=> <> = == > >= and equal_null ilike in isnan isnotnull isnull like not or regexp regexp_like rlike
Protobuf Functions (2)¶
Sketch Functions (40)¶
approx_top_k_estimate hll_sketch_estimate hll_union kll_sketch_get_n_bigint kll_sketch_get_n_double kll_sketch_get_n_float kll_sketch_get_quantile_bigint kll_sketch_get_quantile_double kll_sketch_get_quantile_float kll_sketch_get_rank_bigint kll_sketch_get_rank_double kll_sketch_get_rank_float kll_sketch_merge_bigint kll_sketch_merge_double kll_sketch_merge_float kll_sketch_to_string_bigint kll_sketch_to_string_double kll_sketch_to_string_float theta_difference theta_intersection theta_sketch_estimate theta_union tuple_difference_double tuple_difference_integer tuple_difference_theta_double tuple_difference_theta_integer tuple_intersection_double tuple_intersection_integer tuple_intersection_theta_double tuple_intersection_theta_integer tuple_sketch_estimate_double tuple_sketch_estimate_integer tuple_sketch_summary_double tuple_sketch_summary_integer tuple_sketch_theta_double tuple_sketch_theta_integer tuple_union_double tuple_union_integer tuple_union_theta_double tuple_union_theta_integer
St Functions (5)¶
String Functions (74)¶
ascii base64 bit_length btrim char char_length character_length chr collate collation concat_ws contains decode elt encode endswith find_in_set format_number format_string initcap instr is_valid_utf8 lcase left len length levenshtein locate lower lpad ltrim luhn_check make_valid_utf8 mask octet_length overlay position printf quote randstr regexp_count regexp_extract regexp_extract_all regexp_instr regexp_replace regexp_substr repeat replace right rpad rtrim sentences soundex space split split_part startswith substr substring substring_index to_binary to_char to_number to_varchar translate trim try_to_binary try_to_number try_validate_utf8 ucase unbase64 upper validate_utf8 ||
Struct Functions (2)¶
Table Functions (2)¶
Url Functions (5)¶
Variant Functions (10)¶
is_variant_null parse_json schema_of_variant schema_of_variant_agg to_variant_object try_parse_json try_variant_get variant_explode variant_explode_outer variant_get
Vector Functions (7)¶
vector_avg vector_cosine_similarity vector_inner_product vector_l2_distance vector_norm vector_normalize vector_sum