RDD.filter(f)[source]# Return a new RDD containing only the elements that satisfy a predicate. New in version 0.7.0. Parameters ffunctiona function to run on each element of the RDD Returns RDDa new RDD by applying a function to each element Examples >>> rdd = sc.parallelize([1, 2, 3, 4, 5]) >>> rdd.filter(lambda x: x % 2 == 0).collect() [2, 4]