NEWBIE QUESTIION: Comparing Lists in Python
Alex Martelli
aleax at aleax.it
Thu May 2 05:28:58 EDT 2002
More information about the Python-list mailing list
Thu May 2 05:28:58 EDT 2002
- Previous message (by thread): NEWBIE QUESTIION: Comparing Lists in Python
- Next message (by thread): NEWBIE QUESTIION: Comparing Lists in Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
logistix wrote: > The most general way would be to just copy the list instead of using a > dict and remove the elements: > >>>> diff = list1[:] >>>> for item in diff: > ... if item in list2: > ... diff.remove(item) > ... >>>> diff > [1, 2, 3] If the length of both lists are proportional to some number N, this algorithm will take a time of O(N*N*N) -- *cubic*. One factor of N comes from the for statement, one from operator 'in' when applied to a list on the RHS, and one from the remove operation. > List comprehensions would be my favorite (although many will disagree): > >>>> list1 = [1,2,3,4,5] >>>> list2 = [4,5,6,7,8] >>>> returnList1 = [val for val in list1 if val not in list2] >>>> returnList2 = [val for val in list2 if val not in list1] This is O(N*N), since you don't have the N factor from the remove. The way to get O(N) behavior is to construct auxiliary dictionaries: dict1 = dict(zip(list1,list1)) returnList2 = [x for x in list2 if x not in dict1] "x not in C" takes O(N) time if container C is a sequence with N elements, but (roughly) O(1) time if C is a dictionary. Building a dictionary of N items is roughly O(N). Premature optimization is the root of all evil (Knuth) -- but that basically means MICRO-optimization, trying to shave 20% or 80% off some algorithm for some fixed big-O behavior. Getting the right big-O behavior is generally not a wasted effort, unless you can guarantee that all containers around will be quite small. Python generally makes getting sane big-O behavior quite easy, thanks to the superb performance of dictionaries above all (and, in other contexts, the great performance of list's sort method, and a few other typical hotspots). You do have to know the big tricks and pitfalls, such as "x in sequence", somelist.remove, building strings out of pieces with + or += -- these are the big three typical timesinks of hastily written Python code. Alex
- Previous message (by thread): NEWBIE QUESTIION: Comparing Lists in Python
- Next message (by thread): NEWBIE QUESTIION: Comparing Lists in Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list