mbox despamming script
Michael Hudson
mwh at python.net
Thu Nov 27 07:21:05 EST 2003
More information about the Python-list mailing list
Thu Nov 27 07:21:05 EST 2003
- Previous message (by thread): mbox despamming script
- Next message (by thread): a newbie
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Paul Rubin <http://phr.cx@NOSPAM.invalid> writes: > I was surprised there was no obvious way with spamassassin (maybe I > shoulda looked at spambayes) to split an existing mbox file into its > spam and non-spam messages. So I wrote one. It's pretty slow, taking > around 1.5 seconds per message on a 2 ghz Athlon, making me wonder how > serious ISP's getting thousands of incoming messages per hour can run > anything like spamassassin on all of them. But for my purposes it's ok. > Comments and improvements are welcome. It's my experience that mailbox is pretty slow at reading mbox files. I have memories of speeding up some mail-statistics gathering stuff by a large amount by implementing my own mbox "parser" (basically s.find('\n\nFrom ') or similar, I forget). I'm not sure I'd like to use this approach on something less forgiving than stats, though :-) Cheers, mwh -- 59. In English every word can be verbed. Would that it were so in our programming languages. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
- Previous message (by thread): mbox despamming script
- Next message (by thread): a newbie
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list