removing the header from a gzip'd string
debarchana.ghosh at gmail.com
debarchana.ghosh at gmail.com
Sat Dec 23 12:44:50 EST 2006
More information about the Python-list mailing list
Sat Dec 23 12:44:50 EST 2006
- Previous message (by thread): removing the header from a gzip'd string
- Next message (by thread): removing the header from a gzip'd string
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Bjoern Schliessmann wrote: > Rajarshi wrote: > > > Does anybody know how I can remove the header portion of the > > compressed bytes, such that I only have the compressed data > > remaining? (Obviously I do not intend to perform the > > decompression!) > > Just curious: What's your goal? :) A home made hash function? Actually I was implementing the use of the normalized compression distance to evaluate molecular similarity as described in an article in J.Chem.Inf.Model (http://dx.doi.org/10.1021/ci600384z, subscriber access only, unfortunately). Essentially, they note that the NCD does not always bevave like a metric and one reason they put forward is that this may be due to the size of the header portion (they were using the command line gzip and bzip2 programs) compared to the strings being compressed (which are on average 48 bytes long). So I was interested to see if the NCD behaved like a metric if I removed everything that was not the compressed string. And since I only need to calculate similarity between two strings, I do not need to do any decompression.
- Previous message (by thread): removing the header from a gzip'd string
- Next message (by thread): removing the header from a gzip'd string
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list