Issue5036
Created on 2009-01-23 03:52 by tksmashiw, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Messages (10) | |||
|---|---|---|---|
| msg80398 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-23 03:52 | |
When I make a dictionary by parsing "legacy-icon-mapping.xml"(which is a part of icon-naming-utils[http://tango.freedesktop.org/Tango_Icon_Library]) with the following script, the three keys of the dictionary are collapsed if the "buffer_text" attribute is False. ===================== #!/usr/bin/env python # -*- coding: utf-8 -*- from __future__ import with_statement import sys from xml.parsers.expat import ParserCreate import codecs class Database: """Make a dictionary which is accessible by Databese.dict""" def __init__(self, buffer_text): self.cnt = None self.name = None self.data = None self.dict = {} p = ParserCreate() p.buffer_text = buffer_text p.StartElementHandler = self.start_element p.EndElementHandler = self.end_element p.CharacterDataHandler = self.char_data with open("/usr/share/icon-naming-utils/legacy-icon-mapping.xml", 'r') as f: p.ParseFile(f) def start_element(self, name, attrs): if name == 'context': self.cnt = attrs["dir"] if name == 'icon': self.name = attrs["name"] def end_element(self, name): if name == 'link': self.dict[self.data] = (self.cnt, self.name) def char_data(self, data): self.data = data.strip() def print_set(aset): for e in aset: print '\t' + e if __name__ == '__main__': sys.stdout = codecs.getwriter('utf_8')(sys.stdout) map_false_dict = Database(False).dict map_true_dict = Database(True).dict print "The keys which exist if buffer_text=False but don't exist if buffer_text=True are" print_set(set(map_false_dict.keys()) - set(map_true_dict.keys())) print "The keys which exist if buffer_text=True but don't exist if buffer_text=False are" print_set(set(map_true_dict.keys()) - set(map_false_dict.keys())) ===================== The result of running this script is ====================== The keys which exist if buffer_text=False but don't exist if buffer_text=True are rt-descending ock_text_right lc The keys which exist if buffer_text=True but don't exist if buffer_text=False are stock_text_right gnome-mime-application-vnd.stardivision.calc gtk-sort-descending ====================== I confirmed it in Python-2.5.2 on Fedora 10. |
|||
| msg80432 - (view) | Author: Gabriel Genellina (ggenellina) | Date: 2009-01-24 01:31 | |
If the xml file is small enough, could you attach it to the issue? Or provide a download location? I could not find it myself (without downloading the whole package) (Note that Python 2.5 only gets security fixes now, so unless this still fails with 2.6 or later, this issue is likely to be closed) |
|||
| msg80435 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-24 04:10 | |
Thanks for reply! >If the xml file is small enough, could you attach it to the issue? Or >provide a download location? Sorry, I found here. http://webcvs.freedesktop.org/icon-theme/icon-naming-utils/legacy-icon-mapping.xml?revision=1.75&content-type=text%2Fplain&pathrev=1.75 >(Note that Python 2.5 only gets security fixes now, so unless this >still fails with 2.6 or later, this issue is likely to be closed) I roughly confirmed the same problem on python-3.0 on MS Windows 2 weeks ago, but need to verify more strictly... |
|||
| msg80438 - (view) | Author: HiroakiKawai (kawai) | Date: 2009-01-24 08:48 | |
The sample code has bug. expat is OK.
Method char_data must append the incoming characters because the
character sequence is an buffered input.
def char_data(self, data):
self.data += data
You should reset it by self.data = '' at end_element().
|
|||
| msg80449 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-24 14:10 | |
Hi kawai. I got correct output by modifying the code like you say, but I still cannot understand why this happens. Could you tell me more briefly, or point any documents about it? I can't find any notes which say don't pass strings but append it for CharacterDataHandler in official documents. Does everyone know/understand it already? Only I am so stupid? (;;) |
|||
| msg80451 - (view) | Author: HiroakiKawai (kawai) | Date: 2009-01-24 14:25 | |
That's the spec of XML SAX interface. |
|||
| msg80453 - (view) | Author: HiroakiKawai (kawai) | Date: 2009-01-24 14:54 | |
Please read "The ContentHandler.characters() callback is missing data!" http://www.saxproject.org/faq.html and close this issue :) |
|||
| msg80454 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-24 15:21 | |
a mistake of my former message, briefly -> in detail >Please read "The ContentHandler.characters() callback is missing data!" >http://www.saxproject.org/faq.html I was just reading above site. it is now very clear for me. Thanks kawai and I'm sorry to take up your time, gagenellina. |
|||
| msg80638 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-27 09:57 | |
From msg80438 >You should reset it by self.data = '' at end_element(). It seems that we should reset it at start_element() like this, ============================ def start_element(self, name, attrs): ...abbr... if name == 'link': self.data = '' ============================= or unwanted \s, \t, and \n mix in "self.data". That's all, thanks. |
|||
| msg80851 - (view) | Author: Takeshi Matsuyama (tksmashiw) | Date: 2009-01-31 02:56 | |
Could someone close this? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:44 | admin | set | github: 49286 |
| 2009-01-31 03:16:24 | benjamin.peterson | set | status: open -> closed resolution: not a bug |
| 2009-01-31 02:56:03 | tksmashiw | set | messages: + msg80851 |
| 2009-01-27 09:57:58 | tksmashiw | set | messages: + msg80638 |
| 2009-01-24 15:21:18 | tksmashiw | set | messages: + msg80454 |
| 2009-01-24 14:54:30 | kawai | set | messages: + msg80453 |
| 2009-01-24 14:25:26 | kawai | set | messages: + msg80451 |
| 2009-01-24 14:10:16 | tksmashiw | set | messages: + msg80449 |
| 2009-01-24 08:48:18 | kawai | set | nosy:
+ kawai messages: + msg80438 |
| 2009-01-24 04:10:26 | tksmashiw | set | messages: + msg80435 |
| 2009-01-24 01:31:03 | ggenellina | set | nosy:
+ ggenellina messages: + msg80432 |
| 2009-01-23 03:52:20 | tksmashiw | create | |