htmlparser — Python-Markdown 3.10.2 documentation
This module imports a copy of html.parser.HTMLParser and modifies it heavily through monkey-patches.
A copy is imported rather than the module being directly imported as this ensures that the user can import
and use the unmodified library for their own needs.
Classes:
-
HTMLExtractor–Extract raw HTML from text.
Bases: HTMLParser
Extract raw HTML from text.
The raw HTML is stored in the htmlStash of the
Markdown instance passed to md and the remaining text
is stored in cleandoc as a list of strings.
Methods:
-
reset–Reset this instance. Loses all unprocessed data.
-
close–Handle any buffered data.
-
at_line_start–Returns True if current position is at start of line.
-
get_endtag_text–Returns the text of the end tag.
-
handle_empty_tag–Handle empty tags (
<data>). -
get_starttag_text–Return full source of start tag:
<...>.
Attributes:
-
line_offset(int) –Returns char index in
self.rawdatafor the start of the current line.
Reset this instance. Loses all unprocessed data.
Handle any buffered data.
Returns char index in self.rawdata for the start of the current line.
Returns True if current position is at start of line.
Allows for up to three blank spaces at start of line.
Returns the text of the end tag.
If it fails to extract the actual text from the raw data, it builds a closing tag with tag.
‹›
markdown.htmlparser.HTMLExtractor.handle_empty_tag(data: str, is_block: bool)
¶
Handle empty tags (<data>).
Return full source of start tag: <...>.