htmlparser — Python-Markdown 3.10.2 documentation

This module imports a copy of html.parser.HTMLParser and modifies it heavily through monkey-patches. A copy is imported rather than the module being directly imported as this ensures that the user can import and use the unmodified library for their own needs.

Classes:

HTMLExtractor –
Extract raw HTML from text.

Bases: HTMLParser

Extract raw HTML from text.

The raw HTML is stored in the htmlStash of the Markdown instance passed to md and the remaining text is stored in cleandoc as a list of strings.

Methods:

reset –
Reset this instance. Loses all unprocessed data.
close –
Handle any buffered data.
at_line_start –
Returns True if current position is at start of line.
get_endtag_text –
Returns the text of the end tag.
handle_empty_tag –
Handle empty tags (<data>).
get_starttag_text –
Return full source of start tag: <...>.

Attributes:

line_offset (int) –
Returns char index in self.rawdata for the start of the current line.

Reset this instance. Loses all unprocessed data.

Handle any buffered data.

Returns char index in self.rawdata for the start of the current line.

Returns True if current position is at start of line.

Allows for up to three blank spaces at start of line.

Returns the text of the end tag.

If it fails to extract the actual text from the raw data, it builds a closing tag with tag.

‹› `markdown.htmlparser.HTMLExtractor.handle_empty_tag(data: str, is_block: bool)` ¶

Handle empty tags (<data>).

Return full source of start tag: <...>.

‹› markdown.htmlparser.HTMLExtractor.handle_empty_tag(data: str, is_block: bool) ¶

‹› `markdown.htmlparser.HTMLExtractor.handle_empty_tag(data: str, is_block: bool)` ¶