Krang::Markup - Base class for browser-specific WYSIWYG HTML filtering
Krang::Markup - Base class for browser-specific WYSIWYG HTML filtering
Different browsers use different HTML tags to manage basic markup like bold, italic, underline, strike-through, subscript and superscript. To normalize the database content across the usage of different browsers, the HTML must be filtered accordingly when going to and coming from browser WYSIWYG areas.
A ``normalized tag'' is the HTML tag stored in the database and published on the net.
Let's take BOLD text as an example. The normalized tag is STRONG.
Boldifying text in IE effectively inserts the STRONG tag.
Gecko, however, inserts the B tag, and WebKit wraps the text with a
SPAN tag having its style attribute set to font-weight: bold.
When going to Gecko or WebKit the STRONG tag therefore has to be
replaced with what the WYSIWYG commands of those browsers understand.
And when coming from them, the normalized version has to be restored.
This module provides mockups for methods accomplishing this task.
Subclasses must implement the following class methods:
db2browser_map()db2browser().
browser2db_map()browser2db().
db2browser_map().
It is passed the normalized HTML and returns a string with those mappings applied to it.
Due to strangeness of the internally used HTML::Element module this method must not directly return the modified HTML. Instead return the HTML returned by $pkg->tidy_up_after_treebuilder(tree => $tree) - the argument $tree being a HTML::TreeBuilder object.
browser2db_map()
It is passed the HTML coming from the browser and returns a string with those mappings applied.
Due to strangeness of the internally used HTML::Element module this method must not directly return the modified HTML. Instead return the HTML returned by $pkg->tidy_up_after_treebuilder(tree => $tree) - the argument $tree being a HTML::TreeBuilder object.
tidy_up_after_treebuilder($html_tree_object)db2browser() and browser2db(). It chops the BODY tag
HTML::TreeBuilder wraps around the passed-in HTML, removes the
trailing newline added by HTML::Element's as_HTML(), destroys the
tree object and finally returns the HTML.
remove_junk(\$html)remove tags w/o content inside remove adjacent closing/opening tags while preserving whitespace in between remove excess whitespace remove leading whitespace remove trailing whitespace remove leading BR tags remove trailing BR tags
It must be passed a scalar reference to a string containing the HTML to be cleaned.