The Chilkat HTML-to-XML API is designed for the purpose of transforming HTML into well-formed XML. Once HTML is converted to XHTML (i.e. well-formed XML), any existing XML parsing API can be leveraged to extract data.
The Chilkat HTML-to-Text API converts XML to the best possible plain-text representation.
Main Features:
- File-to-file HTML to XML conversion.
- Memory-to-memory HTML to XML conversion.
- File-to-file HTML to plain-text conversion.
- Memory-to-memory HTML to plain-text conversion.
- Convert character encoding during conversion process.
- Flexibility in controlling how HTML entities are handled.
- Automatically convert HTML entities to corresponding 8-bit characters.
- Optionally drop all text formatting tags from the output.
- Drop/undrop specific tags from the output.
Comments