PYTHON3-HTML2TEXT

Section: User Commands (1)
Updated: January 2021
Page Index
 

NAME

python3-html2text - manual page for python3-html2text 2020.1.16  

DESCRIPTION

usage: python3-html2text [-h] [--default-image-alt DEFAULT_IMAGE_ALT]
[--pad-tables] [--no-wrap-links] [--wrap-list-items]
[--ignore-emphasis] [--reference-links] [--ignore-links] [--protect-links] [--ignore-images] [--images-as-html] [--images-to-alt] [--images-with-size] [-g] [-d] [-e] [-b BODY_WIDTH] [-i LIST_INDENT] [-s] [--escape-all] [--bypass-tables] [--ignore-tables] [--single-line-break] [--unicode-snob] [--no-automatic-links] [--no-skip-internal-links] [--links-after-para] [--mark-code] [--decode-errors DECODE_ERRORS] [--open-quote OPEN_QUOTE] [--close-quote CLOSE_QUOTE] [--version] [filename] [encoding]
 

positional arguments:

filename encoding
 

optional arguments:

-h, --help
show this help message and exit
--default-image-alt DEFAULT_IMAGE_ALT
The default alt string for images with missing ones
--pad-tables
pad the cells to equal column width in tables
--no-wrap-links
don't wrap links during conversion
--wrap-list-items
wrap list items during conversion
--ignore-emphasis
don't include any formatting for emphasis
--reference-links
use reference style links instead of inline links
--ignore-links
don't include any formatting for links
--protect-links
protect links from line breaks surrounding them with angle brackets
--ignore-images
don't include any formatting for images
--images-as-html
Always write image tags as raw html; preserves `height`, `width` and `alt` if possible.
--images-to-alt
Discard image data, only keep alt text
--images-with-size
Write image tags with height and width attrs as raw html to retain dimensions
-g, --google-doc
convert an html-exported Google Document
-d, --dash-unordered-list
use a dash rather than a star for unordered list items
-e, --asterisk-emphasis
use an asterisk rather than an underscore for emphasized text
-b BODY_WIDTH, --body-width BODY_WIDTH
number of characters per output line, 0 for no wrap
-i LIST_INDENT, --google-list-indent LIST_INDENT
number of pixels Google indents nested lists
-s, --hide-strikethrough
hide strike-through text. only relevant when -g is specified as well
--escape-all
Escape all special characters. Output is less readable, but avoids corner case formatting issues.
--bypass-tables
Format tables in HTML rather than Markdown syntax.
--ignore-tables
Ignore table-related tags (table, th, td, tr) while keeping rows.
--single-line-break
Use a single line break after a block element rather than two line breaks. NOTE: Requires --body-width=,0/
--unicode-snob
Use unicode throughout document
--no-automatic-links
Do not use automatic links wherever applicable
--no-skip-internal-links
Do not skip internal links
--links-after-para
Put links after each paragraph instead of document
--mark-code
Mark program code blocks with [code]...[/code]
--decode-errors DECODE_ERRORS
What to do in case of decode errors.'ignore', 'strict' and 'replace' are acceptable values
--open-quote OPEN_QUOTE
The character used to open quotes
--close-quote CLOSE_QUOTE
The character used to close quotes
--version
show program's version number and exit


 

Index

NAME
DESCRIPTION
positional arguments:
optional arguments: