PYTHON3-HTML2TEXT
Section: User Commands (1)
Updated: January 2021
Page Index
 
NAME
python3-html2text - manual page for python3-html2text 2020.1.16
 
DESCRIPTION
usage: python3-html2text [-h] [--default-image-alt DEFAULT_IMAGE_ALT]
- [--pad-tables] [--no-wrap-links] [--wrap-list-items]
- 
[--ignore-emphasis] [--reference-links]
[--ignore-links] [--protect-links] [--ignore-images]
[--images-as-html] [--images-to-alt]
[--images-with-size] [-g] [-d] [-e] [-b BODY_WIDTH]
[-i LIST_INDENT] [-s] [--escape-all]
[--bypass-tables] [--ignore-tables]
[--single-line-break] [--unicode-snob]
[--no-automatic-links] [--no-skip-internal-links]
[--links-after-para] [--mark-code]
[--decode-errors DECODE_ERRORS]
[--open-quote OPEN_QUOTE] [--close-quote CLOSE_QUOTE]
[--version]
[filename] [encoding]
positional arguments:
- 
filename
encoding
optional arguments:
- -h, --help
- 
show this help message and exit
- --default-image-alt DEFAULT_IMAGE_ALT
- 
The default alt string for images with missing ones
- --pad-tables
- 
pad the cells to equal column width in tables
- --no-wrap-links
- 
don't wrap links during conversion
- --wrap-list-items
- 
wrap list items during conversion
- --ignore-emphasis
- 
don't include any formatting for emphasis
- --reference-links
- 
use reference style links instead of inline links
- --ignore-links
- 
don't include any formatting for links
- --protect-links
- 
protect links from line breaks surrounding them with
angle brackets
- --ignore-images
- 
don't include any formatting for images
- --images-as-html
- 
Always write image tags as raw html; preserves
`height`, `width` and `alt` if possible.
- --images-to-alt
- 
Discard image data, only keep alt text
- --images-with-size
- 
Write image tags with height and width attrs as raw
html to retain dimensions
- -g, --google-doc
- 
convert an html-exported Google Document
- -d, --dash-unordered-list
- 
use a dash rather than a star for unordered list items
- -e, --asterisk-emphasis
- 
use an asterisk rather than an underscore for
emphasized text
- -b BODY_WIDTH, --body-width BODY_WIDTH
- 
number of characters per output line, 0 for no wrap
- -i LIST_INDENT, --google-list-indent LIST_INDENT
- 
number of pixels Google indents nested lists
- -s, --hide-strikethrough
- 
hide strike-through text. only relevant when -g is
specified as well
- --escape-all
- 
Escape all special characters. Output is less
readable, but avoids corner case formatting issues.
- --bypass-tables
- 
Format tables in HTML rather than Markdown syntax.
- --ignore-tables
- 
Ignore table-related tags (table, th, td, tr) while
keeping rows.
- --single-line-break
- 
Use a single line break after a block element rather
than two line breaks. NOTE: Requires --body-width=,0/
- --unicode-snob
- 
Use unicode throughout document
- --no-automatic-links
- 
Do not use automatic links wherever applicable
- --no-skip-internal-links
- 
Do not skip internal links
- --links-after-para
- 
Put links after each paragraph instead of document
- --mark-code
- 
Mark program code blocks with [code]...[/code]
- --decode-errors DECODE_ERRORS
- 
What to do in case of decode errors.'ignore', 'strict'
and 'replace' are acceptable values
- --open-quote OPEN_QUOTE
- 
The character used to open quotes
- --close-quote CLOSE_QUOTE
- 
The character used to close quotes
- --version
- 
show program's version number and exit