markitdown version 0.1.6
OS: windows 11 x 64
using powershell terminal: markitdown file.docx -o file.md --keep-data-uris
If the line starts with hash mark ## it will keep it and recognized as section header.
If the line starts with pipeline character | it will be kept and recognized as table, even make it worse if it's already inside a table
only a few special characters listed here will be escaped, like asterisk, underscore.
also, when it converts the table in docx files, the markdown generated will get an empty header and put the real header as the first content row.
the converted result is something like:
| | | |
| --- | --- | --- |
| Header 1 | Header 2 | Header 3 |
but it should be like:
| Header 1 | Header 2 | Header 3 |
| --- | --- | --- |
markitdown version 0.1.6
OS: windows 11 x 64
using powershell terminal: markitdown file.docx -o file.md --keep-data-uris
If the line starts with hash mark ## it will keep it and recognized as section header.
If the line starts with pipeline character | it will be kept and recognized as table, even make it worse if it's already inside a table
only a few special characters listed here will be escaped, like asterisk, underscore.
also, when it converts the table in docx files, the markdown generated will get an empty header and put the real header as the first content row.
the converted result is something like:
| | | |
| --- | --- | --- |
| Header 1 | Header 2 | Header 3 |
but it should be like:
| Header 1 | Header 2 | Header 3 |
| --- | --- | --- |