Since this is an HTML parser, send Accept: text/html when fetching the page to parse to be sure that's what you'll get.
Accept: text/html