Skip to content

Serializing a CArray results in a broken document. #300

Description

@m-gallesio

The way CArray is serialized just concatenates its elements with no spacing included:

This means a document gets broken if when a page's content is replaced via PdfPage.Contents.ReplaceContent(CSequence) if the new content contains any CArray.

See the reproduction code sample, reproduced here:

using PdfSharp.Pdf.Content;
using PdfSharp.Pdf.Content.Objects;
using PdfSharp.Pdf.IO;
using System.IO;

using var inputStream = File.OpenRead(args[0]);
using var document = PdfReader.Open(inputStream, PdfDocumentOpenMode.Modify);

foreach (var page in document.Pages)
{
    var newContent = new CSequence();
    foreach (var item in ContentReader.ReadContent(page))
        newContent.Add(item);
    page.Contents.ReplaceContent(newContent);
}

document.Save(Path.Combine(Path.GetDirectoryName(args[0]), Path.GetFileNameWithoutExtension(args[0]) + "_EDITED.pdf"));

This sample reads each page via ContentReader and re-creates it by just concatenating said contents.

The sample files included in the /files folder are:

  • A DOCX document containing a box with dashed borders
  • Its PDF version converted by Microsoft Word. The dashed borders are rendered via a d operator with a CArray of CReals as its operand
  • The result of processing said PDF document with the sample code.

Exactly how broken the document appears depends on the viewer; in the sample case:

  • Microsoft Edge's embedded viewer renders the box with dashed borders but ignores the original dash spacing
  • Firefox's embedded viewer renders the box correctly
  • Adobe Acrobat reader completely stops rendering the document once it reaches the dashed box
  • The original document I discovered this in breaks PDFBox's parser because it tries to read the concatenated floats as a single float

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions