Skip to main content Skip to footer

What's New in Document Solutions v8

We are back again with new feature releases in Document Solutions v8! Key highlights of this release include a new direct API to import/export data from various data sources in Document Solutions for Excel (DsExcel), options to optimize PDF documents, the ability to replace text in Document Solutions PDF Viewer (DsPdfViewer), API support for updating and working with fields to enable adding TOC to Word documents programmatically in Document Solutions for Word (DsWord) and support for Apache Arrow and Parquet files in Document Solutions Data Viewer (DsDataViewer). Let's delve into the details of the new releases!

Ready to check out the release? Download Document Solutions today!

Document Solutions for PDF (DsPdf)

Optimize PDF Documents

Multiple enhancements have been made to optimize the performance of loading and saving PDF files as well as optimizing the file size of the generated PDF. One of the enhancements includes optimizing the way DsPdf works with object streams. DsPdf introduces an API that can help to load and save PDFs with optimization options. A new SavePdfOptions class gives you precise control over your code to save PDFs in an optimal way for your application, the instance of which can be passed to GcPdfDocument.Save(), Sign() and TimeStamp() methods. Saving a PDF with object streams (PDF 1.5) reduces file size, improves load times, and enhances compression efficiency by consolidating objects, making document handling faster and more efficient. The capabilities of these properties are defined below. Follow the links to learn more about the enum options for optimizing PDF file loading and saving according to your needs:

  • UseObjectStreams - Defines how to use object streams using various Enum options when saving a PDF document.
  • PdfStreamHandling - Defines how to process existing PDF streams in a loaded document. 

Using the new API, you can now re-save an existing PDF with whatever the desired compression settings, affecting all streams. The following code helps to minimize the size of the PDF document:

GcPdfDocument doc = new GcPdfDocument();
FileStream fs = new FileStream("Original_doc.pdf", FileMode.Open);
doc.Load(fs);
doc.CompressionLevel = CompressionLevel.Optimal;
SavePdfOptions so = new SavePdfOptions(SaveMode.Default, PdfStreamHandling.MinimizeSize, UseObjectStreams.Multiple);
doc.Save("Optimized_doc.pdf", so);

Optimize PDF Documents using .NET, C#, VB.NET

Help | Demo

Optimize Font Format

In the v8 release, DsPdf also adds the PdfFontFormat property to GcPdfDocument and FontHandler classes, allowing users to set the encoding type for the font formats representing a font in a PDF document.

PdfFontFormat enumeration provides the following options that define the encoding type:

  • Type0AutoOneByteEncoding - Saves the font as one or more Type0 PDF fonts, where each character is encoded by one byte.
  • Type0IdentityEncoding - Saves the font as a single Type0 font with Identity encoding, where each character is encoded with two bytes.

DsPdf uses a one-byte encoding format, i.e., Type0AutoOneByteEncoding, by default, producing smaller PDF content in most cases.

View Help

Include/Exclude Annotations or Form Fields on Image Export

You now have precise control over which annotations to include when exporting a PDF to images. The new DrawAnnotationFilter property of SaveAsImageOptions can call a delegate to check whether to render a certain annotation or form field based on their type. The following code helps to accomplish this:

GcPdfDocument doc = new GcPdfDocument();

SaveAsImageOptions options = new SaveAsImageOptions();
options.DrawAnnotationFilter = (GcPdfDocument d, Page p, AnnotationBase a, ref bool drawAnnotation) =>
{
  // render only CheckBoxes
  if (a is WidgetAnnotation wa)
  {
    if (wa.Field is CheckBoxField)
    {
      drawAnnotation = true;
      return;
    }
  }
  drawAnnotation = false;
};
doc.SaveAsPng("doc.png", null, options);

Help | Demo

Preserve Images on Redaction

If a part of an image is redacted, the image is replaced with the redacted version. However, if the same image is present at multiple locations, you may want either to apply the redact to all instances of the image or only to the instance of the image at the redact location. You can now control this behavior with the new CopyImagesOnRedact property in the RedactOptions class, indicating whether images within the redacted area that also appear in other locations will be copied before applying the redact. When set to True, only the instance of the image in the redacted area will be modified, leaving the image unchanged in other locations. When set to false, all instances of the image across the document will be affected by the redaction. The default value is False.

The following code redacts an image only at one location, but the same images in other locations will be preserved:

RedactOptions ro = new RedactOptions();
ro.CopyImagesOnRedact = true;
doc.Redact(ro);

Preserve images on redaction

Help | Demo


Document Solutions PDF Viewer (DsPdfViewer)

Replace Text in PDF Documents 

DsPdfViewer adds new Replace options in the UI to conveniently replace text in PDF documents. The expand icon in the Search bar can activate the text replacement mode where you can provide the text to search in the ‘Find in document' textbox, while the text to replace can be provided in the 'Replace’ textbox. Upon pressing Enter, the text will be replaced conveniently throughout the document. The text can also be replaced through ‘Replace Current’ and ‘Replace All’ buttons in the UI of the Search Bar. Additionally, a new keyboard shortcut, Ctrl+H, has been introduced, which opens the Search Bar with the text replacement mode already enabled. 

The text replacement mode is only available when the viewer is configured with the SupportApi, which is utilized for editing functionalities. This feature is accessible to users with a Professional license. The feature is currently not available in Wasm mode

Allow End Users to Find and Replace Text in PDFs with a JavaScript PDF Viewer Control

Help | Demo

Timestamp in PDF Viewer Comments

In PDF document annotations, a timestamp records the exact date and time an annotation was created or modified. Timestamps provide a historical record of edits, helping collaborators track when specific feedback or changes were made. 

The v8 release adds a timestamp to any annotation added to the PDF document. This timestamp is stored in the modificationDate property, which can be modified through code. It is a string type that can either be empty (indicating that the date and time are not specified) or contain the date and time in the internal PDF format, which is 'D:YYYYMMDDHHmmSSOHH'mm'. 

In addition, a new property called "Modified" has been added to the properties panel of each annotation. This property automatically updates as you make changes to the annotation. 

View time stamps on comments within the JS PDF Viewer

Help | Demo

Enhanced Proximity Search

In the new DsPdfViewer v8 release, we’ve enhanced proximity search functionality with the introduction of two new operators: NEAR and ONEAR. These operators provide more flexibility in querying phrases, offering an improved search experience. 

The NEAR operator matches results where specified search terms are within close proximity, ignoring the order of the terms. The syntax for NEAR is as follows:

<expression> NEAR(n) <expression>

The ONEAR operator functions similarly to NEAR but preserves the order of the specified terms. The syntax for ONEAR is as follows:

<expression> ONEAR(n) <expression>

With the updated proximity search, users can also specify phrases within proximity queries by enclosing them in double quotes. In the example below, multiple words in a phrase can be enclosed in double quotes. The two search terms are separated by 4 words, so we write the search expression as "Originally there were in excess" NEAR(4) "(2.3)"

Enhanced Proximity Search in PDFs

Help | Demo


Document Solutions for Word (DsWord)

Enhancements When Working with Fields

Fields in Microsoft Word act as dynamic placeholders for data that can automatically update based on certain conditions, eliminating the need for manual updates and helping create consistently formatted, professional documents. Fields are not only useful for displaying dynamic data but can also be customized and controlled by adjusting their arguments (parameters) and switches (modifiers) to modify their behavior and output.

In the v8.0 release, DsWord has added support for working with and updating the following fields:

  • PAGE - The PAGE field retrieves the number of the current page.
  • PAGEREF - The PAGEREF field inserts the number of the page containing the bookmark for a cross-reference.
  • SECTION - The SECTION field retrieves the number of the current section.
  • SECTIONPAGES - The SECTIONPAGES field retrieves the number of the current page within the current section.
  • SEQ - The SEQ field sequentially numbers chapters, tables, figures, and other user-defined lists of items in a document.
  • TC -  The TC field defines the text and page number for a table of contents (including a table of figures) entry, which is used by a TOC field.
  • TOC - The TOC field builds a table of contents (which can also be a table of figures) using the entries specified by TC fields, their heading levels, and specified styles and inserts that table at this location in the document.

Each of the fields listed above has a corresponding …FieldOptions class (living in the GrapeCity.Documents.Word.Fields namespace) that provides strong typed access to read and write arguments and switches of specific field type. The FieldFormatOptions class represents a base class for ...FieldOptions classes that supports the formatting properties.

To update (recalculate) the fields, use the new GcWordDocument.UpdateFields() method or the Update() method on a specific field. This allows you to include the updated field results in the DOCX or in export to PDF or images.

Have a look at the detailed API for each Field type in the links for each field above. 

The following code helps to add TOC to a Word Document in the second page using the TOCFieldOptions class.

var doc = new GcWordDocument();
doc.Load("Annual Financial Report.docx");
var para6 = doc.Body.Paragraphs[6];
var newSection = para6.AddSectionBreak();
var tocPara = newSection.AddParagraph();
var options = new TocFieldOptions(doc);
options.EntryFormatting.CreateHyperlink = true;
// build TOC with paragraphs that formatted only 'Heading 1' or 'Heading 2' or 'Heading 3' styles
foreach (TocStyleLevel style in options.Styles)
{
    switch (style.Level)
    {
        case OutlineLevel.Level1:
        case OutlineLevel.Level2:
        case OutlineLevel.Level3:
            style.Collect = true;
            break;
        default:
            style.Collect = false;
            break;
    }
}
var toc = tocPara.AddComplexField(options);
toc.Update();
doc.Save("ReportWithTOC.docx");
using (var layout = new GcWordLayout(doc))
{
    // save the whole document as PDF
    layout.SaveAsPdf("ReportWithTOC.pdf", null, new PdfOutputSettings() { CompressionLevel = CompressionLevel.Fastest });
}

Add Table of Contents to PDF that are Converted from DOCX using C#

Help | Demo

DsWordLayout Package is Now Merged into DsWord Package

From v8 onwards, the DsWordLayout package (that enables saving Word documents to PDF or images) has been merged with the DsWord package. The changes are backward compatible, so any user code will continue to work in v8.0. The only change is the need to remove references to DsWordLayout from user projects as it is no longer required.

Ready to check out the release? Download Document Solutions today!

comments powered by Disqus