What's New in Document Solutions v8
We are back again with new feature releases in Document Solutions v8! Key highlights of this release include a new direct API to import/export data from various data sources in Document Solutions for Excel (DsExcel), options to optimize PDF documents, the ability to replace text in Document Solutions PDF Viewer (DsPdfViewer), API support for updating and working with fields to enable adding TOC to Word documents programmatically in Document Solutions for Word (DsWord) and support for Apache Arrow and Parquet files in Document Solutions Data Viewer (DsDataViewer). Let's delve into the details of the new releases!
- Document Solutions for PDF (DsPdf)
- Document Solutions PDF Viewer (DsPdfViewer)
- Document Solutions for Excel (DsExcel)
- Document Solutions Data Viewer (DsDataViewer)
- Document Solutions for Word (DsWord)
Ready to check out the release? Download Document Solutions today!
Document Solutions for PDF (DsPdf)
Optimize PDF Documents
Multiple enhancements have been made to optimize the performance of loading and saving PDF files as well as optimizing the file size of the generated PDF. One of the enhancements includes optimizing the way DsPdf works with object streams. DsPdf introduces an API that can help to load and save PDFs with optimization options. A new SavePdfOptions class gives you precise control over your code to save PDFs in an optimal way for your application, the instance of which can be passed to GcPdfDocument.Save(), Sign() and TimeStamp() methods. Saving a PDF with object streams (PDF 1.5) reduces file size, improves load times, and enhances compression efficiency by consolidating objects, making document handling faster and more efficient. The capabilities of these properties are defined below. Follow the links to learn more about the enum options for optimizing PDF file loading and saving according to your needs:
- UseObjectStreams - Defines how to use object streams using various Enum options when saving a PDF document.
- PdfStreamHandling - Defines how to process existing PDF streams in a loaded document.
Using the new API, you can now re-save an existing PDF with whatever the desired compression settings, affecting all streams. The following code helps to minimize the size of the PDF document:
GcPdfDocument doc = new GcPdfDocument();
FileStream fs = new FileStream("Original_doc.pdf", FileMode.Open);
doc.Load(fs);
doc.CompressionLevel = CompressionLevel.Optimal;
SavePdfOptions so = new SavePdfOptions(SaveMode.Default, PdfStreamHandling.MinimizeSize, UseObjectStreams.Multiple);
doc.Save("Optimized_doc.pdf", so);
Optimize Font Format
In the v8 release, DsPdf also adds the PdfFontFormat property to GcPdfDocument and FontHandler classes, allowing users to set the encoding type for the font formats representing a font in a PDF document.
PdfFontFormat enumeration provides the following options that define the encoding type:
- Type0AutoOneByteEncoding - Saves the font as one or more Type0 PDF fonts, where each character is encoded by one byte.
- Type0IdentityEncoding - Saves the font as a single Type0 font with Identity encoding, where each character is encoded with two bytes.
DsPdf uses a one-byte encoding format, i.e., Type0AutoOneByteEncoding, by default, producing smaller PDF content in most cases.
View Help
Include/Exclude Annotations or Form Fields on Image Export
You now have precise control over which annotations to include when exporting a PDF to images. The new DrawAnnotationFilter property of SaveAsImageOptions can call a delegate to check whether to render a certain annotation or form field based on their type. The following code helps to accomplish this:
GcPdfDocument doc = new GcPdfDocument();
SaveAsImageOptions options = new SaveAsImageOptions();
options.DrawAnnotationFilter = (GcPdfDocument d, Page p, AnnotationBase a, ref bool drawAnnotation) =>
{
// render only CheckBoxes
if (a is WidgetAnnotation wa)
{
if (wa.Field is CheckBoxField)
{
drawAnnotation = true;
return;
}
}
drawAnnotation = false;
};
doc.SaveAsPng("doc.png", null, options);
Preserve Images on Redaction
If a part of an image is redacted, the image is replaced with the redacted version. However, if the same image is present at multiple locations, you may want either to apply the redact to all instances of the image or only to the instance of the image at the redact location. You can now control this behavior with the new CopyImagesOnRedact property in the RedactOptions class, indicating whether images within the redacted area that also appear in other locations will be copied before applying the redact. When set to True, only the instance of the image in the redacted area will be modified, leaving the image unchanged in other locations. When set to false, all instances of the image across the document will be affected by the redaction. The default value is False.
The following code redacts an image only at one location, but the same images in other locations will be preserved:
RedactOptions ro = new RedactOptions();
ro.CopyImagesOnRedact = true;
doc.Redact(ro);
Document Solutions PDF Viewer (DsPdfViewer)
Replace Text in PDF Documents
DsPdfViewer adds new Replace options in the UI to conveniently replace text in PDF documents. The expand icon in the Search bar can activate the text replacement mode where you can provide the text to search in the ‘Find in document' textbox, while the text to replace can be provided in the 'Replace’ textbox. Upon pressing Enter, the text will be replaced conveniently throughout the document. The text can also be replaced through ‘Replace Current’ and ‘Replace All’ buttons in the UI of the Search Bar. Additionally, a new keyboard shortcut, Ctrl+H, has been introduced, which opens the Search Bar with the text replacement mode already enabled.
The text replacement mode is only available when the viewer is configured with the SupportApi, which is utilized for editing functionalities. This feature is accessible to users with a Professional license. The feature is currently not available in Wasm mode.
Timestamp in PDF Viewer Comments
In PDF document annotations, a timestamp records the exact date and time an annotation was created or modified. Timestamps provide a historical record of edits, helping collaborators track when specific feedback or changes were made.
The v8 release adds a timestamp to any annotation added to the PDF document. This timestamp is stored in the modificationDate property, which can be modified through code. It is a string type that can either be empty (indicating that the date and time are not specified) or contain the date and time in the internal PDF format, which is 'D:YYYYMMDDHHmmSSOHH'mm'.
In addition, a new property called "Modified" has been added to the properties panel of each annotation. This property automatically updates as you make changes to the annotation.
Enhanced Proximity Search
In the new DsPdfViewer v8 release, we’ve enhanced proximity search functionality with the introduction of two new operators: NEAR and ONEAR. These operators provide more flexibility in querying phrases, offering an improved search experience.
The NEAR operator matches results where specified search terms are within close proximity, ignoring the order of the terms. The syntax for NEAR is as follows:
<expression> NEAR(n) <expression>
The ONEAR operator functions similarly to NEAR but preserves the order of the specified terms. The syntax for ONEAR is as follows:
<expression> ONEAR(n) <expression>
With the updated proximity search, users can also specify phrases within proximity queries by enclosing them in double quotes. In the example below, multiple words in a phrase can be enclosed in double quotes. The two search terms are separated by 4 words, so we write the search expression as "Originally there were in excess" NEAR(4) "(2.3)"
Document Solutions for Excel (DsExcel)
- Import Data from Object Collections and Data Tables
- Add and Manage Scenarios in What-If Analysis
- Bind Pivot Table Directly with Excel Table as Data Source
- Support Page Number Calculation Operators
- New APIs to Manage PivotTable
- [DsExcel Java] Support for Pattern Fill When Rendering to PDF
- Support Pivot Table Timeline Slicer
- Features for SpreadJS Compatibility
- AutoMerge Cells
- Image Sparkline Export
- Support Cell Decoration API
- Option to Include/Exclude Binding Data
- Multiple Features Supported for Lossless I/O of SpreadJS
Document Solutions Data Viewer (DsDataViewer)
For more details, read this blog on DsExcel and DsDataViewer new features in v8.
Document Solutions for Word (DsWord)
Enhancements When Working with Fields
Fields in Microsoft Word act as dynamic placeholders for data that can automatically update based on certain conditions, eliminating the need for manual updates and helping create consistently formatted, professional documents. Fields are not only useful for displaying dynamic data but can also be customized and controlled by adjusting their arguments (parameters) and switches (modifiers) to modify their behavior and output.
In the v8.0 release, DsWord has added support for working with and updating the following fields:
- PAGE - The PAGE field retrieves the number of the current page.
- PAGEREF - The PAGEREF field inserts the number of the page containing the bookmark for a cross-reference.
- SECTION - The SECTION field retrieves the number of the current section.
- SECTIONPAGES - The SECTIONPAGES field retrieves the number of the current page within the current section.
- SEQ - The SEQ field sequentially numbers chapters, tables, figures, and other user-defined lists of items in a document.
- TC - The TC field defines the text and page number for a table of contents (including a table of figures) entry, which is used by a TOC field.
- TOC - The TOC field builds a table of contents (which can also be a table of figures) using the entries specified by TC fields, their heading levels, and specified styles and inserts that table at this location in the document.
Each of the fields listed above has a corresponding …FieldOptions class (living in the GrapeCity.Documents.Word.Fields namespace) that provides strong typed access to read and write arguments and switches of specific field type. The FieldFormatOptions class represents a base class for ...FieldOptions classes that supports the formatting properties.
To update (recalculate) the fields, use the new GcWordDocument.UpdateFields() method or the Update() method on a specific field. This allows you to include the updated field results in the DOCX or in export to PDF or images.
Have a look at the detailed API for each Field type in the links for each field above.
The following code helps to add TOC to a Word Document in the second page using the TOCFieldOptions class.
var doc = new GcWordDocument();
doc.Load("Annual Financial Report.docx");
var para6 = doc.Body.Paragraphs[6];
var newSection = para6.AddSectionBreak();
var tocPara = newSection.AddParagraph();
var options = new TocFieldOptions(doc);
options.EntryFormatting.CreateHyperlink = true;
// build TOC with paragraphs that formatted only 'Heading 1' or 'Heading 2' or 'Heading 3' styles
foreach (TocStyleLevel style in options.Styles)
{
switch (style.Level)
{
case OutlineLevel.Level1:
case OutlineLevel.Level2:
case OutlineLevel.Level3:
style.Collect = true;
break;
default:
style.Collect = false;
break;
}
}
var toc = tocPara.AddComplexField(options);
toc.Update();
doc.Save("ReportWithTOC.docx");
using (var layout = new GcWordLayout(doc))
{
// save the whole document as PDF
layout.SaveAsPdf("ReportWithTOC.pdf", null, new PdfOutputSettings() { CompressionLevel = CompressionLevel.Fastest });
}
DsWordLayout Package is Now Merged into DsWord Package
From v8 onwards, the DsWordLayout package (that enables saving Word documents to PDF or images) has been merged with the DsWord package. The changes are backward compatible, so any user code will continue to work in v8.0. The only change is the need to remove references to DsWordLayout from user projects as it is no longer required.
Ready to check out the release? Download Document Solutions today!