如何使用 HTML5 和 JavaScript 检测文档并将其保存为 PDF

将收据、发票和合同等文档捕获并保存为 PDF 文件是许多企业的常见要求。在本文中，我们通过添加检测文档并将其保存为 PDF 的功能来增强使用 Dynamsoft Document Viewer 构建的 Web 文档编辑器项目。文档检测功能由 Dynamsoft Capture Vision 提供支持。

演示视频：检测文档并将其保存为 PDF

在线演示

https://yushulx.me/web-document-annotation/

先决条件

Dynamsoft Capture Vision 试用许可证：获得 30 天免费试用许可证，以解锁 Dynamsoft 产品的全部功能。

Dynamsoft 文档查看器：此 JavaScript SDK 支持无缝查看 PDF、JPEG、PNG、TIFF 和 BMP 文件。它还具有 PDF 注释渲染和保存功能。在此处下载：https://www.npmjs.com/package/dynamsoft-document-viewer。

Dynamsoft Capture Vision Bundle：此 JavaScript SDK 提供文档检测、裁剪和图像增强功能。在此处下载：https://www.npmjs.com/package/dynamsoft-capture-vision-bundle。

在 HTML5 和 JavaScript 中实现文档检测和纠正功能

以下部分将指导您使用 HTML5 和 JavaScript 实现文档检测和纠正功能。如果您已经下载源代码，则可以跳至步骤 2。

步骤 1：获取源代码

从 GitHub 存储库克隆源代码：git clone https://github.com/yushulx/web-twain-document-scan-management.git

导航到 document_annotation 目录：cd web-twain-document-scan-management/examples/document_annotation

在 Visual Studio Code 中打开项目。

步骤2：添加文档检测按钮

在 main.css 中，为文档检测按钮添加一个材质图标： .icon-document_scanner::before { content: "crop_free"; } .icon-document_scanner { display: flex; font-size: 1.5em; }

在main.js中定义文档检测按钮并添加到工具栏： const documentButton = { type: Dynamsoft.DDV.Elements.Button, className: "material-icons icon-document_scanner", tooltip: "Detect document", events: { click: "detectDocument", } } const pcEditViewerUiConfig = { type: Dynamsoft.DDV.Elements.Layout, flexDirection: "column", className: "ddv-edit-viewer-desktop", children: [ { type: Dynamsoft.DDV.Elements.Layout, className: "ddv-edit-viewer-header-desktop", children: [ { type: Dynamsoft.DDV.Elements.Layout, children: [ Dynamsoft.DDV.Elements.ThumbnailSwitch, Dynamsoft.DDV.Elements.Zoom, Dynamsoft.DDV.Elements.FitMode, Dynamsoft.DDV.Elements.Crop, Dynamsoft.DDV.Elements.Filter, Dynamsoft.DDV.Elements.Undo, Dynamsoft.DDV.Elements.Redo, Dynamsoft.DDV.Elements.DeleteCurrent, Dynamsoft.DDV.Elements.DeleteAll, Dynamsoft.DDV.Elements.Pan, Dynamsoft.DDV.Elements.AnnotationSet, qrButton, checkButton, scanButton, clearButton, signatureButton, documentButton, ], }, { 类型：Dynamsoft.DDV.Elements.Layout，子级：[ { 类型：Dynamsoft.DDV.Elements.Pagination，className：“ddv-edit-viewer-pagination-desktop”，}, loadButton，downloadButton，], }, ], }, Dynamsoft.DDV.Elements.MainView，], };

添加文档检测按钮的单击事件处理程序：editViewer.on("detectDocument", detectDocument); async function detectDocument() { ... }

步骤 3：创建用于文档检测和规范化的弹出对话框

文档检测与规范化的弹出对话框包含三个按钮：**检测**、**规范化**、**取消**。

检测：检测文档边界。

规范化：规范化文档。

取消：关闭对话框。

**HTML 代码**


        
            Document Detection

**JavaScript 代码**

let detectDocumentButton = document.getElementById("detectDocument");
let cancelDocumentButton = document.getElementById("cancelDocument");
let normalizeDocumentButton = document.getElementById("normalizeDocument");

cancelDocumentButton.addEventListener('click', () => {
    document.getElementById("document-detection").style.display = "none";
});

normalizeDocumentButton.addEventListener('click', async () => {
    document.getElementById("document-detection").style.display = "none";

    ...
});

detectDocumentButton.addEventListener('click', async () => {
    document.getElementById("document-detection").style.display = "none";

    ...
});

步骤 4：编辑文档角点并校正文档

检测文档并根据编辑查看器中的四个角点绘制轮廓：detectDocumentButton.addEventListener('click', async () => { document.getElementById("document-detection").style.display = "none"; const settings = { quality: 100, saveAnnotation: false, }; const image = await editViewer.currentDocument.saveToJpeg(editViewer.getCurrentPageIndex(), settings); const result = await cvRouter.capture(image, "DetectDocumentBoundaries_Default"); for (let item of result.items) { if (item.type !== Dynamsoft.Core.EnumCapturedResultItemType.CRIT_DETECTED_QUAD) { continue; } let points = item.location.points; let currentPageId = currentDoc.pages[editViewer.getCurrentPageIndex()]; let pageData = await currentDoc.getPageData(currentPageId); documentPoints = points; const polyOptions = { points: points.map(p => { return { x: px / pageData.display.width * pageData.mediaBox.width, y: py / pageData.display.height * pageData.mediaBox.height } }), borderColor: "rgb(0,0,255)", 标志: { print: false, noView: false, readOnly: false, } } 让 poly = Dynamsoft.DDV.annotationManager.createAnnotation(currentPageId, "polygon", polyOptions); poly['name'] = 'document'; break; } });

规范化文档图像： normalizeDocumentButton.addEventListener('click', async () => { document.getElementById("document-detection").style.display = "none"; let currentPageId = currentDoc.pages[editViewer.getCurrentPageIndex()]; let blob = await normalizeImage(); if (blob) { await currentDoc.updatePage(currentPageId, blob); documentPoints = null; } }); async function normalizeImage() { if (!documentPoints) { return null; } let params = await cvRouter.getSimplifiedSettings("NormalizeDocument_Default"); params.roi.points = documentPoints; params.roiMeasuredInPercentage = 0; await cvRouter.updateSettings("NormalizeDocument_Default", params); const 设置 = { 质量： 100， saveAnnotation： false， }; const 图像 = 等待 editViewer.currentDocument.saveToJpeg（editViewer.getCurrentPageIndex（），设置）; cvRouter.maxCvsSideLength = 9999; const 结果 = 等待 cvRouter.capture（图像，“NormalizeDocument_Default”）；对于（让 item of result.items）{ 如果（item.type !== Dynamsoft.Core.EnumCapturedResultItemType.CRIT_NORMALIZED_IMAGE）{ 继续; } 让 blob = 等待 item.toBlob（）; 返回 blob; } }

源代码

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/document_annotation

CLIS.CC