如何使用 HTML5 和 JavaScript 检测文档并将其保存为 PDF

将收据、发票和合同等文档捕获并保存为 PDF 文件是许多企业的常见要求。在本文中,我们通过添加检测文档并将其保存为 PDF 的功能来增强使用 Dynamsoft Document Viewer 构建的 Web 文档编辑器项目。文档检测功能由 Dynamsoft Capture Vision 提供支持。

演示视频:检测文档并将其保存为 PDF

在线演示

https://yushulx.me/web-document-annotation/

先决条件

  • Dynamsoft Capture Vision 试用许可证:获得 30 天免费试用许可证,以解锁 Dynamsoft 产品的全部功能。
  • Dynamsoft 文档查看器:此 JavaScript SDK 支持无缝查看 PDF、JPEG、PNG、TIFF 和 BMP 文件。它还具有 PDF 注释渲染和保存功能。在此处下载:https://www.npmjs.com/package/dynamsoft-document-viewer。
  • Dynamsoft Capture Vision Bundle:此 JavaScript SDK 提供文档检测、裁剪和图像增强功能。在此处下载:https://www.npmjs.com/package/dynamsoft-capture-vision-bundle。
  • 在 HTML5 和 JavaScript 中实现文档检测和纠正功能

    以下部分将指导您使用 HTML5 和 JavaScript 实现文档检测和纠正功能。如果您已经下载源代码,则可以跳至步骤 2。

    步骤 1:获取源代码

  • 从 GitHub 存储库克隆源代码:git clone https://github.com/yushulx/web-twain-document-scan-management.git
  • 导航到 document_annotation 目录:cd web-twain-document-scan-management/examples/document_annotation
  • 在 Visual Studio Code 中打开项目。
  • 步骤2:添加文档检测按钮

  • 在 main.css 中,为文档检测按钮添加一个材质图标: .icon-document_scanner::before { content: "crop_free"; } .icon-document_scanner { display: flex; font-size: 1.5em; }
  • 在main.js中定义文档检测按钮并添加到工具栏: const documentButton = { type: Dynamsoft.DDV.Elements.Button, className: "material-icons icon-document_scanner", tooltip: "Detect document", events: { click: "detectDocument", } } const pcEditViewerUiConfig = { type: Dynamsoft.DDV.Elements.Layout, flexDirection: "column", className: "ddv-edit-viewer-desktop", children: [ { type: Dynamsoft.DDV.Elements.Layout, className: "ddv-edit-viewer-header-desktop", children: [ { type: Dynamsoft.DDV.Elements.Layout, children: [ Dynamsoft.DDV.Elements.ThumbnailSwitch, Dynamsoft.DDV.Elements.Zoom, Dynamsoft.DDV.Elements.FitMode, Dynamsoft.DDV.Elements.Crop, Dynamsoft.DDV.Elements.Filter, Dynamsoft.DDV.Elements.Undo, Dynamsoft.DDV.Elements.Redo, Dynamsoft.DDV.Elements.DeleteCurrent, Dynamsoft.DDV.Elements.DeleteAll, Dynamsoft.DDV.Elements.Pan, Dynamsoft.DDV.Elements.AnnotationSet, qrButton, checkButton, scanButton, clearButton, signatureButton, documentButton, ], }, { 类型:Dynamsoft.DDV.Elements.Layout,子级:[ { 类型:Dynamsoft.DDV.Elements.Pagination,className:“ddv-edit-viewer-pagination-desktop”,}, loadButton,downloadButton,], }, ], }, Dynamsoft.DDV.Elements.MainView,], };
  • 添加文档检测按钮的单击事件处理程序:editViewer.on("detectDocument", detectDocument); async function detectDocument() { ... }
  • 步骤 3:创建用于文档检测和规范化的弹出对话框

    文档检测与规范化的弹出对话框包含三个按钮:**检测**、**规范化**、**取消**。

  • 检测:检测文档边界。
  • 规范化:规范化文档。
  • 取消:关闭对话框。
  • **HTML 代码**

    Document Detection

    document detction operations

    **JavaScript 代码**

    let detectDocumentButton = document.getElementById("detectDocument");
    let cancelDocumentButton = document.getElementById("cancelDocument");
    let normalizeDocumentButton = document.getElementById("normalizeDocument");
    
    cancelDocumentButton.addEventListener('click', () => {
        document.getElementById("document-detection").style.display = "none";
    });
    
    normalizeDocumentButton.addEventListener('click', async () => {
        document.getElementById("document-detection").style.display = "none";
    
        ...
    });
    
    detectDocumentButton.addEventListener('click', async () => {
        document.getElementById("document-detection").style.display = "none";
    
        ...
    });

    步骤 4:编辑文档角点并校正文档

  • 检测文档并根据编辑查看器中的四个角点绘制轮廓:detectDocumentButton.addEventListener('click', async () => { document.getElementById("document-detection").style.display = "none"; const settings = { quality: 100, saveAnnotation: false, }; const image = await editViewer.currentDocument.saveToJpeg(editViewer.getCurrentPageIndex(), settings); const result = await cvRouter.capture(image, "DetectDocumentBoundaries_Default"); for (let item of result.items) { if (item.type !== Dynamsoft.Core.EnumCapturedResultItemType.CRIT_DETECTED_QUAD) { continue; } let points = item.location.points; let currentPageId = currentDoc.pages[editViewer.getCurrentPageIndex()]; let pageData = await currentDoc.getPageData(currentPageId); documentPoints = points; const polyOptions = { points: points.map(p => { return { x: px / pageData.display.width * pageData.mediaBox.width, y: py / pageData.display.height * pageData.mediaBox.height } }), borderColor: "rgb(0,0,255)", 标志: { print: false, noView: false, readOnly: false, } } 让 poly = Dynamsoft.DDV.annotationManager.createAnnotation(currentPageId, "polygon", polyOptions); poly['name'] = 'document'; break; } });
  • 规范化文档图像: normalizeDocumentButton.addEventListener('click', async () => { document.getElementById("document-detection").style.display = "none"; let currentPageId = currentDoc.pages[editViewer.getCurrentPageIndex()]; let blob = await normalizeImage(); if (blob) { await currentDoc.updatePage(currentPageId, blob); documentPoints = null; } }); async function normalizeImage() { if (!documentPoints) { return null; } let params = await cvRouter.getSimplifiedSettings("NormalizeDocument_Default"); params.roi.points = documentPoints; params.roiMeasuredInPercentage = 0; await cvRouter.updateSettings("NormalizeDocument_Default", params); const 设置 = { 质量: 100, saveAnnotation: false, }; const 图像 = 等待 editViewer.currentDocument.saveToJpeg(editViewer.getCurrentPageIndex(),设置); cvRouter.maxCvsSideLength = 9999; const 结果 = 等待 cvRouter.capture(图像,“NormalizeDocument_Default”); 对于(让 item of result.items){ 如果(item.type !== Dynamsoft.Core.EnumCapturedResultItemType.CRIT_NORMALIZED_IMAGE){ 继续; } 让 blob = 等待 item.toBlob(); 返回 blob; } }
  • 源代码

    https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/document_annotation