SkillAgentSearch skills...

DocXMLater

Production-ready TypeScript/JavaScript library / framework for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically. Full OpenXML compliance with extensive API coverage and robust test suite. Got tired of half implemented docx / xml frameworks, or expensive ones, so built one that did everything I needed.

Install / Use

/learn @ItMeDiaTech/DocXMLater
About this skill

Quality Score

0/100

Category

Legal

Supported Platforms

Universal

README

docXMLater

A comprehensive, production-ready TypeScript/JavaScript framework for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically.

Features

Core Document Operations

  • Create DOCX files from scratch
  • Read and modify existing DOCX files
  • Buffer-based operations (load/save from memory)
  • Document properties (core, extended, custom)
  • Memory management with dispose pattern
  • Bookmark pair validation and auto-repair (validateBookmarkPairs())
  • App.xml metadata preservation (HeadingPairs, TotalTime, etc.)
  • Document background color/theme support

Text & Paragraph Formatting

  • Character formatting: bold, italic, underline, strikethrough, subscript, superscript
  • Font properties: family, size, color (RGB and theme colors), highlight
  • Text effects: small caps, all caps, shadow, emboss, engrave
  • Paragraph alignment, indentation, spacing, borders, shading
  • Text search and replace with regex support
  • Custom styles (paragraph, character, table)
  • CJK/East Asian paragraph properties (kinsoku, wordWrap, overflowPunct, topLinePunct)
  • Underline color and theme color attributes
  • Theme font references (asciiTheme, hAnsiTheme, eastAsiaTheme, csTheme)

Lists & Tables

  • Numbered lists (decimal, roman, alpha)
  • Bulleted lists with various bullet styles
  • Multi-level lists with custom numbering and restart control
  • Tables with formatting, borders, shading
  • Cell spanning (merge cells horizontally and vertically)
  • Advanced table properties (margins, widths, alignment)
  • Table navigation helpers (getFirstParagraph(), getLastParagraph())
  • Legacy horizontal merge (hMerge) support
  • Table layout parsing (fixed/auto)
  • Table style shading updates (modify styles.xml colors)
  • Cell content management (trailing blank removal with structure preservation)

Rich Content

  • Images (PNG, JPEG, GIF, SVG, EMF, WMF) with positioning, text wrapping, and full ECMA-376 DrawingML attribute coverage
  • Headers & footers (different first page, odd/even pages)
  • Hyperlinks (external URLs, internal bookmarks)
  • Hyperlink defragmentation utility (fixes fragmented links from Google Docs)
  • Hyperlink URL sanitization (strips browser extension prefixes from corrupted URLs)
  • Bookmarks and cross-references
  • Body-level bookmark support (bookmarks between block elements)
  • Shapes and text boxes

Advanced Features

  • Track changes (revisions for insertions, deletions, formatting)
  • Granular character-level tracked changes (text diff-based)
  • Comments and annotations
  • Compatibility mode detection and upgrade (Word 2003/2007/2010/2013+ modes)
  • Table of contents generation with customizable heading levels and relative indentation
  • Fields: merge fields, date/time, page numbers, TOC fields
  • Footnotes and endnotes (full round-trip with save pipeline, parsing, and clear API)
  • Content controls (Structured Document Tags)
  • Form field data preservation (text input, checkbox, dropdown per ECMA-376 §17.16)
  • w14 run effects passthrough (Word 2010+ ligatures, numForm, textOutline, etc.)
  • Expanded document settings (evenAndOddHeaders, mirrorMargins, autoHyphenation, decimalSymbol)
  • People.xml auto-registration for tracked changes authors
  • Style default attribute preservation (w:default="1")
  • Namespace order preservation in generated XML
  • Multiple sections with different page layouts
  • Page orientation, size, and margins
  • Preserved element round-trip (math equations, alternate content, custom XML)
  • Unified shading model with theme color support and inheritance resolution
  • Lossless image optimization (PNG re-compression, BMP-to-PNG conversion)
  • Run property change tracking (w:rPrChange) with direct API access
  • Paragraph mark revision tracking (w:del/w:ins in w:pPr/w:rPr) for full tracked-changes fidelity
  • Normal/NormalWeb style linking with preservation flags

Developer Tools

  • Complete XML generation and parsing (ReDoS-safe, position-based parser)
  • 40+ unit conversion functions (twips, EMUs, points, pixels, inches, cm)
  • Validation utilities and corruption detection
  • Text diff utility for character-level comparisons
  • webSettings.xml auto-generation
  • Safe OOXML parsing helpers (zero-value handling, boolean parsing)
  • Full TypeScript support with comprehensive type definitions
  • Error handling utilities
  • Logging infrastructure with multiple log levels

Installation

npm install docxmlater

Quick Start

Creating a New Document

import { Document } from 'docxmlater';

// Create a new document
const doc = Document.create();

// Add a paragraph
const para = doc.createParagraph();
para.addText('Hello, World!', { bold: true, fontSize: 24 });

// Save to file
await doc.save('hello.docx');

// Don't forget to dispose
doc.dispose();

Loading and Modifying Documents

import { Document } from 'docxmlater';

// Load existing document
const doc = await Document.load('input.docx');

// Find and replace text
doc.replaceText(/old text/g, 'new text');

// Add a new paragraph
const para = doc.createParagraph();
para.addText('Added paragraph', { italic: true });

// Save modifications
await doc.save('output.docx');
doc.dispose();

Working with Tables

import { Document } from 'docxmlater';

const doc = Document.create();

// Create a 3x4 table
const table = doc.createTable(3, 4);

// Set header row
const headerRow = table.getRow(0);
headerRow.getCell(0).addParagraph().addText('Column 1', { bold: true });
headerRow.getCell(1).addParagraph().addText('Column 2', { bold: true });
headerRow.getCell(2).addParagraph().addText('Column 3', { bold: true });
headerRow.getCell(3).addParagraph().addText('Column 4', { bold: true });

// Add data
table.getRow(1).getCell(0).addParagraph().addText('Data 1');
table.getRow(1).getCell(1).addParagraph().addText('Data 2');

// Apply borders
table.setBorders({
  top: { style: 'single', size: 4, color: '000000' },
  bottom: { style: 'single', size: 4, color: '000000' },
  left: { style: 'single', size: 4, color: '000000' },
  right: { style: 'single', size: 4, color: '000000' },
  insideH: { style: 'single', size: 4, color: '000000' },
  insideV: { style: 'single', size: 4, color: '000000' },
});

await doc.save('table.docx');
doc.dispose();

Adding Images

import { Document } from 'docxmlater';
import { readFileSync } from 'fs';

const doc = Document.create();

// Load image from file
const imageBuffer = readFileSync('photo.jpg');

// Add image to document
const para = doc.createParagraph();
await para.addImage(imageBuffer, {
  width: 400,
  height: 300,
  format: 'jpg',
});

await doc.save('with-image.docx');
doc.dispose();

Hyperlink Management

import { Document } from 'docxmlater';

const doc = await Document.load('document.docx');

// Get all hyperlinks
const hyperlinks = doc.getHyperlinks();
console.log(`Found ${hyperlinks.length} hyperlinks`);

// Update URLs in batch (30-50% faster than manual iteration)
doc.updateHyperlinkUrls('http://old-domain.com', 'https://new-domain.com');

// Fix fragmented hyperlinks from Google Docs
const mergedCount = doc.defragmentHyperlinks({
  resetFormatting: true, // Fix corrupted fonts
});
console.log(`Merged ${mergedCount} fragmented hyperlinks`);

await doc.save('updated.docx');
doc.dispose();

Custom Styles

import { Document, Style } from 'docxmlater';

const doc = Document.create();

// Create custom paragraph style
const customStyle = new Style('CustomHeading', 'paragraph');
customStyle.setName('Custom Heading');
customStyle.setRunFormatting({
  bold: true,
  fontSize: 32,
  color: '0070C0',
});
customStyle.setParagraphFormatting({
  alignment: 'center',
  spacingAfter: 240,
});

// Add style to document
doc.getStylesManager().addStyle(customStyle);

// Apply style to paragraph
const para = doc.createParagraph();
para.addText('Styled Heading');
para.applyStyle('CustomHeading');

await doc.save('styled.docx');
doc.dispose();

Compatibility Mode Detection and Upgrade

import { Document, CompatibilityMode } from 'docxmlater';

const doc = await Document.load('legacy.docx');

// Check compatibility mode
console.log(`Mode: ${doc.getCompatibilityMode()}`); // e.g., 12 (Word 2007)

if (doc.isCompatibilityMode()) {
  // Get detailed compatibility info
  const info = doc.getCompatibilityInfo();
  console.log(`Legacy flags: ${info.legacyFlags.length}`);

  // Upgrade to Word 2013+ mode (equivalent to File > Info > Convert)
  const report = doc.upgradeToModernFormat();
  console.log(`Removed ${report.removedFlags.length} legacy flags`);
  console.log(`Added ${report.addedSettings.length} modern settings`);
}

await doc.save('modern.docx');
doc.dispose();

API Overview

Document Class

Creation & Loading:

  • Document.create(options?) - Create new document
  • Document.load(filepath, options?) - Load from file
  • Document.loadFromBuffer(buffer, options?) - Load from memory

Handling Tracked Changes:

By default, docXMLater accepts all tracked changes during document loading to prevent corruption:

// Default: Accepts all changes (recommended)
const doc = await Document.load('document.docx');

// Explicit control
const doc = await Document.load('document.docx', {
  revisionHandling: 'accept'  // Accept all changes (default)
  // OR
  revisionHandling: 'strip'   // Remove all revision markup
  // OR
  revisionHandling: 'preserve' // Keep tracked changes (may cause corruption, but should not do so - report errors if found)
});

Revision Handling Options:

  • 'accept' (default): Removes revision markup, keeps inserted content, removes deleted content
  • 'strip': Removes all revision markup completely
  • 'preserve': Keeps tracked changes as-is (may cause Word "unreadable content" errors)

Why Accept By Default?

Documents with tracked changes can cause Word corruption errors during round-trip processing due to r

View on GitHub
GitHub Stars7
CategoryLegal
Updated9d ago
Forks3

Languages

TypeScript

Security Score

90/100

Audited on Mar 24, 2026

No findings