docXMLater

A comprehensive, production-ready TypeScript/JavaScript framework for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically.

Features

Core Document Operations

Create DOCX files from scratch
Read and modify existing DOCX files
Buffer-based operations (load/save from memory)
Document properties (core, extended, custom)
Memory management with dispose pattern
Bookmark pair validation and auto-repair (validateBookmarkPairs())
App.xml metadata preservation (HeadingPairs, TotalTime, etc.)
Document background color/theme support

Text & Paragraph Formatting

Character formatting: bold, italic, underline, strikethrough, subscript, superscript
Font properties: family, size, color (RGB and theme colors), highlight
Text effects: small caps, all caps, shadow, emboss, engrave
Paragraph alignment, indentation, spacing, borders, shading
Text search and replace with regex support
Custom styles (paragraph, character, table)
CJK/East Asian paragraph properties (kinsoku, wordWrap, overflowPunct, topLinePunct)
Underline color and theme color attributes
Theme font references (asciiTheme, hAnsiTheme, eastAsiaTheme, csTheme)

Lists & Tables

Numbered lists (decimal, roman, alpha)
Bulleted lists with various bullet styles
Multi-level lists with custom numbering and restart control
Tables with formatting, borders, shading
Cell spanning (merge cells horizontally and vertically)
Advanced table properties (margins, widths, alignment)
Table navigation helpers (getFirstParagraph(), getLastParagraph())
Legacy horizontal merge (hMerge) support
Table layout parsing (fixed/auto)
Table style shading updates (modify styles.xml colors)
Cell content management (trailing blank removal with structure preservation)

Rich Content

Images (PNG, JPEG, GIF, SVG, EMF, WMF) with positioning, text wrapping, and full ECMA-376 DrawingML attribute coverage
Headers & footers (different first page, odd/even pages)
Hyperlinks (external URLs, internal bookmarks)
Hyperlink defragmentation utility (fixes fragmented links from Google Docs)
Hyperlink URL sanitization (strips browser extension prefixes from corrupted URLs)
Bookmarks and cross-references
Body-level bookmark support (bookmarks between block elements)
Shapes and text boxes

Advanced Features

Track changes (revisions for insertions, deletions, formatting)
Granular character-level tracked changes (text diff-based)
Comments and annotations
Compatibility mode detection and upgrade (Word 2003/2007/2010/2013+ modes)
Table of contents generation with customizable heading levels and relative indentation
Fields: merge fields, date/time, page numbers, TOC fields
Footnotes and endnotes (full round-trip with save pipeline, parsing, and clear API)
Content controls (Structured Document Tags)
Form field data preservation (text input, checkbox, dropdown per ECMA-376 §17.16)
w14 run effects passthrough (Word 2010+ ligatures, numForm, textOutline, etc.)
Expanded document settings (evenAndOddHeaders, mirrorMargins, autoHyphenation, decimalSymbol)
People.xml auto-registration for tracked changes authors
Style default attribute preservation (w:default="1")
Namespace order preservation in generated XML
Multiple sections with different page layouts
Page orientation, size, and margins
Preserved element round-trip (math equations, alternate content, custom XML)
Unified shading model with theme color support and inheritance resolution
Lossless image optimization (PNG re-compression, BMP-to-PNG conversion)
Run property change tracking (w:rPrChange) with direct API access
Paragraph mark revision tracking (w:del/w:ins in w:pPr/w:rPr) for full tracked-changes fidelity
Normal/NormalWeb style linking with preservation flags

Developer Tools

Complete XML generation and parsing (ReDoS-safe, position-based parser)
40+ unit conversion functions (twips, EMUs, points, pixels, inches, cm)
Validation utilities and corruption detection
Text diff utility for character-level comparisons
webSettings.xml auto-generation
Safe OOXML parsing helpers (zero-value handling, boolean parsing)
Full TypeScript support with comprehensive type definitions
Error handling utilities
Logging infrastructure with multiple log levels

Installation

npm install docxmlater

Quick Start

Creating a New Document

import { Document } from 'docxmlater';

// Create a new document
const doc = Document.create();

// Add a paragraph
const para = doc.createParagraph();
para.addText('Hello, World!', { bold: true, fontSize: 24 });

// Save to file
await doc.save('hello.docx');

// Don't forget to dispose
doc.dispose();

Loading and Modifying Documents

import { Document } from 'docxmlater';

// Load existing document
const doc = await Document.load('input.docx');

// Find and replace text
doc.replaceText(/old text/g, 'new text');

// Add a new paragraph
const para = doc.createParagraph();
para.addText('Added paragraph', { italic: true });

// Save modifications
await doc.save('output.docx');
doc.dispose();

Working with Tables

import { Document } from 'docxmlater';

const doc = Document.create();

// Create a 3x4 table
const table = doc.createTable(3, 4);

// Set header row
const headerRow = table.getRow(0);
headerRow.getCell(0).addParagraph().addText('Column 1', { bold: true });
headerRow.getCell(1).addParagraph().addText('Column 2', { bold: true });
headerRow.getCell(2).addParagraph().addText('Column 3', { bold: true });
headerRow.getCell(3).addParagraph().addText('Column 4', { bold: true });

// Add data
table.getRow(1).getCell(0).addParagraph().addText('Data 1');
table.getRow(1).getCell(1).addParagraph().addText('Data 2');

// Apply borders
table.setBorders({
  top: { style: 'single', size: 4, color: '000000' },
  bottom: { style: 'single', size: 4, color: '000000' },
  left: { style: 'single', size: 4, color: '000000' },
  right: { style: 'single', size: 4, color: '000000' },
  insideH: { style: 'single', size: 4, color: '000000' },
  insideV: { style: 'single', size: 4, color: '000000' },
});

await doc.save('table.docx');
doc.dispose();

Adding Images

import { Document } from 'docxmlater';
import { readFileSync } from 'fs';

const doc = Document.create();

// Load image from file
const imageBuffer = readFileSync('photo.jpg');

// Add image to document
const para = doc.createParagraph();
await para.addImage(imageBuffer, {
  width: 400,
  height: 300,
  format: 'jpg',
});

await doc.save('with-image.docx');
doc.dispose();

Hyperlink Management

import { Document } from 'docxmlater';

const doc = await Document.load('document.docx');

// Get all hyperlinks
const hyperlinks = doc.getHyperlinks();
console.log(`Found ${hyperlinks.length} hyperlinks`);

// Update URLs in batch (30-50% faster than manual iteration)
doc.updateHyperlinkUrls('http://old-domain.com', 'https://new-domain.com');

// Fix fragmented hyperlinks from Google Docs
const mergedCount = doc.defragmentHyperlinks({
  resetFormatting: true, // Fix corrupted fonts
});
console.log(`Merged ${mergedCount} fragmented hyperlinks`);

await doc.save('updated.docx');
doc.dispose();

Custom Styles

import { Document, Style } from 'docxmlater';

const doc = Document.create();

// Create custom paragraph style
const customStyle = new Style('CustomHeading', 'paragraph');
customStyle.setName('Custom Heading');
customStyle.setRunFormatting({
  bold: true,
  fontSize: 32,
  color: '0070C0',
});
customStyle.setParagraphFormatting({
  alignment: 'center',
  spacingAfter: 240,
});

// Add style to document
doc.getStylesManager().addStyle(customStyle);

// Apply style to paragraph
const para = doc.createParagraph();
para.addText('Styled Heading');
para.applyStyle('CustomHeading');

await doc.save('styled.docx');
doc.dispose();

Compatibility Mode Detection and Upgrade

import { Document, CompatibilityMode } from 'docxmlater';

const doc = await Document.load('legacy.docx');

// Check compatibility mode
console.log(`Mode: ${doc.getCompatibilityMode()}`); // e.g., 12 (Word 2007)

if (doc.isCompatibilityMode()) {
  // Get detailed compatibility info
  const info = doc.getCompatibilityInfo();
  console.log(`Legacy flags: ${info.legacyFlags.length}`);

  // Upgrade to Word 2013+ mode (equivalent to File > Info > Convert)
  const report = doc.upgradeToModernFormat();
  console.log(`Removed ${report.removedFlags.length} legacy flags`);
  console.log(`Added ${report.addedSettings.length} modern settings`);
}

await doc.save('modern.docx');
doc.dispose();

API Overview

Document Class

Creation & Loading:

Document.create(options?) - Create new document
Document.load(filepath, options?) - Load from file
Document.loadFromBuffer(buffer, options?) - Load from memory

Handling Tracked Changes:

By default, docXMLater accepts all tracked changes during document loading to prevent corruption:

// Default: Accepts all changes (recommended)
const doc = await Document.load('document.docx');

// Explicit control
const doc = await Document.load('document.docx', {
  revisionHandling: 'accept'  // Accept all changes (default)
  // OR
  revisionHandling: 'strip'   // Remove all revision markup
  // OR
  revisionHandling: 'preserve' // Keep tracked changes (may cause corruption, but should not do so - report errors if found)
});

Revision Handling Options:

'accept' (default): Removes revision markup, keeps inserted content, removes deleted content
'strip': Removes all revision markup completely
'preserve': Keeps tracked changes as-is (may cause Word "unreadable content" errors)

Why Accept By Default?

Documents with tracked changes can cause Word corruption errors during round-trip processing due to r

DocXMLater

Install / Use

README