DocXMLater
Production-ready TypeScript/JavaScript library / framework for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically. Full OpenXML compliance with extensive API coverage and robust test suite. Got tired of half implemented docx / xml frameworks, or expensive ones, so built one that did everything I needed.
Install / Use
/learn @ItMeDiaTech/DocXMLaterREADME
docXMLater
A comprehensive, production-ready TypeScript/JavaScript framework for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically.
Features
Core Document Operations
- Create DOCX files from scratch
- Read and modify existing DOCX files
- Buffer-based operations (load/save from memory)
- Document properties (core, extended, custom)
- Memory management with dispose pattern
- Bookmark pair validation and auto-repair (
validateBookmarkPairs()) - App.xml metadata preservation (HeadingPairs, TotalTime, etc.)
- Document background color/theme support
Text & Paragraph Formatting
- Character formatting: bold, italic, underline, strikethrough, subscript, superscript
- Font properties: family, size, color (RGB and theme colors), highlight
- Text effects: small caps, all caps, shadow, emboss, engrave
- Paragraph alignment, indentation, spacing, borders, shading
- Text search and replace with regex support
- Custom styles (paragraph, character, table)
- CJK/East Asian paragraph properties (kinsoku, wordWrap, overflowPunct, topLinePunct)
- Underline color and theme color attributes
- Theme font references (asciiTheme, hAnsiTheme, eastAsiaTheme, csTheme)
Lists & Tables
- Numbered lists (decimal, roman, alpha)
- Bulleted lists with various bullet styles
- Multi-level lists with custom numbering and restart control
- Tables with formatting, borders, shading
- Cell spanning (merge cells horizontally and vertically)
- Advanced table properties (margins, widths, alignment)
- Table navigation helpers (
getFirstParagraph(),getLastParagraph()) - Legacy horizontal merge (
hMerge) support - Table layout parsing (
fixed/auto) - Table style shading updates (modify styles.xml colors)
- Cell content management (trailing blank removal with structure preservation)
Rich Content
- Images (PNG, JPEG, GIF, SVG, EMF, WMF) with positioning, text wrapping, and full ECMA-376 DrawingML attribute coverage
- Headers & footers (different first page, odd/even pages)
- Hyperlinks (external URLs, internal bookmarks)
- Hyperlink defragmentation utility (fixes fragmented links from Google Docs)
- Hyperlink URL sanitization (strips browser extension prefixes from corrupted URLs)
- Bookmarks and cross-references
- Body-level bookmark support (bookmarks between block elements)
- Shapes and text boxes
Advanced Features
- Track changes (revisions for insertions, deletions, formatting)
- Granular character-level tracked changes (text diff-based)
- Comments and annotations
- Compatibility mode detection and upgrade (Word 2003/2007/2010/2013+ modes)
- Table of contents generation with customizable heading levels and relative indentation
- Fields: merge fields, date/time, page numbers, TOC fields
- Footnotes and endnotes (full round-trip with save pipeline, parsing, and clear API)
- Content controls (Structured Document Tags)
- Form field data preservation (text input, checkbox, dropdown per ECMA-376 §17.16)
- w14 run effects passthrough (Word 2010+ ligatures, numForm, textOutline, etc.)
- Expanded document settings (evenAndOddHeaders, mirrorMargins, autoHyphenation, decimalSymbol)
- People.xml auto-registration for tracked changes authors
- Style default attribute preservation (
w:default="1") - Namespace order preservation in generated XML
- Multiple sections with different page layouts
- Page orientation, size, and margins
- Preserved element round-trip (math equations, alternate content, custom XML)
- Unified shading model with theme color support and inheritance resolution
- Lossless image optimization (PNG re-compression, BMP-to-PNG conversion)
- Run property change tracking (w:rPrChange) with direct API access
- Paragraph mark revision tracking (w:del/w:ins in w:pPr/w:rPr) for full tracked-changes fidelity
- Normal/NormalWeb style linking with preservation flags
Developer Tools
- Complete XML generation and parsing (ReDoS-safe, position-based parser)
- 40+ unit conversion functions (twips, EMUs, points, pixels, inches, cm)
- Validation utilities and corruption detection
- Text diff utility for character-level comparisons
- webSettings.xml auto-generation
- Safe OOXML parsing helpers (zero-value handling, boolean parsing)
- Full TypeScript support with comprehensive type definitions
- Error handling utilities
- Logging infrastructure with multiple log levels
Installation
npm install docxmlater
Quick Start
Creating a New Document
import { Document } from 'docxmlater';
// Create a new document
const doc = Document.create();
// Add a paragraph
const para = doc.createParagraph();
para.addText('Hello, World!', { bold: true, fontSize: 24 });
// Save to file
await doc.save('hello.docx');
// Don't forget to dispose
doc.dispose();
Loading and Modifying Documents
import { Document } from 'docxmlater';
// Load existing document
const doc = await Document.load('input.docx');
// Find and replace text
doc.replaceText(/old text/g, 'new text');
// Add a new paragraph
const para = doc.createParagraph();
para.addText('Added paragraph', { italic: true });
// Save modifications
await doc.save('output.docx');
doc.dispose();
Working with Tables
import { Document } from 'docxmlater';
const doc = Document.create();
// Create a 3x4 table
const table = doc.createTable(3, 4);
// Set header row
const headerRow = table.getRow(0);
headerRow.getCell(0).addParagraph().addText('Column 1', { bold: true });
headerRow.getCell(1).addParagraph().addText('Column 2', { bold: true });
headerRow.getCell(2).addParagraph().addText('Column 3', { bold: true });
headerRow.getCell(3).addParagraph().addText('Column 4', { bold: true });
// Add data
table.getRow(1).getCell(0).addParagraph().addText('Data 1');
table.getRow(1).getCell(1).addParagraph().addText('Data 2');
// Apply borders
table.setBorders({
top: { style: 'single', size: 4, color: '000000' },
bottom: { style: 'single', size: 4, color: '000000' },
left: { style: 'single', size: 4, color: '000000' },
right: { style: 'single', size: 4, color: '000000' },
insideH: { style: 'single', size: 4, color: '000000' },
insideV: { style: 'single', size: 4, color: '000000' },
});
await doc.save('table.docx');
doc.dispose();
Adding Images
import { Document } from 'docxmlater';
import { readFileSync } from 'fs';
const doc = Document.create();
// Load image from file
const imageBuffer = readFileSync('photo.jpg');
// Add image to document
const para = doc.createParagraph();
await para.addImage(imageBuffer, {
width: 400,
height: 300,
format: 'jpg',
});
await doc.save('with-image.docx');
doc.dispose();
Hyperlink Management
import { Document } from 'docxmlater';
const doc = await Document.load('document.docx');
// Get all hyperlinks
const hyperlinks = doc.getHyperlinks();
console.log(`Found ${hyperlinks.length} hyperlinks`);
// Update URLs in batch (30-50% faster than manual iteration)
doc.updateHyperlinkUrls('http://old-domain.com', 'https://new-domain.com');
// Fix fragmented hyperlinks from Google Docs
const mergedCount = doc.defragmentHyperlinks({
resetFormatting: true, // Fix corrupted fonts
});
console.log(`Merged ${mergedCount} fragmented hyperlinks`);
await doc.save('updated.docx');
doc.dispose();
Custom Styles
import { Document, Style } from 'docxmlater';
const doc = Document.create();
// Create custom paragraph style
const customStyle = new Style('CustomHeading', 'paragraph');
customStyle.setName('Custom Heading');
customStyle.setRunFormatting({
bold: true,
fontSize: 32,
color: '0070C0',
});
customStyle.setParagraphFormatting({
alignment: 'center',
spacingAfter: 240,
});
// Add style to document
doc.getStylesManager().addStyle(customStyle);
// Apply style to paragraph
const para = doc.createParagraph();
para.addText('Styled Heading');
para.applyStyle('CustomHeading');
await doc.save('styled.docx');
doc.dispose();
Compatibility Mode Detection and Upgrade
import { Document, CompatibilityMode } from 'docxmlater';
const doc = await Document.load('legacy.docx');
// Check compatibility mode
console.log(`Mode: ${doc.getCompatibilityMode()}`); // e.g., 12 (Word 2007)
if (doc.isCompatibilityMode()) {
// Get detailed compatibility info
const info = doc.getCompatibilityInfo();
console.log(`Legacy flags: ${info.legacyFlags.length}`);
// Upgrade to Word 2013+ mode (equivalent to File > Info > Convert)
const report = doc.upgradeToModernFormat();
console.log(`Removed ${report.removedFlags.length} legacy flags`);
console.log(`Added ${report.addedSettings.length} modern settings`);
}
await doc.save('modern.docx');
doc.dispose();
API Overview
Document Class
Creation & Loading:
Document.create(options?)- Create new documentDocument.load(filepath, options?)- Load from fileDocument.loadFromBuffer(buffer, options?)- Load from memory
Handling Tracked Changes:
By default, docXMLater accepts all tracked changes during document loading to prevent corruption:
// Default: Accepts all changes (recommended)
const doc = await Document.load('document.docx');
// Explicit control
const doc = await Document.load('document.docx', {
revisionHandling: 'accept' // Accept all changes (default)
// OR
revisionHandling: 'strip' // Remove all revision markup
// OR
revisionHandling: 'preserve' // Keep tracked changes (may cause corruption, but should not do so - report errors if found)
});
Revision Handling Options:
'accept'(default): Removes revision markup, keeps inserted content, removes deleted content'strip': Removes all revision markup completely'preserve': Keeps tracked changes as-is (may cause Word "unreadable content" errors)
Why Accept By Default?
Documents with tracked changes can cause Word corruption errors during round-trip processing due to r
