// 3. Main document part - STREAMING XML (no DOM) var docEntry = archive.CreateEntry("word/document.xml"); using (var docStream = docEntry.Open()) using (var xmlWriter = XmlWriter.Create(docStream, new XmlWriterSettings Indent = true )) xmlWriter.WriteStartDocument(); xmlWriter.WriteStartElement("w:document", "http://schemas.openxmlformats.org/wordprocessingml/2006/main"); xmlWriter.WriteStartElement("w:body"); // Title paragraph xmlWriter.WriteStartElement("w:p"); xmlWriter.WriteStartElement("w:r"); xmlWriter.WriteStartElement("w:t"); xmlWriter.WriteString(title); xmlWriter.WriteEndElement(); // t xmlWriter.WriteEndElement(); // r xmlWriter.WriteEndElement(); // p // Content paragraph (sanitized) var safeContent = System.Security.SecurityElement.Escape(content); xmlWriter.WriteStartElement("w:p"); xmlWriter.WriteStartElement("w:r"); xmlWriter.WriteStartElement("w:t"); xmlWriter.WriteString(safeContent); xmlWriter.WriteEndElement(); xmlWriter.WriteEndElement(); xmlWriter.WriteEndElement(); xmlWriter.WriteEndElement(); // body xmlWriter.WriteEndElement(); // document xmlWriter.WriteEndDocument();
Critical note: Microsoft Office defaults to , whereas some open-source parsers prefer Strict . For maximum compatibility in download scenarios, target Transitional. 3. The "Download" Problem: Generation and Delivery 3.1 Two Primary Strategies | Strategy | Description | Pros | Cons | | :--- | :--- | :--- | :--- | | Template-based | Load a pre-created .docx template, replace placeholders (e.g., name ). | Preserves complex formatting. | Requires template management; large memory if using DOM. | | Programmatic build | Build XML trees (e.g., using DocumentBuilder libraries). | Full control; scalable. | Steeper learning curve for complex layouts. | 3.2 Performance Bottleneck: DOM vs. Streaming Most naive implementations load the entire document.xml into an XML DOM (Document Object Model). For a 50-page report, this may be ~10 MB; for a 500,000-row Excel sheet, this can exceed 2 GB of RAM.
Office Open XML, OOXML, Document Generation, File Download, XML Security, ZIP Compression, REST API. 1. Introduction In enterprise web applications, generating downloadable office documents from structured data (e.g., invoices, reports, spreadsheets) is a ubiquitous requirement. Prior to OOXML, server-side generation often relied on binary formats ( .doc , .xls ) via COM interop (unreliable and non-scalable) or HTML-to-PDF converters (loss of semantic fidelity). The introduction of OOXML solved this by providing an open, royalty-free, XML-based standard.
Set a maximum decompression ratio (e.g., ZipFile.Extract with ExtractEntry limits). For generation, do not decompress untrusted archives. 4.3 Path Traversal in ZIP Entries Evil entries like ../../config/secret.xml inside a ZIP can overwrite files.
– Write XML directly to the ZIP entry's output stream using a XmlWriter (or equivalent) without retaining the entire tree.
| Method | Peak Memory (MB) | Time (s) | Max Concurrent Requests | | :--- | :--- | :--- | :--- | | (deprecated) | 1,200 | 62 | 2 (serialized) | | Open XML SDK + DOM | 890 | 28 | 8 | | Open XML SDK + Streaming (our method) | 230 | 22 | 35 |