idconvert/BRIEF.md

15 KiB
Raw Permalink Blame History

IDconvert — Project Brief

Last updated: April 2026 Status: Pre-development — Phase 1 MVP ready to build


The Problem

Designers build documents in Adobe InDesign. Their clients — NGOs, professional associations, corporates — need editable Word versions to update content themselves. Rebuilding a document from scratch in Word is unbillable, time-consuming, and beneath a professional designer's workflow.

The only serious existing solution is ID2Office by Recosoft — a native InDesign plugin at $229/year that requires InDesign to be installed. There is no standalone web tool that accepts an exported IDML file and returns a clean DOCX without InDesign.


The Solution

IDconvert — a web-based IDML to DOCX conversion tool. No InDesign required. Upload the IDML file, receive an editable Word document that preserves layout, typography, images, tables and reading flow.

IDtag — a free InDesign ExtendScript that prepares complex multi-column documents for more accurate conversion. Phase 2 only.


Products

IDtag (Phase 2 — Free)

  • Plain .jsx ExtendScript file
  • Designer runs it inside InDesign before exporting IDML
  • Tags threaded text frames with metadata for accurate multi-column reconstruction
  • No licensing, no activation, distributed freely from the website
  • Serves as a top-of-funnel entry point into IDconvert

IDconvert (Phase 1 — Paid)

  • Web SaaS, no software installation required
  • Upload IDML → scan → confirm → download DOCX
  • Credit-based pricing
  • Pre-conversion scan report with font warnings and layout notices

How It Works (User Flow)

1. Designer runs IDtag in InDesign (Phase 2, optional for single column)
2. Designer exports IDML from InDesign normally
3. Designer or client uploads IDML to IDconvert
4. IDconvert scans the file — shows font report and warnings
5. User reviews notices, confirms conversion
6. DOCX downloads — ready to edit in Word

What the DOCX Contains

  • All text, fully editable, with paragraph and character styles mapped to Word equivalents
  • Layout preserved via anchored linked text boxes — single and multi-column
  • Images embedded and positioned to match original layout
  • Tables with structure and formatting intact
  • Hyperlinks active and clickable
  • Page numbers as native Word footer fields, matched font/size/colour
  • Clear warnings for anything that could not be perfectly replicated

What It Does Not Do

  • Pixel-perfect PDF replication (not possible in Word)
  • Master page headers and footers (Phase 1 exclusion)
  • Complex contour text wrap (simplified to square wrap with warning)
  • Keynote export
  • Require InDesign to be installed

Target Users

Maya — Graphic Designer

Freelance, 6 years experience. Designs annual reports, brand guides, cookbooks, association publications in InDesign. Needs to hand off editable Word documents to clients as a standard project deliverable. Builds IDconvert into her production workflow and bills clients for the conversion as part of her production fee.

Pain: Client always asks for Word version after the PDF is approved. Rebuilding it is unpaid time.

Richard — Operations Manager

Regional engineering firm. Receives designed documents from an external studio. Needs to update figures, swap names, and edit body copy before board submissions — in Word, because that is what his team uses.

Pain: Designer sends a PDF. Richard cannot edit it. He needs a Word version that still looks professional.


Competitive Positioning

ID2Office IDconvert
Requires InDesign Yes No
Price $229/year From $9 (credit pack)
Delivery model Native plugin Web tool
Multi-column support Yes (years of iteration) Phase 2
Pre-conversion warnings No Yes
Font report No Yes
Caribbean market focus No Yes

Pricing

Credit Packs

Pack Price Per Conversion
Starter — 5 credits $19 $3.80
Studio — 20 credits $59 $2.95
Agency — 60 credits $149 $2.48

Free tier: 1 conversion. Gated by email verification, browser fingerprint, and file hash. No ads. Premium positioning — ID2Office charges $229/year with no scan report, no font intelligence, and requires InDesign installed. No subscription at MVP — introduce after observing real usage frequency.


Technical Architecture

IDML Structure

IDML is a ZIP archive of XML files:

  • designmap.xml — master manifest
  • Spreads/*.xml — page geometry, frame positions
  • Stories/*.xml — text content with formatting
  • Resources/Fonts.xml — all fonts used
  • Resources/Styles.xml — paragraph and character styles

Content (Stories) and layout (Spreads) are separate data sources joined by a ParentStory attribute on each text frame.

Conversion Pipeline

Upload IDML
    ↓
SCAN: lightweight XML parse
  - Count pages, stories, images, tables
  - Classify fonts (safe / professional / unknown)
  - Detect text wrap types
  - Detect threading complexity
  - Collect warnings array
    ↓
Display scan report to user
    ↓
User confirms → CONVERT
  - Parse Spreads for frame geometry
  - Parse Stories for text content
  - Join on ParentStory ID
  - Build DOCX:
      Every frame → anchored text box
      Threaded frames → linked text box chain
      Images → anchored DrawingML
      Page numbers → native Word footer field
      Styles → mapped Word paragraph styles
    ↓
Download DOCX

DOCX Layout Model

All content uses anchored text boxes — unified model, no strategy switching:

  • Single frame → anchored text box, no linking
  • Threaded story → linked text box chain via w:linkTxbx
  • Images → anchored DrawingML at matching coordinates
  • Text wrap → wrapSquare for all detected wraps (downgraded, user warned)

Unit Conversion

  • IDML: points
  • DOCX positions: EMUs (1 inch = 914400 EMUs)
  • DOCX font sizes: half-points (DXA)
  • Formula: emu = points * 914400 / 72
  • IDML frame positions are relative to spread center — must offset by page width

Two API Endpoints

POST /scan      costs 0 credits — returns scan report JSON
POST /convert   costs 1 credit  — returns DOCX file

Scan result cached by session. Convert reuses cached parse.


Tech Stack

Layer Technology
InDesign script (Phase 2) ExtendScript .jsx
Web frontend Vue 3 + Vite + Tailwind CSS
Backend API FastAPI (Python)
IDML parsing Python zipfile + lxml
DOCX generation python-docx
Font registry Custom Python classification dict
Auth + credits PocketBase
File storage S3-compatible object storage
Payments Stripe
Deployment Dokploy
Phase 2 pipeline n8n

Pre-Conversion Font Report

Every font in the document is classified and reported before the user spends a credit:

FONTS
─────────────────────────────────────────
✓ Arial              Available in Word — no action needed
✓ Georgia            Available in Word — no action needed

⚠ Freight Text Pro   Not a Word system font
  Used for:  Body text (pages 18)
  Action:    Install on client machine
  If missing: Word substitutes Georgia — minor reflow possible

⚠ Proxima Nova       Not a Word system font
  Used for:  Headings, captions (all pages)
  Action:    Install on client machine
  If missing: Word substitutes Calibri — spacing may differ

◎ DM Mono            Unknown font
  Used for:  Pull quotes (pages 2, 8)
  Action:    Verify availability or supply to client
  If missing: Word substitutes Courier New
─────────────────────────────────────────
💡 Supply a /Fonts folder alongside the DOCX
   and ask your client to install before opening.

File Validation and Security

Every uploaded file passes a strict validation gate before any processing occurs. This protects against malicious uploads, renamed files, raw .indd files, and ZIP bombs.

Validation sequence (in order):

  1. File size check — reject above 50MB before touching the file
  2. Magic byte check — read actual file signature, not the extension. Must be application/zip
  3. Open as ZIP — reject corrupted or fake archives
  4. ZIP bomb protection — sum uncompressed sizes before extracting anything. Reject above 200MB uncompressed
  5. Entry count limit — reject archives with more than 500 entries
  6. Path traversal check — reject any entry with ../ in the path
  7. IDML structure check — must contain designmap.xml, Spreads/, Stories/, Resources/
  8. XML validity check — parse designmap.xml to confirm it is real XML, not injected content
  9. Executable content check — reject if any entry has an executable extension (.exe, .sh, .py, .js etc.)

Common innocent mistake — .indd instead of .idml

The most frequent user error is uploading a raw InDesign .indd file instead of an IDML export. This gets a specific, helpful error message:

"This appears to be an InDesign document file. Please export it as IDML first: File → Export → InDesign Markup (IDML)"

All other rejections get a clear but non-specific error — never tell a malicious actor which check they failed.


Layer 1 — Email verification with disposable domain blocking (Abstract API or Kickbox) Layer 2 — Browser fingerprinting via FingerprintJS (free tier) — persists across sessions Layer 3 — File hashing — same IDML file cannot be converted free across multiple accounts Layer 4 — IP rate limiting via slowapi — 1 free conversion per IP per day

Start with layers 1, 2, and 3 at MVP. Layer 3 is especially effective for this product since IDML files are project-specific assets.


UI Design Direction

Reference: ilovepdf.com and tools.pdf24.org

Core principle: upload zone is the entire hero. No marketing content above the fold on the tool page.

┌───────────────────────────────────────┐
│  IDconvert logo            Login/Signup│
├───────────────────────────────────────┤
│  Convert InDesign to Word.            │
│  Upload your IDML file to begin.      │
│                                       │
│  ┌─────────────────────────────────┐  │
│  │   Drop IDML file here  or       │  │
│  │      [ Browse files ]           │  │
│  └─────────────────────────────────┘  │
│  🔒 Files deleted after 1 hour        │
├───────────────────────────────────────┤
│  SCAN REPORT (appears after upload)   │
│  ┌──────────┐ ┌──────────┐           │
│  │ 12 pages │ │ 8 stories│  ...      │
│  └──────────┘ └──────────┘           │
│  FONTS                                │
│  ✓ Arial              Safe            │
│  ⚠ Freight Text Pro   Install needed  │
│  NOTICES                              │
│  ⚠ Text wrap simplified — pages 4, 7 │
│  1 credit · 4 remaining               │
│  [ Cancel ]    [ Convert → ]          │
├───────────────────────────────────────┤
│  Footer — minimal, links only         │
└───────────────────────────────────────┘

Palette:

  • Background: #F8F9FA
  • Primary CTA: #1a56db (Convert button only)
  • Warning: #F59E0B
  • Success: #10B981
  • UI type: Inter or DM Sans
  • Technical/filenames: DM Mono

Development Timeline

Phase 1 — IDconvert MVP (Weeks 15)

Week 1 — IDML Parser

  • Unzip and read IDML structure
  • Extract tagged text frames in correct order
  • Extract paragraph styles and map to Word styles
  • Extract inline images
  • Font extraction and classification

Week 2 — DOCX Builder

  • Anchored text box generation from frame geometry
  • Linked text box chains for threaded stories
  • Paragraph and character style mapping
  • Image embedding as anchored DrawingML
  • Page number footer generation

Week 3 — Scan Endpoint + Warning System

  • Lightweight pre-conversion parse
  • Font report with substitute mapping and usage context
  • Wrap detection and downgrade warnings
  • Warnings array with page-level detail
  • Session-based scan cache

Week 4 — Web Tool UI + Credits

  • Vue 3 frontend — upload zone, scan report, font report, warning list
  • PocketBase auth and credit system
  • Stripe credit pack checkout
  • Convert endpoint wiring
  • File deletion after 1 hour

Week 5 — Testing and Launch

  • End-to-end test with real annual report, brand guide, and cookbook files
  • Edge case cleanup
  • Soft launch

Phase 2 — IDtag + Multi-Column (Weeks 69)

Week 6 — IDtag ExtendScript (frame tagging, thread metadata, panel UI) Week 7 — Parser update (read IDtag metadata from IDML labels) Week 8 — DOCX builder update (enhanced multi-column reconstruction) Week 9 — Testing with magazine layouts, multi-column reports, release


MVP Feature List (User-Facing Language)

  • Layout preserved — your document opens in Word with the same page structure, columns and content positioning as the original design
  • Text is fully editable — all text can be selected, edited and reformatted directly in Word without any special software
  • Styles carried over — headings, body text, captions and other text styles are mapped to equivalent Word styles so formatting stays consistent when you edit
  • Images included — all images from the original design are embedded in the Word document and positioned to match the layout
  • Tables converted — tables come across with their structure, content and basic formatting intact and ready to edit
  • Clickable links preserved — any hyperlinks in the original document remain active and clickable in the Word version
  • Page numbers matched — page numbers are reproduced in Word using the same font, size and colour as the original design
  • Font report included — before converting, you are told exactly which fonts need to be installed on your computer for the document to display correctly
  • Honest warnings upfront — any design features that cannot be perfectly replicated in Word are clearly explained before you convert, so there are no surprises

Accepted Limitations (Document Clearly in UI)

  • Text wrap inside anchored text boxes is unreliable — simplified to square wrap, user warned
  • Font reflow when client lacks designer's fonts — user warned with install instructions
  • Heavy text editing may cause text box overflow in Word — include how-to-edit guide with download
  • Master page headers and footers excluded at MVP
  • Pixel-perfect PDF match is not the goal — structurally identical and editable is the goal