1205 lines
39 KiB
Markdown
1205 lines
39 KiB
Markdown
# CLAUDE.md — IDconvert Project Instructions
|
|
|
|
This file is the persistent context for Claude Code sessions on the IDconvert project.
|
|
Read this fully before writing any code or making any architectural decisions.
|
|
This file supersedes all previous versions.
|
|
|
|
---
|
|
|
|
## What We Are Building
|
|
|
|
Two products that work together as a complete system:
|
|
|
|
### 1. IDexport (Free InDesign ExtendScript)
|
|
A free `.jsx` script the designer runs inside InDesign before converting.
|
|
It reads the live InDesign document through Adobe's DOM — the fully resolved
|
|
document state — and exports a structured JSON file containing everything the
|
|
web converter needs to produce an accurate DOCX.
|
|
|
|
This is the core architectural decision: the script runs where the data is clean.
|
|
InDesign has already resolved coordinates, applied master pages, calculated font
|
|
metrics, and resolved colour swatches. We capture that resolved state as JSON
|
|
rather than trying to reconstruct it from IDML.
|
|
|
|
**What IDexport does:**
|
|
- Reads every page, frame, story, table, and image through InDesign's DOM
|
|
- Exports resolved coordinates — no offset errors, no spread geometry math
|
|
- Exports text with exact character-level style application per run
|
|
- Exports font names as InDesign has resolved them
|
|
- Exports colour values resolved from swatches to hex
|
|
- Exports thread order as InDesign knows it
|
|
- Exports images with actual placed dimensions
|
|
- Exports table structure with exact column widths
|
|
- Exports paragraph and character style definitions
|
|
- Writes `idconvert_export.json` to the same folder as the INDD file
|
|
|
|
**What IDexport does NOT do:**
|
|
- No licensing, no phone home, no activation
|
|
- Does not require internet connection
|
|
- Does not modify the InDesign document
|
|
- Distributed as a plain `.jsx` file from the website, completely free
|
|
|
|
### 2. IDconvert (Paid Web SaaS)
|
|
A web tool that accepts the `idconvert_export.json` produced by IDexport
|
|
and outputs a clean, brand-consistent, fully editable DOCX.
|
|
|
|
Because the input JSON contains clean resolved data from InDesign's own
|
|
layout engine, the converter can produce accurate output without coordinate
|
|
reconstruction, IDML parsing complexity, or guessing.
|
|
|
|
**The value delivered:**
|
|
- Paragraph styles mapped to real Word styles — not manually formatted text
|
|
- Character styles (bold, italic, colour) as Word character styles
|
|
- Tables as real editable Word tables with correct column widths
|
|
- Images embedded at correct proportions in the right position in the flow
|
|
- Fonts applied correctly per run — brand-consistent in Word
|
|
- Hyperlinks active and clickable
|
|
- Document hierarchy preserved — headings, body, captions, tables of contents work
|
|
- Page numbers as native Word footer fields matched to InDesign styling
|
|
- Pre-conversion scan report with font warnings before credits are spent
|
|
|
|
**What IDconvert explicitly does NOT promise:**
|
|
- Pixel-perfect visual layout replication — this is not possible in Word
|
|
- Column layouts side by side — these flow sequentially for editability
|
|
- Master page decorative elements
|
|
|
|
---
|
|
|
|
## The User Flow
|
|
|
|
```
|
|
1. Designer opens document in InDesign
|
|
2. Designer runs IDexport script (Scripts panel → IDexport.jsx)
|
|
3. Script exports idconvert_export.json to same folder as INDD
|
|
4. Designer uploads idconvert_export.json to IDconvert web tool
|
|
5. IDconvert scans the JSON — shows font report and warnings
|
|
6. Designer confirms → conversion runs
|
|
7. Designer downloads DOCX
|
|
8. Designer sends DOCX to client (Richard)
|
|
9. Richard opens in Word — edits text, updates figures, re-exports PDF
|
|
```
|
|
|
|
---
|
|
|
|
## IDexport Script — Complete Specification
|
|
|
|
### Output Format: idconvert_export.json
|
|
|
|
```json
|
|
{
|
|
"version": "1.0",
|
|
"generator": "IDexport",
|
|
"document": {
|
|
"name": "Annual Report 2025",
|
|
"page_width_pt": 595.28,
|
|
"page_height_pt": 841.89,
|
|
"page_count": 12,
|
|
"facing_pages": true
|
|
},
|
|
"styles": {
|
|
"paragraph": [
|
|
{
|
|
"name": "Body Text",
|
|
"font": "Freight Text Pro",
|
|
"size_pt": 10.5,
|
|
"leading_pt": 14,
|
|
"space_before_pt": 0,
|
|
"space_after_pt": 6,
|
|
"alignment": "left",
|
|
"color_hex": "1A1A1A",
|
|
"bold": false,
|
|
"italic": false
|
|
}
|
|
],
|
|
"character": [
|
|
{
|
|
"name": "Bold Run",
|
|
"font": "Freight Text Pro",
|
|
"bold": true,
|
|
"color_hex": null
|
|
}
|
|
]
|
|
},
|
|
"colors": {
|
|
"Brand Blue": "1a56db",
|
|
"Black": "1A1A1A",
|
|
"White": "FFFFFF"
|
|
},
|
|
"fonts_used": [
|
|
{ "name": "Freight Text Pro", "postscript": "FreightTextPro-Book" },
|
|
{ "name": "Proxima Nova", "postscript": "ProximaNova-Regular" }
|
|
],
|
|
"pages": [
|
|
{
|
|
"page_number": 1,
|
|
"width_pt": 595.28,
|
|
"height_pt": 841.89,
|
|
"items": [
|
|
{
|
|
"type": "text_frame",
|
|
"id": "frame_001",
|
|
"thread_id": "story_A",
|
|
"thread_position": 1,
|
|
"x_pt": 56.69,
|
|
"y_pt": 70.87,
|
|
"width_pt": 481.89,
|
|
"height_pt": 200,
|
|
"column_count": 1,
|
|
"paragraphs": [
|
|
{
|
|
"style": "Heading 1",
|
|
"text": "Annual Report 2025",
|
|
"runs": [
|
|
{
|
|
"text": "Annual Report 2025",
|
|
"style": null,
|
|
"bold": false,
|
|
"italic": false,
|
|
"color_hex": null,
|
|
"font": null,
|
|
"size_pt": null
|
|
}
|
|
]
|
|
},
|
|
{
|
|
"style": "Body Text",
|
|
"text": "This report covers the financial year ending December 2025.",
|
|
"runs": [
|
|
{
|
|
"text": "This report covers the ",
|
|
"style": null,
|
|
"bold": false,
|
|
"italic": false,
|
|
"color_hex": null,
|
|
"font": null,
|
|
"size_pt": null
|
|
},
|
|
{
|
|
"text": "financial year",
|
|
"style": "Bold Run",
|
|
"bold": true,
|
|
"italic": false,
|
|
"color_hex": null,
|
|
"font": null,
|
|
"size_pt": null
|
|
},
|
|
{
|
|
"text": " ending December 2025.",
|
|
"style": null,
|
|
"bold": false,
|
|
"italic": false,
|
|
"color_hex": null,
|
|
"font": null,
|
|
"size_pt": null
|
|
}
|
|
]
|
|
}
|
|
]
|
|
},
|
|
{
|
|
"type": "image_frame",
|
|
"id": "frame_002",
|
|
"x_pt": 56.69,
|
|
"y_pt": 280,
|
|
"width_pt": 481.89,
|
|
"height_pt": 300,
|
|
"image_name": "hero.jpg",
|
|
"image_data_b64": "...",
|
|
"fit_mode": "fill_proportionally"
|
|
},
|
|
{
|
|
"type": "table",
|
|
"id": "frame_003",
|
|
"x_pt": 56.69,
|
|
"y_pt": 600,
|
|
"width_pt": 481.89,
|
|
"column_widths_pt": [120, 120, 120, 121.89],
|
|
"rows": [
|
|
{
|
|
"is_header": true,
|
|
"cells": [
|
|
{
|
|
"text": "Region",
|
|
"style": "Table Header",
|
|
"col_span": 1,
|
|
"row_span": 1,
|
|
"background_hex": "1a56db",
|
|
"text_color_hex": "FFFFFF"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
},
|
|
{
|
|
"type": "page_number",
|
|
"id": "frame_004",
|
|
"x_pt": 56.69,
|
|
"y_pt": 800,
|
|
"font": "Proxima Nova",
|
|
"size_pt": 9,
|
|
"color_hex": "666666",
|
|
"alignment": "center"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### IDexport Script Implementation
|
|
|
|
```javascript
|
|
// IDexport.jsx — runs inside Adobe InDesign
|
|
|
|
#target indesign
|
|
|
|
function exportDocument() {
|
|
if (app.documents.length === 0) {
|
|
alert('No document is open. Please open an InDesign document first.');
|
|
return;
|
|
}
|
|
|
|
var doc = app.activeDocument;
|
|
|
|
if (!doc.saved) {
|
|
alert('Please save your document before exporting.');
|
|
return;
|
|
}
|
|
|
|
var output = {
|
|
version: '1.0',
|
|
generator: 'IDexport',
|
|
document: extractDocumentInfo(doc),
|
|
styles: extractStyles(doc),
|
|
colors: extractColors(doc),
|
|
fonts_used: extractFonts(doc),
|
|
pages: extractPages(doc)
|
|
};
|
|
|
|
var exportPath = doc.filePath + '/idconvert_export.json';
|
|
var file = new File(exportPath);
|
|
file.encoding = 'UTF-8';
|
|
file.open('w');
|
|
file.write(JSON.stringify(output, null, 2));
|
|
file.close();
|
|
|
|
alert('IDexport complete.\nFile saved to:\n' + exportPath +
|
|
'\n\nUpload idconvert_export.json to IDconvert to generate your Word document.');
|
|
}
|
|
|
|
function extractDocumentInfo(doc) {
|
|
var page = doc.pages[0];
|
|
return {
|
|
name: doc.name.replace('.indd', ''),
|
|
page_width_pt: page.bounds[3] - page.bounds[1],
|
|
page_height_pt: page.bounds[2] - page.bounds[0],
|
|
page_count: doc.pages.length,
|
|
facing_pages: doc.documentPreferences.facingPages
|
|
};
|
|
}
|
|
|
|
function extractStyles(doc) {
|
|
var paragraphStyles = [];
|
|
for (var i = 0; i < doc.paragraphStyles.length; i++) {
|
|
var s = doc.paragraphStyles[i];
|
|
if (s.name === '[No Paragraph Style]') continue;
|
|
paragraphStyles.push({
|
|
name: s.name,
|
|
font: safeGet(s, 'appliedFont', null) ?
|
|
s.appliedFont.name : null,
|
|
size_pt: safeGet(s, 'pointSize', null),
|
|
leading_pt: safeGet(s, 'leading', null),
|
|
space_before_pt: safeGet(s, 'spaceBefore', 0),
|
|
space_after_pt: safeGet(s, 'spaceAfter', 0),
|
|
alignment: alignmentToString(safeGet(s, 'justification', null)),
|
|
color_hex: resolveColor(safeGet(s, 'fillColor', null)),
|
|
bold: safeGet(s, 'fontStyle', '').toLowerCase().indexOf('bold') >= 0,
|
|
italic: safeGet(s, 'fontStyle', '').toLowerCase().indexOf('italic') >= 0
|
|
});
|
|
}
|
|
|
|
var characterStyles = [];
|
|
for (var j = 0; j < doc.characterStyles.length; j++) {
|
|
var cs = doc.characterStyles[j];
|
|
if (cs.name === '[No Character Style]') continue;
|
|
characterStyles.push({
|
|
name: cs.name,
|
|
font: safeGet(cs, 'appliedFont', null) ?
|
|
cs.appliedFont.name : null,
|
|
bold: safeGet(cs, 'fontStyle', '').toLowerCase().indexOf('bold') >= 0,
|
|
italic: safeGet(cs, 'fontStyle', '').toLowerCase().indexOf('italic') >= 0,
|
|
color_hex: resolveColor(safeGet(cs, 'fillColor', null))
|
|
});
|
|
}
|
|
|
|
return { paragraph: paragraphStyles, character: characterStyles };
|
|
}
|
|
|
|
function extractColors(doc) {
|
|
var colors = {};
|
|
for (var i = 0; i < doc.colors.length; i++) {
|
|
var c = doc.colors[i];
|
|
try {
|
|
colors[c.name] = colorToHex(c);
|
|
} catch(e) {}
|
|
}
|
|
return colors;
|
|
}
|
|
|
|
function extractFonts(doc) {
|
|
var fonts = [];
|
|
var seen = {};
|
|
for (var i = 0; i < doc.fonts.length; i++) {
|
|
var f = doc.fonts[i];
|
|
if (!seen[f.name]) {
|
|
seen[f.name] = true;
|
|
fonts.push({
|
|
name: f.name,
|
|
postscript: safeGet(f, 'postscriptName', f.name)
|
|
});
|
|
}
|
|
}
|
|
return fonts;
|
|
}
|
|
|
|
function extractPages(doc) {
|
|
var pages = [];
|
|
for (var i = 0; i < doc.pages.length; i++) {
|
|
var page = doc.pages[i];
|
|
pages.push({
|
|
page_number: page.name,
|
|
width_pt: page.bounds[3] - page.bounds[1],
|
|
height_pt: page.bounds[2] - page.bounds[0],
|
|
items: extractPageItems(page, doc)
|
|
});
|
|
}
|
|
return pages;
|
|
}
|
|
|
|
function extractPageItems(page, doc) {
|
|
var items = [];
|
|
var allItems = page.allPageItems;
|
|
|
|
for (var i = 0; i < allItems.length; i++) {
|
|
var item = allItems[i];
|
|
try {
|
|
var extracted = extractItem(item, page, doc);
|
|
if (extracted) items.push(extracted);
|
|
} catch(e) {
|
|
// skip items that can't be extracted
|
|
}
|
|
}
|
|
|
|
// Sort by Y then X — reading order top-to-bottom, left-to-right
|
|
items.sort(function(a, b) {
|
|
if (Math.abs(a.y_pt - b.y_pt) < 5) return a.x_pt - b.x_pt;
|
|
return a.y_pt - b.y_pt;
|
|
});
|
|
|
|
return items;
|
|
}
|
|
|
|
function extractItem(item, page, doc) {
|
|
var bounds = item.geometricBounds;
|
|
// bounds = [top, left, bottom, right] relative to page
|
|
var pageBounds = page.bounds;
|
|
var x = bounds[1] - pageBounds[1];
|
|
var y = bounds[0] - pageBounds[0];
|
|
var width = bounds[3] - bounds[1];
|
|
var height = bounds[2] - bounds[0];
|
|
|
|
// Text frame
|
|
if (item.constructor.name === 'TextFrame') {
|
|
return extractTextFrame(item, x, y, width, height);
|
|
}
|
|
|
|
// Image / graphic frame
|
|
if (item.constructor.name === 'Rectangle' ||
|
|
item.constructor.name === 'Oval' ||
|
|
item.constructor.name === 'GraphicFrame') {
|
|
if (item.graphics.length > 0) {
|
|
return extractImageFrame(item, x, y, width, height);
|
|
}
|
|
// Rectangle with no image — extract fill colour if significant
|
|
return extractShapeFrame(item, x, y, width, height);
|
|
}
|
|
|
|
return null;
|
|
}
|
|
|
|
function extractTextFrame(frame, x, y, width, height) {
|
|
// Detect page number marker
|
|
if (hasAutoPageNumber(frame)) {
|
|
return extractPageNumber(frame, x, y);
|
|
}
|
|
|
|
var paragraphs = [];
|
|
try {
|
|
var story = frame.parentStory;
|
|
for (var i = 0; i < story.paragraphs.length; i++) {
|
|
var para = story.paragraphs[i];
|
|
paragraphs.push(extractParagraph(para));
|
|
}
|
|
} catch(e) {}
|
|
|
|
// Thread info
|
|
var threadId = frame.parentStory.id;
|
|
var threadPos = 1;
|
|
try {
|
|
var prev = frame.previousTextFrame;
|
|
while (prev !== null) {
|
|
threadPos++;
|
|
prev = prev.previousTextFrame;
|
|
}
|
|
} catch(e) {}
|
|
|
|
return {
|
|
type: 'text_frame',
|
|
id: frame.id.toString(),
|
|
thread_id: threadId.toString(),
|
|
thread_position: threadPos,
|
|
x_pt: x,
|
|
y_pt: y,
|
|
width_pt: width,
|
|
height_pt: height,
|
|
column_count: frame.textFramePreferences.textColumnCount || 1,
|
|
paragraphs: paragraphs
|
|
};
|
|
}
|
|
|
|
function extractParagraph(para) {
|
|
var styleName = '[No Paragraph Style]';
|
|
try { styleName = para.appliedParagraphStyle.name; } catch(e) {}
|
|
|
|
var runs = [];
|
|
for (var i = 0; i < para.characters.length; i++) {
|
|
var char = para.characters[i];
|
|
// Group into runs by style change
|
|
var run = buildRun(char);
|
|
if (runs.length > 0 && runsMatch(runs[runs.length-1], run)) {
|
|
runs[runs.length-1].text += run.text;
|
|
} else {
|
|
runs.push(run);
|
|
}
|
|
}
|
|
|
|
return {
|
|
style: styleName,
|
|
text: para.contents,
|
|
runs: runs
|
|
};
|
|
}
|
|
|
|
function buildRun(char) {
|
|
var styleName = null;
|
|
try {
|
|
if (char.appliedCharacterStyle.name !== '[No Character Style]') {
|
|
styleName = char.appliedCharacterStyle.name;
|
|
}
|
|
} catch(e) {}
|
|
|
|
return {
|
|
text: char.contents,
|
|
style: styleName,
|
|
bold: safeGet(char, 'fontStyle', '').toLowerCase().indexOf('bold') >= 0,
|
|
italic: safeGet(char, 'fontStyle', '').toLowerCase().indexOf('italic') >= 0,
|
|
color_hex: resolveColor(safeGet(char, 'fillColor', null)),
|
|
font: safeGet(char, 'appliedFont', null) ? char.appliedFont.name : null,
|
|
size_pt: safeGet(char, 'pointSize', null)
|
|
};
|
|
}
|
|
|
|
function runsMatch(a, b) {
|
|
return a.style === b.style &&
|
|
a.bold === b.bold &&
|
|
a.italic === b.italic &&
|
|
a.color_hex === b.color_hex &&
|
|
a.font === b.font &&
|
|
a.size_pt === b.size_pt;
|
|
}
|
|
|
|
function extractImageFrame(frame, x, y, width, height) {
|
|
var imageData = null;
|
|
var imageName = 'image';
|
|
try {
|
|
var graphic = frame.graphics[0];
|
|
imageName = graphic.itemLink.name;
|
|
// Export image as base64 JPEG at screen resolution
|
|
var tempFile = new File(Folder.temp + '/idconvert_temp.jpg');
|
|
frame.exportFile(ExportFormat.JPG, tempFile, false);
|
|
imageData = fileToBase64(tempFile);
|
|
tempFile.remove();
|
|
} catch(e) {}
|
|
|
|
return {
|
|
type: 'image_frame',
|
|
id: frame.id.toString(),
|
|
x_pt: x,
|
|
y_pt: y,
|
|
width_pt: width,
|
|
height_pt: height,
|
|
image_name: imageName,
|
|
image_data_b64: imageData,
|
|
fit_mode: 'fill_proportionally'
|
|
};
|
|
}
|
|
|
|
function extractShapeFrame(frame, x, y, width, height) {
|
|
// Only export shapes that have a visible fill — skip empty rectangles
|
|
var fillColor = null;
|
|
try {
|
|
if (frame.fillColor.name !== 'None' &&
|
|
frame.fillColor.name !== '[None]') {
|
|
fillColor = colorToHex(frame.fillColor);
|
|
}
|
|
} catch(e) {}
|
|
|
|
if (!fillColor) return null;
|
|
|
|
return {
|
|
type: 'shape',
|
|
id: frame.id.toString(),
|
|
x_pt: x,
|
|
y_pt: y,
|
|
width_pt: width,
|
|
height_pt: height,
|
|
fill_hex: fillColor
|
|
};
|
|
}
|
|
|
|
function extractPageNumber(frame, x, y) {
|
|
var run = null;
|
|
try {
|
|
run = frame.parentStory.characters[0];
|
|
} catch(e) {}
|
|
|
|
return {
|
|
type: 'page_number',
|
|
id: frame.id.toString(),
|
|
x_pt: x,
|
|
y_pt: y,
|
|
font: run ? (run.appliedFont ? run.appliedFont.name : null) : null,
|
|
size_pt: run ? safeGet(run, 'pointSize', 9) : 9,
|
|
color_hex: run ? resolveColor(safeGet(run, 'fillColor', null)) : '000000',
|
|
alignment: run ?
|
|
alignmentToString(safeGet(run, 'justification', null)) : 'center'
|
|
};
|
|
}
|
|
|
|
function hasAutoPageNumber(frame) {
|
|
try {
|
|
for (var i = 0; i < frame.parentStory.texts[0].characters.length; i++) {
|
|
var char = frame.parentStory.texts[0].characters[i];
|
|
if (char.contents === SpecialCharacters.AUTO_PAGE_NUMBER) {
|
|
return true;
|
|
}
|
|
}
|
|
} catch(e) {}
|
|
return false;
|
|
}
|
|
|
|
// --- Utility functions ---
|
|
|
|
function resolveColor(colorObj) {
|
|
if (!colorObj) return null;
|
|
try {
|
|
if (colorObj.name === 'None' || colorObj.name === '[None]') return null;
|
|
return colorToHex(colorObj);
|
|
} catch(e) { return null; }
|
|
}
|
|
|
|
function colorToHex(colorObj) {
|
|
try {
|
|
var vals = colorObj.colorValue;
|
|
if (colorObj.space === ColorSpace.CMYK) {
|
|
// Convert CMYK to RGB
|
|
var c = vals[0]/100, m = vals[1]/100,
|
|
y = vals[2]/100, k = vals[3]/100;
|
|
var r = Math.round(255 * (1-c) * (1-k));
|
|
var g = Math.round(255 * (1-m) * (1-k));
|
|
var b = Math.round(255 * (1-y) * (1-k));
|
|
return toHex(r) + toHex(g) + toHex(b);
|
|
}
|
|
if (colorObj.space === ColorSpace.RGB) {
|
|
return toHex(vals[0]) + toHex(vals[1]) + toHex(vals[2]);
|
|
}
|
|
} catch(e) {}
|
|
return '000000';
|
|
}
|
|
|
|
function toHex(n) {
|
|
var h = Math.max(0, Math.min(255, Math.round(n))).toString(16);
|
|
return h.length === 1 ? '0' + h : h;
|
|
}
|
|
|
|
function alignmentToString(justification) {
|
|
if (!justification) return 'left';
|
|
var map = {
|
|
1: 'left', 2: 'center', 3: 'right', 4: 'justify',
|
|
1514227313: 'left', 1514731619: 'center',
|
|
1514731618: 'right', 1514599026: 'justify'
|
|
};
|
|
return map[justification] || 'left';
|
|
}
|
|
|
|
function fileToBase64(file) {
|
|
file.open('r');
|
|
file.encoding = 'BINARY';
|
|
var content = file.read();
|
|
file.close();
|
|
// InDesign ExtendScript doesn't have btoa — use manual base64
|
|
return btoa(content);
|
|
}
|
|
|
|
function safeGet(obj, prop, fallback) {
|
|
try { return obj[prop]; } catch(e) { return fallback; }
|
|
}
|
|
|
|
// Run
|
|
exportDocument();
|
|
```
|
|
|
|
---
|
|
|
|
## Web Tool — JSON to DOCX Conversion Pipeline
|
|
|
|
### Input Validation Gate
|
|
|
|
The upload endpoint validates `idconvert_export.json` before processing:
|
|
|
|
```python
|
|
# idml/validator.py — renamed to export/validator.py
|
|
def validate_export_json(content: bytes) -> dict:
|
|
# 1. Size check — JSON should not exceed 100MB
|
|
if len(content) > 100 * 1024 * 1024:
|
|
raise FileValidationError('File too large', 'FILE_TOO_LARGE')
|
|
|
|
# 2. Valid JSON
|
|
try:
|
|
data = json.loads(content)
|
|
except json.JSONDecodeError:
|
|
raise FileValidationError(
|
|
'This does not appear to be a valid IDexport file. '
|
|
'Please run IDexport.jsx in InDesign and upload the '
|
|
'idconvert_export.json file it produces.',
|
|
'INVALID_JSON'
|
|
)
|
|
|
|
# 3. IDexport signature check
|
|
if data.get('generator') != 'IDexport':
|
|
raise FileValidationError(
|
|
'This JSON file was not produced by IDexport. '
|
|
'Please run IDexport.jsx in InDesign to generate '
|
|
'a compatible export file.',
|
|
'WRONG_GENERATOR'
|
|
)
|
|
|
|
# 4. Version check
|
|
if data.get('version') not in ['1.0']:
|
|
raise FileValidationError(
|
|
'This IDexport file was created with an incompatible version. '
|
|
'Please download the latest IDexport.jsx from IDconvert.',
|
|
'VERSION_MISMATCH'
|
|
)
|
|
|
|
# 5. Required fields
|
|
required = ['document', 'styles', 'pages', 'fonts_used']
|
|
for field in required:
|
|
if field not in data:
|
|
raise FileValidationError(
|
|
f'IDexport file is missing required field: {field}. '
|
|
'Please re-run IDexport.jsx and try again.',
|
|
'MISSING_FIELD'
|
|
)
|
|
|
|
return data
|
|
```
|
|
|
|
### Conversion Pipeline
|
|
|
|
```python
|
|
# services/conversion_service.py
|
|
|
|
class ConversionService:
|
|
def convert(self, export_data: dict) -> bytes:
|
|
|
|
# 1. Build document model from JSON
|
|
document = IDExportParser.parse(export_data)
|
|
|
|
# 2. Create Word document
|
|
doc = Document()
|
|
self._apply_page_setup(doc, document)
|
|
self._register_styles(doc, document.styles)
|
|
|
|
# 3. Process pages in order
|
|
for page in document.pages:
|
|
self._process_page(doc, page, document)
|
|
|
|
# 4. Add footer with page numbers
|
|
if document.page_number_style:
|
|
FooterFactory.apply(doc, document.page_number_style)
|
|
|
|
# 5. Return DOCX bytes
|
|
buffer = BytesIO()
|
|
doc.save(buffer)
|
|
return buffer.getvalue()
|
|
|
|
def _process_page(self, doc, page, document):
|
|
# Items are already sorted in reading order by IDexport
|
|
for item in page.items:
|
|
if item.type == 'text_frame':
|
|
self._add_text_frame(doc, item, document)
|
|
elif item.type == 'image_frame':
|
|
self._add_image(doc, item)
|
|
elif item.type == 'table':
|
|
self._add_table(doc, item, document)
|
|
elif item.type == 'page_number':
|
|
pass # handled by footer
|
|
elif item.type == 'shape':
|
|
pass # shapes not converted at MVP
|
|
|
|
# Page break between pages (except last)
|
|
doc.add_page_break()
|
|
|
|
def _add_text_frame(self, doc, item, document):
|
|
for para_data in item.paragraphs:
|
|
p = doc.add_paragraph()
|
|
style = StyleFactory.get_word_style(
|
|
para_data.style, document.styles
|
|
)
|
|
if style:
|
|
p.style = style
|
|
|
|
for run_data in para_data.runs:
|
|
run = p.add_run(run_data.text)
|
|
StyleFactory.apply_run_formatting(run, run_data)
|
|
```
|
|
|
|
### Style Mapping — The Core Value
|
|
|
|
```python
|
|
# docx/factories.py
|
|
|
|
HEADING_STYLE_MAP = {
|
|
# InDesign style name patterns → Word built-in style
|
|
'heading 1': 'Heading 1',
|
|
'h1': 'Heading 1',
|
|
'title': 'Heading 1',
|
|
'heading 2': 'Heading 2',
|
|
'h2': 'Heading 2',
|
|
'subheading': 'Heading 2',
|
|
'heading 3': 'Heading 3',
|
|
'h3': 'Heading 3',
|
|
'caption': 'Caption',
|
|
'table header': 'Table Grid',
|
|
}
|
|
|
|
class StyleFactory:
|
|
@staticmethod
|
|
def register_styles(doc: Document, styles: IDExportStyles) -> None:
|
|
"""Register all InDesign paragraph styles as Word custom styles"""
|
|
for style_data in styles.paragraph:
|
|
word_style_name = StyleFactory._map_to_builtin(style_data.name)
|
|
if word_style_name:
|
|
# Map to Word built-in — modify it to match InDesign values
|
|
style = doc.styles[word_style_name]
|
|
else:
|
|
# Create custom style
|
|
try:
|
|
style = doc.styles.add_style(
|
|
style_data.name, WD_STYLE_TYPE.PARAGRAPH
|
|
)
|
|
except:
|
|
style = doc.styles[style_data.name]
|
|
|
|
# Apply InDesign style properties
|
|
if style_data.font:
|
|
style.font.name = style_data.font
|
|
if style_data.size_pt:
|
|
style.font.size = Pt(style_data.size_pt)
|
|
if style_data.color_hex:
|
|
style.font.color.rgb = RGBColor.from_string(style_data.color_hex)
|
|
if style_data.bold is not None:
|
|
style.font.bold = style_data.bold
|
|
if style_data.italic is not None:
|
|
style.font.italic = style_data.italic
|
|
if style_data.space_before_pt:
|
|
style.paragraph_format.space_before = Pt(style_data.space_before_pt)
|
|
if style_data.space_after_pt:
|
|
style.paragraph_format.space_after = Pt(style_data.space_after_pt)
|
|
if style_data.leading_pt:
|
|
style.paragraph_format.line_spacing = Pt(style_data.leading_pt)
|
|
if style_data.alignment:
|
|
style.paragraph_format.alignment = ALIGNMENT_MAP[style_data.alignment]
|
|
|
|
@staticmethod
|
|
def apply_run_formatting(run, run_data) -> None:
|
|
"""Apply character-level formatting to a Word run"""
|
|
if run_data.bold:
|
|
run.bold = True
|
|
if run_data.italic:
|
|
run.italic = True
|
|
if run_data.font:
|
|
run.font.name = run_data.font
|
|
if run_data.size_pt:
|
|
run.font.size = Pt(run_data.size_pt)
|
|
if run_data.color_hex:
|
|
run.font.color.rgb = RGBColor.from_string(run_data.color_hex)
|
|
|
|
@staticmethod
|
|
def _map_to_builtin(style_name: str) -> str | None:
|
|
return HEADING_STYLE_MAP.get(style_name.lower().strip())
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture Patterns
|
|
|
|
Three patterns throughout. No exceptions without discussion.
|
|
|
|
```
|
|
FACTORIES → build objects from raw data
|
|
SERVICES → orchestrate business logic
|
|
REPOSITORIES → own all data access
|
|
ROUTERS → handle HTTP only
|
|
```
|
|
|
|
### Routers — HTTP only
|
|
```python
|
|
@router.post('/scan', response_model=ScanReport)
|
|
async def scan(
|
|
file: UploadFile,
|
|
user=Depends(get_current_user),
|
|
service: ScanService = Depends(get_scan_service)
|
|
):
|
|
return await service.scan(file, user.id)
|
|
```
|
|
|
|
### Services — business logic only
|
|
```python
|
|
class ScanService:
|
|
async def scan(self, file: UploadFile, user_id: str) -> ScanReport:
|
|
content = await file.read()
|
|
data = self.validator.validate(content)
|
|
report = self.scanner.scan(data)
|
|
await self.repo.cache_scan(user_id, report)
|
|
return report
|
|
```
|
|
|
|
### Repositories — data access only
|
|
```python
|
|
class ConversionRepository:
|
|
async def cache_scan(self, user_id: str, report: dict) -> str:
|
|
record = await self.pb.collection('scan_cache').create({
|
|
'user': user_id, 'report': report
|
|
})
|
|
return record.id
|
|
```
|
|
|
|
### Factories — object construction only
|
|
```python
|
|
class StyleFactory:
|
|
@staticmethod
|
|
def from_export_style(data: dict) -> IDExportStyle:
|
|
return IDExportStyle(
|
|
name=data['name'],
|
|
font=data.get('font'),
|
|
size_pt=data.get('size_pt'),
|
|
# ...
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Repository Structure
|
|
|
|
```
|
|
idconvert/
|
|
├── CLAUDE.md
|
|
├── BRIEF.md
|
|
│
|
|
├── IDexport.jsx ← InDesign script — single file, distributed free
|
|
│
|
|
├── pb_hooks/
|
|
│ └── credits.pb.js ← atomic credit logic
|
|
│
|
|
├── backend/
|
|
│ ├── main.py ← FastAPI app init, middleware, router registration
|
|
│ ├── config.py ← pydantic-settings, env vars only
|
|
│ ├── dependencies.py ← FastAPI DI wiring
|
|
│ │
|
|
│ ├── routers/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── upload.py ← POST /scan, POST /convert
|
|
│ │ ├── payments.py ← POST /payments/create-intent, /webhook
|
|
│ │ └── users.py ← GET /users/me
|
|
│ │
|
|
│ ├── services/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── scan_service.py ← scan JSON, build report
|
|
│ │ ├── conversion_service.py ← parse JSON, build DOCX
|
|
│ │ ├── payment_service.py ← Stripe handling
|
|
│ │ └── user_service.py ← user reads
|
|
│ │
|
|
│ ├── repositories/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── user_repository.py
|
|
│ │ ├── conversion_repository.py
|
|
│ │ └── transaction_repository.py
|
|
│ │
|
|
│ ├── export/ ← IDexport JSON domain
|
|
│ │ ├── __init__.py
|
|
│ │ ├── validator.py ← JSON validation gate
|
|
│ │ ├── scanner.py ← lightweight scan for report
|
|
│ │ ├── parser.py ← full parse into domain models
|
|
│ │ ├── factories.py ← IDExportDocumentFactory, IDExportPageFactory
|
|
│ │ ├── fonts.py ← font classification and substitution map
|
|
│ │ └── models.py ← IDExportDocument, IDExportPage,
|
|
│ │ IDExportFrame, IDExportParagraph,
|
|
│ │ IDExportRun, IDExportStyle
|
|
│ │
|
|
│ ├── docx/ ← DOCX output domain
|
|
│ │ ├── __init__.py
|
|
│ │ ├── builder.py ← orchestrates DOCX construction
|
|
│ │ ├── factories.py ← StyleFactory, TableFactory,
|
|
│ │ │ ImageFactory, FooterFactory
|
|
│ │ ├── units.py ← Pt(), Inches() helpers
|
|
│ │ └── models.py ← internal DOCX domain models
|
|
│ │
|
|
│ ├── models/ ← Pydantic request/response schemas
|
|
│ │ ├── __init__.py
|
|
│ │ ├── scan.py ← ScanReport, FontEntry, Warning
|
|
│ │ ├── conversion.py ← ConversionRequest, ConversionResult
|
|
│ │ ├── payment.py
|
|
│ │ └── user.py
|
|
│ │
|
|
│ └── core/
|
|
│ ├── __init__.py
|
|
│ ├── exceptions.py
|
|
│ ├── logging.py
|
|
│ └── storage.py
|
|
│
|
|
├── frontend/
|
|
│ ├── src/
|
|
│ │ ├── main.js
|
|
│ │ ├── App.vue
|
|
│ │ ├── router/index.js
|
|
│ │ ├── stores/
|
|
│ │ │ ├── auth.js
|
|
│ │ │ ├── upload.js
|
|
│ │ │ └── ui.js
|
|
│ │ ├── services/
|
|
│ │ │ ├── api.js
|
|
│ │ │ ├── uploadService.js
|
|
│ │ │ ├── paymentService.js
|
|
│ │ │ ├── userService.js
|
|
│ │ │ └── factories/
|
|
│ │ │ ├── scanFactory.js
|
|
│ │ │ └── userFactory.js
|
|
│ │ ├── components/
|
|
│ │ │ ├── common/
|
|
│ │ │ │ ├── BaseButton.vue
|
|
│ │ │ │ ├── BaseCard.vue
|
|
│ │ │ │ ├── BaseBadge.vue
|
|
│ │ │ │ ├── BaseSpinner.vue
|
|
│ │ │ │ └── BaseAlert.vue
|
|
│ │ │ ├── upload/
|
|
│ │ │ │ ├── UploadZone.vue
|
|
│ │ │ │ └── FilePreview.vue
|
|
│ │ │ ├── scan/
|
|
│ │ │ │ ├── ScanReport.vue
|
|
│ │ │ │ ├── StatCards.vue
|
|
│ │ │ │ ├── FontReport.vue
|
|
│ │ │ │ └── WarningList.vue
|
|
│ │ │ ├── conversion/
|
|
│ │ │ │ ├── ProcessingState.vue
|
|
│ │ │ │ └── DownloadState.vue
|
|
│ │ │ ├── pricing/
|
|
│ │ │ │ └── PricingCard.vue
|
|
│ │ │ └── layout/
|
|
│ │ │ ├── AppNav.vue
|
|
│ │ │ └── AppFooter.vue
|
|
│ │ └── views/
|
|
│ │ ├── HomeView.vue
|
|
│ │ ├── PricingView.vue
|
|
│ │ ├── DashboardView.vue
|
|
│ │ └── LoginView.vue
|
|
│ ├── vite.config.js
|
|
│ ├── tailwind.config.js
|
|
│ └── package.json
|
|
│
|
|
├── tests/
|
|
│ ├── backend/
|
|
│ │ ├── export/
|
|
│ │ │ ├── test_validator.py
|
|
│ │ │ ├── test_factories.py
|
|
│ │ │ └── test_scanner.py
|
|
│ │ ├── docx/
|
|
│ │ │ ├── test_factories.py
|
|
│ │ │ └── test_builder.py
|
|
│ │ └── services/
|
|
│ │ ├── test_scan_service.py
|
|
│ │ └── test_conversion_service.py
|
|
│ └── fixtures/
|
|
│ └── sample_export.json ← real IDexport output for integration tests
|
|
│
|
|
├── docker-compose.yml
|
|
└── .env.example
|
|
```
|
|
|
|
---
|
|
|
|
## Tech Stack
|
|
|
|
| Layer | Technology |
|
|
|---|---|
|
|
| InDesign script | ExtendScript (.jsx) — IDexport.jsx |
|
|
| Web frontend | Vue 3 + Vite + Tailwind CSS |
|
|
| Backend API | FastAPI (Python) |
|
|
| Export JSON parsing | Python `json` + Pydantic |
|
|
| DOCX generation | `python-docx` |
|
|
| Font registry | Custom Python classification dict |
|
|
| Auth + credits | PocketBase + JS hooks |
|
|
| File storage | S3-compatible object storage |
|
|
| Payments | Stripe |
|
|
| Deployment | Dokploy |
|
|
|
|
Note: `zipfile` and `lxml` are no longer needed — IDML parsing is replaced
|
|
by JSON parsing. Remove these from requirements.txt.
|
|
|
|
---
|
|
|
|
## Pre-Conversion Scan Report
|
|
|
|
Same scan report UX as before — runs on the uploaded JSON before credits are spent.
|
|
The scanner reads the JSON and collects:
|
|
|
|
- Page count, frame count, image count, table count
|
|
- Font classification (safe / professional / unknown) with substitute mapping
|
|
- Warnings for any items that cannot be converted (shapes, unsupported elements)
|
|
- IDexport version compatibility check
|
|
|
|
Font classification and substitution map — unchanged from previous version.
|
|
|
|
---
|
|
|
|
## Credit Pack Pricing
|
|
|
|
**Every registered user gets 1 free conversion automatically.**
|
|
Tracked via `free_used: bool` on the user record. Not a tier.
|
|
|
|
| Pack | Price | Credits | Per Conversion |
|
|
|---|---|---|---|
|
|
| Starter | $19 | 5 credits | $3.80 |
|
|
| Studio | $59 | 20 credits | $2.95 |
|
|
| Agency | $149 | 60 credits | $2.48 |
|
|
|
|
---
|
|
|
|
## Credits — PocketBase Hooks
|
|
|
|
All credit logic in `pb_hooks/credits.pb.js` using SQLite transactions.
|
|
Unchanged from previous version — hook fires on conversions collection create,
|
|
deducts atomically, refunds on failed status update.
|
|
|
|
---
|
|
|
|
## File Validation
|
|
|
|
Upload accepts only `idconvert_export.json` files produced by IDexport.jsx.
|
|
Validation checks: size limit, valid JSON, generator signature, version
|
|
compatibility, required fields present.
|
|
Reject with specific helpful message if generator is not IDexport.
|
|
|
|
---
|
|
|
|
## Anti-Abuse
|
|
|
|
- Email verification with disposable domain blocking
|
|
- Browser fingerprint via FingerprintJS
|
|
- File hash — same JSON file cannot be converted free twice
|
|
- IP rate limiting via slowapi (Redis-backed for multi-instance)
|
|
|
|
File hash is still effective: each `idconvert_export.json` is unique per
|
|
InDesign document and per export run (includes timestamp).
|
|
|
|
---
|
|
|
|
## UI Design
|
|
|
|
Exact ilovepdf.com replication — all specs unchanged from previous version.
|
|
Upload zone accepts `.json` files only.
|
|
Update file type label: "Drop idconvert_export.json here"
|
|
Update trust line: "Run IDexport.jsx in InDesign to generate your export file"
|
|
|
|
---
|
|
|
|
## User-Facing Feature List
|
|
|
|
- **Text and styles preserved** — every paragraph style from InDesign is recreated as a real Word style your team can edit globally
|
|
- **Character formatting carried over** — bold, italic, colour and font applied at the character level, not lost in conversion
|
|
- **Images included** — all placed images are embedded at their correct proportions
|
|
- **Tables converted** — tables come across as real Word tables with correct column widths, ready to edit
|
|
- **Hyperlinks active** — all links survive the conversion and remain clickable
|
|
- **Page numbers matched** — font, size and colour of page numbers are matched as live Word page numbers
|
|
- **Font report included** — before converting, you are told exactly which fonts need to be installed on your client's machine
|
|
- **Honest warnings upfront** — anything that cannot be converted is flagged before you spend a credit
|
|
|
|
---
|
|
|
|
## Accepted Limitations (Communicate Clearly in UI)
|
|
|
|
- Column layouts flow sequentially — side-by-side columns become top-to-bottom flow
|
|
- Decorative shapes and rules are not converted
|
|
- Master page decorative elements are excluded
|
|
- Pixel-perfect visual replication is not the goal — accurate, editable content is
|
|
|
|
---
|
|
|
|
## Key Changes From Previous Architecture
|
|
|
|
| Previous | Now |
|
|
|---|---|
|
|
| Parses IDML ZIP + XML | Parses IDexport JSON |
|
|
| Reconstructs coordinates from IDML | Uses InDesign-resolved coordinates from JSON |
|
|
| IDML threading detection | Thread order resolved by InDesign, exported in JSON |
|
|
| Anchored text boxes for layout | Flowing document — sequential paragraphs |
|
|
| `zipfile` + `lxml` dependencies | `json` + Pydantic only |
|
|
| IDtag (Phase 2 companion script) | IDexport (Phase 1 core script) |
|
|
| Upload .idml file | Upload idconvert_export.json |
|
|
| IDML domain (`idml/`) | Export domain (`export/`) |
|
|
|
|
---
|
|
|
|
## Competitive Positioning
|
|
|
|
| | ID2Office | IDconvert |
|
|
|---|---|---|
|
|
| Requires InDesign on designer machine | Yes | Yes (to run IDexport) |
|
|
| Requires InDesign on client machine | No | No |
|
|
| Price | $229/year | From $19 |
|
|
| Pre-conversion warnings | No | Yes |
|
|
| Font report | No | Yes |
|
|
| Style fidelity | High (native plugin) | High (resolved JSON) |
|
|
| Layout fidelity | High | Low (intentionally flows) |
|
|
| Best for | Designers needing visual layout | Designers needing editable content handoff |
|
|
|
|
---
|
|
|
|
## Notes for Claude Code
|
|
|
|
- Always check this file before making architectural decisions
|
|
- The input is `idconvert_export.json` — not IDML, not .indd
|
|
- The output is a flowing Word document — not positioned text boxes
|
|
- `python-docx` is the DOCX library — use `doc.add_paragraph()`, `doc.add_picture()`, `doc.add_table()`
|
|
- Style registration happens once at document creation — all styles registered before any content is added
|
|
- All PocketBase calls go through repository classes only
|
|
- All frontend API calls go through service files only
|
|
- Use dependency injection for all services and repositories
|
|
- All exceptions use the custom hierarchy in `core/exceptions.py`
|
|
- Use structlog for all logging — never print()
|
|
- Ask before adding dependencies not listed in the tech stack
|
|
- Phase 2 items stubbed with TODO comments only — not implemented
|
|
- Write docstrings on all public methods in services and repositories
|
|
- Every new module gets a corresponding stub test file in tests/
|