๐Ÿ“„ XML Renderer โ€” Accessible PDFs from XML

ObviousPDF.Xml renders XML documents into tagged, accessible PDFs. Set accessible="true" for automatic PDF/UA-1 compliance โ€” no manual structure tree needed.

๐Ÿ“ฆ Installation

The XML renderer ships as a separate NuGet package:

dotnet add package ObviousPDF
dotnet add package ObviousPDF.Xml

๐Ÿ†š Three Input Formats โ€” Same Accessible Output

ObviousPDF offers three input pipelines that all produce identical PDF output. Choose the format that fits your workflow:

Pipeline Package Best For
JSONObviousPDF.JsonWeb APIs, JavaScript tooling, LLM/AI generation
XMLObviousPDF.XmlEnterprise workflows, XSLT transforms, schema validation
CSVObviousPDF.CsvSpreadsheet authoring (Excel, Google Sheets), bulk data
C# โ€” Render Accessible PDF from XML
using ObviousPDF.Xml;

// Render from an XML file to a PDF file
PdfXmlRenderer.RenderFromFile(
    xmlPath:    "form-1040ea.xml",
    outputPath: "form-1040ea.pdf");

// โ€” or โ€” render from an XML string in memory
string xml = File.ReadAllText("form-1040ea.xml");
PdfXmlRenderer.Render(xml, "form-1040ea.pdf");

// โ€” or โ€” build a PdfDocument for further manipulation
using ObviousPDF;
using ObviousPDF.Accessibility;

PdfDocument doc = PdfXmlRenderer.BuildDocument(xml);

// Run the built-in accessibility checker
var checker = new PdfAccessibilityChecker();
PdfAccessibilityReport report = checker.Check(doc);

foreach (var result in report.Results)
    Console.WriteLine($"[{result.Feature}] {result.Description}");

doc.Save("form-1040ea.pdf");
Add-Type -Path "ObviousPDF.dll"
Add-Type -Path "ObviousPDF.Xml.dll"

# Render from XML file to PDF
[ObviousPDF.Xml.PdfXmlRenderer]::RenderFromFile(
    "form-1040ea.xml",
    "form-1040ea.pdf")

# Or render from an XML string
$xml = Get-Content "form-1040ea.xml" -Raw
[ObviousPDF.Xml.PdfXmlRenderer]::Render($xml, "form-1040ea.pdf")

The form-1040ea.xml file is a complete 3-page IRS Form 1040 (2024) with Schedule 1 โ€” identical visual output to the JSON and CSV versions, using accessible="true" for automatic PDF/UA-1 conformance.

Form 1040 (Easy Accessibility) โ€” XML

The same IRS Form 1040 from the JSON example, expressed as XML. Setting accessible="true" on the root <Document> element enables all the same automatic accessibility features โ€” tagged PDF, PDF/UA-1, auto-tagging, font auto-upgrade, and display document title.

form-1040ea.xml โ€” Document Root & First Page (Excerpt)
<?xml version="1.0" encoding="utf-8"?>
<Document xmlns="https://obviouspdf.com/schemas/xml/1.0"
          coordinateOrigin="topLeft"
          accessible="true">

  <DocumentSettings language="en-US" displayDocTitle="true" pdfVersion="2.0">
    <Info title="Form 1040 โ€” U.S. Individual Income Tax Return (2024)"
          author="Department of the Treasury โ€” Internal Revenue Service"
          subject="U.S. Individual Income Tax Return"
          keywords="IRS, 1040, tax, income, federal, 2024"
          creator="ObviousPDF.Xml" />
  </DocumentSettings>

  <Fonts>
    <Font id="title-font" mode="standardFont" standardFont="HelveticaBold" />
    <Font id="body-font"  mode="standardFont" standardFont="Helvetica" />
    <Font id="section-font" mode="standardFont" standardFont="HelveticaBold" />
  </Fonts>

  <Pages>
    <Page size="Letter">
      <Content>
        <!-- Background โ€” artifact, excluded from tag tree -->
        <Rectangle x="0" y="0" width="612" height="792"
                   mode="fill" artifact="true" artifactType="background">
          <DrawOptions><FillColor name="white" /></DrawOptions>
        </Rectangle>

        <!-- Title โ€” auto-tagged as H1 via structureTag -->
        <Text x="160" y="28" structureTag="H1">
          U.S. Individual Income Tax Return
          <TextOptions fontRef="title-font" fontSize="14">
            <TextColor name="black" />
          </TextOptions>
        </Text>

        <!-- Section header -->
        <Text x="36" y="90" structureTag="H2">
          Filing Status
          <TextOptions fontRef="section-font" fontSize="10">
            <TextColor name="white" />
          </TextOptions>
        </Text>

        <!-- Form field โ€” auto-tagged as Form element -->
        <CheckboxField name="filing_single" x="40" y="110"
                       width="12" height="12"
                       tooltip="Filing status: Single" />

        <!-- Text field with tooltip for accessibility -->
        <TextField name="first_name" x="36" y="185"
                   width="200" height="18" required="true"
                   tooltip="Your first name (required)" />
      </Content>
    </Page>
  </Pages>
</Document>

What accessible="true" Does

Setting accessible="true" on the <Document> element automatically enables:

Feature What Happens Standard Reference
Tagged PDFStructure tree root created; /MarkInfo << /Marked true >>ISO 32000-2 ยง14.8
PDF/UA-1 conformanceXMP metadata declares pdfuaid:part = 1ISO 14289-1 ยง7.1
Auto-tag textEvery non-artifact <Text> โ†’ <P> structure elementMatterhorn 01-003
Auto-tag imagesEvery non-artifact <Image> โ†’ <Figure> with /AltISO 14289-1 ยง7.3
Auto-tag form fieldsEvery form field โ†’ <P> โ†’ <Form>ISO 14289-1 ยง7.18.4
Font auto-upgradeStandard 14 fonts โ†’ embedded with Unicode CMapISO 14289-1 ยง7.21
Display document title/ViewerPreferences << /DisplayDocTitle true >>Matterhorn 07-001
Tab orderPages with annotations get /Tabs /SMatterhorn 28-008

XML vs JSON โ€” Same Form, Different Syntax

Compare how the same form field is expressed in XML and JSON:

XML
<TextField name="first_name"
           x="36" y="185"
           width="200" height="18"
           required="true"
           tooltip="Your first name (required)" />
JSON
{
  "type": "textField",
  "name": "first_name",
  "x": 36, "y": 185,
  "width": 200, "height": 18,
  "required": true,
  "tooltip": "Your first name (required)"
}
Rendered Form 1040 PDF showing Page 1 with Filing Status, Name and Address, Income, and Adjustments to Income sections โ€” rendered from XML with easy accessibility
Form 1040 Page 1 โ€” rendered from XML with accessible="true"

โ™ฟ Accessibility Notes

  • PAC Validated: PDFs rendered from the XML pipeline pass PAC for both PDF/UA and WCAG 2.2.
  • Form fields โ€” always include a tooltip attribute, which becomes the /TU entry (Matterhorn checkpoint 28).
  • Artifacts โ€” decorative elements must use artifact="true" to be excluded from the tag tree (ISO 14289-1 ยง7.1).
  • Language โ€” set language on <DocumentSettings> to a valid BCP-47 tag (ISO 14289-1 ยง7.2).
  • Document title โ€” set title on <Info> so the viewer displays the title (Matterhorn 07-001).
  • Use structureTag="H1" / "H2" etc. on <Text> elements to override the default <P> auto-tag for richer semantics.

Override structureTag for Semantic Structure

By default, accessible="true" auto-tags every non-artifact text element as <P>. Use the structureTag attribute to apply richer semantics:

XML โ€” structureTag Examples
<Page size="Letter">
  <Content>
    <!-- H1 heading -->
    <Text x="160" y="28" structureTag="H1">
      Form Title
      <TextOptions fontRef="title-font" fontSize="14" />
    </Text>

    <!-- H2 heading -->
    <Text x="36" y="90" structureTag="H2">
      Section Header
      <TextOptions fontRef="section-font" fontSize="10" />
    </Text>

    <!-- Default paragraph (same as structureTag="P") -->
    <Text x="36" y="130">
      Regular body text...
      <TextOptions fontRef="body-font" fontSize="9" />
    </Text>

    <!-- BlockQuote -->
    <Text x="36" y="500" structureTag="BlockQuote">
      * See instructions on page 2.
      <TextOptions fontRef="body-font" fontSize="8" />
    </Text>

    <!-- Artifact โ€” excluded from tag tree -->
    <Rectangle x="0" y="780" width="612" height="12"
               mode="fill" artifact="true" artifactType="background">
      <DrawOptions><FillColor name="white" /></DrawOptions>
    </Rectangle>
  </Content>
</Page>

๐Ÿ“‹ Available Structure Tags

Tag Usage Standard
H1โ€“H6Headings (must start at H1, no skipped levels)ISO 14289-1 ยง7.4
PParagraph (default for auto-tagged text)ISO 32000-2 ยง14.8.4
BlockQuoteBlock-level quotationISO 32000-2 ยง14.8.4
CaptionCaption for a figure or tableISO 32000-2 ยง14.8.4
SpanInline span of textISO 32000-2 ยง14.8.4

Built-in Accessibility Report

Add generateAccessibilityReport="true" to the <Document> element โ€” or set it via PdfXmlRendererOptions โ€” to receive a detailed accessibility report after rendering:

XML โ€” Enable Accessibility Report
<Document xmlns="https://obviouspdf.com/schemas/xml/1.0"
          accessible="true"
          generateAccessibilityReport="true">
  ...
</Document>
C# โ€” Access the Accessibility Report
using ObviousPDF.Xml;

string xml = File.ReadAllText("form-1040ea.xml");

// Option 1: RenderWithReport returns the report
PdfXmlRenderResult result = PdfXmlRenderer.RenderWithReport(xml, "output.pdf");

if (result.AccessibilityReport != null)
{
    foreach (var item in result.AccessibilityReport.Results)
        Console.WriteLine($"[{item.Feature}] {item.Description}");
}

// Option 2: Set via options object
var options = new PdfXmlRendererOptions
{
    GenerateAccessibilityReport = true
};
PdfXmlRenderer.RenderFromFile("form-1040ea.xml", "output.pdf", options);

โš ๏ธ Disclaimer

The automated accessibility report assists in assessment but is not a comprehensive audit. To confirm WCAG or PDF/UA conformance, human assessment by an accessibility specialist is recommended.

๐Ÿ“‹ XML Schema

The ObviousPDF XML format uses the namespace https://obviouspdf.com/schemas/xml/1.0. All 26 content element types are supported: text, images, shapes, lines, annotations, form fields, FormXObjects, shadings, patterns, layers, and more.