A controversial proposal is gaining traction in tech circles: reformatting all business documents into a machine-friendly language, dubbed 'DocLang', to make them more palatable to artificial intelligence systems. The idea, which has been floated by a coalition of AI developers and data architects, suggests that converting PDFs, Word files, and spreadsheets into a standardised, AI-readable format could drastically reduce the time and cost of training large language models (LLMs) on proprietary data. For UK businesses, this could mean a fundamental shift in how they manage information.
The proposal envisions a universal markup language that embeds metadata, context, and structure directly into documents, allowing AI to parse them without complex preprocessing. Currently, firms spend millions on data cleaning and extraction before feeding documents into AI tools. 'DocLang would be like giving AI a map instead of a jigsaw puzzle,' said Dr. Eleanor Marsh, a data scientist at the University of Cambridge. 'But the transition cost for small and medium enterprises could be prohibitive.'
For UK consumers, the implications are mixed. On one hand, AI systems trained on well-structured data could provide faster, more accurate services — from automated tax returns to personalised healthcare advice. On the other, privacy advocates warn that standardised formats could make personal data easier to scrape and analyse, raising concerns under the UK's Data Protection Act 2018. The Information Commissioner's Office (ICO) has yet to issue formal guidance, but a spokesperson told UKPulse Media: 'Any new data standard must respect individuals' rights and ensure transparency.'
The regulatory landscape is further complicated by the EU AI Act, which imposes strict transparency and risk management requirements on AI systems. If DocLang becomes a de facto standard in Europe, UK firms trading with the bloc may need to comply, even if the UK pursues a lighter-touch approach. 'This could create a two-speed data economy,' warned Mark Henderson, a technology policy analyst at the Institute for Public Policy Research. 'Businesses that adopt early may gain a competitive edge, but those that lag could face trade barriers.'
Opportunities abound for sectors like legal, finance, and healthcare, where document-heavy workflows dominate. Law firms could automate contract reviews; banks could speed up loan approvals; the NHS could streamline patient records. However, the upfront cost of reformatting decades of archives is daunting. 'We're talking about hundreds of billions of pages across the UK economy,' said Sarah Whitmore, CEO of London-based data consultancy DataBridge. 'The question is whether the long-term savings justify the short-term pain.'
As the debate heats up, the UK government is expected to consult on the proposal later this year. The Department for Science, Innovation and Technology has not confirmed a timeline, but insiders suggest a pilot programme with public sector bodies could be announced. For now, UK businesses and consumers alike are left to weigh the promise of AI-readiness against the reality of legacy systems and regulatory uncertainty.