Skip to main content
  1. Home

Investigating data issues

To help identify the root cause of a data issue, a utiliy function is provided, stream_read_xbrl_debug.

To use it, pass it 4 values from the single row of data in question:

  • zip_url
  • run_code
  • company_id
  • date

The function then:

  • If necessary, downloads the source ZIP file from Companies House, and stores in a local cache.
  • Finds the maching file of original member XML or HTML file that matches the 4 values.
  • Prints out this original data from which this data was derived.

For example, save the below Python code to a file debug.py.

from stream_read_xbrl import stream_read_xbrl_debug

stream_read_xbrl_debug(
    zip_url='https://download.companieshouse.gov.uk/archive/Accounts_Monthly_Data-February2019.zip',
    run_code='Prod224_0063',
    company_id='00024001',
    date=datetime.date(2018, 6, 30),
)

And run it using

python debug.py

This should print the untransformed source data to the console.

To save the source data to a file the following command can be used:

python debug.py > out.html

This allows you to use another program to view the source data. For example a web browser can be used to investigate .html files.