Para español, seleccione de la lista

Data Usage Guide

Overview

All data downloads are fixed length text files where each column of data is a specific number of characters. The start position and length of each column is provided in the file definitions.

Required Software

To open the data files, you will need software and a computer. Data downloads are not intended to be consumed from a mobile device.

The Division of Corporations does not endorse specific software. Any software named is an example only. You are responsible for identifying and evaluating software that meets your specific requirements. The Division cannot help with this process or answer software specific questions.

The type of software you use to open the file depends on what you are trying to do.

If you want to…

Open files in…

View the data and/or do basic keyword searches

  • A standard text editor
  • A specialized text editor

Do more complex searches (such as a wildcard search) or get additional features

  • A specialized text editor
  • A spreadsheet program

Do basic data manipulation (sort, filter, etc.)

  • A spreadsheet program
  • A database

Do complex data manipulation (extraction, lookups, summaries, merge two files, etc.)

  • A database
  • A custom program

Software Type Definitions

A standard text editor is usually included in your operating system. It will open text files but may not have many additional features. Notepad is the default for Windows and TextEdit is the default for MacOS.

A specialized text editor is a text editor with the specific features such as line numbers, wild card searching, or the ability to open large files. You typically need to find and install the editor that has the features you are looking for. A popular one for Windows is Notepad++.

A spreadsheet program is designed for working with rows and columns of data. Examples include Microsoft Excel, Apple Numbers, and Google Sheets

A database is a general term to refer to software designed to host and allow searching (querying) of related data. Although most database software is aimed at technical users with server experience, Microsoft Access is a common database software for personal computers.

A custom program is something purpose built to meet your needs. A custom program is not required to use our data, but it is an option for users with programming expertise or staff.

Working With Large Files

Files that are 1GB or larger may not open at all in a standard text editor or spreadsheet program. The quarterly corporate data files fall into this category.

To open large files in a text editor, you may need a text editor specifically designed to handle large files. You can find one by doing an internet search for a “program to open large text files”.

Even if you have a program that can open large files, it can still take significant time and resources.

Make sure your computer has sufficient memory and processing power when working with large files to avoid slow or no responses. The exact values for these vary. You can look for the recommended requirements for your chosen software as a guide.

Importing Data Files

To get the data into your chosen spreadsheet program or database, you will need to import. This process will convert the data from a fixed-width text file into the format of your chosen software.

The best way to find instructions to import into your chosen software is to do an internet search for the phrase “import fixed width text files into <Program Name and version>”.

  • For example: import fixed width text files into Excel Online
  • You may have to try several results to get one that matches but this method should work for almost any software.
  • It also has the advantage of giving video options which may be easier to follow than a plain text explanation.

Data Tips

  • Our data files do not include headers. To know the column names, you need to use the file definitions.

  • The document number (also called record number, registration number or entity id) is the unique identifier for a record.
    • Each data row will contain a document number that is 6 or 12 characters long.
    • Records in different files that have the same document number can be considered the same record.
    • Some files may contain multiple rows for the same document number.

  • The file definitions will also detail where the columns start and end. That is going to be important for the import process as it’s not always easy to tell just by looking at the data.

  • Defining the columns can be a time-consuming process, especially when you are importing multiple files. To limit this, most tools will let you define only the columns you plan to use and set the others to be skipped or ignored.

  • Fixed-width data includes a character in every position. Be sure to trim whitespace (remove spaces from the beginning and end) of any data you are trying to compare or sort to avoid unexpected results.

Troubleshooting

We provide our data as-is. That means occasionally, a file may include one or more rows that don’t line up properly. This is usually caused by a special character (such as a curly quote from Word) or a line break.

  • If you are writing a custom program, you should validate that each row is the expected length (available in the file definitions).

  • For other uses you may need to adjust your import process or “clean” the file before importing to get the desired results.

  • If you report specific records to us, we can correct them in the source. This will fix the record in any future data files, but it will not correct the already created file you are working with.

Our office does not have the technical expertise to assist with manipulation of the provided data. If you need more help than is offered in this guide, you will need to contact someone with experience importing and manipulating large datasets or do an internet search for your specific issue.

>