Free Money: How to easily text-mine financial reports.-ChillznDay

Hey guys,

This page is the first in a series of posts on methodologies that you can easily apply in the market. I will explain every aspect of this approach and how you can use it within your portfolio to generate substantial amounts of Alpha. Today we are going to be looking at Text-Mining and how it can be applied to your portfolio. I will demystify this technique that firms use to read through financial reports at near instant speeds.

Definition of terms.

So before we begin we have to discuss what we are talking about. The purpose of this is to make sure everyone is on the same page. As such if you see a word or term you already know feel free to skip ahead.


The first term, methodology. The Oxford English dictionary defines methodology as “a system of methods used in a particular area of study or activity.”

For the purposes of this demonstration we can think of the term “methodology” as the method to which we will look at the market, or in this case financial reports.


Text-Mining can be defined loosely as “using a computer program to look through a corpus of text and demonstrate underlying connections between words and other data points.”

For the purposes of this post think of text-mining as just using a computer program to look through multiple financial reports near instantly and demonstrate links

Financial Reports.

For the purposes of this post we will be looking at U.S financial reports. A financial report can be defined as “a document that outlines, defines, and makes public key financial data.”

Think of financial reports as a big book that is published online by a publicly traded company. This corporate ‘book’ is publicly published by the Securities Exchange Commission of the United States and helps investors make informed decisions.


The 10-k form is an annual filing that each U.S publicly traded company must submit to the SEC each year. These 10-k’s outline everything the company has done the year before and their intentions ongoing. We will be using these forms for this post because it’s generally the easiest form to understand and access. Once you understand how to look at a 10-k then this methodology can be applied to any other form.

Starting the Text-Mining.

Getting AAPL’s 10-k financial document.

First, we are going to pull a financial report for a U.S publicly traded company.

To get the 10-k financial document you need to go to the Securities Exchange Commission site. Click here for the link.

In this case we will be looking at the huge Apple Inc first. In order to access Apple’s 10-k we must go to the website and navigate to Apple’s depository. (hint: Look for EDGAR’s Full Text Search)

As we can see there is the ability to enter a company’s stock ticker in the search field. Go ahead and enter in Apple Inc’s stock ticker AAPL to pull up Apple’s financial reports.

Left side of the screen you see the form dropdown. Click that.

Next, navigate to the 10-k form in the dropdown and click it. After you have done that you need to find the latest non-ex 10-k form. (ex is an amended form and won’t help you for now.) A screen should pop up, click the open document tab.

Once you click the open document tab you will be greeted with a very large document. Copy the URL for this document.

Now that we have the $AAPL 10-k form let’s begin the process of text-mining. To do this we will need to find our text-mining tool.

Using Voyant-Tools to mine through your text.

For this I recommend the free text-mining program Voyant-Tools (

Voyant-Tools utilizes a Python algorithm to demonstrate semantic links within a corpus of texts. You can put in several financial documents and see ‘who is talking to who’ before the analysts on the street have time to decipher them. Giving you a trading advantage, or Alpha.

Go ahead and put that link for $AAPL’s 10-k in the “Add textbox.

Once you click reveal a large screen with several boxes that won’t make any sense initially. Play around with these boxes and you will begin to understand how this can help you parse through a 10-k faster than a normal analyst.

The best news is that you can plug in multiple 10-k’s into the corpus text box. For example you can put in a competitor 10-k to $AAPL, $MSFT, and see what links are present within both 10-k’s.


This tool will allow you to go through 10-k’s and other SEC forms lightning fast. Like any tool its going to take some time to get used to but with a little practice you can learn to read these forms at a lightning quick pace.

Further, you can input an entire sector’s worth of 10-k’s and see just what the sector is doing almost instantly. This alone will save you countless hours of calling and researching, freeing you up to plan a trade that will capitalize on evolving trends.

As usual if you like content like this you should like, comment, share, and subscribe to our newsletter! I only send it out on a weekly basis and it’s just to keep you updated on the market.

Further, you can check out some of the other articles below.

Until next time, best of luck in your trading!