New Forum

Visit the new forum at

Sunday, August 16, 2015

Downloading Stock Market News for Specific Symbols

Grabbing the data.

How do you grab the latest news on your favorite ticker symbol?

It all starts with the following URL.
You'll want to change "q=SPY" to whatever symbol you're interested in.

You can add something like the following to the end if you'd like more articles returned. There is, however, a limit on the number returned.
What the returned data means.

When you download the above link, you're given an rss feed, which is in XML. If you don't know XML, it's basically a text document with various fields sorted by tags that look like "<tag>info</tag>" (e.g. <title>SPY Stock News</title>, <link></link>, etc).

The tags you'll be interested in start with <item>. Each <item> contains info on a single news article. Within the <item></item> tags, you'll find four tags that are probably of interest to you. They are:
<title> This gives you the title of the article. (e.g. <title>S&amp;P 500: Should You Buy the SPY ETF Now?</title>
<link> This gives you the link to the article. (e.g. <link></link>
<pubDate> This is the date the article was published online. (e.g. <pubDate>Tue, 30 Jun 2015 12:00:10 GMT</pubDate>) 
<description> This is the description Google gives for the article. It is short but provides an idea without requiring you to download the entire article. (e.g. SPDR S&amp;P 500 ETF Trust Sees Large Drop in Short Interest (SPY))
What you'll likely want to do is download this data and extract it somewhere you can store and analyze it.

Storing the data.

I'd store it in a database. You can also store it in flat files (.txt or .xml) if you prefer.

Create a table in your MySQL database (make sure you have MySQL installed! On Ubuntu Linux: sudo apt-get install msyql-server). This can be modified for other databases fairly easy.
CREATE TABLE stocknews (symbol VARCHAR(5), pubDate DATETIME, title TEXT, link TEXT, description TEXT);
The Code.

Next we want to extract and insert. We'll use Python.

First, if you don't have the "pymysql" module installed you'll need to install it by typing: pip install pymysql. I also really like "timestring" for date/time parsing (run: pip install datestring).

Now for the lovely code :) This can be more easily consumed via the "godelsmarket" Github Repo.

import urllib2
from lxml import etree
import pymysql
import timestring

#connect to the database
connection = pymysql.connect(host='localhost',

#stock symbol you want to download news for
symbol = "SPY"

#this is the url where we grab the data
url_stub = ""

#use urllib2 to download the data
response = urllib2.urlopen(url_stub + symbol)
xml =

#turn into an xml doc
doc = etree.fromstring(xml)
#we're only interested in tags under <item>
item_tags = doc.xpath('//channel/item')
for item in item_tags:
#split up by the four tags
  date_tag = item.xpath('pubDate')
title_tag = item.xpath('title')
link_tag = item.xpath('link')
description_tag = item.xpath('description')

date_text = date_tag[0].text
title_text = title_tag[0].text
link_text = link_tag[0].text
description_text = description_tag[0].text

print 'date:' + date_text
print 'title:' + title_text
print 'link:' + link_text
print 'description:' + description_text

#insert into the database
with connection.cursor() as cursor:
sql = "INSERT INTO `stocknews` (`symbol`, `pubDate`, `title`, `link`, `description`) VALUES (%s, %s, %s, %s, %s)"
cursor.execute(sql, (symbol, str(timestring.Date(date_text)), title_text, link_text, description_text))

As always, if you have any questions feel free to comment! Hope you enjoyed!

If you enjoyed this article, consider signing up for the Gödel's Market Newsletter. I don't spam, and every signup is seen as a sign of support! Thank you!

No comments:

Post a Comment