How StockTwits Uses Machine Learning

Fascinating behind the scenes interview of StockTwits Senior Data
Scientist Garrett Hoffman.


He shares great tidbits on how StockTwits uses machine learning for
sentiment analysis. I’ve summarized the highlights below:

  • Idea generation is a huge barrier for active trading
  • Next gen of traders uses social media to make decisions
  • Garrett solves data problems and builds features for the StockTwits
    platform
  • This includes: production data science, product analytics, and
    insights research
  • Understanding social dynamics makes for a better user experience
  • Focus is to understand social dynamics of StockTwits (ST) community
  • Focuses on what’s happening inside the ST community
  • ST’s market sentiment model helps users with decision making
  • Users ‘tag’ content for bullish or bearish classes
  • Only 20 to 30% of content is tagged
  • Using ST’s market sentiment model increases coverage to 100%
  • For Data Science work, Python Stack is used
  • Use: Numpy, SciPy, Pandas, Scikit-Learn
  • Jupyter Notebooks for research and prototyping
  • Flask for API deployment
  • For Deep Learning, uses Tensorflow with AWS EC2 instances
  • Can spin up GPU’s as needed
  • Deep Learning methods used are Recurrent Neural Nets, Word2Vec, and
    Autoencoders
  • Stays abreast of new machine learning techniques from blogs,
    conferences and Twitter
  • Follows Twitter accounts from Google, Spotify, Apple, and small tech
    companies
  • One area ST wants to improve on is DevOps around Data Science
  • Bridge the gap between research/prototype phase and embedding it into
    tech stack for deployment
  • Misconception that complex solutions are best
  • Complexity ONLY ok if it leads to deeper insight
  • Simple solutions are best
  • Future long-term ideas: use AI around natural language

Auto Support Resistance Lines in Forex

On the heels of my last post, I’ve extended those functions to the EURUSD pair. The data starts from this year 2019 and goes through to yesterday. It’s actually a pretty neat script as it takes data from Onada and then generates the support and resistance lines for that particular pair. The next step would be to create a buy/sell order in the Oanda Practice Account. Once I do that it’s then a matter of writing a trading strategy and testing it in real-time.

Everything I’m doing is completely academic and modular right now. I have no idea how really build a Forex Trading Bot or even what strategies to use here. It’s more of a “can I do it” endeavor.

I’m fully convinced that retail traders that can learn Python can automate their entire trading strategies. Now, the flipside here is if their strategies are worth anything. Just because we can automated trading setups, we must always ask “why do I think I’m right?”

Update

This method has been pretty successful for me in automatically eyeballing support and resistance lines for various currency pairs. I built on top of Hootnot‘s great python API integration and some generous’s chap that created the basic function for detecting maximums and minimums in a time series. I found that code on the Internet somewhere and adapted it to what I needed.

The first incarnation of this charting system used Matplotlib but I soon ported this over to Bokeh and later Plotly. Feel free to take a look at my Github and ask me any questions that you might have!

H2O.ai Empowers MarketAxess to Innovate and Inform Trading Strategies – Bloomberg

“H2O is an integral part of Composite+ and provides some of the fundamental machine learning tools and support that make our algorithms run as well as they do,” said David Krein, Global Head of Research at MarketAxess. “The Composite+ pricing engine is helping fulfill our clients’ critical liquidity needs with more accurate and timely pricing data, which we make available within the MarketAxess electronic trading workflow. H2O.ai has been a great partner which has contributed to our recent success.”

H2O.ai Empowers MarketAxess to Innovate and Inform Trading Strategies – Bloomberg

Python Forex Trading Bot

I have had some time to continue on my Python Forex Trading Bot (code borrowed from here and tweaked by me) now that we’re all self-isolating. This is purely for educational purposes because when I run this sucker, it loses money. Not so much anymore but it’s not profitable. The reason why I say ‘educational purposes’ is that coding is not my first choice of career and I teach myself as I go along. Coding’s been very profitable in other parts of my life and I use it to get $hit done.

I now understand the concept of Classes, which is great because it makes pieces of code very ‘pluggable.’ Originally I thought I could write a set of functions in the MomentumTrader class that would serve as my Stop and Trailing Stop orders. I did do that only to find out that I was creating those orders AFTER the trade and when the Bot would try to close out or add to the position (as it does because it’s a mean reversion strategy) it would sometimes crash. This led me to find a set classes in the API called onFill. This eliminated the need for me to create the order first and THEN add in a stop or trailing stop. I was able to do it once the trade was filled. The moral of the story, you should really understand your API classes.

Overall the API extended by Feite is quite robust and powerful, but it’s still very hard to make any money with this thing. Although I’ve been whining about getting active again, the reality is that the long term wins.

I’ll continue to test this over the course of the next few weeks using an Oanda Practice Account but I think I’m going to write a new class that best mimics my current Forex trading style instead. I use the Daily chart, trade pairs where I make $ from the carry trade, and do a long term trend play. That’s the beauty of Forex, you can see some great long trends if you zoom out.

My discretionary trading system does have some flaws. I usually get the entry wrong and have to place a second trade to ‘scale in.’ It’s something I don’t like doing because it means more risk. I also need to work on proper risk management as well. Right now I don’t use stops and I routinely take on 200 pip swings. This has worked out for me because 99% of the time I trade the EURUSD pair, which has been in a long downward trend. I usually make a short entry, then the price turns against me and goes higher, then I place another short entry where the price stabilizes. I think I’ve been very lucky until now and my trading metrics and expectancy are positive. Still, I feel like I leave a lot to chance and I’d like to size my position accordingly, make better entries, and use better risk management.

Current Python Forex Trading Bot

So here’s the latest incarnation of the Bot. I spent some time cleaning it up and adding in a trailingstop onfill function. Just copy all the code into a single python file (some_name.py) and create a subfolder called ‘oanda.’ In that folder you will need create account.txt and token.txt. Those two files are your account number and your dev token from oanda.

Note: the import function below refers to the standard Python libraries and Feite’s Oanda API that is needed to run the Bot.

#Install Py Package from: https://github.com/hootnot/oanda-api-v20
#https://oanda-api-v20.readthedocs.io/en/latest/oanda-api-v20.html

import json
import oandapyV20 as opy
import oandapyV20.endpoints.instruments as instruments
from oandapyV20.contrib.factories import InstrumentsCandlesFactory

import pandas as pd
from pandas.io.json import json_normalize

from oandapyV20.exceptions import V20Error, StreamTerminated
from oandapyV20.endpoints.transactions import TransactionsStream
from oandapyV20.endpoints.pricing import PricingStream
from oandapyV20.contrib.requests import TrailingStopLossOrderRequest

import datetime
from dateutil import parser

import numpy as np

def exampleAuth():
    accountID, token = None, None
    with open("./oanda_account/account.txt") as I:
        accountID = I.read().strip()
    with open("./oanda_account/token.txt") as I:
        token = I.read().strip()
    return accountID, token

instrument = "EUR_USD"

#Set time functions to offset chart
today = datetime.datetime.today()
two_years_ago = today - datetime.timedelta(days=720)

current_time = datetime.datetime.now()

twentyfour_hours_ago = current_time - datetime.timedelta(hours=12)
print (current_time)
print (twentyfour_hours_ago)


#Create time parameter for Oanada call
ct = current_time.strftime("%Y-%m-%dT%H:%M:%SZ")
tf = twentyfour_hours_ago.strftime("%Y-%m-%dT%H:%M:%SZ")

#Connect to tokens
accountID, access_token = exampleAuth()
client = opy.API(access_token=access_token)


params={"from": tf,
        "to": ct,
        "granularity":'M1',
        "price":'A'}
r = instruments.InstrumentsCandles(instrument=instrument,params=params)
#Do not use client from above
data = client.request(r)
results= [{"time":x['time'],"closeAsk":float(x['ask']['c'])} for x in data['candles']]
df = pd.DataFrame(results).set_index('time')

df.index = pd.DatetimeIndex(df.index)

from oandapyV20.endpoints.pricing import PricingStream
import oandapyV20.endpoints.orders as orders
from oandapyV20.contrib.requests import MarketOrderRequest, TrailingStopLossDetails, TakeProfitDetails
from oandapyV20.exceptions import V20Error, StreamTerminated
import oandapyV20.endpoints.trades as trades

class MomentumTrader(PricingStream):
    def __init__(self, momentum, *args, **kwargs):
        PricingStream.__init__(self, *args, **kwargs)
        self.ticks = 0
        self.position = 0
        self.df = pd.DataFrame()
        self.momentum = momentum
        self.units = 1000
        self.connected = False
        self.client = opy.API(access_token=access_token)

    def create_order(self, units):
        #You can write a custom distance value here, so distance = some calculation

        trailingStopLossOnFill = TrailingStopLossDetails(distance=0.0005)

        order = orders.OrderCreate(accountID=accountID,
                                   data=MarketOrderRequest(instrument=instrument,
                                                           units=units,
                                                           trailingStopLossOnFill=trailingStopLossOnFill.data).data)
        response = self.client.request(order)
        print('\t', response)

    def on_success(self, data):
        self.ticks  = 1
        print("ticks=",self.ticks)
        # print(self.ticks, end=', ')

        # appends the new tick data to the DataFrame object
        self.df = self.df.append(pd.DataFrame([{'time': data['time'],'closeoutAsk':data['closeoutAsk']}],
                                 index=[data["time"]]))

        #transforms the time information to a DatetimeIndex object
        self.df.index = pd.DatetimeIndex(self.df["time"])

        # Convert items back to numeric (Why, OANDA, why are you returning strings?)
        self.df['closeoutAsk'] = pd.to_numeric(self.df["closeoutAsk"],errors='ignore')

        # resamples the data set to a new, homogeneous interval, set this from '5s' to '1m'
        dfr = self.df.resample('60s').last().bfill()

        # calculates the log returns
        dfr['returns'] = np.log(dfr['closeoutAsk'] / dfr['closeoutAsk'].shift(1))

        # derives the positioning according to the momentum strategy
        dfr['position'] = np.sign(dfr['returns'].rolling(self.momentum).mean())

        print("position=",dfr['position'].iloc[-1])

        if dfr['position'].iloc[-1] == 1:
            print("go long")
            if self.position == 0:
                self.create_order(self.units)

            elif self.position == -1:
                self.create_order(self.units * 2)
            self.position = 1

        elif dfr['position'].iloc[-1] == -1:
            print("go short")
            if self.position == 0:
                self.create_order(-self.units)


            elif self.position == 1:
                self.create_order(-self.units * 2)

            self.position = -1

        if self.ticks == 25000:
            print("close out the position")
            if self.position == 1:
                self.create_order(-self.units)
            elif self.position == -1:
                self.create_order(self.units)
            self.disconnect()

    def disconnect(self):
        self.connected=False

    def rates(self, account_id, instruments, **params):
        self.connected = True
        params = params or {}
        ignore_heartbeat = None
        if "ignore_heartbeat" in params:
            ignore_heartbeat = params['ignore_heartbeat']
        while self.connected:
            response = self.client.request(self)
            for tick in response:
                if not self.connected:
                    break
                if not (ignore_heartbeat and tick["type"]=="HEARTBEAT"):
                    print(tick)
                    self.on_success(tick)


# Set momentum to be the number of previous 5 second intervals to calculate against

mt = MomentumTrader(momentum=60,accountID=accountID,params={'instruments': instrument})
print (mt)

mt.rates(account_id=accountID, instruments=instrument, ignore_heartbeat=True)

Grab and Download Tick Data (Updated)

Sometimes you just want to extract tick data for your Forex trading bots. The way to do this is by simply modifying a sample script from the API examples and saving it to a JSON file for later manipulation.

Here’s how you do it:

import json
import pandas as pd
from oandapyV20 import API
from oandapyV20.exceptions import V20Error
from oandapyV20.endpoints.pricing import PricingStream

from pandas.io.json import json_normalize

def exampleAuth():
    accountID, token = None, None
    with open("./oanda_account/account.txt") as I:
        accountID = I.read().strip()
    with open("./oanda_account/token.txt") as I:
        token = I.read().strip()
    return accountID, token

#Connect to tokens
accountID, access_token = exampleAuth()
api = API(access_token=access_token, environment="practice")

#instruments = "DE30_EUR,EUR_USD,EUR_JPY"
instruments = "EUR_USD"
s = PricingStream(accountID=accountID, params={"instruments":instruments})

df = pd.DataFrame()
out = pd.DataFrame()

for R in api.request(s):
    df = json_normalize(R)
    out = out.append(df)

out.to_csv('tickdata.csv')

I modified the above code to write the tick data to a pandas dataframe instead. This way you can save this data to a CSV file for later backtesting and startegy evaluation.

Other Forex Strategies to Try

This Reversion Mean Trading works can work on very long or short time frames IMHO. As I wrote about, this sucker loses money but has been a great help in learning and understanding how Feite’s API works and how you can codify your ideas into plain code to (hopefully) make money.

This led me to think about other Forex Strategies I could code together and try. I did a quick Google search and came across this article on different Forex Strategies.

They list the following Forex Strategies:

  1. Carry Trading (did this)
  2. Position Trading (did this)
  3. Swing Trading (did this)
  4. Trend Trading (did this)
  5. Range Trading (Mean Reversion like / did this)
  6. Day Trading (never really did this)
  7. Scalp Trading (never did this)

While this is a lot of work I find the scalping strategies to be of interest to me. All you have to do is look at smaller time frames (5, 10, and 15 minutes) and use some price-volume indicator to cross a certain level and enter a trade. then when it crosses below that indicator you sell.

You can of course flip to short strategies if the indictor drops below a threshold and then close out the trade when it reaches your close-out point.

How would I build that? I would create another class and name it ‘Scalper.’ I would keep the initialization, create_order, disconnect, and rates functions AS IS. I wouldn’t change them, except for the create_order trailing stop loss part. I might comment it out or adjust it to a wider/tighter value.

The trick to the strategy is in the on_success function. Here the stream tick data comes into a Pandas dataframe and gets resampled into a 60-second frame. From there I would need to build a Money Flow indicator (MFI) and then write the logic to do something like if MFI > 50, then Buy. Sell if MFI > 70 and go Short. Then Buy when MFI <50. Close all trades

Something like that. I need to think about it and then of course test it in my play account. See below.

Price Scalper Stochastic Class

This is a work in progress and standard disclaimers of financial & trading risk apply, but this is a bastardized version of the MomentumTrader Class called the ScalpTrader Class. It’s hot off the presses here and it needs a ton of clean-up, especially fine-tuning the BUY and SELL signals. On the surface, this works pretty well so far so I’m happy about that. Right now the BUY signals are only on the %D values right now and you only BUY when between 0 and 20, and SELL when you’re between 80 and 100. I’ll run this over the next week to see if it makes any profit or not.

from oandapyV20.endpoints.pricing import PricingStream
import oandapyV20.endpoints.orders as orders
from oandapyV20.contrib.requests import MarketOrderRequest, TrailingStopLossDetails, TakeProfitDetails
from oandapyV20.exceptions import V20Error, StreamTerminated
import oandapyV20.endpoints.trades as trades

class ScalpTrader(PricingStream):
    def __init__(self, momentum, *args, **kwargs):
        PricingStream.__init__(self, *args, **kwargs)
        self.ticks = 0
        self.position = 0
        self.df = pd.DataFrame()
        self.momentum = momentum
        self.units = 1000
        self.connected = False
        self.client = opy.API(access_token=access_token)

    def create_order(self, units):
        #You can write a custom distance value here, so distance = some calculation

        trailingStopLossOnFill = TrailingStopLossDetails(distance=0.0005)

        order = orders.OrderCreate(accountID=accountID,
                                   data=MarketOrderRequest(instrument=instrument,
                                                           units=units,
                                                           trailingStopLossOnFill=trailingStopLossOnFill.data).data)
        response = self.client.request(order)
        print('\t', response)

    def on_success(self, data):
        self.ticks  = 1
        print("ticks=",self.ticks)
        # print(self.ticks, end=', ')

        # appends the new tick data to the DataFrame object
        self.df = self.df.append(pd.DataFrame([{'time': data['time'],'closeoutAsk':data['closeoutAsk']}],
                                 index=[data["time"]]))

        #transforms the time information to a DatetimeIndex object
        self.df.index = pd.DatetimeIndex(self.df["time"])

        # Convert items back to numeric (Why, OANDA, why are you returning strings?)
        self.df['closeoutAsk'] = pd.to_numeric(self.df["closeoutAsk"],errors='ignore')

        # resamples the data set to a new, homogeneous interval, set this from '5s' to '1m'
        dfr = self.df.resample('60s').last().bfill()

        #Calculate K and D
        dfr['14-high'] = dfr['closeoutAsk'].rolling(14).max()
        dfr['14-low'] = dfr['closeoutAsk'].rolling(14).min()
        dfr['K'] = (dfr['closeoutAsk'] - dfr['14-low'])*100/(dfr['14-high'] - dfr['14-low'])
        dfr['D'] = dfr['K'].rolling(3).mean()

        # creates position column, fill all with zeros
        dfr['position'] = 0

        # derives the positioning according to the scalping strategy below
        dfr['position'] = np.where(((dfr['D'] > 0) & (dfr['D'] < 20)), 1, dfr.position)
        dfr['position'] = np.where(((dfr['D'] > 80) & (dfr['D'] < 100)), -1, dfr.position)
        print (dfr)

        print("position=",dfr['position'].iloc[-1])
        print ("%K=", dfr['K'].iloc[-1])
        print ("%D=", dfr['D'].iloc[-1])

        if dfr['position'].iloc[-1] == 1:
            print("go long")
            if self.position == 0:
                self.create_order(self.units)

            elif self.position == -1:
                self.create_order(self.units * 2)
            self.position = 1

        elif dfr['position'].iloc[-1] == -1:
            print("go short")
            if self.position == 0:
                self.create_order(-self.units)


            elif self.position == 1:
                self.create_order(-self.units * 2)

            self.position = -1

        if self.ticks == 25000:
            print("close out the position")
            if self.position == 1:
                self.create_order(-self.units)
            elif self.position == -1:
                self.create_order(self.units)
            self.disconnect()

    def disconnect(self):
        self.connected=False

    def rates(self, account_id, instruments, **params):
        self.connected = True
        params = params or {}
        ignore_heartbeat = None
        if "ignore_heartbeat" in params:
            ignore_heartbeat = params['ignore_heartbeat']
        while self.connected:
            response = self.client.request(self)
            for tick in response:
                if not self.connected:
                    break
                if not (ignore_heartbeat and tick["type"]=="HEARTBEAT"):
                    print(tick)
                    self.on_success(tick)


This seems to work ok and I lose less money with this but it;ss not profitable.

Price Scalper RSI Class

Another update, this time using an RSI indicator to make trades. None of this stuff really makes money but it’s an exercise that I’m working on. Hopefully one day I’ll get it right. Use at your own risk and there’s code clean up I need to do here.

from oandapyV20.endpoints.pricing import PricingStream
import oandapyV20.endpoints.orders as orders
from oandapyV20.contrib.requests import MarketOrderRequest, TrailingStopLossDetails, TakeProfitDetails
from oandapyV20.exceptions import V20Error, StreamTerminated
import oandapyV20.endpoints.trades as trades

class ScalpTraderRSI(PricingStream):
    def __init__(self, momentum, *args, **kwargs):
        PricingStream.__init__(self, *args, **kwargs)
        self.ticks = 0
        self.position = 0
        self.df = pd.DataFrame()
        self.momentum = momentum
        self.units = 10000
        self.connected = False
        self.client = opy.API(access_token=access_token)

    def create_order(self, units):
        #You can write a custom distance value here, so distance = some calculation

        trailingStopLossOnFill = TrailingStopLossDetails(distance=0.05)

        order = orders.OrderCreate(accountID=accountID,
                                   data=MarketOrderRequest(instrument=instrument,
                                                           units=units,
                                                           trailingStopLossOnFill=trailingStopLossOnFill.data).data)
        response = self.client.request(order)
        print('\t', response)

    def on_success(self, data):
        self.ticks  = 1
        print("ticks=",self.ticks)
        # print(self.ticks, end=', ')

        # appends the new tick data to the DataFrame object
        self.df = self.df.append(pd.DataFrame([{'time': data['time'],'closeoutAsk':data['closeoutAsk']}],
                                 index=[data["time"]]))

        #transforms the time information to a DatetimeIndex object
        self.df.index = pd.DatetimeIndex(self.df["time"])

        # Convert items back to numeric (Why, OANDA, why are you returning strings?)
        self.df['closeoutAsk'] = pd.to_numeric(self.df["closeoutAsk"],errors='ignore')

        # resamples the data set to a new, homogeneous interval, set this from '5s' to '1m'
        dfr = self.df.resample('300s').last().bfill()

        #Calculate K and D
        delta = dfr['closeoutAsk'].diff()
        up = delta.clip(lower=0)
        down = -1*delta.clip(upper=0)
        ema_up = up.ewm(com=0.5, min_periods=13).mean()
        ema_down = down.ewm(com=0.5, min_periods=13).mean()
        rs = ema_up/ema_down

        dfr['RSI'] = 100 - (100/(1   rs))

        # creates position column, fill all with zeros
        dfr['position'] = 0

        # derives the positioning according to the scalping strategy below
        dfr['position'] = np.where(((dfr['RSI'] > 0) & (dfr['RSI'] < 10)), 1, dfr.position)
        dfr['position'] = np.where(((dfr['RSI'] > 80) & (dfr['RSI'] < 100)), -1, dfr.position)
        print (dfr)

        print("position=",dfr['position'].iloc[-1])
        print ("RSI=", dfr['RSI'].iloc[-1])


        if dfr['position'].iloc[-1] == 1:
            print("go long")
            if self.position == 0:
                self.create_order(self.units)

            elif self.position == -1:
                self.create_order(self.units * 2)
            self.position = 1

        elif dfr['position'].iloc[-1] == -1:
            print("go short")
            if self.position == 0:
                self.create_order(-self.units)


            elif self.position == 1:
                self.create_order(-self.units * 2)

            self.position = -1

        if self.ticks == 25000:
            print("close out the position")
            if self.position == 1:
                self.create_order(-self.units)
            elif self.position == -1:
                self.create_order(self.units)
            self.disconnect()

    def disconnect(self):
        self.connected=False

    def rates(self, account_id, instruments, **params):
        self.connected = True
        params = params or {}
        ignore_heartbeat = None
        if "ignore_heartbeat" in params:
            ignore_heartbeat = params['ignore_heartbeat']
        while self.connected:
            response = self.client.request(self)
            for tick in response:
                if not self.connected:
                    break
                if not (ignore_heartbeat and tick["type"]=="HEARTBEAT"):
                    print(tick)
                    self.on_success(tick)


Information Shocks

As I build more of these classes I’m beginning to realize that trading with technical indicators is terrible. It confirms my suspicions that technicals really don’t work well in the long run or on a daily, sub 15-minute time frame.

What worked for me was discretionary trading, where I would enter a trade based on the fundamentals and news around me and then sit on the trade for days and weeks. I made sick money (on a relative percentage basis) that way but when the market sentiment changed I also lost ‘sick money’ too. I believe that the happy answer is a switch between short and long-term holding periods but when and how? That’s the question.

I recently came across an interesting post about Chaos Theory in r/AlgoTrading and the top response is what resonated with me. It made me think back to UglyChart’s trading bot, W0nk0’s trading scripts, and Maoxian’s trading style. You trade according to some volatility or market event per asset. This is loosely known as information shock.

Why hadn’t I thought of this before? Coding in some sort of volatility trading class in the Forex Bot? After all, I stream in the tick data and from there I can calculate how many ticks per time period (buying or selling) I get. Perhaps I can write a simple directional bot that when the buying pressure exceeds the selling pressure by some amount I go long for a few pips and then closeout. The same idea holds true if I were selling short.

This way I don’t care about the direction of the trade, just what the short-term market is telling me and I can get in and out of trades quickly. I’d have to keep in mind the spread costs and only take trades that are statistically proven to provide me with a 2R (2 times the reward of what I risk).

The first step is to capture tick data again and manipulate the dataframes to build the logic for tick compression. Then come up with a buy and sell strategy and backtest it. Then run the bot in multiple time frames in the Oanda practice environment. Then, PROFIT!!?!?!

%d bloggers like this: