H2O.ai Empowers MarketAxess to Innovate and Inform Trading Strategies – Bloomberg

“H2O is an integral part of Composite+ and provides some of the fundamental machine learning tools and support that make our algorithms run as well as they do,” said David Krein, Global Head of Research at MarketAxess. “The Composite+ pricing engine is helping fulfill our clients’ critical liquidity needs with more accurate and timely pricing data, which we make available within the MarketAxess electronic trading workflow. H2O.ai has been a great partner which has contributed to our recent success.”

H2O.ai Empowers MarketAxess to Innovate and Inform Trading Strategies – Bloomberg

Isolation Forests in H2O.ai

A new feature has been added to H2O-3 open-source, isolation forests. I’ve always been a fan of understanding outliers and love using One-Class SVM’s as a method, but the isolation forests appear to be better in finding outliers, in most cases.

From the H2O.ai blog:

There are multiple approaches to an unsupervised anomaly detection problem that try to exploit the differences between the properties of common and unique observations. The idea behind the Isolation Forest is as follows.

We start by building multiple decision trees such that the trees isolate the observations in their leaves. Ideally, each leaf of the tree isolates exactly one observation from your data set. The trees are being split randomly. We assume that if one observation is similar to others in our data set, it will take more random splits to perfectly isolate this observation, as opposed to isolating an outlier.

For an outlier that has some feature values significantly different from the other observations, randomly finding the split isolating it should not be too hard. As we build multiple isolation trees, hence the isolation forest, for each observation we can calculate the average number of splits across all the trees that isolate the observation. The average number of splits is then used as a score, where the less splits the observation needs, the more likely it is to be anomalous.

While there are other methods of outlier detection like LOF (local outlier factor), it appears that Isolation Forests tend to be better than One-Class SVM’s in finding outliers.

See this handy image from the Scikit-Learn site:

Anomaly Detection Comparison
Anomaly Detection Comparison

Interesting indeed. I plan on using this new feature on some work I’m doing for customers.

Install H2O’s Wave on AWS Lightsail or EC2

I recently set had to set up H2O’s Wave Server on AWS Lightsail and build a simple Wave App as a Proof of Concept.

If you’ve never heard of H2O Wave then you have been missing out on a new cool app development framework. We use it at H2O to build AI-based apps and you can too, or use it for all other kinds of interactive applications. It’s like Django or Streamlit, but better and faster to build and deploy.

AWS Light Sail Instances & Plans

The first step is to go to AWS Lightsail and select the OS-based blueprint. I selected my favorite distro of Linux: Ubuntu 20.

Then I attached an instance size, I opted for the $5 a month plan. The instance is considered tiny, it has 1GB memory, 1vcpu, with 40 GB storage.

Also, make sure that you’re in the right AWS region for your needs.

NOTE: YOU WILL NEED TO OPEN PORT 10101! Just navigate to Manage > Networking on your instance and add a custom TCP port to 10101, like so:

Connect with SSH and install required Python libraries

Ubuntu 20 comes preloaded with a lot of stuff, like Python 3, but you’ll need to update & upgrade the instance first before installing the required libraries.

On AWS Lightsail you can click on the ‘Connect with SSH’ button on your instance card. That will open up a nice terminal where you can run the following commands.

First, run these commands:

sudo apt-get install updates
sudo apt-get install upgrade

Then install pip, virtualenv by doing:

sudo apt-get install python3-pip -y
sudo pip3 install virtualenv

Once you’ve done that, your instance should be ready to download the latest Wave Server release!

Download the Wave Server and Install it

Grab the latest version of Wave (as of this post, it’s v0.13). Make sure you grab the right install package. Since we are on Linux, download the ‘wave-0.13.0-linux-amd64.tar.gz‘ package.

I did a simple ‘wget’ to the /home/ubuntu directory like so:

wget https://github.com/h2oai/wave/releases/download/v0.13.0/wave-0.13.0-linux-amd64.tar.gz

Then I unzipped it using the following command:

tar -xzf wave-0.13.0-linux-amd64.tar.gz

Once I did that I had a directory in my /home/ubuntu directory called ‘wave-0.13.0-linux-amd64’.

Running the Wave Server

CD into the wave-0.13.0-linux-amd64 directory after you unzipped the package. Once you’re in there you just need to run the following command to start the Wave Server

Run ./waved

The Wave Server should spin up and you should see similar output below.

Setup Your Apps Directory

The neat thing about Wave is building all the apps. To do so and for best practices, we create an apps directory. This is where you’ll store all your apps and create a virtualenv to run them in.

First, you’ll have to CD back to your home directory, make an apps directory, and then CD to the apps directory.

cd $HOME

mkdir apps

cd apps

Once you’re in the /home/ubuntu/apps directory you’re going to set up a virtual environment and write your app!

First set up a virtual environment called venv

python3 - m venv venv

Next, initialize the virtual environment by running:

source venv/bin/activate

You will install any python libraries or dependencies into the virtual environment. My suggestion is to put those dependencies into a requirements.txt file and then do a simple pip3 install -r requirements.txt

The next step is to create a src directory where you will store the source (src) code of your app. This base app directory is a great place to put all sorts of things for your app, like images and requirements.txt.

| – apps
| – requirements.txt
| – src
| |
| app.py
| – wave-0.13.0-linux-amd64

Running your Wave App

Once your apps directory and virtual environment are set up, it’s time to run your Wave App.

It’s a simple command as

wave run src.app

If it spins up without any errors, navigate to your instance’s IP address:10101 and see your app!

Automating a Wave Server and APP

There are two things that happen when you run a Wave Server and Wave App, you have to make sure the Wave Server is running and that the Wave App is inside a virtual environment that’s active.

That’s a lot of moving parts but it can be automated with a nice script that you can install upon your instances boot up.

Here’s a simple bash script that I wrote to run the Wave Server in the background and make sure my stock app is running inside the virtual environment and is reachable.


cd wave-0.13.0-linux-amd64
nohup ./waved &

cd /home/ubuntu/apps/stock
source venv/bin/activate
nohup wave run --no-reload src.app &

Make sure to chmod x mystartupscript.sh and save it.

Note the --no-reload part. When Wave is in dev mode it scans the directories of where your apps are for any code changes. In production, you want to disable that because it can use up CPU power!

That’s it!

H2O Wave App Tutorials

Just like what I did with my general H2O-3 Tutorials, I’m starting a separate H2O Wave Tutorial post for you all here. I will add to this as I go along and build more apps.

Stock Chart Generator Wave App

This is a simple candlestick chart generator that defaults to JNJ when you load it. All you need to do is add your Alphavantage API key where it says:

"apikey": "XXXXXXX" } <span class="hashtag">#ENTER</span> YOUR ALPHAVANTAGE KEY HERE

The app is pretty simple and looks like this:

I need to refactor the code and write it better, but it works!

from h2o_wave import site, ui, app, Q, main

import numpy as np

from bokeh.models import HoverTool
from bokeh.plotting import figure
from bokeh.resources import CDN
from bokeh.embed import file_html
from bokeh.palettes import Spectral6
from bokeh.models import Span, Label, Title, Legend, LegendItem, ColumnDataSource

from bokeh.io import curdoc, show
from bokeh.models import ColumnDataSource, Grid, Line, LinearAxis, Plot
from bokeh.resources import INLINE

import bokeh.io

from math import pi

import pandas as pd

import requests

import datetime as DT
from datetime import datetime
from datetime import date
import sys, os

import time as tt

async def serve(q: Q):


    # Grab a reference to the page at route '/stock'
    #page = site['/stock']

    q.page['title'] = ui.header_card(box='1 1 8 1', title='Stock Chart Generator', icon='shoppingcart', icon_color='yellow', subtitle='')

    # Add a markdown card to the page.
    #q.page['title'] = ui.markdown_card(
    #    box='1 1 8 1',
    #    title='Stock Charts',
    #    content='For the Investors that want to lose Money!',

    #This does nothing yet.
    #q.page['alpha'] = ui.form_card(box='1 2 2 4', items=[
    #    ui.text_l(content='Enter Your Alphavantage API Key'),
    #    ui.textbox(name='junk', label='Enter API'),
    #    ui.button(name='show_inputs', label='Submit', primary=True),])

    if q.args.textbox:
        q.page['stockform'] = ui.form_card(box='1 2 2 4', items=[
            ui.text_l(content='Enter a Stock Symbol.'),
            ui.textbox(name='textbox', label='Enter a Stock Symbol'),
            ui.button(name='show_inputs', label='Submit', primary=True),]
        q.page['stockform'] = ui.form_card(box='1 2 2 4', items=[
            ui.text_l(content='Enter a Stock Symbol.'),
            ui.textbox(name='textbox', label='Enter a Stock Symbol'),
            ui.button(name='show_inputs', label='Submit', primary=True),]
        q.args.textbox = 'JNJ'
# Get Data

    instrument = q.args.textbox

    API_URL = "https://www.alphavantage.co/query"

    symbol = instrument

    data = { "function": "TIME_SERIES_DAILY",
         "symbol": symbol,
         "outputsize" : "compact",
         "datatype": "json",

# https://www.alphavantage.co/query?function=SMA&symbol=IBM&interval=weekly&time_period=10&series_type=open&apikey=demo

    response = requests.get(API_URL, data)
    response_json_stock = response.json() # maybe redundant

# Test Chart Building

#Manipulate OHLC Frame
    data = pd.DataFrame.from_dict(response_json_stock['Time Series (Daily)'], orient= 'index').sort_index(axis=1)
    data = data.rename(columns={ '1. open': 'Open', '2. high': 'High', '3. low': 'Low', '4. close': 'Close', '5. volume':'Volume'})
    data = data[['Open', 'High', 'Low', 'Close','Volume']]
    data['Date'] = data.index

    data['Date'] = pd.to_datetime(data['Date'])

# Set up Bokeh Frame
    inc = data.Close > data.Open
    dec = data.Open > data.Close
    w = 12*60*60*1000 # half day in ms

    p = figure(x_axis_type="datetime", sizing_mode='stretch_both', plot_width=800, title = str(instrument)   " Stock")

    p.segment(data.Date, data.High, data.Date, data.Low, color="black")

    p.vbar(data.Date[inc], w, data.Open[inc], data.Close[inc], fill_color="green", line_color="black")
    p.vbar(data.Date[dec], w, data.Open[dec], data.Close[dec], fill_color="red", line_color="black")

    p.xaxis.major_label_orientation = pi/4

    p.add_layout(Title(text="Date", align="center"), "below")
    p.add_layout(Title(text= str(instrument) ' Price 
, align="center"), "left")

# Export html for our frame card
    html = file_html(p, CDN, "plot")

    q.page['chart'] = ui.frame_card(
    box='3 2 6 8',
    title='Stock Chart',

# Finally, save the page.
    await q.page.save()

%d bloggers like this: