Fascinating behind the scenes interview of StockTwit’s Senior Data Scientist Garrett Hoffman. He shares great tidbits on how StockTwits uses machine learning for sentiment analysis. I’ve summarized the highlights below:

  • Idea generation is a huge barrier for active trading
  • Next gen of traders uses social media to make decisions
  • Garrett solves data problems and builds features for the StockTwits platform
  • This includes: production data science, product analytics, and insights research
  • Understanding social dynamics makes for a better user experience
  • Focus is to understand social dynamics of StockTwits (ST) community
  • Focuses on what’s happening inside the ST community
  • ST’s market sentiment model helps users with decision making
  • Users ’tag’ content for bullish or bearish classes
  • Only 20 to 30% of content is tagged
  • Using ST’s market sentiment model increases coverage to 100%
  • For Data Science work, Python Stack is used
  • Use: Numpy, SciPy, Pandas, Scikit-Learn
  • Jupyter Notebooks for research and prototyping
  • Flask for API deployment
  • For Deep Learning, uses Tensorflow with AWS EC2 instances
  • Can spin up GPU’s as needed
  • Deep Learning methods used are Recurrent Neural Nets, Word2Vec, and Autoencoders
  • Stays abreast of new machine learning techniques from blogs, conferences and Twitter
  • Follows Twitter accounts from Google, Spotify, Apple, and small tech companies
  • One area ST wants to improve on is DevOps around Data Science
  • Bridge the gap between research/prototype phase and embedding it into tech stack for deployment
  • Misconception that complex solutions are best
  • Complexity ONLY ok if it leads to deeper insight
  • Simple solutions are best
  • Future long-term ideas: use AI around natural language