First off is Julia Language for Data Engineering (Medium link). Author Logan Kilpatrick writes how to use Julia Language packages Dataframs.jl and CSV.jl to do some basic data engineering. He doesn’t stop there but shows you how to work with databases in Julia Language as well and shares a lot of links to videos and additional information.
Meta AI, which I believe might’ve been Facebook AI, has released a new open-source package called data2vec. Data2vec operates in the self-supervised area of machine learning. Self-supervised learning is explained as:
Self-supervision enables computers to learn about the world just by observing it and then figuring out the structure of images, speech, or text. Having machines that don’t need to be explicitly taught to classify images or understand spoken language is simply much more scalable. - via Meta AI.
Are you interested in running machine learning on small and low-powered chips? How about doing deep learning on tiny devices? That’s the goal of TinyML.
TinyML seeks to bring the power of deep learning to small microcontrollers and chips are cheap to produce, use low power, and can run on a battery. I really like this idea since I have several Raspberry Pi computers and a Jetson Nano.
This is a treasure trove of data science cheat sheets. Bookmark this Kaggle page, you will thank me for it.
In other news, it looks like Alteryx is trying to stay relevant by snapping up “holy shit are they still around” Big Data profiling company Trifacta.
Last but not least, how bad was Zillow’s Zestimate? It looks like it was way off the market and the company is shedding more than 20% of its workforce. OUCH! They incurred a $420 million dollar loss. More OUCH!