Cleaning up my bookshelf a bit, I came upon the book Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman. It deals with all kinds of ways to deal with big data sets, data streams, link structures between documents, social network analysis, and other kinds of data which of occur in large amounts. There … Continue reading When Big Data became Scalable Databases
There is one book I've read in the past months that keeps coming back to me. Currently in the form of a series of Medium posts by Simon Wardley it is already a few hundred pages in length. Created as a means to map out the different parts of a company and to help developing … Continue reading Wardley Maps and the Democratization of AI
I recently did a podcast with Ben Lorica for our new project The Data Exchange on key AI trends for 2020. I particularly happy to be part of this project so we continue to collaborate now that he has left O'Reilly and joined databricks. So I checked, and actually I met him for the first … Continue reading Key AI Trends for 2020
In a few days on August 1st, I will have completed my fourth year at Zalando. It is my first job out of university and I was fortunate enough to have their trust to make the switch from supervising a bunch of Ph.D. students to managing teams and leading people. Later, I switched to an … Continue reading My 2019 Recap of Machine Learning From Academia To Industry
When it comes to new technologies like Artificial Intelligence, the pure technology is only a small aspect required to putting it to use. Still, given the hype that exists currently, one can easily loose sight of the big picture as announcements of new algorithms, toolboxes, or cloud services fight for grabbing our attention as the … Continue reading The Levels of Doing AI
One discussion I find myself in more often recently is people asking me whether something is "really AI" or not. Often, what people seem to mean with that is whether someone is already using deep learning, or still "just" machine learning. I mentioned this to a friend in the industry and he just rolled his … Continue reading But is it AI?
Two and a half-years ago I jumped off the AI bandwagon, left a permanent position in academia to joined "the industry," namely Zalando, a big European fashion ecommerce retailer. On the other hand, ever since I left, AI really did explode, with NIPS 2017 selling out in 15 days, and absurdities happening like Intel AI … Continue reading That Post-Academia Thing I Needed to Write.
(Repost from 2014) I don’t know whether this word exists, but mainstreamification is what’s happening to data analysis right now. Projects like Pandas or scikit-learn are open source, free, and allow anyone with some Python skills do lift some serious data analysis. Projects like MLbase or Apache Mahout work to make data analysis scalable such … Continue reading Data Analysis: The Hard Parts
(Repost) In case you haven’t heard yet, Data Science is all the craze. Courses, posts, and schools are springing up everywhere. However, every time I take a look at one of those offerings, I see that a lot of emphasis is put on specific learning algorithms. Of course, understanding how logistic regression or deep learning … Continue reading Three Things About Data Science You Won’t Find in the Books
Nowadays Python is probably the programming language of choice (besides R) for data scientists for prototyping, visualization, and running data analyses on small and medium sized data sets. And rightly so, I think, given the large number of available tools (just look at the list at the top of this article). However, it wasn’t always … Continue reading How Python Became the Language of Choice for Data Science