Deep Learning is a powerful technology but you might want to try some “shallow” approaches before you dive in.

It’s unquestionable that over the last decade, deep learning has changed machine learning landscape for the better. Deep Neural Networks (DNNs), first popularised by Yan LeCunn, Yoshua Bengio and Geoffrey Hinton, are a family of machine learning models that are capable of learning to see and categorise objects, predict stock market trends, understand written text and even play video games.
Buzzwords like “LSTM” and “GAN” sound very cool but are they the right fit for purpose for your business problem?
Why is Training Data Important for Deep Learning?

Deep Learning or Classical Learning?
Most data scientists will prefer to KISS than to charge in with a deep learning model.
-
You have an experienced data science team who understand feature engineering and the data they’re being asked to model or at the very least can get hold of people that understand the data.
-
You don’t have access to GPUs and large amounts of compute power or hardware and computing power are at a premium
-
You don’t have lots of data (i.e. you have 100 or 1000 examples rather than 100k or 1 million)
1: Data Scientists, Feature Engineering and Understanding the data
…a deep learning model may be able to learn features of the data that data scientists can’t but if a hand-engineered model gets you to 90% accuracy, is the extra data gathering and compute power worth it…?

Conversely, one of the most exciting things about “deep learning” is that these models are able to learn complex features for themselves over time. Just like a human brain slowly assigns meaning to the seemingly random photons that hit our retinas, deep networks are able to receive series of pixels from images and slowly learn which patterns of pixels are interesting or predictive. The caveat is that automatically deriving these features requires huge volumes of data to learn from (see point 3). Ultimately a deep learning model may be able to implicitly learn features of the data that human data scientists are unable to isolate but if a classical, hand-engineered model gets you to 90% accuracy, is the extra data gathering and compute power worth it for that 5-7% boost?
2. Compute Power Requirements

It often makes sense to prefer simpler models in cases where compute resource is at a premium or even not available and where classical models give “good enough” accuracy. For example in an edge computing environment in a factory or in an anti-fraud solution at a retail bank where millions of transactions must be examined in real-time. It would either be impossible or obscenely expensive to run a complex deep learning model on millions of data records in real time. Or, it might not be practical to install a cluster of whirring servers into your working environment. On the other hand, if accuracy is what you need and you have lots of data then maybe its time to buy those GPUs…