What is deep learning?
Deep learning is a specific branch of machine learning that creates models using deep neural networks. The two terms have become somewhat interchangeable — deep learning and deep neural networks. The term deep in both phrases indicates that these learning models are comprised of a large number of sequential data transformations and processing steps.
Is deep learning a new concept?
Yes and no. The first neural network was created in 1943, but it was primarily only theorized because we lacked the necessary compute power to train these models effectively and the proper underlying methods to achieve back propagation with many layers. There were a series of milestones to overcome problems with the neural network plus the improvements in processing chips that allowed researchers to create functional deep networks in the 2010’s. After that, many training techniques such as DropOut or Batch Normalization were developed that helped deep learning models to converge faster.
Are these neural networks based upon the way the human brain works?
The first artificial neural networks (ANNs) were inspired by the connections of neurons in the brain. Modern neural networks have evolved past their original biological inspiration to embrace technological advancements that enable enhanced performance. There are, however, other approaches, like Biological Neural Networks (BNNs) and Hierarchical Temporal Memory (HTM), that attempt to more closely match the structure of the brain.
We often hear the buzzword “deep neural networks”. Is there such a thing as a “shallow neural network”? Is one better than another?
There are both deep neural networks and shallow neural networks. You can think of neural networks as being built of layers where each layer represents a transformation to input data. The more layers you have stacked, the more transformations you apply to the data, the deeper the neural network.
There are both pros and cons with respect to the depth of the neural network. Deeper networks allow for greater flexibility to model complex functions, however, shallower networks typically are able to compute faster. The decision to employ either a deep neural network or a shallow neural network often comes down to the underlying function and the optimization considerations. For example, for our Nova system, we found that shallow neural networks are often preferable because we are looking to assess data in real-time. The faster compute of shallower networks outweighed the more thorough solution provided by deeper neural networks. But it is worth noting that speed is only one of many factors to consider in determining the depth of a network and the depth is only one hyperparameter in the design of a network.