Knowledge

Types of artificial neural networks

Source đź“ť

1347:. Instead of just adjusting the weights in a network of fixed topology, Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure. Once a new hidden unit has been added to the network, its input-side weights are frozen. This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors. The Cascade-Correlation architecture has several advantages: It learns quickly, determines its own size and topology, retains the structures it has built even if the training set changes and requires no 947:
from which it receives connections. The system can explicitly activate (independent of incoming signals) some output units at certain time steps. For example, if the input sequence is a speech signal corresponding to a spoken digit, the final target output at the end of the sequence may be a label classifying the digit. For each sequence, its error is the sum of the deviations of all activations computed by the network from the corresponding target signals. For a training set of numerous sequences, the total error is the sum of the errors of all individual sequences.
890: 1286:
Bias of the neural network ensemble. An associative neural network has a memory that can coincide with the training set. If new data become available, the network instantly improves its predictive ability and provides data approximation (self-learns) without retraining. Another important feature of ASNN is the possibility to interpret neural network results by analysis of correlations between data cases in the space of models.
1406:), CPPNs can include both types of functions and many others. Furthermore, unlike typical artificial neural networks, CPPNs are applied across the entire space of possible inputs so that they can represent a complete image. Since they are compositions of functions, CPPNs in effect encode images at infinite resolution and can be sampled for a particular display at whatever resolution is optimal. 2980: 2185: 722:). All three approaches use a non-linear kernel function to project the input data into a space where the learning problem can be solved using a linear model. Like Gaussian processes, and unlike SVMs, RBF networks are typically trained in a maximum likelihood framework by maximizing the probability (minimizing the error). SVMs avoid overfitting by maximizing instead a 1695:(ITNN) were inspired by the phenomenon of short-term learning that seems to occur instantaneously. In these networks the weights of the hidden and the output layers are mapped directly from the training vector data. Ordinarily, they work on binary data, but versions for continuous data that require small additional processing exist. 811:
are determined by training. When presented with the x vector of input values from the input layer, a hidden neuron computes the Euclidean distance of the test case from the neuron's center point and then applies the RBF kernel function to this distance using the spread values. The resulting value is passed to the summation layer.
1015:(like similar attractor-based networks) is of historic interest although it is not a general RNN, as it is not designed to process sequences of patterns. Instead it requires stationary inputs. It is an RNN in which all connections are symmetric. It guarantees that it will converge. If the connections are trained using 2659: 1909: 1449:
This type of network can add new patterns without re-training. It is done by creating a specific memory structure, which assigns each new pattern to an orthogonal plane using adjacently connected hierarchical arrays. The network offers real-time pattern recognition and high scalability; this requires
1263:
A committee of machines (CoM) is a collection of different neural networks that together "vote" on a given example. This generally gives a much better result than individual networks. Because neural networks suffer from local minima, starting with the same architecture and training but using randomly
1226:
A RNN (often a LSTM) where a series is decomposed into a number of scales where every scale informs the primary length between two consecutive points. A first order scale consists of a normal RNN, a second order consists of all points separated by two indices and so on. The Nth order RNN connects the
946:
in discrete time settings, training sequences of real-valued input vectors become sequences of activations of the input nodes, one input vector at a time. At each time step, each non-input unit computes its current activation as a nonlinear function of the weighted sum of the activations of all units
644:
found universally in sensory recognition. A mechanism to perform optimization during recognition is created using inhibitory feedback connections back to the same inputs that activate them. This reduces requirements during learning and allows learning and updating to be easier while still being able
3574:
Quote: "...McCormick said future investigations and models of neuronal operation in the brain will need to take into account the mixed analog-digital nature of communication. Only with a thorough understanding of this mixed mode of signal transmission will a truly in depth understanding of the brain
938:
This architecture was developed in the 1980s. Its network creates a directed connection between every pair of units. Each has a time-varying, real-valued (more than just zero or one) activation (output). Each connection has a modifiable real-valued weight. Some of the nodes are called labeled nodes,
706:
RBF networks have the disadvantage of requiring good coverage of the input space by radial basis functions. RBF centres are determined with reference to the distribution of the input data, but without reference to the prediction task. As a result, representational resources may be wasted on areas of
1611:
the word-count vectors obtained from a large set of documents. Documents are mapped to memory addresses in such a way that semantically similar documents are located at nearby addresses. Documents similar to a query document can then be found by accessing all the addresses that differ by only a few
852:
DTREG uses a training algorithm that uses an evolutionary approach to determine the optimal center points and spreads for each neuron. It determines when to stop adding neurons to the network by monitoring the estimated leave-one-out (LOO) error and terminating when the LOO error begins to increase
674:
Radial basis functions are functions that have a distance criterion with respect to a center. Radial basis functions have been applied as a replacement for the sigmoidal hidden layer transfer characteristic in multi-layer perceptrons. RBF networks have two layers: In the first, input is mapped onto
52:
and the electrical signals they convey between input (such as from the eyes or nerve endings in the hand), processing, and output from the brain (such as reacting to light, touch, or heat). The way neurons semantically communicate is an area of ongoing research. Most artificial neural networks bear
1285:
The associative neural network (ASNN) is an extension of committee of machines that combines multiple feedforward neural networks and the k-nearest neighbor technique. It uses the correlation between ensemble responses as a measure of distance amid the analyzed cases for the kNN. This corrects the
1189:
Bi-directional RNN, or BRNN, use a finite sequence to predict or label each element of a sequence based on both the past and future context of the element. This is done by adding the outputs of two RNNs: one processing the sequence from left to right, the other one from right to left. The combined
810:
This layer has a variable number of neurons (determined by the training process). Each neuron consists of a radial basis function centered on a point with as many dimensions as predictor variables. The spread (radius) of the RBF function may be different for each dimension. The centers and spreads
817:
The value coming out of a neuron in the hidden layer is multiplied by a weight associated with the neuron and adds to the weighted values of other neurons. This sum becomes the output. For classification problems, one output is produced (with a separate set of weights and summation unit) for each
1790:
cells), as a cascading model for use in pattern recognition tasks. Local features are extracted by S-cells whose deformation is tolerated by C-cells. Local features in the input are integrated gradually and classified at higher layers. Among the various kinds of neocognitron are systems that can
1525:
tasks, generalization and pattern recognition with changeable attention. Dynamic search localization is central to biological memory. In visual perception, humans focus on specific objects in a pattern. Humans can change focus from object to object without learning. HAM can mimic this ability by
745:
The nearest neighbor classification performed for this example depends on how many neighboring points are considered. If 1-NN is used and the closest point is negative, then the new point should be classified as negative. Alternatively, if 9-NN classification is used and the closest 9 points are
912:
A DBN can be used to generatively pre-train a deep neural network (DNN) by using the learned DBN weights as the initial DNN weights. Various discriminative algorithms can then tune these weights. This is particularly helpful when training data are limited, because poorly initialized weights can
749:
An RBF network positions neurons in the space described by the predictor variables (x,y in this example). This space has as many dimensions as predictor variables. The Euclidean distance is computed from the new point to the center of each neuron, and a radial basis function (RBF, also called a
970:. An online hybrid between BPTT and RTRL with intermediate complexity exists, with variants for continuous time. A major problem with gradient descent for standard RNN architectures is that error gradients vanish exponentially quickly with the size of the time lag between important events. The 698:
RBF networks have the advantage of avoiding local minima in the same way as multi-layer perceptrons. This is because the only parameters that are adjusted in the learning process are the linear mapping from hidden layer to output layer. Linearity ensures that the error surface is quadratic and
283:
CNNs are suitable for processing visual and other two-dimensional data. They have shown superior results in both image and speech applications. They can be trained with standard backpropagation. CNNs are easier to train than other regular, deep, feed-forward neural networks and have many fewer
1327:
Unlike static neural networks, dynamic neural networks adapt their structure and/or parameters to the input during inference showing time-dependent behaviour, such as transient phenomena and delay effects. Dynamic neural networks in which the parameters may change over time are related to the
1092:
Simple recurrent networks have three layers, with the addition of a set of "context units" in the input layer. These units connect from the hidden layer or the output layer with a fixed weight of one. At each time step, the input is propagated in a standard feedforward fashion, and then a
85:
The feedforward neural network was the first and simplest type. In this network the information moves only from the input layer directly through any hidden layers to the output layer without cycles/loops. Feedforward networks can be constructed with various types of units, such as binary
1526:
creating explicit representations for focus. It uses a bi-modal representation of pattern and a hologram-like complex spherical weight state-space. HAMs are useful for optical realization because the underlying hyper-spherical computations can be implemented with optical computation.
848:
to find cluster centers which are then used as the centers for the RBF functions. However, K-means clustering is computationally intensive and it often does not generate the optimal number of centers. Another approach is to use a random subset of the training points as the centers.
1839:. This provides a better representation, allowing faster learning and more accurate classification with high-dimensional data. However, these architectures are poor at learning novel classes with few examples, because all network units are involved in representing the input (a 158:(MLP) – with an input layer, an output layer and one or more hidden layers connecting them. However, the output layer has the same number of units as the input layer. Its purpose is to reconstruct its own inputs (instead of emitting a target value). Therefore, autoencoders are 856:
The computation of the optimal weights between the neurons in the hidden layer and the summation layer is done using ridge regression. An iterative procedure computes the optimal regularization Lambda parameter that minimizes the generalized cross-validation (GCV) error.
1653:
vectors stored in memory cells and registers. Thus, the model is fully differentiable and trains end-to-end. The key characteristic of these models is that their depth, the size of their short-term memory, and the number of parameters can be altered independently.
3532:
Quote: "..."Since the 1980s, many neuroscientists believed they possessed the key for finally beginning to understand the workings of the brain. But we have provided strong evidence to suggest that the brain may not encode information using precise patterns of
675:
each RBF in the 'hidden' layer. The RBF chosen is usually a Gaussian. In regression problems the output layer is a linear combination of hidden layer values representing mean predicted output. The interpretation of this output layer value is the same as a
2279: 192:
A probabilistic neural network (PNN) is a four-layer feedforward neural network. The layers are Input, hidden pattern/summation, and output. In the PNN algorithm, the parent probability distribution function (PDF) of each class is approximated by a
3327:
A more straightforward way to use kernel machines for deep learning was developed for spoken language understanding. The main idea is to use a kernel machine to approximate a shallow neural net with an infinite number of hidden units, then use a
558: 699:
therefore has a single easily found minimum. In regression problems this can be found in one matrix operation. In classification problems the fixed non-linearity introduced by the sigmoid output function is most efficiently dealt with using
5629:
Y. Han, G. Huang, S. Song, L. Yang, H. Wang and Y. Wang, "Dynamic Neural Networks: A Survey," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 11, pp. 7436-7456, 1 Nov. 2022, doi: 10.1109/TPAMI.2021.3117837.
2368: 707:
the input space that are irrelevant to the task. A common solution is to associate each data point with its own centre, although this can expand the linear system to be solved in the final layer and requires shrinkage techniques to avoid
2975:{\displaystyle P(\nu ,h^{1},h^{2}\mid h^{3})={\frac {1}{Z(\psi ,h^{3})}}\exp \left(\sum _{ij}W_{ij}^{(1)}\nu _{i}h_{j}^{1}+\sum _{j\ell }W_{j\ell }^{(2)}h_{j}^{1}h_{\ell }^{2}+\sum _{\ell m}W_{\ell m}^{(3)}h_{\ell }^{2}h_{m}^{3}\right).} 2180:{\displaystyle p({\boldsymbol {\nu }},\psi )={\frac {1}{Z}}\sum _{h}\exp \left(\sum _{ij}W_{ij}^{(1)}\nu _{i}h_{j}^{1}+\sum _{j\ell }W_{j\ell }^{(2)}h_{j}^{1}h_{\ell }^{2}+\sum _{\ell m}W_{\ell m}^{(3)}h_{\ell }^{2}h_{m}^{3}\right),} 741:
Assume that each case in a training set has two predictor variables, x and y, and the target variable has two categories, positive and negative. Given a new case with predictor values x=6, y=5.1, how is the target variable computed?
197:
and a non-parametric function. Then, using PDF of each class, the class probability of a new input is estimated and Bayes’ rule is employed to allocate it to the class with the highest posterior probability. It was derived from the
913:
significantly hinder learning. These pre-trained weights end up in a region of the weight space that is closer to the optimal weights than random choices. This allows for both improved modeling and faster ultimate convergence.
464:
to form the expanded input for the next block. Thus, the input to the first block contains the original data only, while downstream blocks' input adds the output of preceding blocks. Then learning the upper-layer weight matrix
3553:
Quote: "..."Our work implies that the brain mechanisms for forming these kinds of associations might be extremely similar in snails and higher organisms...We don't fully understand even very simple kinds of learning in these
442: 773:
The radial basis function for a neuron has a center and a radius (also called a spread). The radius may be different for each neuron, and, in RBF networks generated by DTREG, the radius may be different in each dimension.
1884:, generalized from abstract concepts flowing through the model layers, which is able to synthesize new examples in novel classes that look "reasonably" natural. All the levels are learned jointly by maximizing a joint 5017:
Williams, R. J. (1989). Complexity of exact gradient computation algorithms for recurrent neural networks. Technical Report Technical Report NU-CCS-89-27 (Report). Boston: Northeastern University, College of Computer
1675:, where the input and output are written sentences in two natural languages. In that work, an LSTM RNN or CNN was used as an encoder to summarize a source sentence, and the summary was decoded using a conditional RNN 1624:
Deep neural networks can be potentially improved by deepening and parameter reduction, while maintaining trainability. While training extremely deep (e.g., 1 million layers) neural networks might not be practical,
1067:. A set of neurons learn to map points in an input space to coordinates in an output space. The input space can have different dimensions and topology from the output space, and SOM attempts to preserve these. 1276:
method, except that the necessary variety of machines in the committee is obtained by training from different starting weights rather than training on different randomly selected subsets of the training data.
3006:
from time-varying observations using a linear dynamical model. Then, a pooling strategy is used to learn invariant feature representations. These units compose to form a deep architecture and are trained by
1084:(LVQ) can be interpreted as a neural network architecture. Prototypical representatives of the classes parameterize, together with an appropriate distance measure, in a distance-based classification scheme. 1517:
Holographic Associative Memory (HAM) is an analog, correlation-based, associative, stimulus-response system. Information is mapped onto the phase orientation of complex numbers. The memory is effective for
5868: 321:
A deep stacking network (DSN) (deep convex network) is based on a hierarchy of blocks of simplified neural network modules. It was introduced in 2011 by Deng and Yu. It formulates the learning as a
4238:
Szegedy, Christian; Liu, Wei; Jia, Yangqing; Sermanet, Pierre; Reed, Scott E.; Anguelov, Dragomir; Erhan, Dumitru; Vanhoucke, Vincent; Rabinovich, Andrew (2015). "Going deeper with convolutions".
1171:. It works even when with long delays between inputs and can handle signals that mix low and high frequency components. LSTM RNN outperformed other RNN and other sequence learning methods such as 1145:
The echo state network (ESN) employs a sparsely connected random hidden layer. The weights of output neurons are the only part of the network that are trained. ESN are good at reproducing certain
3250: 750:
kernel function) is applied to the distance to compute the weight (influence) for each neuron. The radial basis function is so named because the radius distance is the argument to the function.
2624: 2513: 1097:). The fixed back connections leave a copy of the previous values of the hidden units in the context units (since they propagate over the connections before the learning rule is applied). 4783:
Larochelle, Hugo; Erhan, Dumitru; Courville, Aaron; Bergstra, James; Bengio, Yoshua (2007). "An empirical evaluation of deep architectures on problems with many factors of variation".
3512:
Quote: "... "It's amazing that after a hundred years of modern neuroscience research, we still don't know the basic information processing functions of a neuron," said Bartlett Mel..."
2442: 6127:
Graves, Alex; Wayne, Greg; Reynolds, Malcolm; Harley, Tim; Danihelka, Ivo; Grabska-Barwińska, Agnieszka; Colmenarejo, Sergio Gómez; Grefenstette, Edward; Ramalho, Tiago (2016-10-12).
726:. SVMs outperform RBF networks in most classification applications. In regression applications they can be competitive when the dimensionality of the input space is relatively small. 7702: 2193: 1492:
theory. HTM is a method for discovering and inferring the high-level causes of observed input patterns and sequences, thus building an increasingly complex model of the world.
224:
independent of sequence position. In order to achieve time-shift invariance, delays are added to the input so that multiple data points (points in time) are analyzed together.
1569:
Neural Turing machines (NTM) couple LSTM networks to external memory resources, with which they can interact by attentional processes. The combined system is analogous to a
3332:
to splice the output of the kernel machine and the raw input in building the next, higher level of the kernel machine. The number of levels in the deep convex network is a
3648:"Multi-layered GMDH-type neural network self-selecting optimum neural network architecture and its application to 3-dimensional medical image recognition of blood vessels" 1577:. Preliminary results demonstrate that neural Turing machines can infer simple algorithms such as copying, sorting and associative recall from input and output examples. 477: 3318: 3181: 2553: 2284: 3072: 290:(CapsNet) add structures called capsules to a CNN and reuse output from several capsules to form more stable (with respect to various perturbations) representations. 993:
is occasionally used to evaluate performance, which influences its input stream through output units connected to actuators that affect the environment. Variants of
574:
representation. The structure of the hierarchy of this kind of architecture makes parallel learning straightforward, as a batch-mode optimization problem. In purely
3284: 3119: 3092: 2651: 1724:
SNN and the temporal correlations of neural assemblies in such networks—have been used to model figure/ground separation and region linking in the visual system.
3022:
DPCNs predict the representation of the layer, by using a top-down approach using the information in upper layer and temporal dependencies from previous states.
1045:(hidden units). Boltzmann machine learning was at first slow to simulate, but the contrastive divergence algorithm speeds up training for Boltzmann machines and 6929: 3142: 4351:
van den Oord, Aaron; Dieleman, Sander; Schrauwen, Benjamin (2013-01-01). Burges, C. J. C.; Bottou, L.; Welling, M.; Ghahramani, Z.; Weinberger, K. Q. (eds.).
7099: 5880: 5764: 5182: 5174: 53:
only some resemblance to their more complex biological counterparts, but are very effective at their intended tasks (e.g. classification or segmentation).
7194: 1551:
Self-referential RNNs with special output units for addressing and rapidly manipulating the RNN's own weights in differentiable fashion (internal storage)
5178: 3405: 396: 5203:; Nachtschlaeger, T.; Markram, H. (2002). "Real-time computing without stable states: A new framework for neural computation based on perturbations". 1380:. Embedding an FIS in a general structure of an ANN has the benefit of using available ANN training methods to find the parameters of a fuzzy system. 7154: 4030:
Fukushima, K. (1980). "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position".
1495:
HTM combines existing ideas to mimic the neocortex with a simple design that provides many capabilities. HTM combines and extends approaches used in
7230: 930:(RNN) propagate data forward, but also backwards, from later processing stages to earlier stages. RNN can be used as general sequence processors. 116:
The Group Method of Data Handling (GMDH) features fully automatic structural and parametric model optimization. The node activation functions are
5200: 3037:
Multilayer kernel machines (MKM) are a way of learning highly nonlinear functions by iterative application of weakly nonlinear kernels. They use
1372:
in the body of an artificial neural network. Depending on the FIS type, several layers simulate the processes involved in a fuzzy inference-like
6536:
Morer I, Cardillo A, DĂ­az-Guilera A, Prignano L, Lozano S (2020). "Comparing spatial networks: a one-size-fits-all efficiency-driven approach".
3879: 6409:
Schmidhuber, Juergen; Courville, Aaron; Bengio, Yoshua (2015). "Describing Multimedia Content using Attention-based Encoder—Decoder Networks".
1329: 3853: 1418:. The long-term memory can be read and written to, with the goal of using it for prediction. These models have been applied in the context of 954:
can be used to change each weight in proportion to its derivative with respect to the error, provided the non-linear activation functions are
7047:
Lee, Honglak; Grosse, Roger (2009). "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations".
5548: 1245:
Biological studies have shown that the human brain operates as a collection of small networks. This realization gave birth to the concept of
617:
s is done in batch mode, to allow parallelization. Parallelization allows scaling the design to larger (deeper) architectures and data sets.
4637:
Dahl, G.; Yu, D.; Deng, L.; Acero, A. (2012). "Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition".
3914: 245:
A convolutional neural network (CNN, or ConvNet or shift invariant or space invariant) is a class of deep network, composed of one or more
7723:
Fukushima, Kunihiko (1987). "A hierarchical neural network model for selective attention". In Eckmiller, R.; Von der Malsburg, C. (eds.).
4982:
Schmidhuber, J. (1992). "A fixed size storage O(n3) time complexity learning algorithm for fully recurrent continually running networks".
1199: 1190:
outputs are the predictions of the teacher-given target signals. This technique proved to be especially useful when combined with LSTM.
5460:
Graves, A.; Schmidhuber, J. (2005). "Framewise phoneme classification with bidirectional LSTM and other neural network architectures".
770:
The value for the new point is found by summing the output values of the RBF functions multiplied by weights computed for each neuron.
1851:). Limiting the degree of freedom reduces the number of parameters to learn, facilitating learning of new classes from few examples. 1616:
that operates on 1000-bit addresses, semantic hashing works on 32 or 64-bit addresses found in a conventional computer architecture.
1300:
A physical neural network includes electrically adjustable resistance material to simulate artificial synapses. Examples include the
962:" or BPTT, a generalization of back-propagation for feedforward networks. A more computationally expensive online variant is called " 683:
of a linear combination of hidden layer values, representing a posterior probability. Performance in both cases is often improved by
6597:
Gupta J, Molnar C, Xie Y, Knight J, Shekhar S (2021). "Spatial variability aware deep neural networks (SVANN): a general approach".
1721:
Spiking neural networks with axonal conduction delays exhibit polychronization, and hence could have a very large memory capacity.
691:
in classical statistics. This corresponds to a prior belief in small parameter values (and therefore smooth output functions) in a
1218:
A district from conventional neural networks, stochastic artificial neural network used as an approximation to random functions.
1692: 1389: 1873:
Compound HD architectures aim to integrate characteristics of both HB and deep networks. The compound HDP-DBM architecture is a
4865:
Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. Learning Internal Representations by Error Propagation (Report).
3434: 1213: 1184: 796:, N-1 neurons are used where N is the number of categories. The input neurons standardizes the value ranges by subtracting the 6388:
Schmidhuber, Juergen (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation".
5604: 7776: 7422: 6692: 4441: 4388: 4265: 700: 5664: 1706:(SNN) explicitly consider the timing of inputs. The network input and output are usually represented as a series of spikes ( 1394:
Compositional pattern-producing networks (CPPNs) are a variation of artificial neural networks which differ in their set of
6463: 6357: 636:
Regulatory feedback networks started as a model to explain brain phenomena found during recognition including network-wide
3197: 1629:-like architectures such as pointer networks and neural random-access machines overcome this limitation by using external 7791: 4095: 3038: 203: 253:. In particular, max-pooling. It is often structured via Fukushima's convolutional architecture. They are variations of 7786: 1679:
to produce the translation. These systems share building blocks: gated RNNs and CNNs and trained attention mechanisms.
4469: 2626:
represents a conditional DBM model, which can be viewed as a two-layer DBM but with bias terms given by the states of
7781: 7292: 7074: 6940: 5614: 4810: 4215: 3333: 1545:
Memory networks where the control network's external differentiable storage is in the fast weights of another network
1422:(QA) where the long-term memory effectively acts as a (dynamic) knowledge base and the output is a textual response. 866: 7316:
Kemp, Charles; Perfors, Amy; Tenenbaum, Joshua (2007). "Learning overhypotheses with hierarchical Bayesian models".
5782:
Nasution, B.B.; Khan, A.I. (February 2008). "A Hierarchical Graph Neuron Scheme for Real-Time Pattern Recognition".
897:(RBM) with fully connected visible and hidden units. Note there are no hidden-hidden or visible-visible connections. 818:
target category. The value output for a category is the probability that the case being evaluated has that category.
2561: 2450: 7265:
Vincent, Pascal; Larochelle, Hugo (2008). "Extracting and composing robust features with denoising autoencoders".
7110: 1127:
mechanism is trained to map the reservoir to the desired output. Training is performed only at the readout stage.
586:
This architecture is a DSN extension. It offers two important improvements: it uses higher-order information from
5768: 3370: 1751: 1041:
can be thought of as a noisy Hopfield network. It is one of the first neural networks to demonstrate learning of
746:
considered, then the effect of the surrounding 8 positive points may outweigh the closest 9-th (negative) point.
42: 3992: 3946:"Parallel distributed processing model with local space-invariant interconnections and its optical architecture" 3945: 7205: 6078: 4352: 2379: 1875: 1646: 1580: 1512: 1204:
Hierarchical RNN connects elements in various ways to decompose hierarchical behavior into useful subprograms.
4682:
Mohamed, Abdel-rahman; Dahl, George; Hinton, Geoffrey (2012). "Acoustic Modeling Using Deep Belief Networks".
3589: 3585: 3461: 3337: 2274:{\displaystyle {\boldsymbol {h}}=\{{\boldsymbol {h}}^{(1)},{\boldsymbol {h}}^{(2)},{\boldsymbol {h}}^{(3)}\}} 1604: 1499:, spatial and temporal clustering algorithms, while using a tree-shaped hierarchy of nodes that is common in 735: 669: 111: 5982:
Hochreiter, Sepp; Younger, A. Steven; Conwell, Peter R. (2001). "Learning to Learn Using Gradient Descent".
4586:
Hinton, Geoffrey; Salakhutdinov, Ruslan (2006). "Reducing the Dimensionality of Data with Neural Networks".
4170: 6875: 5513:
Schmidhuber, J. (1992). "Learning complex, extended sequences using the principle of history compression".
3410: 3121: 3026: 1800: 1600: 1471: 1430: 1081: 1076: 963: 959: 894: 872: 357:, and the output layer has linear units. Connections between these layers are represented by weight matrix 240: 187: 137: 7169: 7241: 6128: 5287:
Jaeger, H.; Harnessing (2004). "Predicting chaotic systems and saving energy in wireless communication".
3451: 3400: 3380: 1852: 1489: 322: 310: 5396:. Advances in Neural Information Processing Systems 22, NIPS'22. Vancouver: MIT Press. pp. 545–552. 4371:
Collobert, Ronan; Weston, Jason (2008-01-01). "A unified architecture for natural language processing".
3320:
with which the classifier has reached the lowest error rate determines the number of features to retain.
1714:(signals that vary over time). They are often implemented as recurrent networks. SNN are also a form of 5854:
Learning Context Free Grammars: Limitations of a Recurrent Neural Network with an External Stack Memory
5671: 4240:
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015
3003: 1816: 1739: 1434: 1168: 1020: 621: 571: 221: 87: 80: 7613:
Scholkopf, B; Smola, Alexander (1998). "Nonlinear component analysis as a kernel eigenvalue problem".
6054: 5966: 5901:
Schmidhuber, J. (1992). "Learning to control fast-weight memories: An alternative to recurrent nets".
5030: 220:
A time delay neural network (TDNN) is a feedforward architecture for sequential data that recognizes
7009: 4932:
Schmidhuber, J. (1989). "A local learning algorithm for dynamic feedforward and recurrent networks".
3890: 3476: 3350: 2994:
coding scheme that uses top-down information to empirically adjust the priors needed for a bottom-up
2370:
are the model parameters, representing visible-hidden and hidden-hidden symmetric interaction terms.
1613: 1426: 1312: 1264:
different initial weights often gives vastly different results. A CoM tends to stabilize the result.
723: 249:
layers with fully connected layers (matching those in typical ANNs) on top. It uses tied weights and
215: 194: 38: 30: 7683: 7452: 7330: 7275: 7057: 7033: 6971: 5474: 5429: 5309: 4696: 4651: 4520: 4198: 3693: 553:{\displaystyle \min _{U^{T}}f=\|{\boldsymbol {U}}^{T}{\boldsymbol {H}}-{\boldsymbol {T}}\|_{F}^{2},} 7627: 7550: 7505: 7375: 6889: 6215:
Atkeson, Christopher G.; Schaal, Stefan (1995). "Memory-based neural networks for robot learning".
4793: 4418: 3857: 3145: 1832: 1650: 1519: 994: 927: 922: 598:
from each of two distinct sets of hidden units in the same layer to predictions, via a third-order
171: 5555: 1742:
for representing and predicting geographic phenomena. They generally improve both the statistical
5996: 4282: 3918: 3565: 3544: 3523: 3503: 2363:{\displaystyle \psi =\{{\boldsymbol {W}}^{(1)},{\boldsymbol {W}}^{(2)},{\boldsymbol {W}}^{(3)}\}} 1747: 1596: 1451: 1295: 1246: 955: 594:
of a lower-layer to a convex sub-problem of an upper-layer. TDSNs use covariance statistics in a
68: 7415:
Proceedings of the 28th International Conference on International Conference on Machine Learning
5825:
Sutherland, John G. (1 January 1990). "A holographic model of memory, learning and expression".
4011: 1542:
Differentiable push and pop actions for alternative memory networks called neural stack machines
7622: 7545: 7500: 7447: 7370: 7325: 7270: 7052: 6884: 6243: 5991: 5469: 5424: 5304: 4788: 4691: 4646: 4515: 4193: 3688: 3446: 3385: 3360: 3296: 3159: 1824: 1743: 1733: 1703: 1584: 1538:(LSTM), other approaches also added differentiable memory to recurrent functions. For example: 1535: 1308: 1240: 1164: 1158: 978: 971: 715: 684: 653: 341: 326: 287: 45: 7703:"Use of Kernel Deep Convex Networks and End-To-End Learning for Spoken Language Understanding" 7659: 6682: 6276:
Le, Quoc V.; Mikolov, Tomas (2014). "Distributed representations of sentences and documents".
4904:"Gradient-based learning algorithms for recurrent networks and their computational complexity" 3733: 2522: 227:
It usually forms part of a larger pattern recognition system. It has been implemented using a
7670: 7020: 7010:"Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning" 6958: 6190: 3420: 3329: 3051: 3042: 3012: 1836: 1668: 1642: 1634: 1564: 1332:
architecture (1987), where one neural network outputs the weights of another neural network.
1272: 1064: 254: 163: 159: 155: 5962: 7745: 7536:
Ruslan, Salakhutdinov; Joshua, Tenenbaum (2012). "Learning with Hierarchical-Deep Models".
6821: 6645: 6545: 6428: 6140: 5869:"A connectionist symbol manipulator that discovers the structure of context-free languages" 5742:
Schmidhuber, Juergen (2015). "Large-scale Simple Question Answering with Memory Networks".
5416: 5296: 5259: 5155: 4755: 4595: 4297: 4188:
Hinton, Geoffrey E.; Krizhevsky, Alex; Wang, Sida D. (2011), "Transforming Auto-Encoders",
3957: 3482: 3456: 3262: 3097: 3077: 2991: 2629: 2445: 1630: 1128: 906: 793: 575: 231:
network whose connection weights were trained with back propagation (supervised learning).
7698: 5965:(1993). "An introspective network that can learn to run its own weight change algorithm". 4125:
LeCun, et al. (1989). "Backpropagation Applied to Handwritten Zip Code Recognition".
652:. The feedback is used to find the optimal activation of units. It is most similar to a 8: 6104:"DeepMind's differentiable neural computer helps you navigate the subway with its memory" 6055:"DeepMind's AI learned to ride the London Underground using human-like reason and memory" 4886:
The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1
4283:"Convolutional Neural Network-Based Robot Navigation Using Uncalibrated Spherical Images" 3673: 3466: 3415: 3149: 1820: 1796: 1792: 1672: 1395: 1341: 1172: 1106: 1058: 943: 884: 804:
range. The input neurons then feed the values to each of the neurons in the hidden layer.
801: 692: 676: 625: 591: 567: 334: 258: 129: 125: 95: 34: 7749: 7135: 6825: 6649: 6549: 6432: 6144: 5420: 5300: 5263: 4759: 4599: 4301: 3961: 3152:
selects the best informative features among features extracted by KPCA. The process is:
7640: 7593: 7571: 7518: 7473: 7298: 7080: 6910: 6845: 6734: 6709: 6663: 6614: 6579: 6518: 6444: 6418: 6389: 6368: 6319: 6298: 6277: 6172: 6033: 6012: 5918: 5807: 5743: 5722: 5701: 5530: 5495: 5442: 5330: 5228: 5136: 5053: 4999: 4949: 4914: 4866: 4816: 4709: 4664: 4619: 4568: 4488: 4447: 4394: 4328: 4243: 4221: 4055: 3794: 3714: 3628: 3590:"The group method of data handling – a rival of the method of stochastic approximation" 3254: 3184: 3127: 2516: 2374: 1888: 1828: 1638: 1419: 1140: 1046: 845: 657: 471:
given other weights in the network can be formulated as a convex optimization problem:
306: 6806: 5549:"Dynamic Representation of Movement Primitives in an Evolved Recurrent Neural Network" 5407:
Schuster, Mike; Paliwal, Kuldip K. (1997). "Bidirectional recurrent neural networks".
4529: 4426:
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
3019:
such that the states at any layer depend only on the preceding and succeeding layers.
1398:
and how they are applied. While typical artificial neural networks often contain only
7563: 7465: 7418: 7388: 7343: 7339: 7302: 7288: 7070: 6902: 6867: 6837: 6739: 6688: 6667: 6618: 6571: 6510: 6228: 6176: 6164: 6156: 5937: 5799: 5610: 5487: 5369: 5322: 5220: 5128: 4851: 4806: 4611: 4560: 4437: 4384: 4333: 4315: 4261: 4211: 4059: 4047: 3973: 3829: 3718: 3706: 3647: 3395: 1867: 1848: 1755: 1496: 1403: 1258: 1228: 1111:
Reservoir computing is a computation framework that may be viewed as an extension of
1038: 1032: 649: 451: 354: 330: 261:. This architecture allows CNNs to take advantage of the 2D structure of input data. 67:
Neural networks can be hardware- (neurons are represented by physical components) or
7522: 7084: 6583: 6522: 5922: 5811: 5534: 5446: 5088:"Gradient flow in recurrent nets: the difficulty of learning long-term dependencies" 5057: 5003: 4953: 4918: 4903: 4885: 4870: 4836:"Generalization of backpropagation with application to a recurrent gas market model" 4820: 4668: 4451: 3632: 7753: 7644: 7632: 7575: 7555: 7510: 7477: 7457: 7408:"The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning" 7380: 7335: 7280: 7062: 6914: 6894: 6849: 6829: 6729: 6725: 6721: 6653: 6606: 6561: 6553: 6502: 6448: 6436: 6258: 6224: 6148: 5968:
Proceedings of the International Conference on Artificial Neural Networks, Brighton
5910: 5834: 5791: 5522: 5499: 5479: 5434: 5388: 5361: 5350:"LSTM recurrent networks learn simple context free and context sensitive languages" 5334: 5314: 5267: 5232: 5212: 5140: 5120: 5045: 4991: 4941: 4847: 4798: 4763: 4713: 4701: 4656: 4623: 4603: 4552: 4525: 4480: 4429: 4398: 4376: 4323: 4305: 4253: 4225: 4203: 4134: 4039: 3965: 3775: 3748: 3698: 3620: 3008: 2999: 1881: 1812: 1763: 1574: 1415: 1399: 1369: 1348: 1268: 1116: 1094: 1016: 1012: 1006: 990: 982: 951: 902: 840:
The weights applied to the RBF function outputs as they pass to the summation layer
714:
Associating each input datum with an RBF leads naturally to kernel methods such as
688: 680: 605:
While parallelization and scalability are not considered seriously in conventional
351: 298: 199: 175: 167: 61: 6658: 6633: 4572: 4484: 792:
One neuron appears in the input layer for each predictor variable. In the case of
6863: 6103: 5675: 5483: 4492: 3993:"Shift-invariant pattern recognition neural network and its optical architecture" 3779: 3752: 3355: 1885: 1859: 1608: 1500: 1377: 1175:
in applications such as language learning and connected handwriting recognition.
1042: 986: 777:
With larger spread, neurons at a distance from a point have a greater influence.
719: 437:{\displaystyle {\boldsymbol {H}}=\sigma ({\boldsymbol {W}}^{T}{\boldsymbol {X}})} 302: 269: 99: 57: 7384: 6898: 6557: 6471: 4433: 4207: 3545:"UCLA Neuroscientist Gains Insights Into Human Brain From Study Of Marine Snail" 1791:
detect multiple patterns in the same input by using back propagation to achieve
71:(computer models), and can use a variety of topologies and learning algorithms. 48:
that are generally unknown. Particularly, they are inspired by the behaviour of
7636: 7514: 7267:
Proceedings of the 25th international conference on Machine learning - ICML '08
6506: 6262: 5390:
Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks
5216: 5124: 5087: 4543:
Hutchinson, Brian; Deng, Li; Yu, Dong (2012). "Tensor deep stacking networks".
4373:
Proceedings of the 25th international conference on Machine learning - ICML '08
4192:, Lecture Notes in Computer Science, vol. 6791, Springer, pp. 44–51, 3524:"It's Only A Game Of Chance: Leading Theory Of Perception Called Into Question" 3471: 3430: 3287: 1715: 1707: 1676: 1570: 1455: 1438: 1112: 889: 133: 7758: 7733: 5914: 5838: 5526: 5272: 5247: 5075:(Diploma thesis) (in German). Munich: Institut f. Informatik, Technische Univ. 5049: 4995: 4945: 4768: 4743: 4705: 4660: 4257: 4138: 4103: 3624: 1227:
first and last node. The outputs from all the various scales are treated as a
844:
Various methods have been used to train RBF networks. One approach first uses
738:(k-NN) models. The basic idea is that similar inputs produce similar outputs. 7770: 7592:
Chalasani, Rakesh; Principe, Jose (2013). "Deep Predictive Coding Networks".
6985:
Larochelle, Hugo; Bengio, Yoshua; Louradour, Jerdme; Lamblin, Pascal (2009).
6440: 6160: 5349: 5187:
An overview of reservoir computing: theory, applications, and implementations
4319: 4019:. 4th International Conf. Computer Vision. Berlin, Germany. pp. 121–128. 3710: 3390: 1779: 1373: 641: 265: 250: 117: 7407: 7284: 7195:"Parsing Natural Scenes and Natural Language with Recursive Neural Networks" 7066: 6986: 6493:
Izhikevich EM (February 2006). "Polychronization: computation with spikes".
6011:
Schmidhuber, Juergen (2015). "Learning to Transduce with Unbounded Memory".
5795: 5641: 5318: 4802: 4607: 4470:"Deep Convex Net: A Scalable Architecture for Speech Pattern Classification" 4380: 1811:
Compound hierarchical-deep models compose deep networks with non-parametric
875:
but it is used for regression and approximation rather than classification.
7567: 7559: 7469: 7438:
Fei-Fei, Li; Fergus, Rob (2006). "One-shot learning of object categories".
7392: 7347: 7049:
Proceedings of the 26th Annual International Conference on Machine Learning
6906: 6841: 6743: 6575: 6514: 6340: 6168: 5803: 5491: 5373: 5326: 5224: 5189:. European Symposium on Artificial Neural Networks ESANN. pp. 471–482. 4615: 4564: 4556: 4337: 3977: 3815: 3793:
Diederik P Kingma; Welling, Max (2013). "Auto-Encoding Variational Bayes".
3016: 1787: 1775: 1766:. Examples of SNNs are the OSFA spatial neural networks, SVANNs and GWNNs. 1573:
but is differentiable end-to-end, allowing it to be efficiently trained by
1441:. However, the early controllers of such memories were not differentiable. 595: 273: 7461: 7361:
Xu, Fei; Tenenbaum, Joshua (2007). "Word learning as Bayesian inference".
5132: 4051: 1249:, in which several small networks cooperate or compete to solve problems. 679:
in statistics. In classification problems the output layer is typically a
3969: 3365: 1783: 1711: 1366: 1360: 1146: 708: 333:. Each DSN block is a simple module that is easy to train by itself in a 277: 246: 149: 6833: 6152: 4096:"Convolutional Neural Networks (LeNet) – DeepLearning 0.1 documentation" 3997:
Proceedings of Annual Conference of the Japan Society of Applied Physics
1894:
In a DBM with three hidden layers, the probability of a visible input ''
1671:
input to highly structured output. The approach arose in the context of
1667:
Encoder–decoder frameworks are based on neural networks that map highly
268:. Units respond to stimuli in a restricted region of space known as the 7491:
Rodriguez, Abel; Dunson, David (2008). "The Nested Dirichlet Process".
6566: 4156: 4152: 4043: 3702: 3425: 3375: 1863: 1583:(DNC) are an NTM extension. They out-performed Neural turing machines, 1231:
and the associated scores are used genetically for the next iteration.
587: 228: 121: 91: 6129:"Hybrid computing using a neural network with dynamic external memory" 5438: 5365: 4310: 3652:
International Journal of Innovative Computing, Information and Control
3542: 1880:
as a hierarchical model, incorporating DBM architecture. It is a full
1738:
Spatial neural networks (SNNs) constitute a supercategory of tailored
871:
A GRNN is an associative memory neural network that is similar to the
2995: 1481: 1477: 1476:
Hierarchical temporal memory (HTM) models some of the structural and
1344: 1316: 1304: 1200:
Recurrent neural network § Hierarchical recurrent neural network
294: 264:
Its unit connectivity pattern is inspired by the organization of the
120:
that permit additions and multiplications. It uses a deep multilayer
7202:
Proceedings of the 26th International Conference on Machine Learning
7107:
Proceedings of the 28th International Conference on Machine Learning
6680: 6610: 4835: 4785:
Proceedings of the 24th international conference on Machine learning
4281:
Ran, Lingyan; Zhang, Yanning; Zhang, Qilin; Yang, Tao (2017-06-12).
4074: 4013:
Learning recognition and segmentation of 3-D objects from 2-D images
1433:, the patterns encoded by neural networks are used as addresses for 154:
An autoencoder, autoassociator or Diabolo network is similar to the
6423: 6324: 6303: 6017: 5748: 5727: 5085: 3830:"Competitive probabilistic neural network (PDF Download Available)" 1778:
is a hierarchical, multilayered network that was modeled after the
1759: 637: 140:. The size and depth of the resulting network depends on the task. 7598: 6394: 6373: 6282: 6038: 5706: 5646:
Proceedings of the Annual Meeting of the Cognitive Science Society
5111:
Hochreiter, S.; Schmidhuber, J. (1997). "Long short-term memory".
5086:
Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. (2001).
4248: 3799: 3611:
Ivakhnenko, A. G. (1971). "Polynomial Theory of Complex Systems".
3144:
output in the feature domain induced by the kernel. To reduce the
1607:
methods. Deep learning is useful in semantic hashing where a deep
1450:
parallel processing and is thus best suited for platforms such as
128:
network that grows layer by layer, where each layer is trained by
6710:"Receptive fields of single neurones in the cat's striate cortex" 6684:
Brain and visual perception: the story of a 25-year collaboration
6535: 6365:
Twenty-eighth Conference on Neural Information Processing Systems
5579: 3501: 1485: 1301: 827:
The following parameters are determined by the training process:
7140:
Advances in Neural Information Processing Systems 23 (NIPS 2010)
6984: 6464:"Spiking Neuron Models: Single Neurons, Populations, Plasticity" 5031:"Learning state space trajectories in recurrent neural networks" 4782: 4419:"Scalable stacking and learning for building deep architectures" 1529: 450:
are known at each stage. The function performs the element-wise
272:. Receptive fields partially overlap, over-covering the entire 3566:"Brain Communicates In Analog And Digital Modes Simultaneously" 3440: 1522: 834:
The coordinates of the center of each hidden-layer RBF function
797: 599: 49: 7538:
IEEE Transactions on Pattern Analysis and Machine Intelligence
7440:
IEEE Transactions on Pattern Analysis and Machine Intelligence
6318:
Schmidhuber, Juergen (2015). "Neural Random-Access Machines".
5173: 4545:
IEEE Transactions on Pattern Analysis and Machine Intelligence
3521: 3504:"Gray Matters: New Clues Into How Neurons Process Information" 1093:
backpropagation-like learning rule is applied (not performing
5860: 4350: 1459: 1437:, with "neurons" essentially serving as address encoders and 620:
The basic architecture is suitable for diverse tasks such as
6786: 5955: 5929: 4969:
Neural and Adaptive Systems: Fundamentals through Simulation
4190:
Artificial Neural Networks and Machine Learning – ICANN 2011
2373:
A learned DBM model is an undirected model that defines the
1710:
or more complex shapes). SNN can process information in the
1595:
Approaches that represent previous experiences directly and
7134:
Lin, Yuanqing; Zhang, Tong; Zhu, Shenghuo; Yu, Kai (2010).
5721:
Schmidhuber, Juergen (2015). "End-To-End Memory Networks".
4684:
IEEE Transactions on Audio, Speech, and Language Processing
4639:
IEEE Transactions on Audio, Speech, and Language Processing
3931: 1831:, deep coding networks, DBNs with sparse feature learning, 1782:. It uses multiple types of units, (originally two, called 454:
operation. Each block estimates the same final label class
7098:
Courville, Aaron; Bergstra, James; Bengio, Yoshua (2011).
6408: 5935: 4171:"Unsupervised Feature Learning and Deep Learning Tutorial" 1758:(e.g. spatial regression models) whenever the geo-spatial 1587:
systems and memory networks on sequence-processing tasks.
1214:
Artificial neural network § Stochastic neural network
905:
made up of multiple hidden layers. It can be considered a
837:
The radius (spread) of each RBF function in each dimension
6638:
International Journal of Geographical Information Science
6126: 6004: 5663:
Fahlman, Scott E.; Lebiere, Christian (August 29, 1991).
5199: 1626: 1115:. Typically an input signal is fed into a fixed (random) 660:
in that it mathematically emulates feedforward networks.
276:. Unit response can be approximated mathematically by a 206:. It is used for classification and pattern recognition. 6987:"Exploring Strategies for Training Deep Neural Networks" 6762: 6750: 6625: 5981: 5845: 5756: 4911:
Back-propagation: Theory, Architectures and Applications
3792: 981:
settings, no teacher provides target signals. Instead a
7097: 6032:
Schmidhuber, Juergen (2014). "Neural Turing Machines".
458:, and its estimate is concatenated with original input 444:. Modules are trained in order, so lower-layer weights 337:
fashion without backpropagation for the entire blocks.
7100:"Unsupervised Models of Images by Spike-and-Slab RBMs" 6805:
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (2015).
6774: 6599:
ACM Transactions on Intelligent Systems and Technology
5938:"Learning precise timing with LSTM recurrent networks" 4891:(Report). Cambridge University Engineering Department. 4787:. ICML '07. New York, NY, USA: ACM. pp. 473–480. 3045:
greedy layer-wise pre-training step of deep learning.
1123:
whose dynamics map the input to a higher dimension. A
7231:"Modeling Human Motion Using Binary Latent Variables" 6634:"A geographically weighted artificial neural network" 5975: 5856:. 14th Annual Conf. of the Cog. Sci. Soc. p. 79. 5110: 4237: 3543:
University Of California – Los Angeles (2004-12-14).
3299: 3265: 3200: 3162: 3130: 3100: 3080: 3054: 2662: 2632: 2564: 2525: 2453: 2382: 2287: 2196: 1912: 648:
A regulatory feedback network makes inferences using
480: 399: 363:
input-to-hidden-layer connections have weight matrix
7696: 7315: 6596: 6590: 6358:"Sequence to sequence learning with neural networks" 5095:
A Field Guide to Dynamical Recurrent Neural Networks
4731:. International Joint Conference on Neural Networks. 4187: 3813: 3245:{\displaystyle m_{\ell }\in \{1,\ldots ,n_{\ell }\}} 1806: 1612:
bits from the address of the query document. Unlike
860: 614: 293:
Examples of applications in computer vision include
6927: 6529: 6338: 6241: 5936:Gers, F.; Schraudolph, N.; Schmidhuber, J. (2002). 5693: 4966: 4864: 4585: 3324:Some drawbacks accompany the KPCA method for MKMs. 1827:(DBM), deep auto encoders, convolutional variants, 344:(MLP) with a single hidden layer. The hidden layer 16:
Classification of Artificial Neural Networks (ANNs)
7155:"Sparse Feature Learning for Deep Belief Networks" 6355: 6025: 5961: 3880:"An Introduction to Probabilistic Neural Networks" 3766:Liou, Cheng-Yuan (2014). "Autoencoder for words". 3734:"Modeling word perception using the Elman network" 3613:IEEE Transactions on Systems, Man, and Cybernetics 3406:Large memory storage and retrieval neural networks 3312: 3278: 3244: 3175: 3136: 3113: 3086: 3066: 2985: 2974: 2645: 2618: 2547: 2507: 2444:. One way to express what has been learned is the 2436: 2362: 2273: 2179: 1858:allow learning from few examples, for example for 1750:of the a-spatial/classic NNs whenever they handle 1383: 552: 436: 7591: 7264: 7238:Advances in Neural Information Processing Systems 7162:Advances in Neural Information Processing Systems 6862: 6804: 6297:Schmidhuber, Juergen (2015). "Pointer Networks". 5873:Advances in Neural Information Processing Systems 5602: 5459: 5386: 4681: 4506:David, Wolpert (1992). "Stacked generalization". 4280: 2998:procedure by means of a deep, locally connected, 1506: 7768: 6928:Hinton, Geoffrey; Salakhutdinov, Ruslan (2009). 6868:"A fast learning algorithm for deep belief nets" 6631: 6242:Salakhutdinov, Ruslan; Hinton, Geoffrey (2009). 5700:Schmidhuber, Juergen (2014). "Memory Networks". 5606:Talking Nets: An Oral History of Neural Networks 5286: 4883: 4542: 3502:University Of Southern California (2004-06-16). 3074:learns the representation of the previous layer 1819:can be learned using deep architectures such as 1633:and other components that typically belong to a 901:A deep belief network (DBN) is a probabilistic, 734:RBF neural networks are conceptually similar to 581: 482: 105: 94:. Continuous neurons, frequently with sigmoidal 7493:Journal of the American Statistical Association 7490: 7007: 6930:"Efficient Learning of Deep Boltzmann Machines" 5665:"The Cascade-Correlation Learning Architecture" 5406: 5347: 5073:Untersuchungen zu dynamischen neuronalen Netzen 4636: 4370: 4145: 3814:Boesen, A.; Larsen, L.; Sonderby, S.K. (2015). 3563: 3148:of the updated representation in each layer, a 1465: 1070: 7612: 7153:Ranzato, Marc Aurelio; Boureau, Y-Lan (2007). 7152: 6356:Sutskever, I.; Vinyals, O.; Le, Q. V. (2014). 6251:International Journal of Approximate Reasoning 6079:"DeepMind AI 'Learns' to Navigate London Tube" 5662: 5603:Anderson, James A.; Rosenfeld, Edward (2000). 4967:Principe, J.C.; Euliano, N.R.; Lefebvre, W.C. 4901: 4726: 4010:Weng, J.; Ahuja, N.; Huang, T. S. (May 1993). 4009: 3878:Cheung, Vincent; Cannons, Kevin (2002-06-10). 1597:use a similar experience to form a local model 1444: 1311: is a physical implementation of an  997:are often used to optimize the weight matrix. 663: 570:, the goal is not to discover the transformed 7437: 7133: 6681:David H. Hubel and Torsten N. Wiesel (2005). 6492: 6214: 5640:Hinton, Geoffrey E.; Plaut, David C. (1987). 3887:Probabilistic and Statistical Inference Group 3877: 3522:Weizmann Institute of Science. (2007-04-02). 3252:, compute the classification error rate of a 2990:A deep predictive coding network (DPCN) is a 2619:{\displaystyle P(\nu ,h^{1},h^{2}\mid h^{3})} 2508:{\displaystyle P(\nu ,h^{1},h^{2}\mid h^{3})} 1530:LSTM-related differentiable memory structures 718:(SVM) and Gaussian processes (the RBF is the 610: 7535: 7405: 7228: 7109:. Vol. 10. pp. 1–8. Archived from 6486: 6191:"Differentiable neural computers | DeepMind" 5781: 4375:. New York, NY, USA: ACM. pp. 160–167. 4088: 3239: 3214: 3032: 2357: 2294: 2268: 2205: 606: 533: 504: 329:, emphasizing the mechanism's similarity to 7360: 6856: 6707: 6461: 6387: 6317: 6296: 6031: 6010: 5900: 5851: 5741: 5720: 5699: 5642:"Using Fast Weights to Deblur Old Memories" 5639: 5512: 5028: 4981: 4931: 4735: 4416: 3336:of the overall system, to be determined by 1662: 1554:Learning to transduce with unbounded memory 1340:Cascade correlation is an architecture and 1019:the Hopfield network can perform as robust 916: 64:and environments, which constantly change. 5894: 5824: 5070: 3681:Foundations and Trends in Machine Learning 3610: 3584: 3575:and its disorders be achieved, he said..." 1687: 939:some output nodes, the rest hidden nodes. 7757: 7731: 7722: 7626: 7597: 7587: 7585: 7549: 7504: 7451: 7374: 7329: 7274: 7229:Taylor, Graham; Hinton, Geoffrey (2006). 7192: 7056: 7046: 6888: 6792: 6780: 6768: 6756: 6733: 6657: 6565: 6422: 6393: 6372: 6323: 6302: 6281: 6275: 6037: 6016: 5995: 5747: 5726: 5705: 5473: 5428: 5308: 5271: 5093:. In Kremer, S. C.; Kolen, J. F. (eds.). 4792: 4767: 4720: 4695: 4650: 4519: 4463: 4461: 4412: 4410: 4408: 4327: 4309: 4247: 4231: 4197: 4118: 4029: 4003: 3807: 3798: 3692: 3645: 2437:{\displaystyle P(\nu ,h^{1},h^{2},h^{3})} 1558: 1152: 966:" or RTRL. Unlike BPTT this algorithm is 831:The number of neurons in the hidden layer 566:Unlike other deep architectures, such as 6991:The Journal of Machine Learning Research 6687:. Oxford University Press. p. 106. 5866: 5852:Das, S.; Giles, C.L.; Sun, G.Z. (1992). 5016: 4417:Deng, Li; Yu, Dong; Platt, John (2012). 4360:. Curran Associates. pp. 2643–2651. 4075:"LeNet-5, convolutional neural networks" 4023: 3667: 3665: 3639: 1252: 888: 765: 729: 316: 7008:Coates, Adam; Carpenter, Blake (2011). 6342:Recurrent continuous translation models 6269: 5827:International Journal of Neural Systems 5106: 5104: 4727:Achler, T.; Omar, C.; Amir, E. (2008). 4354:Deep content-based music recommendation 4242:. IEEE Computer Society. pp. 1–9. 4163: 3604: 3578: 2341: 2320: 2299: 2252: 2231: 2210: 2198: 1920: 1693:Instantaneously trained neural networks 1390:Compositional pattern-producing network 974:architecture overcomes these problems. 528: 520: 509: 427: 416: 401: 7769: 7651: 7606: 7582: 7529: 7484: 7431: 7399: 7354: 7309: 7258: 7222: 7186: 7146: 7127: 7091: 7040: 7001: 6978: 6708:Hubel, DH; Wiesel, TN (October 1959). 6339:Kalchbrenner, N.; Blunsom, P. (2013). 6101: 5762: 5409:IEEE Transactions on Signal Processing 5245: 4884:Robinson, A. J.; Fallside, F. (1987). 4833: 4741: 4467: 4458: 4405: 3671: 3435:NeuroEvolution of Augmented Topologies 1847:) and must be adjusted together (high 1185:Bidirectional recurrent neural network 1100: 1052: 1023:, resistant to connection alteration. 878: 631: 7697:Deng, Li; Tur, Gokhan; He, Xiaodong; 6921: 5348:Gers, F. A.; Schmidhuber, J. (2001). 5157:Neural Networks as Cybernetic Systems 4675: 4630: 4579: 4536: 4505: 4499: 4151: 4124: 4066: 3990: 3943: 3662: 701:iteratively re-weighted least squares 578:, DSNs outperform conventional DBNs. 132:. Useless items are detected using a 5784:IEEE Transactions on Neural Networks 5387:Graves, A.; Schmidhuber, J. (2009). 5354:IEEE Transactions on Neural Networks 5101: 4902:Williams, R. J.; Zipser, D. (1994). 3765: 3731: 3674:"Learning Deep Architectures for AI" 1026: 968:local in time but not local in space 340:Each block consists of a simplified 56:Some artificial neural networks are 7657: 7406:Chen, Bo; Polatkan, Gungor (2011). 7193:Socher, Richard; Lin, Clif (2011). 6052: 6046: 3786: 3039:kernel principal component analysis 1619: 1590: 1131:are a type of reservoir computing. 1087: 1063:The self-organizing map (SOM) uses 204:Kernel Fisher discriminant analysis 202:and a statistical algorithm called 162:models. An autoencoder is used for 21:types of artificial neural networks 13: 7727:. Springer-Verlag. pp. 81–90. 7660:"Kernel Methods for Deep Learning" 3984: 3937: 3015:. The layers constitute a kind of 3002:. This works by extracting sparse 1409: 1267:The CoM is similar to the general 933: 590:statistics, and it transforms the 563:which has a closed-form solution. 14: 7803: 6345:. EMNLP'2013. pp. 1700–1709. 5153: 4072: 1807:Compound hierarchical-deep models 1178: 958:. The standard method is called " 867:General regression neural network 861:General regression neural network 301:. They have wide applications in 7690: 7340:10.1111/j.1467-7687.2007.00585.x 6866:; Osindero, S.; Teh, Y. (2006). 6798: 6701: 6674: 6483:Freely available online textbook 6102:Mannes, John (13 October 2016). 4729:Shedding Weights: More With Less 4157:"Slides on Deep Learning Online" 3025:DPCNs can be extended to form a 2281:is the set of hidden units, and 1754:, and also of the other spatial 1221: 785:RBF networks have three layers: 645:to perform complex recognition. 234: 181: 7716: 7417:. Omnipress. pp. 361–368. 6632:Hagenauer J, Helbich M (2022). 6455: 6411:IEEE Transactions on Multimedia 6402: 6381: 6349: 6332: 6311: 6290: 6235: 6208: 6183: 6120: 6095: 6071: 5818: 5775: 5735: 5714: 5656: 5633: 5623: 5596: 5572: 5541: 5506: 5453: 5400: 5380: 5341: 5280: 5239: 5193: 5167: 5147: 5079: 5064: 5022: 5010: 4975: 4960: 4925: 4895: 4877: 4858: 4827: 4776: 4364: 4344: 4274: 4181: 3907: 3871: 3854:"Probabilistic Neural Networks" 3846: 3822: 3759: 3371:Biologically inspired computing 3286:most informative features on a 2986:Deep predictive coding networks 1769: 1581:Differentiable neural computers 1384:Compositional pattern-producing 1307:-based neural network. An  1193: 780: 170:, typically for the purpose of 90:, the simplest of which is the 6726:10.1113/jphysiol.1959.sp006308 5867:Mozer, M. C.; Das, S. (1993). 4477:Proceedings of the Interspeech 3725: 3586:Ivakhnenko, Alexey Grigorevich 3564:Yale University (2006-04-13). 3557: 3536: 3515: 3495: 2929: 2923: 2859: 2853: 2794: 2788: 2745: 2726: 2711: 2666: 2613: 2568: 2542: 2529: 2502: 2457: 2431: 2386: 2352: 2346: 2331: 2325: 2310: 2304: 2263: 2257: 2242: 2236: 2221: 2215: 2134: 2128: 2064: 2058: 1999: 1993: 1930: 1916: 1876:hierarchical Dirichlet process 1682: 1513:Holographic associative memory 1507:Holographic associative memory 1354: 1280: 431: 411: 393:The matrix of hidden units is 143: 74: 1: 7142:. Vol. 23. pp. 1–9. 6659:10.1080/13658816.2021.1871618 5765:"Distributed representations" 4530:10.1016/S0893-6080(05)80023-1 4485:10.21437/Interspeech.2011-607 3816:"Generating Faces with Torch" 3489: 3462:Principal components analysis 3124:(PC) of the projection layer 1801:convolutional neural networks 1376:, inference, aggregation and 1207: 1134: 670:Radial basis function network 582:Tensor deep stacking networks 381:, and the input data vectors 209: 112:Group method of data handling 106:Group method of data handling 98:, are used in the context of 7777:Neural network architectures 7732:Fukushima, Kunihiko (2007). 6229:10.1016/0925-2312(95)00033-6 5763:Hinton, Geoffrey E. (1984). 5580:"Associative Neural Network" 5484:10.1016/j.neunet.2005.06.042 4852:10.1016/0893-6080(88)90007-x 3780:10.1016/j.neucom.2013.09.055 3753:10.1016/j.neucom.2008.04.030 3646:Kondo, T.; Ueno, J. (2008). 3411:Linear discriminant analysis 3183:features according to their 3041:(KPCA), as a method for the 1472:Hierarchical temporal memory 1466:Hierarchical temporal memory 1431:hierarchical temporal memory 1414:Memory networks incorporate 1335: 1082:Learning vector quantization 1077:Learning vector quantization 1071:Learning vector quantization 964:Real-Time Recurrent Learning 960:backpropagation through time 909:of simple learning modules. 895:restricted Boltzmann machine 873:probabilistic neural network 241:Convolutional neural network 188:Probabilistic neural network 118:Kolmogorov–Gabor polynomials 60:and are used for example to 7: 7385:10.1037/0033-295X.114.2.245 6899:10.1162/neco.2006.18.7.1527 6558:10.1103/PhysRevE.101.042301 5163:(2nd and revised ed.). 5029:Pearlmutter, B. A. (1989). 4468:Deng, Li; Yu, Dong (2011). 4434:10.1109/ICASSP.2012.6288333 4208:10.1007/978-3-642-21735-7_6 3452:Particle swarm optimization 3401:In Situ Adaptive Tabulation 3381:Connectionist expert system 3343: 1445:One-shot associative memory 1365:A neuro-fuzzy network is a 1289: 1000: 822: 664:Radial basis function (RBF) 387:form the columns of matrix 375:form the columns of matrix 323:convex optimization problem 311:natural language processing 303:image and video recognition 124:with eight layers. It is a 10: 7808: 7792:Computational neuroscience 7637:10.1162/089976698300017467 7515:10.1198/016214508000000553 6507:10.1162/089976606775093882 6263:10.1016/j.ijar.2008.11.006 5672:Carnegie Mellon University 5217:10.1162/089976602760407955 5125:10.1162/neco.1997.9.8.1735 4913:. Hillsdale, NJ: Erlbaum. 4102:. LISA Lab. Archived from 3259:classifier using only the 1854:Hierarchical Bayesian (HB) 1843:distributed representation 1731: 1727: 1698: 1657: 1649:. Such systems operate on 1562: 1510: 1469: 1435:content-addressable memory 1387: 1358: 1322: 1293: 1256: 1238: 1234: 1211: 1197: 1182: 1169:vanishing gradient problem 1156: 1138: 1104: 1074: 1056: 1030: 1021:content-addressable memory 1004: 920: 882: 864: 667: 642:difficulty with similarity 238: 213: 185: 147: 109: 81:Feedforward neural network 78: 39:biological neural networks 31:Artificial neural networks 7787:Classification algorithms 7759:10.4249/scholarpedia.1717 6939:: 448–455. Archived from 5915:10.1162/neco.1992.4.1.131 5879:: 863–870. Archived from 5839:10.1142/S0129065790000163 5527:10.1162/neco.1992.4.2.234 5273:10.4249/scholarpedia.2330 5050:10.1162/neco.1989.1.2.263 4996:10.1162/neco.1992.4.2.243 4946:10.1080/09540098908915650 4769:10.4249/scholarpedia.5947 4706:10.1109/tasl.2011.2109382 4661:10.1109/tasl.2011.2134090 4258:10.1109/CVPR.2015.7298594 4139:10.1162/neco.1989.1.4.541 3732:Liou, Cheng-Yuan (2008). 3672:Bengio, Y. (2009-11-15). 3625:10.1109/TSMC.1971.4308320 3477:Time delay neural network 3351:Adaptive resonance theory 3313:{\displaystyle m_{\ell }} 3176:{\displaystyle n_{\ell }} 3033:Multilayer kernel machine 1614:sparse distributed memory 1427:sparse distributed memory 1313:artificial neural network 950:To minimize total error, 928:Recurrent neural networks 216:Time delay neural network 7782:Computational statistics 6441:10.1109/TMM.2015.2477044 5971:. IEE. pp. 191–195. 5246:Jaeger, Herbert (2007). 3594:Soviet Automatic Control 3190:for different values of 2548:{\displaystyle P(h^{3})} 1663:Encoder–decoder networks 1651:probability distribution 1452:wireless sensor networks 995:evolutionary computation 923:Recurrent neural network 917:Recurrent neural network 853:because of overfitting. 284:parameters to estimate. 172:dimensionality reduction 7285:10.1145/1390156.1390294 7067:10.1145/1553374.1553453 5796:10.1109/TNN.2007.905857 5319:10.1126/science.1091277 5071:Hochreiter, S. (1991). 4803:10.1145/1273496.1273556 4608:10.1126/science.1127647 4381:10.1145/1390156.1390177 3067:{\displaystyle \ell +1} 1825:deep Boltzmann machines 1795:. It has been used for 1704:Spiking neural networks 1688:Instantaneously trained 1296:Physical neural network 1247:modular neural networks 716:support vector machines 288:Capsule Neural Networks 88:McCulloch–Pitts neurons 7678:Cite journal requires 7658:Cho, Youngmin (2012). 7560:10.1109/TPAMI.2012.269 7269:. pp. 1096–1103. 7028:Cite journal requires 6966:Cite journal requires 4834:Werbos, P. J. (1988). 4744:"Deep belief networks" 4557:10.1109/tpami.2012.268 4428:. pp. 2133–2136. 3447:Optical neural network 3361:Autoassociative memory 3314: 3280: 3246: 3187:with the class labels; 3177: 3138: 3115: 3088: 3068: 2976: 2647: 2620: 2549: 2509: 2438: 2364: 2275: 2181: 1837:denoising autoencoders 1734:Spatial neural network 1585:long short-term memory 1559:Neural Turing machines 1536:long short-term memory 1309:optical neural network 1241:Modular neural network 1165:long short-term memory 1159:Long short-term memory 1153:Long short-term memory 979:reinforcement learning 972:Long short-term memory 898: 656:but is different from 554: 438: 342:multi-layer perceptron 331:stacked generalization 255:multilayer perceptrons 7462:10.1109/TPAMI.2006.79 7318:Developmental Science 7168:: 1–8. Archived from 7136:"Deep Coding Network" 4742:Hinton, G.E. (2009). 3421:Multilayer perceptron 3330:deep stacking network 3315: 3281: 3279:{\displaystyle m_{l}} 3247: 3178: 3139: 3116: 3114:{\displaystyle n_{l}} 3089: 3087:{\displaystyle \ell } 3069: 3027:convolutional network 3013:unsupervised learning 2977: 2648: 2646:{\displaystyle h^{3}} 2621: 2550: 2510: 2439: 2365: 2276: 2182: 1740:neural networks (NNs) 1635:computer architecture 1565:Neural Turing machine 1253:Committee of machines 1229:Committee of Machines 1129:Liquid-state machines 1065:unsupervised learning 892: 794:categorical variables 766:Radial Basis Function 730:How RBF networks work 687:techniques, known as 654:non-parametric method 555: 439: 317:Deep stacking network 164:unsupervised learning 160:unsupervised learning 156:multilayer perceptron 136:, and pruned through 7051:. pp. 609–616. 5248:"Echo state network" 3970:10.1364/ao.29.004790 3747:(16–18): 3150–3157. 3483:Graph neural network 3457:Predictive analytics 3297: 3263: 3198: 3160: 3128: 3098: 3078: 3052: 2660: 2630: 2562: 2523: 2451: 2380: 2285: 2194: 1910: 1835:, conditional DBNs, 1764:non-linear relations 1756:(statistical) models 1752:geo-spatial datasets 1631:random-access memory 1396:activation functions 800:and dividing by the 576:discriminative tasks 478: 397: 327:closed-form solution 35:computational models 7750:2007SchpJ...2.1717F 6834:10.1038/nature14539 6826:2015Natur.521..436L 6650:2022IJGIS..36..215H 6550:2020PhRvE.101d2301M 6462:Gerstner; Kistler. 6433:2015arXiv150701053C 6153:10.1038/nature20101 6145:2016Natur.538..471G 5421:1997ITSP...45.2673S 5301:2004Sci...304...78J 5264:2007SchpJ...2.2330J 5183:Campenhout, Jan Van 5175:Schrauwen, Benjamin 4760:2009SchpJ...4.5947H 4600:2006Sci...313..504H 4302:2017Senso..17.1341R 4106:on 28 December 2017 3991:Zhang, Wei (1988). 3962:1990ApOpt..29.4790Z 3944:Zhang, Wei (1990). 3915:"TDNN Fundamentals" 3467:Simulated annealing 3416:Logistic regression 3150:supervised strategy 3122:principal component 2963: 2948: 2933: 2893: 2878: 2863: 2823: 2798: 2168: 2153: 2138: 2098: 2083: 2068: 2028: 2003: 1799:tasks and inspired 1797:pattern recognition 1793:selective attention 1762:' variables depict 1673:machine translation 1605:k-nearest neighbors 1342:supervised learning 1107:Reservoir computing 1101:Reservoir computing 1059:Self-organizing map 1053:Self-organizing map 1047:Products of Experts 944:supervised learning 885:Deep belief network 879:Deep belief network 632:Regulatory feedback 609:, all learning for 546: 307:recommender systems 130:regression analysis 126:supervised learning 7707:Microsoft Research 7699:Hakkani-TĂĽr, Dilek 7615:Neural Computation 7499:(483): 1131–1154. 6876:Neural Computation 6795:, pp. 81, 85. 6495:Neural Computation 6244:"Semantic hashing" 5963:JĂĽrgen Schmidhuber 5903:Neural Computation 5515:Neural Computation 5205:Neural Computation 5179:Verstraeten, David 5113:Neural Computation 5038:Neural Computation 4984:Neural Computation 4934:Connection Science 4175:ufldl.stanford.edu 4127:Neural Computation 4044:10.1007/bf00344251 3703:10.1561/2200000006 3310: 3276: 3255:K-nearest neighbor 3242: 3185:mutual information 3173: 3134: 3111: 3084: 3064: 2972: 2949: 2934: 2910: 2909: 2879: 2864: 2840: 2839: 2809: 2775: 2774: 2643: 2616: 2545: 2505: 2434: 2375:joint distribution 2360: 2271: 2177: 2154: 2139: 2115: 2114: 2084: 2069: 2045: 2044: 2014: 1980: 1979: 1955: 1480:properties of the 1420:question answering 1404:Gaussian functions 1317:optical components 1167:(LSTM) avoids the 1141:Echo state network 899: 846:K-means clustering 736:K-Nearest Neighbor 658:K-nearest neighbor 592:non-convex problem 550: 532: 497: 434: 41:, and are used to 7424:978-1-4503-0619-5 6820:(7553): 436–444. 6694:978-0-19-517618-6 6417:(11): 1875–1886. 6197:. 12 October 2016 6139:(7626): 471–476. 5439:10.1109/78.650093 5415:(11): 2673–2681. 5366:10.1109/72.963769 5211:(11): 2531–2560. 4594:(5786): 504–507. 4443:978-1-4673-0046-9 4390:978-1-60558-205-4 4311:10.3390/s17061341 4267:978-1-4673-6964-0 3930:, a chapter from 3396:Genetic algorithm 3137:{\displaystyle l} 3094:, extracting the 2897: 2827: 2762: 2749: 2446:conditional model 2102: 2032: 1967: 1946: 1944: 1868:cognitive science 1849:degree of freedom 1601:nearest neighbour 1599:are often called 1548:LSTM forget gates 1497:Bayesian networks 1490:memory-prediction 1400:sigmoid functions 1259:Committee machine 1039:Boltzmann machine 1033:Boltzmann machine 1027:Boltzmann machine 650:negative feedback 481: 369:. Target vectors 257:that use minimal 176:generative models 174:and for learning 168:efficient codings 62:model populations 7799: 7763: 7761: 7728: 7725:Neural computers 7711: 7710: 7694: 7688: 7687: 7681: 7676: 7674: 7666: 7664: 7655: 7649: 7648: 7630: 7621:(5): 1299–1319. 7610: 7604: 7603: 7601: 7589: 7580: 7579: 7553: 7533: 7527: 7526: 7508: 7488: 7482: 7481: 7455: 7435: 7429: 7428: 7412: 7403: 7397: 7396: 7378: 7358: 7352: 7351: 7333: 7313: 7307: 7306: 7278: 7262: 7256: 7255: 7253: 7252: 7246: 7240:. Archived from 7235: 7226: 7220: 7219: 7217: 7216: 7210: 7204:. Archived from 7199: 7190: 7184: 7183: 7181: 7180: 7174: 7159: 7150: 7144: 7143: 7131: 7125: 7124: 7122: 7121: 7115: 7104: 7095: 7089: 7088: 7060: 7044: 7038: 7037: 7031: 7026: 7024: 7016: 7014: 7005: 6999: 6998: 6982: 6976: 6975: 6969: 6964: 6962: 6954: 6952: 6951: 6945: 6934: 6925: 6919: 6918: 6892: 6883:(7): 1527–1554. 6872: 6860: 6854: 6853: 6811: 6802: 6796: 6790: 6784: 6778: 6772: 6766: 6760: 6754: 6748: 6747: 6737: 6705: 6699: 6698: 6678: 6672: 6671: 6661: 6629: 6623: 6622: 6594: 6588: 6587: 6569: 6533: 6527: 6526: 6490: 6484: 6482: 6480: 6479: 6470:. Archived from 6459: 6453: 6452: 6426: 6406: 6400: 6399: 6397: 6385: 6379: 6378: 6376: 6362: 6353: 6347: 6346: 6336: 6330: 6329: 6327: 6315: 6309: 6308: 6306: 6294: 6288: 6287: 6285: 6273: 6267: 6266: 6248: 6239: 6233: 6232: 6212: 6206: 6205: 6203: 6202: 6187: 6181: 6180: 6124: 6118: 6117: 6115: 6114: 6099: 6093: 6092: 6090: 6089: 6075: 6069: 6068: 6066: 6065: 6050: 6044: 6043: 6041: 6029: 6023: 6022: 6020: 6008: 6002: 6001: 5999: 5979: 5973: 5972: 5959: 5953: 5952: 5942: 5933: 5927: 5926: 5898: 5892: 5891: 5889: 5888: 5864: 5858: 5857: 5849: 5843: 5842: 5822: 5816: 5815: 5779: 5773: 5772: 5767:. Archived from 5760: 5754: 5753: 5751: 5739: 5733: 5732: 5730: 5718: 5712: 5711: 5709: 5697: 5691: 5690: 5688: 5686: 5680: 5674:. Archived from 5669: 5660: 5654: 5653: 5637: 5631: 5627: 5621: 5620: 5600: 5594: 5593: 5591: 5590: 5576: 5570: 5569: 5567: 5566: 5560: 5554:. Archived from 5553: 5545: 5539: 5538: 5510: 5504: 5503: 5477: 5468:(5–6): 602–610. 5457: 5451: 5450: 5432: 5404: 5398: 5397: 5395: 5384: 5378: 5377: 5360:(6): 1333–1340. 5345: 5339: 5338: 5312: 5284: 5278: 5277: 5275: 5243: 5237: 5236: 5197: 5191: 5190: 5171: 5165: 5164: 5162: 5151: 5145: 5144: 5119:(8): 1735–1780. 5108: 5099: 5098: 5092: 5083: 5077: 5076: 5068: 5062: 5061: 5035: 5026: 5020: 5019: 5014: 5008: 5007: 4979: 4973: 4972: 4964: 4958: 4957: 4929: 4923: 4922: 4908: 4899: 4893: 4892: 4890: 4881: 4875: 4874: 4862: 4856: 4855: 4831: 4825: 4824: 4796: 4780: 4774: 4773: 4771: 4739: 4733: 4732: 4724: 4718: 4717: 4699: 4679: 4673: 4672: 4654: 4634: 4628: 4627: 4583: 4577: 4576: 4551:(8): 1944–1957. 4540: 4534: 4533: 4523: 4503: 4497: 4496: 4474: 4465: 4456: 4455: 4423: 4414: 4403: 4402: 4368: 4362: 4361: 4359: 4348: 4342: 4341: 4331: 4313: 4287: 4278: 4272: 4271: 4251: 4235: 4229: 4228: 4201: 4185: 4179: 4178: 4167: 4161: 4160: 4149: 4143: 4142: 4122: 4116: 4115: 4113: 4111: 4100:DeepLearning 0.1 4092: 4086: 4085: 4083: 4081: 4070: 4064: 4063: 4027: 4021: 4020: 4018: 4007: 4001: 4000: 3988: 3982: 3981: 3941: 3935: 3929: 3927: 3926: 3917:. Archived from 3911: 3905: 3904: 3902: 3901: 3895: 3889:. Archived from 3884: 3875: 3869: 3868: 3866: 3865: 3856:. Archived from 3850: 3844: 3843: 3841: 3840: 3826: 3820: 3819: 3811: 3805: 3804: 3802: 3790: 3784: 3783: 3763: 3757: 3756: 3738: 3729: 3723: 3722: 3696: 3678: 3669: 3660: 3659: 3643: 3637: 3636: 3608: 3602: 3601: 3582: 3576: 3573: 3561: 3555: 3552: 3540: 3534: 3531: 3519: 3513: 3511: 3499: 3338:cross validation 3319: 3317: 3316: 3311: 3309: 3308: 3285: 3283: 3282: 3277: 3275: 3274: 3251: 3249: 3248: 3243: 3238: 3237: 3210: 3209: 3182: 3180: 3179: 3174: 3172: 3171: 3143: 3141: 3140: 3135: 3120: 3118: 3117: 3112: 3110: 3109: 3093: 3091: 3090: 3085: 3073: 3071: 3070: 3065: 3000:generative model 2981: 2979: 2978: 2973: 2968: 2964: 2962: 2957: 2947: 2942: 2932: 2921: 2908: 2892: 2887: 2877: 2872: 2862: 2851: 2838: 2822: 2817: 2808: 2807: 2797: 2786: 2773: 2750: 2748: 2744: 2743: 2718: 2710: 2709: 2697: 2696: 2684: 2683: 2652: 2650: 2649: 2644: 2642: 2641: 2625: 2623: 2622: 2617: 2612: 2611: 2599: 2598: 2586: 2585: 2554: 2552: 2551: 2546: 2541: 2540: 2514: 2512: 2511: 2506: 2501: 2500: 2488: 2487: 2475: 2474: 2443: 2441: 2440: 2435: 2430: 2429: 2417: 2416: 2404: 2403: 2369: 2367: 2366: 2361: 2356: 2355: 2344: 2335: 2334: 2323: 2314: 2313: 2302: 2280: 2278: 2277: 2272: 2267: 2266: 2255: 2246: 2245: 2234: 2225: 2224: 2213: 2201: 2186: 2184: 2183: 2178: 2173: 2169: 2167: 2162: 2152: 2147: 2137: 2126: 2113: 2097: 2092: 2082: 2077: 2067: 2056: 2043: 2027: 2022: 2013: 2012: 2002: 1991: 1978: 1954: 1945: 1937: 1923: 1899: 1882:generative model 1845: 1844: 1620:Pointer networks 1591:Semantic hashing 1575:gradient descent 1416:long-term memory 1370:inference system 1315: with  1269:machine learning 1117:dynamical system 1095:gradient descent 1088:Simple recurrent 1043:latent variables 1017:Hebbian learning 1013:Hopfield network 1007:Hopfield network 991:utility function 983:fitness function 952:gradient descent 903:generative model 815:Summation layer: 689:ridge regression 681:sigmoid function 677:regression model 616: 612: 608: 596:bilinear mapping 559: 557: 556: 551: 545: 540: 531: 523: 518: 517: 512: 496: 495: 494: 452:logistic sigmoid 443: 441: 440: 435: 430: 425: 424: 419: 404: 299:robot navigation 200:Bayesian network 58:adaptive systems 7807: 7806: 7802: 7801: 7800: 7798: 7797: 7796: 7767: 7766: 7719: 7714: 7695: 7691: 7679: 7677: 7668: 7667: 7662: 7656: 7652: 7611: 7607: 7590: 7583: 7534: 7530: 7489: 7485: 7453:10.1.1.110.9024 7436: 7432: 7425: 7410: 7404: 7400: 7359: 7355: 7331:10.1.1.141.5560 7314: 7310: 7295: 7276:10.1.1.298.4083 7263: 7259: 7250: 7248: 7244: 7233: 7227: 7223: 7214: 7212: 7208: 7197: 7191: 7187: 7178: 7176: 7172: 7157: 7151: 7147: 7132: 7128: 7119: 7117: 7113: 7102: 7096: 7092: 7077: 7058:10.1.1.149.6800 7045: 7041: 7029: 7027: 7018: 7017: 7012: 7006: 7002: 6983: 6979: 6967: 6965: 6956: 6955: 6949: 6947: 6943: 6932: 6926: 6922: 6870: 6861: 6857: 6809: 6807:"Deep learning" 6803: 6799: 6791: 6787: 6779: 6775: 6767: 6763: 6755: 6751: 6706: 6702: 6695: 6679: 6675: 6630: 6626: 6611:10.1145/3466688 6595: 6591: 6538:Physical Review 6534: 6530: 6491: 6487: 6477: 6475: 6460: 6456: 6407: 6403: 6386: 6382: 6360: 6354: 6350: 6337: 6333: 6316: 6312: 6295: 6291: 6274: 6270: 6246: 6240: 6236: 6213: 6209: 6200: 6198: 6189: 6188: 6184: 6125: 6121: 6112: 6110: 6100: 6096: 6087: 6085: 6077: 6076: 6072: 6063: 6061: 6053:Burgess, Matt. 6051: 6047: 6030: 6026: 6009: 6005: 5980: 5976: 5960: 5956: 5940: 5934: 5930: 5899: 5895: 5886: 5884: 5865: 5861: 5850: 5846: 5823: 5819: 5780: 5776: 5761: 5757: 5740: 5736: 5719: 5715: 5698: 5694: 5684: 5682: 5678: 5667: 5661: 5657: 5638: 5634: 5628: 5624: 5617: 5601: 5597: 5588: 5586: 5578: 5577: 5573: 5564: 5562: 5558: 5551: 5547: 5546: 5542: 5511: 5507: 5475:10.1.1.331.5800 5462:Neural Networks 5458: 5454: 5430:10.1.1.331.9441 5405: 5401: 5393: 5385: 5381: 5346: 5342: 5310:10.1.1.719.2301 5295:(5667): 78–80. 5285: 5281: 5244: 5240: 5198: 5194: 5172: 5168: 5160: 5152: 5148: 5109: 5102: 5090: 5084: 5080: 5069: 5065: 5033: 5027: 5023: 5015: 5011: 4980: 4976: 4965: 4961: 4930: 4926: 4906: 4900: 4896: 4888: 4882: 4878: 4863: 4859: 4840:Neural Networks 4832: 4828: 4813: 4781: 4777: 4740: 4736: 4725: 4721: 4697:10.1.1.338.2670 4680: 4676: 4652:10.1.1.227.8990 4635: 4631: 4584: 4580: 4541: 4537: 4521:10.1.1.133.8090 4508:Neural Networks 4504: 4500: 4472: 4466: 4459: 4444: 4421: 4415: 4406: 4391: 4369: 4365: 4357: 4349: 4345: 4285: 4279: 4275: 4268: 4236: 4232: 4218: 4199:10.1.1.220.5099 4186: 4182: 4169: 4168: 4164: 4150: 4146: 4123: 4119: 4109: 4107: 4094: 4093: 4089: 4079: 4077: 4071: 4067: 4028: 4024: 4016: 4008: 4004: 3989: 3985: 3942: 3938: 3924: 3922: 3913: 3912: 3908: 3899: 3897: 3893: 3882: 3876: 3872: 3863: 3861: 3852: 3851: 3847: 3838: 3836: 3828: 3827: 3823: 3812: 3808: 3791: 3787: 3764: 3760: 3736: 3730: 3726: 3694:10.1.1.701.9550 3676: 3670: 3663: 3644: 3640: 3609: 3605: 3583: 3579: 3562: 3558: 3541: 3537: 3520: 3516: 3500: 3496: 3492: 3487: 3356:Artificial life 3346: 3334:hyper-parameter 3304: 3300: 3298: 3295: 3294: 3270: 3266: 3264: 3261: 3260: 3233: 3229: 3205: 3201: 3199: 3196: 3195: 3167: 3163: 3161: 3158: 3157: 3146:dimensionaliity 3129: 3126: 3125: 3105: 3101: 3099: 3096: 3095: 3079: 3076: 3075: 3053: 3050: 3049: 3035: 2988: 2958: 2953: 2943: 2938: 2922: 2914: 2901: 2888: 2883: 2873: 2868: 2852: 2844: 2831: 2818: 2813: 2803: 2799: 2787: 2779: 2766: 2761: 2757: 2739: 2735: 2722: 2717: 2705: 2701: 2692: 2688: 2679: 2675: 2661: 2658: 2657: 2637: 2633: 2631: 2628: 2627: 2607: 2603: 2594: 2590: 2581: 2577: 2563: 2560: 2559: 2536: 2532: 2524: 2521: 2520: 2496: 2492: 2483: 2479: 2470: 2466: 2452: 2449: 2448: 2425: 2421: 2412: 2408: 2399: 2395: 2381: 2378: 2377: 2345: 2340: 2339: 2324: 2319: 2318: 2303: 2298: 2297: 2286: 2283: 2282: 2256: 2251: 2250: 2235: 2230: 2229: 2214: 2209: 2208: 2197: 2195: 2192: 2191: 2163: 2158: 2148: 2143: 2127: 2119: 2106: 2093: 2088: 2078: 2073: 2057: 2049: 2036: 2023: 2018: 2008: 2004: 1992: 1984: 1971: 1966: 1962: 1950: 1936: 1919: 1911: 1908: 1907: 1897: 1886:log-probability 1860:computer vision 1842: 1841: 1813:Bayesian models 1809: 1772: 1736: 1730: 1701: 1690: 1685: 1665: 1660: 1622: 1609:graphical model 1593: 1567: 1561: 1532: 1515: 1509: 1501:neural networks 1488:model based on 1474: 1468: 1447: 1412: 1410:Memory networks 1402:(and sometimes 1392: 1386: 1378:defuzzification 1363: 1357: 1349:backpropagation 1338: 1325: 1298: 1292: 1283: 1261: 1255: 1243: 1237: 1224: 1216: 1210: 1202: 1196: 1187: 1181: 1161: 1155: 1143: 1137: 1113:neural networks 1109: 1103: 1090: 1079: 1073: 1061: 1055: 1035: 1029: 1009: 1003: 987:reward function 936: 934:Fully recurrent 925: 919: 887: 881: 869: 863: 825: 783: 768: 732: 720:kernel function 672: 666: 634: 584: 541: 536: 527: 519: 513: 508: 507: 490: 486: 485: 479: 476: 475: 426: 420: 415: 414: 400: 398: 395: 394: 319: 270:receptive field 243: 237: 218: 212: 190: 184: 152: 146: 114: 108: 100:backpropagation 83: 77: 19:There are many 17: 12: 11: 5: 7805: 7795: 7794: 7789: 7784: 7779: 7765: 7764: 7734:"Neocognitron" 7729: 7718: 7715: 7713: 7712: 7701:(2012-12-01). 7689: 7680:|journal= 7650: 7628:10.1.1.53.8911 7605: 7581: 7551:10.1.1.372.909 7544:(8): 1958–71. 7528: 7506:10.1.1.70.9873 7483: 7446:(4): 594–611. 7430: 7423: 7398: 7376:10.1.1.57.9649 7353: 7308: 7293: 7257: 7221: 7185: 7145: 7126: 7090: 7075: 7039: 7030:|journal= 7000: 6977: 6968:|journal= 6920: 6890:10.1.1.76.1541 6855: 6797: 6793:Fukushima 1987 6785: 6781:Fukushima 2007 6773: 6769:Fukushima 1987 6761: 6757:Fukushima 1987 6749: 6700: 6693: 6673: 6644:(2): 215–235. 6624: 6589: 6528: 6485: 6454: 6401: 6380: 6348: 6331: 6310: 6289: 6268: 6257:(7): 969–978. 6234: 6223:(3): 243–269. 6217:Neurocomputing 6207: 6182: 6119: 6094: 6070: 6045: 6024: 6003: 5974: 5954: 5928: 5909:(1): 131–139. 5893: 5859: 5844: 5833:(3): 259–267. 5817: 5790:(2): 212–229. 5774: 5771:on 2016-05-02. 5755: 5734: 5713: 5692: 5655: 5632: 5622: 5615: 5595: 5584:www.vcclab.org 5571: 5540: 5521:(2): 234–242. 5505: 5452: 5399: 5379: 5340: 5279: 5238: 5201:Mass, Wolfgang 5192: 5166: 5146: 5100: 5078: 5063: 5044:(2): 263–269. 5021: 5009: 4990:(2): 243–248. 4974: 4959: 4940:(4): 403–412. 4924: 4894: 4876: 4857: 4846:(4): 339–356. 4826: 4811: 4794:10.1.1.77.3242 4775: 4734: 4719: 4674: 4629: 4578: 4535: 4514:(2): 241–259. 4498: 4457: 4442: 4404: 4389: 4363: 4343: 4273: 4266: 4230: 4216: 4180: 4162: 4144: 4133:(4): 541–551. 4117: 4087: 4065: 4038:(4): 193–202. 4022: 4002: 3983: 3956:(32): 4790–7. 3950:Applied Optics 3936: 3906: 3870: 3845: 3821: 3806: 3785: 3768:Neurocomputing 3758: 3741:Neurocomputing 3724: 3661: 3638: 3619:(4): 364–378. 3603: 3577: 3556: 3535: 3533:activity."..." 3514: 3493: 3491: 3488: 3486: 3485: 3480: 3474: 3472:Systolic array 3469: 3464: 3459: 3454: 3449: 3444: 3438: 3431:Neuroevolution 3428: 3423: 3418: 3413: 3408: 3403: 3398: 3393: 3388: 3383: 3378: 3373: 3368: 3363: 3358: 3353: 3347: 3345: 3342: 3322: 3321: 3307: 3303: 3291: 3288:validation set 3273: 3269: 3241: 3236: 3232: 3228: 3225: 3222: 3219: 3216: 3213: 3208: 3204: 3188: 3170: 3166: 3133: 3108: 3104: 3083: 3063: 3060: 3057: 3034: 3031: 2987: 2984: 2983: 2982: 2971: 2967: 2961: 2956: 2952: 2946: 2941: 2937: 2931: 2928: 2925: 2920: 2917: 2913: 2907: 2904: 2900: 2896: 2891: 2886: 2882: 2876: 2871: 2867: 2861: 2858: 2855: 2850: 2847: 2843: 2837: 2834: 2830: 2826: 2821: 2816: 2812: 2806: 2802: 2796: 2793: 2790: 2785: 2782: 2778: 2772: 2769: 2765: 2760: 2756: 2753: 2747: 2742: 2738: 2734: 2731: 2728: 2725: 2721: 2716: 2713: 2708: 2704: 2700: 2695: 2691: 2687: 2682: 2678: 2674: 2671: 2668: 2665: 2640: 2636: 2615: 2610: 2606: 2602: 2597: 2593: 2589: 2584: 2580: 2576: 2573: 2570: 2567: 2544: 2539: 2535: 2531: 2528: 2504: 2499: 2495: 2491: 2486: 2482: 2478: 2473: 2469: 2465: 2462: 2459: 2456: 2433: 2428: 2424: 2420: 2415: 2411: 2407: 2402: 2398: 2394: 2391: 2388: 2385: 2359: 2354: 2351: 2348: 2343: 2338: 2333: 2330: 2327: 2322: 2317: 2312: 2309: 2306: 2301: 2296: 2293: 2290: 2270: 2265: 2262: 2259: 2254: 2249: 2244: 2241: 2238: 2233: 2228: 2223: 2220: 2217: 2212: 2207: 2204: 2200: 2188: 2187: 2176: 2172: 2166: 2161: 2157: 2151: 2146: 2142: 2136: 2133: 2130: 2125: 2122: 2118: 2112: 2109: 2105: 2101: 2096: 2091: 2087: 2081: 2076: 2072: 2066: 2063: 2060: 2055: 2052: 2048: 2042: 2039: 2035: 2031: 2026: 2021: 2017: 2011: 2007: 2001: 1998: 1995: 1990: 1987: 1983: 1977: 1974: 1970: 1965: 1961: 1958: 1953: 1949: 1943: 1940: 1935: 1932: 1929: 1926: 1922: 1918: 1915: 1808: 1805: 1771: 1768: 1732:Main article: 1729: 1726: 1716:pulse computer 1708:delta function 1700: 1697: 1689: 1686: 1684: 1681: 1677:language model 1664: 1661: 1659: 1656: 1621: 1618: 1592: 1589: 1571:Turing machine 1563:Main article: 1560: 1557: 1556: 1555: 1552: 1549: 1546: 1543: 1531: 1528: 1511:Main article: 1508: 1505: 1470:Main article: 1467: 1464: 1456:grid computing 1446: 1443: 1411: 1408: 1388:Main article: 1385: 1382: 1359:Main article: 1356: 1353: 1337: 1334: 1324: 1321: 1294:Main article: 1291: 1288: 1282: 1279: 1257:Main article: 1254: 1251: 1239:Main article: 1236: 1233: 1223: 1220: 1212:Main article: 1209: 1206: 1198:Main article: 1195: 1192: 1183:Main article: 1180: 1179:Bi-directional 1177: 1157:Main article: 1154: 1151: 1139:Main article: 1136: 1133: 1105:Main article: 1102: 1099: 1089: 1086: 1075:Main article: 1072: 1069: 1057:Main article: 1054: 1051: 1031:Main article: 1028: 1025: 1005:Main article: 1002: 999: 956:differentiable 935: 932: 921:Main article: 918: 915: 883:Main article: 880: 877: 865:Main article: 862: 859: 842: 841: 838: 835: 832: 824: 821: 820: 819: 812: 805: 782: 779: 767: 764: 763: 762: 731: 728: 668:Main article: 665: 662: 633: 630: 622:classification 583: 580: 561: 560: 549: 544: 539: 535: 530: 526: 522: 516: 511: 506: 503: 500: 493: 489: 484: 433: 429: 423: 418: 413: 410: 407: 403: 318: 315: 251:pooling layers 239:Main article: 236: 233: 214:Main article: 211: 208: 186:Main article: 183: 180: 148:Main article: 145: 142: 138:regularization 134:validation set 110:Main article: 107: 104: 79:Main article: 76: 73: 69:software-based 15: 9: 6: 4: 3: 2: 7804: 7793: 7790: 7788: 7785: 7783: 7780: 7778: 7775: 7774: 7772: 7760: 7755: 7751: 7747: 7743: 7739: 7735: 7730: 7726: 7721: 7720: 7708: 7704: 7700: 7693: 7685: 7672: 7661: 7654: 7646: 7642: 7638: 7634: 7629: 7624: 7620: 7616: 7609: 7600: 7595: 7588: 7586: 7577: 7573: 7569: 7565: 7561: 7557: 7552: 7547: 7543: 7539: 7532: 7524: 7520: 7516: 7512: 7507: 7502: 7498: 7494: 7487: 7479: 7475: 7471: 7467: 7463: 7459: 7454: 7449: 7445: 7441: 7434: 7426: 7420: 7416: 7409: 7402: 7394: 7390: 7386: 7382: 7377: 7372: 7369:(2): 245–72. 7368: 7364: 7357: 7349: 7345: 7341: 7337: 7332: 7327: 7324:(3): 307–21. 7323: 7319: 7312: 7304: 7300: 7296: 7294:9781605582054 7290: 7286: 7282: 7277: 7272: 7268: 7261: 7247:on 2016-03-04 7243: 7239: 7232: 7225: 7211:on 2016-03-04 7207: 7203: 7196: 7189: 7175:on 2016-03-04 7171: 7167: 7163: 7156: 7149: 7141: 7137: 7130: 7116:on 2016-03-04 7112: 7108: 7101: 7094: 7086: 7082: 7078: 7076:9781605585161 7072: 7068: 7064: 7059: 7054: 7050: 7043: 7035: 7022: 7011: 7004: 6996: 6992: 6988: 6981: 6973: 6960: 6946:on 2015-11-06 6942: 6938: 6931: 6924: 6916: 6912: 6908: 6904: 6900: 6896: 6891: 6886: 6882: 6878: 6877: 6869: 6865: 6864:Hinton, G. E. 6859: 6851: 6847: 6843: 6839: 6835: 6831: 6827: 6823: 6819: 6815: 6808: 6801: 6794: 6789: 6782: 6777: 6771:, p. 84. 6770: 6765: 6759:, p. 83. 6758: 6753: 6745: 6741: 6736: 6731: 6727: 6723: 6720:(3): 574–91. 6719: 6715: 6711: 6704: 6696: 6690: 6686: 6685: 6677: 6669: 6665: 6660: 6655: 6651: 6647: 6643: 6639: 6635: 6628: 6620: 6616: 6612: 6608: 6604: 6600: 6593: 6585: 6581: 6577: 6573: 6568: 6563: 6559: 6555: 6551: 6547: 6544:(4): 042301. 6543: 6539: 6532: 6524: 6520: 6516: 6512: 6508: 6504: 6501:(2): 245–82. 6500: 6496: 6489: 6474:on 2017-06-04 6473: 6469: 6468:icwww.epfl.ch 6465: 6458: 6450: 6446: 6442: 6438: 6434: 6430: 6425: 6420: 6416: 6412: 6405: 6396: 6391: 6384: 6375: 6370: 6366: 6359: 6352: 6344: 6343: 6335: 6326: 6321: 6314: 6305: 6300: 6293: 6284: 6279: 6272: 6264: 6260: 6256: 6252: 6245: 6238: 6230: 6226: 6222: 6218: 6211: 6196: 6192: 6186: 6178: 6174: 6170: 6166: 6162: 6158: 6154: 6150: 6146: 6142: 6138: 6134: 6130: 6123: 6109: 6105: 6098: 6084: 6080: 6074: 6060: 6056: 6049: 6040: 6035: 6028: 6019: 6014: 6007: 5998: 5993: 5989: 5985: 5978: 5970: 5969: 5964: 5958: 5950: 5946: 5939: 5932: 5924: 5920: 5916: 5912: 5908: 5904: 5897: 5883:on 2019-12-06 5882: 5878: 5874: 5870: 5863: 5855: 5848: 5840: 5836: 5832: 5828: 5821: 5813: 5809: 5805: 5801: 5797: 5793: 5789: 5785: 5778: 5770: 5766: 5759: 5750: 5745: 5738: 5729: 5724: 5717: 5708: 5703: 5696: 5681:on 3 May 2013 5677: 5673: 5666: 5659: 5651: 5647: 5643: 5636: 5626: 5618: 5616:9780262511117 5612: 5609:. MIT Press. 5608: 5607: 5599: 5585: 5581: 5575: 5561:on 2011-07-18 5557: 5550: 5544: 5536: 5532: 5528: 5524: 5520: 5516: 5509: 5501: 5497: 5493: 5489: 5485: 5481: 5476: 5471: 5467: 5463: 5456: 5448: 5444: 5440: 5436: 5431: 5426: 5422: 5418: 5414: 5410: 5403: 5392: 5391: 5383: 5375: 5371: 5367: 5363: 5359: 5355: 5351: 5344: 5336: 5332: 5328: 5324: 5320: 5316: 5311: 5306: 5302: 5298: 5294: 5290: 5283: 5274: 5269: 5265: 5261: 5257: 5253: 5249: 5242: 5234: 5230: 5226: 5222: 5218: 5214: 5210: 5206: 5202: 5196: 5188: 5184: 5180: 5176: 5170: 5159: 5158: 5154:Cruse, Holk. 5150: 5142: 5138: 5134: 5130: 5126: 5122: 5118: 5114: 5107: 5105: 5097:. IEEE Press. 5096: 5089: 5082: 5074: 5067: 5059: 5055: 5051: 5047: 5043: 5039: 5032: 5025: 5013: 5005: 5001: 4997: 4993: 4989: 4985: 4978: 4970: 4963: 4955: 4951: 4947: 4943: 4939: 4935: 4928: 4920: 4916: 4912: 4905: 4898: 4887: 4880: 4872: 4868: 4861: 4853: 4849: 4845: 4841: 4837: 4830: 4822: 4818: 4814: 4812:9781595937933 4808: 4804: 4800: 4795: 4790: 4786: 4779: 4770: 4765: 4761: 4757: 4753: 4749: 4745: 4738: 4730: 4723: 4715: 4711: 4707: 4703: 4698: 4693: 4689: 4685: 4678: 4670: 4666: 4662: 4658: 4653: 4648: 4644: 4640: 4633: 4625: 4621: 4617: 4613: 4609: 4605: 4601: 4597: 4593: 4589: 4582: 4574: 4570: 4566: 4562: 4558: 4554: 4550: 4546: 4539: 4531: 4527: 4522: 4517: 4513: 4509: 4502: 4494: 4490: 4486: 4482: 4479:: 2285–2288. 4478: 4471: 4464: 4462: 4453: 4449: 4445: 4439: 4435: 4431: 4427: 4420: 4413: 4411: 4409: 4400: 4396: 4392: 4386: 4382: 4378: 4374: 4367: 4356: 4355: 4347: 4339: 4335: 4330: 4325: 4321: 4317: 4312: 4307: 4303: 4299: 4295: 4291: 4284: 4277: 4269: 4263: 4259: 4255: 4250: 4245: 4241: 4234: 4227: 4223: 4219: 4217:9783642217340 4213: 4209: 4205: 4200: 4195: 4191: 4184: 4176: 4172: 4166: 4158: 4154: 4148: 4140: 4136: 4132: 4128: 4121: 4105: 4101: 4097: 4091: 4076: 4073:LeCun, Yann. 4069: 4061: 4057: 4053: 4049: 4045: 4041: 4037: 4033: 4026: 4015: 4014: 4006: 3998: 3994: 3987: 3979: 3975: 3971: 3967: 3963: 3959: 3955: 3951: 3947: 3940: 3934:online manual 3933: 3921:on 2017-03-22 3920: 3916: 3910: 3896:on 2012-01-31 3892: 3888: 3881: 3874: 3860:on 2010-12-18 3859: 3855: 3849: 3835: 3831: 3825: 3817: 3810: 3801: 3796: 3789: 3781: 3777: 3773: 3769: 3762: 3754: 3750: 3746: 3742: 3735: 3728: 3720: 3716: 3712: 3708: 3704: 3700: 3695: 3690: 3686: 3682: 3675: 3668: 3666: 3658:(1): 175–187. 3657: 3653: 3649: 3642: 3634: 3630: 3626: 3622: 3618: 3614: 3607: 3599: 3595: 3591: 3587: 3581: 3571: 3567: 3560: 3554:animals."..." 3550: 3546: 3539: 3529: 3525: 3518: 3509: 3505: 3498: 3494: 3484: 3481: 3478: 3475: 3473: 3470: 3468: 3465: 3463: 3460: 3458: 3455: 3453: 3450: 3448: 3445: 3442: 3439: 3436: 3432: 3429: 3427: 3424: 3422: 3419: 3417: 3414: 3412: 3409: 3407: 3404: 3402: 3399: 3397: 3394: 3392: 3391:Expert system 3389: 3387: 3386:Decision tree 3384: 3382: 3379: 3377: 3374: 3372: 3369: 3367: 3364: 3362: 3359: 3357: 3354: 3352: 3349: 3348: 3341: 3339: 3335: 3331: 3325: 3305: 3301: 3293:the value of 3292: 3289: 3271: 3267: 3258: 3256: 3234: 3230: 3226: 3223: 3220: 3217: 3211: 3206: 3202: 3193: 3189: 3186: 3168: 3164: 3155: 3154: 3153: 3151: 3147: 3131: 3123: 3106: 3102: 3081: 3061: 3058: 3055: 3046: 3044: 3040: 3030: 3028: 3023: 3020: 3018: 3014: 3010: 3005: 3001: 2997: 2993: 2969: 2965: 2959: 2954: 2950: 2944: 2939: 2935: 2926: 2918: 2915: 2911: 2905: 2902: 2898: 2894: 2889: 2884: 2880: 2874: 2869: 2865: 2856: 2848: 2845: 2841: 2835: 2832: 2828: 2824: 2819: 2814: 2810: 2804: 2800: 2791: 2783: 2780: 2776: 2770: 2767: 2763: 2758: 2754: 2751: 2740: 2736: 2732: 2729: 2723: 2719: 2714: 2706: 2702: 2698: 2693: 2689: 2685: 2680: 2676: 2672: 2669: 2663: 2656: 2655: 2654: 2638: 2634: 2608: 2604: 2600: 2595: 2591: 2587: 2582: 2578: 2574: 2571: 2565: 2556: 2537: 2533: 2526: 2518: 2497: 2493: 2489: 2484: 2480: 2476: 2471: 2467: 2463: 2460: 2454: 2447: 2426: 2422: 2418: 2413: 2409: 2405: 2400: 2396: 2392: 2389: 2383: 2376: 2371: 2349: 2336: 2328: 2315: 2307: 2291: 2288: 2260: 2247: 2239: 2226: 2218: 2202: 2174: 2170: 2164: 2159: 2155: 2149: 2144: 2140: 2131: 2123: 2120: 2116: 2110: 2107: 2103: 2099: 2094: 2089: 2085: 2079: 2074: 2070: 2061: 2053: 2050: 2046: 2040: 2037: 2033: 2029: 2024: 2019: 2015: 2009: 2005: 1996: 1988: 1985: 1981: 1975: 1972: 1968: 1963: 1959: 1956: 1951: 1947: 1941: 1938: 1933: 1927: 1924: 1913: 1906: 1905: 1904: 1902: 1901: 1892: 1890: 1887: 1883: 1879: 1877: 1871: 1869: 1865: 1861: 1857: 1855: 1850: 1846: 1838: 1834: 1830: 1826: 1822: 1818: 1814: 1804: 1802: 1798: 1794: 1789: 1785: 1781: 1780:visual cortex 1777: 1767: 1765: 1761: 1757: 1753: 1749: 1745: 1741: 1735: 1725: 1722: 1719: 1717: 1713: 1709: 1705: 1696: 1694: 1680: 1678: 1674: 1670: 1655: 1652: 1648: 1644: 1640: 1636: 1632: 1628: 1617: 1615: 1610: 1606: 1602: 1598: 1588: 1586: 1582: 1578: 1576: 1572: 1566: 1553: 1550: 1547: 1544: 1541: 1540: 1539: 1537: 1527: 1524: 1521: 1514: 1504: 1502: 1498: 1493: 1491: 1487: 1483: 1479: 1473: 1463: 1461: 1457: 1453: 1442: 1440: 1436: 1432: 1428: 1423: 1421: 1417: 1407: 1405: 1401: 1397: 1391: 1381: 1379: 1375: 1374:fuzzification 1371: 1368: 1362: 1352: 1350: 1346: 1343: 1333: 1331: 1320: 1318: 1314: 1310: 1306: 1303: 1297: 1287: 1278: 1275: 1274: 1270: 1265: 1260: 1250: 1248: 1242: 1232: 1230: 1222:Genetic Scale 1219: 1215: 1205: 1201: 1191: 1186: 1176: 1174: 1170: 1166: 1160: 1150: 1148: 1142: 1132: 1130: 1126: 1122: 1118: 1114: 1108: 1098: 1096: 1085: 1083: 1078: 1068: 1066: 1060: 1050: 1048: 1044: 1040: 1034: 1024: 1022: 1018: 1014: 1008: 998: 996: 992: 988: 984: 980: 975: 973: 969: 965: 961: 957: 953: 948: 945: 940: 931: 929: 924: 914: 910: 908: 904: 896: 891: 886: 876: 874: 868: 858: 854: 850: 847: 839: 836: 833: 830: 829: 828: 816: 813: 809: 808:Hidden layer: 806: 803: 802:interquartile 799: 795: 791: 788: 787: 786: 778: 775: 771: 760: 756: 753: 752: 751: 747: 743: 739: 737: 727: 725: 721: 717: 712: 710: 704: 702: 696: 694: 690: 686: 682: 678: 671: 661: 659: 655: 651: 646: 643: 639: 629: 627: 623: 618: 603: 601: 597: 593: 589: 579: 577: 573: 569: 564: 547: 542: 537: 524: 514: 501: 498: 491: 487: 474: 473: 472: 470: 469: 463: 462: 457: 453: 449: 448: 421: 408: 405: 392: 391: 386: 385: 380: 379: 374: 373: 368: 367: 362: 361: 356: 353: 350:has logistic 349: 348: 343: 338: 336: 332: 328: 324: 314: 312: 308: 304: 300: 296: 291: 289: 285: 281: 279: 275: 271: 267: 266:visual cortex 262: 260: 259:preprocessing 256: 252: 248: 247:convolutional 242: 235:Convolutional 232: 230: 225: 223: 217: 207: 205: 201: 196: 195:Parzen window 189: 182:Probabilistic 179: 177: 173: 169: 165: 161: 157: 151: 141: 139: 135: 131: 127: 123: 119: 113: 103: 101: 97: 93: 89: 82: 72: 70: 65: 63: 59: 54: 51: 47: 44: 40: 36: 32: 28: 26: 22: 7741: 7738:Scholarpedia 7737: 7724: 7717:Bibliography 7706: 7692: 7671:cite journal 7653: 7618: 7614: 7608: 7541: 7537: 7531: 7496: 7492: 7486: 7443: 7439: 7433: 7414: 7401: 7366: 7363:Psychol. Rev 7362: 7356: 7321: 7317: 7311: 7266: 7260: 7249:. Retrieved 7242:the original 7237: 7224: 7213:. Retrieved 7206:the original 7201: 7188: 7177:. Retrieved 7170:the original 7165: 7161: 7148: 7139: 7129: 7118:. Retrieved 7111:the original 7106: 7093: 7048: 7042: 7021:cite journal 7003: 6994: 6990: 6980: 6959:cite journal 6948:. Retrieved 6941:the original 6936: 6923: 6880: 6874: 6858: 6817: 6813: 6800: 6788: 6776: 6764: 6752: 6717: 6713: 6703: 6683: 6676: 6641: 6637: 6627: 6602: 6598: 6592: 6541: 6537: 6531: 6498: 6494: 6488: 6476:. Retrieved 6472:the original 6467: 6457: 6414: 6410: 6404: 6383: 6364: 6351: 6341: 6334: 6313: 6292: 6271: 6254: 6250: 6237: 6220: 6216: 6210: 6199:. Retrieved 6194: 6185: 6136: 6132: 6122: 6111:. Retrieved 6107: 6097: 6086:. Retrieved 6082: 6073: 6062:. Retrieved 6058: 6048: 6027: 6006: 5997:10.1.1.5.323 5987: 5983: 5977: 5967: 5957: 5948: 5944: 5931: 5906: 5902: 5896: 5885:. Retrieved 5881:the original 5876: 5872: 5862: 5853: 5847: 5830: 5826: 5820: 5787: 5783: 5777: 5769:the original 5758: 5737: 5716: 5695: 5683:. Retrieved 5676:the original 5658: 5649: 5645: 5635: 5625: 5605: 5598: 5587:. Retrieved 5583: 5574: 5563:. Retrieved 5556:the original 5543: 5518: 5514: 5508: 5465: 5461: 5455: 5412: 5408: 5402: 5389: 5382: 5357: 5353: 5343: 5292: 5288: 5282: 5255: 5252:Scholarpedia 5251: 5241: 5208: 5204: 5195: 5186: 5169: 5156: 5149: 5116: 5112: 5094: 5081: 5072: 5066: 5041: 5037: 5024: 5012: 4987: 4983: 4977: 4968: 4962: 4937: 4933: 4927: 4910: 4897: 4879: 4860: 4843: 4839: 4829: 4784: 4778: 4751: 4748:Scholarpedia 4747: 4737: 4728: 4722: 4690:(1): 14–22. 4687: 4683: 4677: 4645:(1): 30–42. 4642: 4638: 4632: 4591: 4587: 4581: 4548: 4544: 4538: 4511: 4507: 4501: 4476: 4425: 4372: 4366: 4353: 4346: 4293: 4289: 4276: 4239: 4233: 4189: 4183: 4174: 4165: 4147: 4130: 4126: 4120: 4108:. Retrieved 4104:the original 4099: 4090: 4078:. Retrieved 4068: 4035: 4032:Biol. Cybern 4031: 4025: 4012: 4005: 3996: 3986: 3953: 3949: 3939: 3923:. Retrieved 3919:the original 3909: 3898:. Retrieved 3891:the original 3886: 3873: 3862:. Retrieved 3858:the original 3848: 3837:. Retrieved 3834:ResearchGate 3833: 3824: 3809: 3788: 3771: 3767: 3761: 3744: 3740: 3727: 3687:(1): 1–127. 3684: 3680: 3655: 3651: 3641: 3616: 3612: 3606: 3597: 3593: 3580: 3570:ScienceDaily 3569: 3559: 3549:ScienceDaily 3548: 3538: 3528:ScienceDaily 3527: 3517: 3508:ScienceDaily 3507: 3497: 3326: 3323: 3253: 3191: 3047: 3043:unsupervised 3036: 3024: 3021: 3017:Markov chain 2989: 2557: 2372: 2189: 1896: 1895: 1893: 1874: 1872: 1853: 1840: 1810: 1776:neocognitron 1773: 1770:Neocognitron 1737: 1723: 1720: 1702: 1691: 1666: 1623: 1594: 1579: 1568: 1533: 1516: 1494: 1475: 1448: 1424: 1413: 1393: 1364: 1339: 1330:fast weights 1326: 1299: 1284: 1271: 1266: 1262: 1244: 1225: 1217: 1203: 1194:Hierarchical 1188: 1162: 1144: 1124: 1120: 1110: 1091: 1080: 1062: 1036: 1010: 976: 967: 949: 941: 937: 926: 911: 900: 870: 855: 851: 843: 826: 814: 807: 790:Input layer: 789: 784: 781:Architecture 776: 772: 769: 758: 754: 748: 744: 740: 733: 713: 705: 697: 673: 647: 635: 619: 604: 585: 565: 562: 467: 466: 460: 459: 455: 446: 445: 389: 388: 383: 382: 377: 376: 371: 370: 365: 364: 359: 358: 346: 345: 339: 320: 292: 286: 282: 274:visual field 263: 244: 226: 219: 191: 153: 115: 84: 66: 55: 37:inspired by 29: 24: 20: 18: 7744:(1): 1717. 6605:(6): 1–21. 6567:2445/161417 5258:(9): 2330. 4754:(5): 5947. 4296:(6): 1341. 4153:LeCun, Yann 4080:16 November 3600:(3): 43–55. 3366:Autoencoder 3011:layer-wise 1748:reliability 1712:time domain 1683:Other types 1534:Apart from 1520:associative 1484:. HTM is a 1478:algorithmic 1361:Neuro-fuzzy 1355:Neuro-fuzzy 1281:Associative 1147:time series 907:composition 709:overfitting 695:framework. 280:operation. 278:convolution 150:Autoencoder 144:Autoencoder 75:Feedforward 43:approximate 7771:Categories 7251:2019-08-25 7215:2019-08-25 7179:2019-08-25 7120:2019-08-25 7015:: 440–445. 6950:2019-08-25 6714:J. Physiol 6478:2017-06-18 6424:1507.01053 6325:1511.06392 6304:1506.03134 6201:2016-10-19 6113:2016-10-19 6108:TechCrunch 6088:2016-10-19 6064:2016-10-19 6018:1506.02516 5951:: 115–143. 5887:2019-08-25 5749:1506.02075 5728:1503.08895 5589:2017-06-17 5565:2010-07-12 3925:2017-06-18 3900:2012-03-22 3864:2012-03-22 3839:2017-03-16 3490:References 3426:Neural gas 3376:Blue brain 2992:predictive 1864:statistics 1669:structured 1486:biomimetic 1208:Stochastic 1135:Echo state 626:regression 588:covariance 335:supervised 229:perceptron 210:Time delay 122:perceptron 96:activation 92:perceptron 7623:CiteSeerX 7599:1301.3541 7546:CiteSeerX 7501:CiteSeerX 7448:CiteSeerX 7371:CiteSeerX 7326:CiteSeerX 7303:207168299 7271:CiteSeerX 7053:CiteSeerX 6885:CiteSeerX 6668:233883395 6619:244786699 6395:1406.1078 6374:1409.3215 6283:1405.4053 6177:205251479 6161:1476-4687 6039:1410.5401 5992:CiteSeerX 5990:: 87–94. 5707:1410.3916 5685:4 October 5470:CiteSeerX 5425:CiteSeerX 5305:CiteSeerX 4789:CiteSeerX 4692:CiteSeerX 4647:CiteSeerX 4516:CiteSeerX 4320:1424-8220 4249:1409.4842 4194:CiteSeerX 4110:31 August 4060:206775608 3800:1312.6114 3774:: 84–96. 3719:207178999 3711:1935-8237 3689:CiteSeerX 3306:ℓ 3235:ℓ 3224:… 3212:∈ 3207:ℓ 3169:ℓ 3156:rank the 3082:ℓ 3056:ℓ 2996:inference 2940:ℓ 2916:ℓ 2903:ℓ 2899:∑ 2885:ℓ 2849:ℓ 2836:ℓ 2829:∑ 2801:ν 2764:∑ 2755:⁡ 2730:ψ 2699:∣ 2670:ν 2601:∣ 2572:ν 2490:∣ 2461:ν 2390:ν 2289:ψ 2145:ℓ 2121:ℓ 2108:ℓ 2104:∑ 2090:ℓ 2054:ℓ 2041:ℓ 2034:∑ 2006:ν 1969:∑ 1960:⁡ 1948:∑ 1928:ψ 1921:ν 1639:registers 1482:neocortex 1345:algorithm 1336:Cascading 1305:memristor 1121:reservoir 1119:called a 685:shrinkage 534:‖ 525:− 505:‖ 409:σ 352:sigmoidal 295:DeepDream 178:of data. 46:functions 7568:23787346 7523:13462201 7470:16566508 7393:17500627 7348:17444972 7085:12008458 6907:16764513 6842:26017442 6744:14403679 6584:49564277 6576:32422764 6523:14253998 6515:16378515 6195:DeepMind 6169:27732574 6059:WIRED UK 5923:16683347 5812:17573325 5804:18269954 5535:18271205 5492:16112549 5447:18375389 5374:18249962 5327:15064413 5225:12433288 5185:(2007). 5058:16813485 5018:Science. 5004:11761172 4954:18721007 4919:14792754 4871:62245742 4821:14805281 4669:14862572 4616:16873662 4565:23267198 4452:16171497 4338:28604624 4155:(2016). 3978:20577468 3633:17606980 3588:(1968). 3344:See also 3004:features 1817:Features 1760:datasets 1744:accuracy 1647:pointers 1637:such as 1439:decoders 1290:Physical 1001:Hopfield 823:Training 759:distance 693:Bayesian 638:bursting 222:features 7746:Bibcode 7645:6674407 7576:4508400 7478:6953475 6997:: 1–40. 6915:2309950 6850:3074096 6822:Bibcode 6735:1363130 6646:Bibcode 6546:Bibcode 6449:1179542 6429:Bibcode 6141:Bibcode 5500:1856462 5417:Bibcode 5335:2184251 5297:Bibcode 5289:Science 5260:Bibcode 5233:1045112 5141:1915014 5133:9377276 4756:Bibcode 4714:9530137 4624:1658773 4596:Bibcode 4588:Science 4399:2617020 4329:5492478 4298:Bibcode 4290:Sensors 4226:6138085 4052:7370364 3958:Bibcode 1788:complex 1728:Spatial 1699:Spiking 1658:Hybrids 1323:Dynamic 1302:ADALINE 1273:bagging 1235:Modular 1125:readout 572:feature 325:with a 50:neurons 7665:: 1–9. 7643:  7625:  7574:  7566:  7548:  7521:  7503:  7476:  7468:  7450:  7421:  7391:  7373:  7346:  7328:  7301:  7291:  7273:  7083:  7073:  7055:  6913:  6905:  6887:  6848:  6840:  6814:Nature 6742:  6732:  6691:  6666:  6617:  6582:  6574:  6521:  6513:  6447:  6175:  6167:  6159:  6133:Nature 5994:  5921:  5810:  5802:  5613:  5533:  5498:  5490:  5472:  5445:  5427:  5372:  5333:  5325:  5307:  5231:  5223:  5139:  5131:  5056:  5002:  4952:  4917:  4869:  4819:  4809:  4791:  4712:  4694:  4667:  4649:  4622:  4614:  4573:344385 4571:  4563:  4518:  4491:  4450:  4440:  4397:  4387:  4336:  4326:  4318:  4264:  4224:  4214:  4196:  4058:  4050:  3976:  3717:  3709:  3691:  3631:  3479:(TDNN) 3441:Ni1000 3437:(NEAT) 3257:(K-NN) 3048:Layer 3009:greedy 2515:and a 2190:where 1898:ν 1856:models 1829:ssRBMs 1784:simple 1523:memory 1460:GPGPUs 1458:, and 798:median 757:= RBF( 755:Weight 724:margin 613:s and 600:tensor 7663:(PDF) 7641:S2CID 7594:arXiv 7572:S2CID 7519:S2CID 7474:S2CID 7411:(PDF) 7299:S2CID 7245:(PDF) 7234:(PDF) 7209:(PDF) 7198:(PDF) 7173:(PDF) 7158:(PDF) 7114:(PDF) 7103:(PDF) 7081:S2CID 7013:(PDF) 6944:(PDF) 6933:(PDF) 6911:S2CID 6871:(PDF) 6846:S2CID 6810:(PDF) 6664:S2CID 6615:S2CID 6580:S2CID 6519:S2CID 6445:S2CID 6419:arXiv 6390:arXiv 6369:arXiv 6361:(PDF) 6320:arXiv 6299:arXiv 6278:arXiv 6247:(PDF) 6173:S2CID 6083:PCMAG 6034:arXiv 6013:arXiv 5984:ICANN 5941:(PDF) 5919:S2CID 5808:S2CID 5744:arXiv 5723:arXiv 5702:arXiv 5679:(PDF) 5668:(PDF) 5559:(PDF) 5552:(PDF) 5531:S2CID 5496:S2CID 5443:S2CID 5394:(PDF) 5331:S2CID 5229:S2CID 5161:(PDF) 5137:S2CID 5091:(PDF) 5054:S2CID 5034:(PDF) 5000:S2CID 4950:S2CID 4915:S2CID 4907:(PDF) 4889:(PDF) 4867:S2CID 4817:S2CID 4710:S2CID 4665:S2CID 4620:S2CID 4569:S2CID 4493:36439 4489:S2CID 4473:(PDF) 4448:S2CID 4422:(PDF) 4395:S2CID 4358:(PDF) 4286:(PDF) 4244:arXiv 4222:S2CID 4056:S2CID 4017:(PDF) 3894:(PDF) 3883:(PDF) 3795:arXiv 3737:(PDF) 3715:S2CID 3677:(PDF) 3629:S2CID 2558:Here 2519:term 2517:prior 1889:score 1878:(HDP) 1367:fuzzy 355:units 7684:help 7564:PMID 7466:PMID 7419:ISBN 7389:PMID 7344:PMID 7289:ISBN 7071:ISBN 7034:help 6972:help 6903:PMID 6838:PMID 6740:PMID 6689:ISBN 6572:PMID 6511:PMID 6165:PMID 6157:ISSN 5988:2130 5945:JMLR 5800:PMID 5687:2014 5611:ISBN 5488:PMID 5370:PMID 5323:PMID 5221:PMID 5129:PMID 4807:ISBN 4612:PMID 4561:PMID 4549:1–15 4438:ISBN 4385:ISBN 4334:PMID 4316:ISSN 4262:ISBN 4212:ISBN 4112:2013 4082:2013 4048:PMID 3974:PMID 3932:SNNS 3707:ISSN 3443:chip 3194:and 1903:is: 1866:and 1833:RNNs 1821:DBNs 1786:and 1774:The 1746:and 1645:and 1163:The 1037:The 1011:The 942:For 640:and 624:and 615:TDSN 607:DNNs 568:DBNs 309:and 297:and 33:are 7754:doi 7633:doi 7556:doi 7511:doi 7497:103 7458:doi 7381:doi 7367:114 7336:doi 7281:doi 7063:doi 6895:doi 6830:doi 6818:521 6730:PMC 6722:doi 6718:148 6654:doi 6607:doi 6562:hdl 6554:doi 6542:101 6503:doi 6437:doi 6259:doi 6225:doi 6149:doi 6137:538 5911:doi 5835:doi 5792:doi 5523:doi 5480:doi 5435:doi 5362:doi 5315:doi 5293:304 5268:doi 5213:doi 5121:doi 5046:doi 4992:doi 4942:doi 4848:doi 4799:doi 4764:doi 4702:doi 4657:doi 4604:doi 4592:313 4553:doi 4526:doi 4481:doi 4430:doi 4377:doi 4324:PMC 4306:doi 4254:doi 4204:doi 4135:doi 4040:doi 3966:doi 3776:doi 3772:139 3749:doi 3699:doi 3621:doi 2752:exp 1957:exp 1643:ALU 1627:CPU 1603:or 1429:or 1425:In 1173:HMM 989:or 985:or 977:In 611:DSN 483:min 166:of 27:). 25:ANN 7773:: 7752:. 7740:. 7736:. 7705:. 7675:: 7673:}} 7669:{{ 7639:. 7631:. 7619:44 7617:. 7584:^ 7570:. 7562:. 7554:. 7542:35 7540:. 7517:. 7509:. 7495:. 7472:. 7464:. 7456:. 7444:28 7442:. 7413:. 7387:. 7379:. 7365:. 7342:. 7334:. 7322:10 7320:. 7297:. 7287:. 7279:. 7236:. 7200:. 7166:23 7164:. 7160:. 7138:. 7105:. 7079:. 7069:. 7061:. 7025:: 7023:}} 7019:{{ 6995:10 6993:. 6989:. 6963:: 6961:}} 6957:{{ 6935:. 6909:. 6901:. 6893:. 6881:18 6879:. 6873:. 6844:. 6836:. 6828:. 6816:. 6812:. 6738:. 6728:. 6716:. 6712:. 6662:. 6652:. 6642:36 6640:. 6636:. 6613:. 6603:12 6601:. 6578:. 6570:. 6560:. 6552:. 6540:. 6517:. 6509:. 6499:18 6497:. 6466:. 6443:. 6435:. 6427:. 6415:17 6413:. 6367:. 6363:. 6255:50 6253:. 6249:. 6219:. 6193:. 6171:. 6163:. 6155:. 6147:. 6135:. 6131:. 6106:. 6081:. 6057:. 5986:. 5947:. 5943:. 5917:. 5905:. 5875:. 5871:. 5831:01 5829:. 5806:. 5798:. 5788:19 5786:. 5670:. 5648:. 5644:. 5582:. 5529:. 5517:. 5494:. 5486:. 5478:. 5466:18 5464:. 5441:. 5433:. 5423:. 5413:45 5411:. 5368:. 5358:12 5356:. 5352:. 5329:. 5321:. 5313:. 5303:. 5291:. 5266:. 5254:. 5250:. 5227:. 5219:. 5209:14 5207:. 5181:; 5177:; 5135:. 5127:. 5115:. 5103:^ 5052:. 5040:. 5036:. 4998:. 4986:. 4948:. 4936:. 4909:. 4842:. 4838:. 4815:. 4805:. 4797:. 4762:. 4750:. 4746:. 4708:. 4700:. 4688:20 4686:. 4663:. 4655:. 4643:20 4641:. 4618:. 4610:. 4602:. 4590:. 4567:. 4559:. 4547:. 4524:. 4510:. 4487:. 4475:. 4460:^ 4446:. 4436:. 4424:. 4407:^ 4393:. 4383:. 4332:. 4322:. 4314:. 4304:. 4294:17 4292:. 4288:. 4260:. 4252:. 4220:, 4210:, 4202:, 4173:. 4129:. 4098:. 4054:. 4046:. 4036:36 4034:. 3995:. 3972:. 3964:. 3954:29 3952:. 3948:. 3885:. 3832:. 3770:. 3745:71 3743:. 3739:. 3713:. 3705:. 3697:. 3683:. 3679:. 3664:^ 3654:. 3650:. 3627:. 3615:. 3598:13 3596:. 3592:. 3568:. 3547:. 3526:. 3506:. 3433:, 3340:. 3029:. 2653:: 2555:. 1900:'' 1891:. 1870:. 1862:, 1823:, 1815:. 1803:. 1718:. 1641:, 1503:. 1462:. 1454:, 1351:. 1319:. 1149:. 1049:. 893:A 711:. 703:. 628:. 602:. 390:X. 360:U; 313:. 305:, 102:. 7762:. 7756:: 7748:: 7742:2 7709:. 7686:) 7682:( 7647:. 7635:: 7602:. 7596:: 7578:. 7558:: 7525:. 7513:: 7480:. 7460:: 7427:. 7395:. 7383:: 7350:. 7338:: 7305:. 7283:: 7254:. 7218:. 7182:. 7123:. 7087:. 7065:: 7036:) 7032:( 6974:) 6970:( 6953:. 6937:3 6917:. 6897:: 6852:. 6832:: 6824:: 6783:. 6746:. 6724:: 6697:. 6670:. 6656:: 6648:: 6621:. 6609:: 6586:. 6564:: 6556:: 6548:: 6525:. 6505:: 6481:. 6451:. 6439:: 6431:: 6421:: 6398:. 6392:: 6377:. 6371:: 6328:. 6322:: 6307:. 6301:: 6286:. 6280:: 6265:. 6261:: 6231:. 6227:: 6221:9 6204:. 6179:. 6151:: 6143:: 6116:. 6091:. 6067:. 6042:. 6036:: 6021:. 6015:: 6000:. 5949:3 5925:. 5913:: 5907:4 5890:. 5877:5 5841:. 5837:: 5814:. 5794:: 5752:. 5746:: 5731:. 5725:: 5710:. 5704:: 5689:. 5652:. 5650:9 5619:. 5592:. 5568:. 5537:. 5525:: 5519:4 5502:. 5482:: 5449:. 5437:: 5419:: 5376:. 5364:: 5337:. 5317:: 5299:: 5276:. 5270:: 5262:: 5256:2 5235:. 5215:: 5143:. 5123:: 5117:9 5060:. 5048:: 5042:1 5006:. 4994:: 4988:4 4971:. 4956:. 4944:: 4938:1 4921:. 4873:. 4854:. 4850:: 4844:1 4823:. 4801:: 4772:. 4766:: 4758:: 4752:4 4716:. 4704:: 4671:. 4659:: 4626:. 4606:: 4598:: 4575:. 4555:: 4532:. 4528:: 4512:5 4495:. 4483:: 4454:. 4432:: 4401:. 4379:: 4340:. 4308:: 4300:: 4270:. 4256:: 4246:: 4206:: 4177:. 4159:. 4141:. 4137:: 4131:1 4114:. 4084:. 4062:. 4042:: 3999:. 3980:. 3968:: 3960:: 3928:. 3903:. 3867:. 3842:. 3818:. 3803:. 3797:: 3782:. 3778:: 3755:. 3751:: 3721:. 3701:: 3685:2 3656:4 3635:. 3623:: 3617:1 3572:. 3551:. 3530:. 3510:. 3302:m 3290:; 3272:l 3268:m 3240:} 3231:n 3227:, 3221:, 3218:1 3215:{ 3203:m 3192:K 3165:n 3132:l 3107:l 3103:n 3062:1 3059:+ 2970:. 2966:) 2960:3 2955:m 2951:h 2945:2 2936:h 2930:) 2927:3 2924:( 2919:m 2912:W 2906:m 2895:+ 2890:2 2881:h 2875:1 2870:j 2866:h 2860:) 2857:2 2854:( 2846:j 2842:W 2833:j 2825:+ 2820:1 2815:j 2811:h 2805:i 2795:) 2792:1 2789:( 2784:j 2781:i 2777:W 2771:j 2768:i 2759:( 2746:) 2741:3 2737:h 2733:, 2727:( 2724:Z 2720:1 2715:= 2712:) 2707:3 2703:h 2694:2 2690:h 2686:, 2681:1 2677:h 2673:, 2667:( 2664:P 2639:3 2635:h 2614:) 2609:3 2605:h 2596:2 2592:h 2588:, 2583:1 2579:h 2575:, 2569:( 2566:P 2543:) 2538:3 2534:h 2530:( 2527:P 2503:) 2498:3 2494:h 2485:2 2481:h 2477:, 2472:1 2468:h 2464:, 2458:( 2455:P 2432:) 2427:3 2423:h 2419:, 2414:2 2410:h 2406:, 2401:1 2397:h 2393:, 2387:( 2384:P 2358:} 2353:) 2350:3 2347:( 2342:W 2337:, 2332:) 2329:2 2326:( 2321:W 2316:, 2311:) 2308:1 2305:( 2300:W 2295:{ 2292:= 2269:} 2264:) 2261:3 2258:( 2253:h 2248:, 2243:) 2240:2 2237:( 2232:h 2227:, 2222:) 2219:1 2216:( 2211:h 2206:{ 2203:= 2199:h 2175:, 2171:) 2165:3 2160:m 2156:h 2150:2 2141:h 2135:) 2132:3 2129:( 2124:m 2117:W 2111:m 2100:+ 2095:2 2086:h 2080:1 2075:j 2071:h 2065:) 2062:2 2059:( 2051:j 2047:W 2038:j 2030:+ 2025:1 2020:j 2016:h 2010:i 2000:) 1997:1 1994:( 1989:j 1986:i 1982:W 1976:j 1973:i 1964:( 1952:h 1942:Z 1939:1 1934:= 1931:) 1925:, 1917:( 1914:p 761:) 548:, 543:2 538:F 529:T 521:H 515:T 510:U 502:= 499:f 492:T 488:U 468:U 461:X 456:y 447:W 432:) 428:X 422:T 417:W 412:( 406:= 402:H 384:x 378:T 372:t 366:W 347:h 23:(

Index

Artificial neural networks
computational models
biological neural networks
approximate
functions
neurons
adaptive systems
model populations
software-based
Feedforward neural network
McCulloch–Pitts neurons
perceptron
activation
backpropagation
Group method of data handling
Kolmogorov–Gabor polynomials
perceptron
supervised learning
regression analysis
validation set
regularization
Autoencoder
multilayer perceptron
unsupervised learning
unsupervised learning
efficient codings
dimensionality reduction
generative models
Probabilistic neural network
Parzen window

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑