what are the default values of nodes and internal layers in Neural Network model?
$begingroup$
what are the default number of internal layers and internal nodes in training a neural network?
My data has 62 observations with roughly 200 predictors. I have a target variable with two classes. I have implemented neural networks with one internal layer and one internal node without repeats and I have also tried with two internal layers with 5 internal nodes in one and 2 internal nodes in second layer. I want to find the accuracy first on default values and then I will try to optimise the model performance.
what is the criterion to choose the number of layers and internal nodes in neural network training model? In case of random forest we can choose try to be roughly equal to square root of the number of predictors?
machine-learning neural-network
$endgroup$
add a comment |
$begingroup$
what are the default number of internal layers and internal nodes in training a neural network?
My data has 62 observations with roughly 200 predictors. I have a target variable with two classes. I have implemented neural networks with one internal layer and one internal node without repeats and I have also tried with two internal layers with 5 internal nodes in one and 2 internal nodes in second layer. I want to find the accuracy first on default values and then I will try to optimise the model performance.
what is the criterion to choose the number of layers and internal nodes in neural network training model? In case of random forest we can choose try to be roughly equal to square root of the number of predictors?
machine-learning neural-network
$endgroup$
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday
add a comment |
$begingroup$
what are the default number of internal layers and internal nodes in training a neural network?
My data has 62 observations with roughly 200 predictors. I have a target variable with two classes. I have implemented neural networks with one internal layer and one internal node without repeats and I have also tried with two internal layers with 5 internal nodes in one and 2 internal nodes in second layer. I want to find the accuracy first on default values and then I will try to optimise the model performance.
what is the criterion to choose the number of layers and internal nodes in neural network training model? In case of random forest we can choose try to be roughly equal to square root of the number of predictors?
machine-learning neural-network
$endgroup$
what are the default number of internal layers and internal nodes in training a neural network?
My data has 62 observations with roughly 200 predictors. I have a target variable with two classes. I have implemented neural networks with one internal layer and one internal node without repeats and I have also tried with two internal layers with 5 internal nodes in one and 2 internal nodes in second layer. I want to find the accuracy first on default values and then I will try to optimise the model performance.
what is the criterion to choose the number of layers and internal nodes in neural network training model? In case of random forest we can choose try to be roughly equal to square root of the number of predictors?
machine-learning neural-network
machine-learning neural-network
edited Aug 12 '17 at 1:22
KHAN irfan
asked Aug 12 '17 at 1:01
KHAN irfanKHAN irfan
9310
9310
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday
add a comment |
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
One potential approach can be iterative design of a neural network architecture such as Multi-Layer Perceptron (MLP) as described in the following post:
https://stats.stackexchange.com/questions/238637/deep-neural-network-tuning-hyperparameters
We can restrict ourselves to 4-8 layers with 8-128 (power of 2) neurons per layer. In addition, we can assume recommended ReLU activations with He normal weight initialization and Adam or SGD with Nesterov momentum optimizers.
In order to avoid overfitting on a small dataset, it is important to add l1 or l2 regularization (weight decay) and a dropout layer (e.g. with keep probability of 0.5).
We can then use cross validation with random search or bayesian optimization to choose the best architecture as described in the cross validated article above.
$endgroup$
add a comment |
$begingroup$
There are websites that explain these pretty well.
Deciding on the number of neurons in the hidden layer(s)
From https://www.r-bloggers.com/selecting-the-number-of-neurons-in-the-hidden-layer-of-a-neural-network/:
The most common rule of thumb is to choose a number of hidden neurons between 1 and the number of input variables.
Deciding on the number of layers of hidden layers
From https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw:
For most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
Hope that answers your question!
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f22199%2fwhat-are-the-default-values-of-nodes-and-internal-layers-in-neural-network-model%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
One potential approach can be iterative design of a neural network architecture such as Multi-Layer Perceptron (MLP) as described in the following post:
https://stats.stackexchange.com/questions/238637/deep-neural-network-tuning-hyperparameters
We can restrict ourselves to 4-8 layers with 8-128 (power of 2) neurons per layer. In addition, we can assume recommended ReLU activations with He normal weight initialization and Adam or SGD with Nesterov momentum optimizers.
In order to avoid overfitting on a small dataset, it is important to add l1 or l2 regularization (weight decay) and a dropout layer (e.g. with keep probability of 0.5).
We can then use cross validation with random search or bayesian optimization to choose the best architecture as described in the cross validated article above.
$endgroup$
add a comment |
$begingroup$
One potential approach can be iterative design of a neural network architecture such as Multi-Layer Perceptron (MLP) as described in the following post:
https://stats.stackexchange.com/questions/238637/deep-neural-network-tuning-hyperparameters
We can restrict ourselves to 4-8 layers with 8-128 (power of 2) neurons per layer. In addition, we can assume recommended ReLU activations with He normal weight initialization and Adam or SGD with Nesterov momentum optimizers.
In order to avoid overfitting on a small dataset, it is important to add l1 or l2 regularization (weight decay) and a dropout layer (e.g. with keep probability of 0.5).
We can then use cross validation with random search or bayesian optimization to choose the best architecture as described in the cross validated article above.
$endgroup$
add a comment |
$begingroup$
One potential approach can be iterative design of a neural network architecture such as Multi-Layer Perceptron (MLP) as described in the following post:
https://stats.stackexchange.com/questions/238637/deep-neural-network-tuning-hyperparameters
We can restrict ourselves to 4-8 layers with 8-128 (power of 2) neurons per layer. In addition, we can assume recommended ReLU activations with He normal weight initialization and Adam or SGD with Nesterov momentum optimizers.
In order to avoid overfitting on a small dataset, it is important to add l1 or l2 regularization (weight decay) and a dropout layer (e.g. with keep probability of 0.5).
We can then use cross validation with random search or bayesian optimization to choose the best architecture as described in the cross validated article above.
$endgroup$
One potential approach can be iterative design of a neural network architecture such as Multi-Layer Perceptron (MLP) as described in the following post:
https://stats.stackexchange.com/questions/238637/deep-neural-network-tuning-hyperparameters
We can restrict ourselves to 4-8 layers with 8-128 (power of 2) neurons per layer. In addition, we can assume recommended ReLU activations with He normal weight initialization and Adam or SGD with Nesterov momentum optimizers.
In order to avoid overfitting on a small dataset, it is important to add l1 or l2 regularization (weight decay) and a dropout layer (e.g. with keep probability of 0.5).
We can then use cross validation with random search or bayesian optimization to choose the best architecture as described in the cross validated article above.
answered Aug 15 '17 at 19:46
Vadim SmolyakovVadim Smolyakov
361213
361213
add a comment |
add a comment |
$begingroup$
There are websites that explain these pretty well.
Deciding on the number of neurons in the hidden layer(s)
From https://www.r-bloggers.com/selecting-the-number-of-neurons-in-the-hidden-layer-of-a-neural-network/:
The most common rule of thumb is to choose a number of hidden neurons between 1 and the number of input variables.
Deciding on the number of layers of hidden layers
From https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw:
For most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
Hope that answers your question!
$endgroup$
add a comment |
$begingroup$
There are websites that explain these pretty well.
Deciding on the number of neurons in the hidden layer(s)
From https://www.r-bloggers.com/selecting-the-number-of-neurons-in-the-hidden-layer-of-a-neural-network/:
The most common rule of thumb is to choose a number of hidden neurons between 1 and the number of input variables.
Deciding on the number of layers of hidden layers
From https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw:
For most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
Hope that answers your question!
$endgroup$
add a comment |
$begingroup$
There are websites that explain these pretty well.
Deciding on the number of neurons in the hidden layer(s)
From https://www.r-bloggers.com/selecting-the-number-of-neurons-in-the-hidden-layer-of-a-neural-network/:
The most common rule of thumb is to choose a number of hidden neurons between 1 and the number of input variables.
Deciding on the number of layers of hidden layers
From https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw:
For most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
Hope that answers your question!
$endgroup$
There are websites that explain these pretty well.
Deciding on the number of neurons in the hidden layer(s)
From https://www.r-bloggers.com/selecting-the-number-of-neurons-in-the-hidden-layer-of-a-neural-network/:
The most common rule of thumb is to choose a number of hidden neurons between 1 and the number of input variables.
Deciding on the number of layers of hidden layers
From https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw:
For most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
Hope that answers your question!
answered Aug 12 '17 at 17:54
IronEdwardIronEdward
15510
15510
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f22199%2fwhat-are-the-default-values-of-nodes-and-internal-layers-in-neural-network-model%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
i would worry about over-fitting (given there are only 62 observations and 200 predictors). i suggest regularizing the network using l1 or l2 penalty on weights and dropout with keep probability of 0.5
$endgroup$
– Vadim Smolyakov
Aug 15 '17 at 18:29
$begingroup$
You are in the right direction. Please read thispaper that tried to answer your question. Welcome the world of Neural Architecture Search (NAS).
$endgroup$
– iDeepVision
yesterday