What is the best architecture for Auto-Encoder for image reconstruction?












1












$begingroup$


I am trying to use Convultional Auto-Encoder for its latent space (embedding layer), specifically, I want to use the embedding for K-nearest neighbor search in the latent space (similar idea to word2vec).



My input is 224x224 (ImageNet). I could not find any article that elaborates a specific architecture (in terms of number of filters, number of conv layers, etc.)
I tried some arbitrary architectures like:



Encoder:




  • Conv(channels=3,filters=16,kernel=3)

  • Conv(channels=16,filters=32,kernel=3)

  • Conv(channels=32,filters=64,kernel=3)


Decoder:




  • Conv(channels=64,filters=32,kernel=3)

  • Conv(channels=32,filters=16,kernel=3)

  • Conv(channels=16,filters=3,kernel=3)


But I'd like to start my hyper-parameters search from a set up that proved itself on a similar task.
Can you refer me to a source or suggest an architecture that worked for you for this purpose?










share|improve this question









New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
    $endgroup$
    – pcko1
    yesterday












  • $begingroup$
    @pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
    $endgroup$
    – Idan azuri
    yesterday


















1












$begingroup$


I am trying to use Convultional Auto-Encoder for its latent space (embedding layer), specifically, I want to use the embedding for K-nearest neighbor search in the latent space (similar idea to word2vec).



My input is 224x224 (ImageNet). I could not find any article that elaborates a specific architecture (in terms of number of filters, number of conv layers, etc.)
I tried some arbitrary architectures like:



Encoder:




  • Conv(channels=3,filters=16,kernel=3)

  • Conv(channels=16,filters=32,kernel=3)

  • Conv(channels=32,filters=64,kernel=3)


Decoder:




  • Conv(channels=64,filters=32,kernel=3)

  • Conv(channels=32,filters=16,kernel=3)

  • Conv(channels=16,filters=3,kernel=3)


But I'd like to start my hyper-parameters search from a set up that proved itself on a similar task.
Can you refer me to a source or suggest an architecture that worked for you for this purpose?










share|improve this question









New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
    $endgroup$
    – pcko1
    yesterday












  • $begingroup$
    @pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
    $endgroup$
    – Idan azuri
    yesterday
















1












1








1





$begingroup$


I am trying to use Convultional Auto-Encoder for its latent space (embedding layer), specifically, I want to use the embedding for K-nearest neighbor search in the latent space (similar idea to word2vec).



My input is 224x224 (ImageNet). I could not find any article that elaborates a specific architecture (in terms of number of filters, number of conv layers, etc.)
I tried some arbitrary architectures like:



Encoder:




  • Conv(channels=3,filters=16,kernel=3)

  • Conv(channels=16,filters=32,kernel=3)

  • Conv(channels=32,filters=64,kernel=3)


Decoder:




  • Conv(channels=64,filters=32,kernel=3)

  • Conv(channels=32,filters=16,kernel=3)

  • Conv(channels=16,filters=3,kernel=3)


But I'd like to start my hyper-parameters search from a set up that proved itself on a similar task.
Can you refer me to a source or suggest an architecture that worked for you for this purpose?










share|improve this question









New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I am trying to use Convultional Auto-Encoder for its latent space (embedding layer), specifically, I want to use the embedding for K-nearest neighbor search in the latent space (similar idea to word2vec).



My input is 224x224 (ImageNet). I could not find any article that elaborates a specific architecture (in terms of number of filters, number of conv layers, etc.)
I tried some arbitrary architectures like:



Encoder:




  • Conv(channels=3,filters=16,kernel=3)

  • Conv(channels=16,filters=32,kernel=3)

  • Conv(channels=32,filters=64,kernel=3)


Decoder:




  • Conv(channels=64,filters=32,kernel=3)

  • Conv(channels=32,filters=16,kernel=3)

  • Conv(channels=16,filters=3,kernel=3)


But I'd like to start my hyper-parameters search from a set up that proved itself on a similar task.
Can you refer me to a source or suggest an architecture that worked for you for this purpose?







neural-network deep-learning convnet autoencoder






share|improve this question









New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 16 mins ago









Stephen Rauch

1,52551330




1,52551330






New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









Idan azuriIdan azuri

62




62




New contributor




Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Idan azuri is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
    $endgroup$
    – pcko1
    yesterday












  • $begingroup$
    @pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
    $endgroup$
    – Idan azuri
    yesterday




















  • $begingroup$
    There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
    $endgroup$
    – pcko1
    yesterday












  • $begingroup$
    @pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
    $endgroup$
    – Idan azuri
    yesterday


















$begingroup$
There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
$endgroup$
– pcko1
yesterday






$begingroup$
There is none. You should always optimize your network through an "ad hoc" hyperparameter search that depends on the problem at hand.
$endgroup$
– pcko1
yesterday














$begingroup$
@pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
$endgroup$
– Idan azuri
yesterday






$begingroup$
@pcko1 disagree, in many cases, it is very helpful to use a similar problem architecture and then to make the fine-tuning. Moreover, my dataset is ImageNet which is very investigated. Last, until you didn't cover all the articles in arxiv you can't say "there is none"...
$endgroup$
– Idan azuri
yesterday












0






active

oldest

votes












Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Idan azuri is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49709%2fwhat-is-the-best-architecture-for-auto-encoder-for-image-reconstruction%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








Idan azuri is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Idan azuri is a new contributor. Be nice, and check out our Code of Conduct.













Idan azuri is a new contributor. Be nice, and check out our Code of Conduct.












Idan azuri is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49709%2fwhat-is-the-best-architecture-for-auto-encoder-for-image-reconstruction%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to label and detect the document text images

Vallis Paradisi

Tabula Rosettana