what is correct way to perform normalization on data in Auto encoder?

working on anomaly detection problem. i'm using auto-encoder to denoise given input. I trained network with normal data(anomaly free). so model predict normal state of given input. Normalization of input is essential for my dataset.

problem with normalization is that when noise value is very high compare to entire dataset. then prediction follows noise. for example if I add noise (delta=300) to 80% of the data and perform normalization on the dataset which mean value is 250 and standard deviation is 79. here noisy data points(80% of the total dataset) are greater than 300. after normalization, I feed this dataset to the model, prediction follows the noise and giving wrong output. this happens because of feature scaling. when I add noise to most of data points, model consider this points as normal data points and rest of as anomalous data points.

In inverse scaling process, I can not use min-max values of my input to perform inverse scaling on prediction otherwise its follow noise in dataset.

so what is right way to perform feature scaling in denoise kind of problem?

asked 2 days ago

Milan_Harkhani

New contributor

$begingroup$
It would help if you give an exact example of noise addition and normalization for some 1D dataset like (1, 1, 2, 3, 3, 4, 5, 6, 6, 7)
$endgroup$
– Esmailian
yesterday

$begingroup$
I have a multivariate(16 feature) dataset. each feature represent sensor data(temperature, pressure, etc ) of industrial machinery. Training dataset represent normal condition of machinery and goal is to find abnormal behaviour of sensor data. I added some noise(constant) value to some data points. for example I add delta=300 to first 1000 points in one of 16 features. when I feed this perturbed input to model. prediction follows noise instead of giving normal output.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
this odd behavior makes sense. Suppose 1000 points are around 0 with standard deviation 1. So if you add 300 to 800 of them, now we have 200 points around 0 and 800 points around 300. This way, these 800 points are the NEW normal, those 200 points become the new outliers!
$endgroup$
– Esmailian
yesterday

$begingroup$
exactly that's what I found. I also have test dataset, sensor data when machine was in faulty condition. its a 1400 points in size. one of sensor output is under 20 for first 800 points which is consider as normal behaviour and then after goes up to 50 which is deviation from normal. when I applied min-max scaling and inverse transform on predicted value with same min-max value. it follows noise and output goes up to above 50 for anomalous data points. so what is right way to perform normalization and denormalization.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
The problem is with the amount of noise added, no normalization can make this right since the majority of data (80%) will always stick together under any normalization and always will be the norm.
$endgroup$
– Esmailian
yesterday

add a comment |

In inverse scaling process, I can not use min-max values of my input to perform inverse scaling on prediction otherwise its follow noise in dataset.

so what is right way to perform feature scaling in denoise kind of problem?

asked 2 days ago

Milan_Harkhani

New contributor

$begingroup$
It would help if you give an exact example of noise addition and normalization for some 1D dataset like (1, 1, 2, 3, 3, 4, 5, 6, 6, 7)
$endgroup$
– Esmailian
yesterday

$begingroup$
I have a multivariate(16 feature) dataset. each feature represent sensor data(temperature, pressure, etc ) of industrial machinery. Training dataset represent normal condition of machinery and goal is to find abnormal behaviour of sensor data. I added some noise(constant) value to some data points. for example I add delta=300 to first 1000 points in one of 16 features. when I feed this perturbed input to model. prediction follows noise instead of giving normal output.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
this odd behavior makes sense. Suppose 1000 points are around 0 with standard deviation 1. So if you add 300 to 800 of them, now we have 200 points around 0 and 800 points around 300. This way, these 800 points are the NEW normal, those 200 points become the new outliers!
$endgroup$
– Esmailian
yesterday

$begingroup$
exactly that's what I found. I also have test dataset, sensor data when machine was in faulty condition. its a 1400 points in size. one of sensor output is under 20 for first 800 points which is consider as normal behaviour and then after goes up to 50 which is deviation from normal. when I applied min-max scaling and inverse transform on predicted value with same min-max value. it follows noise and output goes up to above 50 for anomalous data points. so what is right way to perform normalization and denormalization.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
The problem is with the amount of noise added, no normalization can make this right since the majority of data (80%) will always stick together under any normalization and always will be the norm.
$endgroup$
– Esmailian
yesterday

add a comment |

In inverse scaling process, I can not use min-max values of my input to perform inverse scaling on prediction otherwise its follow noise in dataset.

so what is right way to perform feature scaling in denoise kind of problem?

asked 2 days ago

Milan_Harkhani

New contributor

In inverse scaling process, I can not use min-max values of my input to perform inverse scaling on prediction otherwise its follow noise in dataset.

so what is right way to perform feature scaling in denoise kind of problem?

deep-learning preprocessing autoencoder feature-scaling noise

asked 2 days ago

Milan_Harkhani

New contributor

asked 2 days ago

Milan_Harkhani

New contributor

asked 2 days ago

Milan_Harkhani

New contributor

asked 2 days ago

Milan_Harkhani

asked 2 days ago

Milan_Harkhani

New contributor

Milan_Harkhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
It would help if you give an exact example of noise addition and normalization for some 1D dataset like (1, 1, 2, 3, 3, 4, 5, 6, 6, 7)
$endgroup$
– Esmailian
yesterday

$begingroup$
I have a multivariate(16 feature) dataset. each feature represent sensor data(temperature, pressure, etc ) of industrial machinery. Training dataset represent normal condition of machinery and goal is to find abnormal behaviour of sensor data. I added some noise(constant) value to some data points. for example I add delta=300 to first 1000 points in one of 16 features. when I feed this perturbed input to model. prediction follows noise instead of giving normal output.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
this odd behavior makes sense. Suppose 1000 points are around 0 with standard deviation 1. So if you add 300 to 800 of them, now we have 200 points around 0 and 800 points around 300. This way, these 800 points are the NEW normal, those 200 points become the new outliers!
$endgroup$
– Esmailian
yesterday

$begingroup$
exactly that's what I found. I also have test dataset, sensor data when machine was in faulty condition. its a 1400 points in size. one of sensor output is under 20 for first 800 points which is consider as normal behaviour and then after goes up to 50 which is deviation from normal. when I applied min-max scaling and inverse transform on predicted value with same min-max value. it follows noise and output goes up to above 50 for anomalous data points. so what is right way to perform normalization and denormalization.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
The problem is with the amount of noise added, no normalization can make this right since the majority of data (80%) will always stick together under any normalization and always will be the norm.
$endgroup$
– Esmailian
yesterday

add a comment |

$begingroup$
It would help if you give an exact example of noise addition and normalization for some 1D dataset like (1, 1, 2, 3, 3, 4, 5, 6, 6, 7)
$endgroup$
– Esmailian
yesterday

$begingroup$
I have a multivariate(16 feature) dataset. each feature represent sensor data(temperature, pressure, etc ) of industrial machinery. Training dataset represent normal condition of machinery and goal is to find abnormal behaviour of sensor data. I added some noise(constant) value to some data points. for example I add delta=300 to first 1000 points in one of 16 features. when I feed this perturbed input to model. prediction follows noise instead of giving normal output.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
this odd behavior makes sense. Suppose 1000 points are around 0 with standard deviation 1. So if you add 300 to 800 of them, now we have 200 points around 0 and 800 points around 300. This way, these 800 points are the NEW normal, those 200 points become the new outliers!
$endgroup$
– Esmailian
yesterday

$begingroup$
exactly that's what I found. I also have test dataset, sensor data when machine was in faulty condition. its a 1400 points in size. one of sensor output is under 20 for first 800 points which is consider as normal behaviour and then after goes up to 50 which is deviation from normal. when I applied min-max scaling and inverse transform on predicted value with same min-max value. it follows noise and output goes up to above 50 for anomalous data points. so what is right way to perform normalization and denormalization.
$endgroup$
– Milan_Harkhani
yesterday

$begingroup$
The problem is with the amount of noise added, no normalization can make this right since the majority of data (80%) will always stick together under any normalization and always will be the norm.
$endgroup$
– Esmailian
yesterday

It would help if you give an exact example of noise addition and normalization for some 1D dataset like (1, 1, 2, 3, 3, 4, 5, 6, 6, 7)

– Esmailian
yesterday

I have a multivariate(16 feature) dataset. each feature represent sensor data(temperature, pressure, etc ) of industrial machinery. Training dataset represent normal condition of machinery and goal is to find abnormal behaviour of sensor data. I added some noise(constant) value to some data points. for example I add delta=300 to first 1000 points in one of 16 features. when I feed this perturbed input to model. prediction follows noise instead of giving normal output.

– Milan_Harkhani
yesterday

this odd behavior makes sense. Suppose 1000 points are around 0 with standard deviation 1. So if you add 300 to 800 of them, now we have 200 points around 0 and 800 points around 300. This way, these 800 points are the NEW normal, those 200 points become the new outliers!

– Esmailian
yesterday

exactly that's what I found. I also have test dataset, sensor data when machine was in faulty condition. its a 1400 points in size. one of sensor output is under 20 for first 800 points which is consider as normal behaviour and then after goes up to 50 which is deviation from normal. when I applied min-max scaling and inverse transform on predicted value with same min-max value. it follows noise and output goes up to above 50 for anomalous data points. so what is right way to perform normalization and denormalization.

– Milan_Harkhani
yesterday

The problem is with the amount of noise added, no normalization can make this right since the majority of data (80%) will always stick together under any normalization and always will be the norm.

– Esmailian
yesterday

add a comment |

1 Answer
1

active

oldest

votes

Min-Max-Scaling will not perform well for your problem, as you already said. For noisy data, scaling according to the quantile range should perform better.

After scaling, you could also try to clip your data to $[-1,1]$, as it it is often done in adversarial learning methods.

answered 2 days ago

Andreas Look

39619

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Milan_Harkhani is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47086%2fwhat-is-correct-way-to-perform-normalization-on-data-in-auto-encoder%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Min-Max-Scaling will not perform well for your problem, as you already said. For noisy data, scaling according to the quantile range should perform better.

After scaling, you could also try to clip your data to $[-1,1]$, as it it is often done in adversarial learning methods.

answered 2 days ago

Andreas Look

39619

add a comment |

Min-Max-Scaling will not perform well for your problem, as you already said. For noisy data, scaling according to the quantile range should perform better.

After scaling, you could also try to clip your data to $[-1,1]$, as it it is often done in adversarial learning methods.

answered 2 days ago

Andreas Look

39619

add a comment |

Min-Max-Scaling will not perform well for your problem, as you already said. For noisy data, scaling according to the quantile range should perform better.

After scaling, you could also try to clip your data to $[-1,1]$, as it it is often done in adversarial learning methods.

answered 2 days ago

Andreas Look

39619

Min-Max-Scaling will not perform well for your problem, as you already said. For noisy data, scaling according to the quantile range should perform better.

After scaling, you could also try to clip your data to $[-1,1]$, as it it is often done in adversarial learning methods.

answered 2 days ago

Andreas Look

39619

answered 2 days ago

Andreas Look

39619

answered 2 days ago

Andreas Look

39619

answered 2 days ago

Andreas Look

39619

add a comment |

Milan_Harkhani is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Milan_Harkhani is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk