Data scaling before or after PCA

I have seen senior data scientists doing data scaling either before or after applying PCA.

What is more right to do and why?

asked Jul 25 '18 at 13:50

Poete Maudit

386314

1

$begingroup$
Closely related: stats.stackexchange.com/questions/53/…
$endgroup$
– Sycorax
Jul 25 '18 at 15:33

add a comment |

I have seen senior data scientists doing data scaling either before or after applying PCA.

What is more right to do and why?

asked Jul 25 '18 at 13:50

Poete Maudit

386314

1

$begingroup$
Closely related: stats.stackexchange.com/questions/53/…
$endgroup$
– Sycorax
Jul 25 '18 at 15:33

add a comment |

I have seen senior data scientists doing data scaling either before or after applying PCA.

What is more right to do and why?

asked Jul 25 '18 at 13:50

Poete Maudit

386314

I have seen senior data scientists doing data scaling either before or after applying PCA.

What is more right to do and why?

machine-learning feature-scaling

asked Jul 25 '18 at 13:50

Poete Maudit

386314

asked Jul 25 '18 at 13:50

Poete Maudit

386314

asked Jul 25 '18 at 13:50

Poete Maudit

386314

asked Jul 25 '18 at 13:50

Poete Maudit

386314

asked Jul 25 '18 at 13:50

Poete Maudit

386314

1

$begingroup$
Closely related: stats.stackexchange.com/questions/53/…
$endgroup$
– Sycorax
Jul 25 '18 at 15:33

add a comment |

1

$begingroup$
Closely related: stats.stackexchange.com/questions/53/…
$endgroup$
– Sycorax
Jul 25 '18 at 15:33

Closely related: stats.stackexchange.com/questions/53/…

– Sycorax
Jul 25 '18 at 15:33

add a comment |

2 Answers
2

active

oldest

votes

I once heard a data scinetist state at a conference talk: "Basically, you can do what you want, as long as you know what you are doing."

This also applies here. The more statistically sound way would be to transform all variables prior to additional steps such as PCA or factor analysis. Then you still know the scale of your variables and can interpret the rescaling in the context of your application. If you have no such interpretation, but good reasons for rescaling your principal components due to computational issues arising if some values are to close to zero while others are quite large, rescaling the components makes sense. However, reversing this process and still being able to interpret the effect of the rescaling operation in your context will become almost impossible.

answered Jul 25 '18 at 13:59

Alex2006

25118

$begingroup$
Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.
$endgroup$
– Poete Maudit
Jul 27 '18 at 12:01

add a comment |

"It results more important to balance the classes rather than reduce the dimensionality, at least in terms of accuracy; (ii) The best choice seems to be the application of SMOTE followed by PCA.."

Link: https://core.ac.uk/download/pdf/61408511.pdf

answered 14 hours ago

tsumaranaina

4510

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f36002%2fdata-scaling-before-or-after-pca%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

I once heard a data scinetist state at a conference talk: "Basically, you can do what you want, as long as you know what you are doing."

answered Jul 25 '18 at 13:59

Alex2006

25118

$begingroup$
Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.
$endgroup$
– Poete Maudit
Jul 27 '18 at 12:01

add a comment |

I once heard a data scinetist state at a conference talk: "Basically, you can do what you want, as long as you know what you are doing."

answered Jul 25 '18 at 13:59

Alex2006

25118

$begingroup$
Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.
$endgroup$
– Poete Maudit
Jul 27 '18 at 12:01

add a comment |

I once heard a data scinetist state at a conference talk: "Basically, you can do what you want, as long as you know what you are doing."

answered Jul 25 '18 at 13:59

Alex2006

25118

I once heard a data scinetist state at a conference talk: "Basically, you can do what you want, as long as you know what you are doing."

answered Jul 25 '18 at 13:59

Alex2006

25118

answered Jul 25 '18 at 13:59

Alex2006

25118

answered Jul 25 '18 at 13:59

Alex2006

25118

answered Jul 25 '18 at 13:59

Alex2006

25118

$begingroup$
Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.
$endgroup$
– Poete Maudit
Jul 27 '18 at 12:01

add a comment |

$begingroup$
Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.
$endgroup$
– Poete Maudit
Jul 27 '18 at 12:01

Thank you for your answer@alex. As I can for the upvotes that you got, your answer is right and actually this was what I had in my mind.

– Poete Maudit
Jul 27 '18 at 12:01

add a comment |

"It results more important to balance the classes rather than reduce the dimensionality, at least in terms of accuracy; (ii) The best choice seems to be the application of SMOTE followed by PCA.."

Link: https://core.ac.uk/download/pdf/61408511.pdf

answered 14 hours ago

tsumaranaina

4510

add a comment |

"It results more important to balance the classes rather than reduce the dimensionality, at least in terms of accuracy; (ii) The best choice seems to be the application of SMOTE followed by PCA.."

Link: https://core.ac.uk/download/pdf/61408511.pdf

answered 14 hours ago

tsumaranaina

4510

add a comment |

"It results more important to balance the classes rather than reduce the dimensionality, at least in terms of accuracy; (ii) The best choice seems to be the application of SMOTE followed by PCA.."

Link: https://core.ac.uk/download/pdf/61408511.pdf

answered 14 hours ago

tsumaranaina

4510

"It results more important to balance the classes rather than reduce the dimensionality, at least in terms of accuracy; (ii) The best choice seems to be the application of SMOTE followed by PCA.."

Link: https://core.ac.uk/download/pdf/61408511.pdf

answered 14 hours ago

tsumaranaina

4510

answered 14 hours ago

tsumaranaina

4510

answered 14 hours ago

tsumaranaina

4510

answered 14 hours ago

tsumaranaina

4510

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk