Feature selection vs Feature extraction. Which to use when?












11












$begingroup$


Feature extraction and feature selection essentially reduce the dimensionality of the data, but feature extraction also makes the data more separable, if I am right.



Which technique would be preferred over the other and when?



I was thinking, since feature selection does not modify the original data and it's properties, I assume that you will use feature selection when it's important that the features you're training on be unchanged. But I can't imagine why you would want something like this..










share|improve this question











$endgroup$

















    11












    $begingroup$


    Feature extraction and feature selection essentially reduce the dimensionality of the data, but feature extraction also makes the data more separable, if I am right.



    Which technique would be preferred over the other and when?



    I was thinking, since feature selection does not modify the original data and it's properties, I assume that you will use feature selection when it's important that the features you're training on be unchanged. But I can't imagine why you would want something like this..










    share|improve this question











    $endgroup$















      11












      11








      11


      2



      $begingroup$


      Feature extraction and feature selection essentially reduce the dimensionality of the data, but feature extraction also makes the data more separable, if I am right.



      Which technique would be preferred over the other and when?



      I was thinking, since feature selection does not modify the original data and it's properties, I assume that you will use feature selection when it's important that the features you're training on be unchanged. But I can't imagine why you would want something like this..










      share|improve this question











      $endgroup$




      Feature extraction and feature selection essentially reduce the dimensionality of the data, but feature extraction also makes the data more separable, if I am right.



      Which technique would be preferred over the other and when?



      I was thinking, since feature selection does not modify the original data and it's properties, I assume that you will use feature selection when it's important that the features you're training on be unchanged. But I can't imagine why you would want something like this..







      feature-selection feature-extraction dimensionality-reduction






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 13 mins ago









      Aditya

      1,4101525




      1,4101525










      asked Mar 13 '18 at 5:32









      SidSid

      11318




      11318






















          4 Answers
          4






          active

          oldest

          votes


















          12












          $begingroup$

          Adding to The answer given by Toros,



          These(see below bullets) three are quite similar but with a subtle differences-:(concise and easy to remember)




          • feature extraction and feature engineering: transformation of raw data into features suitable for modeling;


          • feature transformation: transformation of data to improve the accuracy of the algorithm;


          • feature selection: removing unnecessary features.



          Just to add an Example of the same,




          Feature Extraction and Engineering(we can extract something from them)





          • Texts(ngrams, word2vec, tf-idf etc)

          • Images(CNN'S, texts, q&a)

          • Geospatial data(lat, long etc)

          • Date and time(day,month,week,year..)

          • Time series, web, etc...

          • Dimensional Reduction Techniques..

          • .....(And Many Others)



          Feature transformations(transforming them to make sense)





          • Normalization and changing distribution(Scaling)

          • Interactions

          • Filling in the missing values(median filling etc)

          • .....(And Many Others)



          Feature selection(building your model on these selected features)





          • Statistical approaches

          • Selection by modeling

          • Grid search

          • Cross Validation

          • .....(And Many Others)


          Hope this helps...



          Do look at the links shared by others.
          They are Quite Nice...






          share|improve this answer











          $endgroup$













          • $begingroup$
            nice way of answering +1 for that.
            $endgroup$
            – Toros91
            Mar 14 '18 at 1:48










          • $begingroup$
            Kudos to this community.. Learning a lot from it..
            $endgroup$
            – Aditya
            Mar 14 '18 at 2:11






          • 1




            $begingroup$
            True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
            $endgroup$
            – Toros91
            Mar 14 '18 at 2:19










          • $begingroup$
            What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
            $endgroup$
            – technazi
            Oct 20 '18 at 19:39










          • $begingroup$
            Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
            $endgroup$
            – Aditya
            Oct 21 '18 at 2:00





















          3












          $begingroup$

          I think they are 2 different things,



          Lets start with Feature Selection:



          This technique is used for selecting the features which explain the most of the target variable(has a correlation with the target variable).This test is ran just before the model is applied on the data.



          To explain it better let us go by an example: there are 10 feature and 1 target variable, 9 features explain 90% of the target variable and 10 features together explains 91% of the target variable. So the 1 variable is not making much of a difference so you tend to remove that before modelling(It is subjective to the business as well). I can also be called as Predictor Importance.



          Now lets talk about Feature Extraction,



          Which is used in Unsupervised Learning,extraction of contours in images, extraction of Bi-grams from a text, extraction of phonemes from recording of spoken text.
          When you don't know anything about the data like no data dictionary, too many features which means the data is not in understandable format. Then you try applying this technique to get some features which explains the most of the data. Feature extraction involves a transformation of the features, which often is not reversible because some information is lost in the process of dimensionality reduction.



          You can apply Feature Extraction on the given data to extract features and then apply Feature Selection with respect to the Target Variable to select the subset which can help in making a good model with good results.



          you can go through these Link-1,Link-2 for better understanding.



          we can implement them in R, Python, SPSS.



          let me know if need any more clarification.






          share|improve this answer











          $endgroup$





















            3












            $begingroup$

            The two are very different: Feature Selection indeed reduces dimensions, but feature extraction adds dimensions which are computed from other features.



            For panel or time series data, one usually has the datetime variable, and one does not want to train the dependent variable on the date itself as those do not occur in the future. So you should eliminate the datetime: feature elimination.



            On the other hand, weekday/weekend day may be very relevant, so we need to compute the weekday status from the datetime: feature extraction.






            share|improve this answer









            $endgroup$





















              3












              $begingroup$

              As Aditya said, there are 3 feature-related terms that sometimes are confused with each other. I will try and give summary explanation to each one of them:





              • Feature extraction: Generation of features from data that are in a format that is difficult to analyse directly/are not directly comparable (e.g. images, time-series, etc.) In the example of a time-series, some simple features could be for example: length of time-series, period, mean value, std, etc.


              • Feature transformation: Transformation of existing features in order to create new ones based on the old ones. A very popularly used technique for dimensionality reduction is Principal Component Analysis (pca) that uses some orthogonal transformation in order to produce a set of linearly non-correlated variables based on the initial set of variables.


              • Feature selection: Selection of the features with the highest "importance"/influence on the target variable, from a set of existing features. This can be done with various techniques: e.g. Linear Regression, Decision Trees, calculation of "importance" weights (e.g. Fisher score, ReliefF)


              If the only thing you want to achieve is dimensionality reduction in an existing dataset, you can use either feature transformation or feature selection methods. But if you need to know the physical interpretation of the features you identify as "important" or you are trying to limit the amount of data that need to be collected for your analysis (you need all the initial set of features for feature transformation), then only feature selection can work.



              You can find more details on Feature Selection and Dimensionality Reduction in the following links:




              • A summary of Dimension Reduction methods


              • Classification and Feature Selection: A Review


              • Relevant question and answers in Stack Overflow







              share|improve this answer











              $endgroup$













                Your Answer





                StackExchange.ifUsing("editor", function () {
                return StackExchange.using("mathjaxEditing", function () {
                StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
                StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
                });
                });
                }, "mathjax-editing");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "557"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29006%2ffeature-selection-vs-feature-extraction-which-to-use-when%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                12












                $begingroup$

                Adding to The answer given by Toros,



                These(see below bullets) three are quite similar but with a subtle differences-:(concise and easy to remember)




                • feature extraction and feature engineering: transformation of raw data into features suitable for modeling;


                • feature transformation: transformation of data to improve the accuracy of the algorithm;


                • feature selection: removing unnecessary features.



                Just to add an Example of the same,




                Feature Extraction and Engineering(we can extract something from them)





                • Texts(ngrams, word2vec, tf-idf etc)

                • Images(CNN'S, texts, q&a)

                • Geospatial data(lat, long etc)

                • Date and time(day,month,week,year..)

                • Time series, web, etc...

                • Dimensional Reduction Techniques..

                • .....(And Many Others)



                Feature transformations(transforming them to make sense)





                • Normalization and changing distribution(Scaling)

                • Interactions

                • Filling in the missing values(median filling etc)

                • .....(And Many Others)



                Feature selection(building your model on these selected features)





                • Statistical approaches

                • Selection by modeling

                • Grid search

                • Cross Validation

                • .....(And Many Others)


                Hope this helps...



                Do look at the links shared by others.
                They are Quite Nice...






                share|improve this answer











                $endgroup$













                • $begingroup$
                  nice way of answering +1 for that.
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 1:48










                • $begingroup$
                  Kudos to this community.. Learning a lot from it..
                  $endgroup$
                  – Aditya
                  Mar 14 '18 at 2:11






                • 1




                  $begingroup$
                  True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 2:19










                • $begingroup$
                  What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                  $endgroup$
                  – technazi
                  Oct 20 '18 at 19:39










                • $begingroup$
                  Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                  $endgroup$
                  – Aditya
                  Oct 21 '18 at 2:00


















                12












                $begingroup$

                Adding to The answer given by Toros,



                These(see below bullets) three are quite similar but with a subtle differences-:(concise and easy to remember)




                • feature extraction and feature engineering: transformation of raw data into features suitable for modeling;


                • feature transformation: transformation of data to improve the accuracy of the algorithm;


                • feature selection: removing unnecessary features.



                Just to add an Example of the same,




                Feature Extraction and Engineering(we can extract something from them)





                • Texts(ngrams, word2vec, tf-idf etc)

                • Images(CNN'S, texts, q&a)

                • Geospatial data(lat, long etc)

                • Date and time(day,month,week,year..)

                • Time series, web, etc...

                • Dimensional Reduction Techniques..

                • .....(And Many Others)



                Feature transformations(transforming them to make sense)





                • Normalization and changing distribution(Scaling)

                • Interactions

                • Filling in the missing values(median filling etc)

                • .....(And Many Others)



                Feature selection(building your model on these selected features)





                • Statistical approaches

                • Selection by modeling

                • Grid search

                • Cross Validation

                • .....(And Many Others)


                Hope this helps...



                Do look at the links shared by others.
                They are Quite Nice...






                share|improve this answer











                $endgroup$













                • $begingroup$
                  nice way of answering +1 for that.
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 1:48










                • $begingroup$
                  Kudos to this community.. Learning a lot from it..
                  $endgroup$
                  – Aditya
                  Mar 14 '18 at 2:11






                • 1




                  $begingroup$
                  True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 2:19










                • $begingroup$
                  What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                  $endgroup$
                  – technazi
                  Oct 20 '18 at 19:39










                • $begingroup$
                  Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                  $endgroup$
                  – Aditya
                  Oct 21 '18 at 2:00
















                12












                12








                12





                $begingroup$

                Adding to The answer given by Toros,



                These(see below bullets) three are quite similar but with a subtle differences-:(concise and easy to remember)




                • feature extraction and feature engineering: transformation of raw data into features suitable for modeling;


                • feature transformation: transformation of data to improve the accuracy of the algorithm;


                • feature selection: removing unnecessary features.



                Just to add an Example of the same,




                Feature Extraction and Engineering(we can extract something from them)





                • Texts(ngrams, word2vec, tf-idf etc)

                • Images(CNN'S, texts, q&a)

                • Geospatial data(lat, long etc)

                • Date and time(day,month,week,year..)

                • Time series, web, etc...

                • Dimensional Reduction Techniques..

                • .....(And Many Others)



                Feature transformations(transforming them to make sense)





                • Normalization and changing distribution(Scaling)

                • Interactions

                • Filling in the missing values(median filling etc)

                • .....(And Many Others)



                Feature selection(building your model on these selected features)





                • Statistical approaches

                • Selection by modeling

                • Grid search

                • Cross Validation

                • .....(And Many Others)


                Hope this helps...



                Do look at the links shared by others.
                They are Quite Nice...






                share|improve this answer











                $endgroup$



                Adding to The answer given by Toros,



                These(see below bullets) three are quite similar but with a subtle differences-:(concise and easy to remember)




                • feature extraction and feature engineering: transformation of raw data into features suitable for modeling;


                • feature transformation: transformation of data to improve the accuracy of the algorithm;


                • feature selection: removing unnecessary features.



                Just to add an Example of the same,




                Feature Extraction and Engineering(we can extract something from them)





                • Texts(ngrams, word2vec, tf-idf etc)

                • Images(CNN'S, texts, q&a)

                • Geospatial data(lat, long etc)

                • Date and time(day,month,week,year..)

                • Time series, web, etc...

                • Dimensional Reduction Techniques..

                • .....(And Many Others)



                Feature transformations(transforming them to make sense)





                • Normalization and changing distribution(Scaling)

                • Interactions

                • Filling in the missing values(median filling etc)

                • .....(And Many Others)



                Feature selection(building your model on these selected features)





                • Statistical approaches

                • Selection by modeling

                • Grid search

                • Cross Validation

                • .....(And Many Others)


                Hope this helps...



                Do look at the links shared by others.
                They are Quite Nice...







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Oct 25 '18 at 9:38

























                answered Mar 13 '18 at 10:00









                AdityaAditya

                1,4101525




                1,4101525












                • $begingroup$
                  nice way of answering +1 for that.
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 1:48










                • $begingroup$
                  Kudos to this community.. Learning a lot from it..
                  $endgroup$
                  – Aditya
                  Mar 14 '18 at 2:11






                • 1




                  $begingroup$
                  True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 2:19










                • $begingroup$
                  What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                  $endgroup$
                  – technazi
                  Oct 20 '18 at 19:39










                • $begingroup$
                  Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                  $endgroup$
                  – Aditya
                  Oct 21 '18 at 2:00




















                • $begingroup$
                  nice way of answering +1 for that.
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 1:48










                • $begingroup$
                  Kudos to this community.. Learning a lot from it..
                  $endgroup$
                  – Aditya
                  Mar 14 '18 at 2:11






                • 1




                  $begingroup$
                  True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                  $endgroup$
                  – Toros91
                  Mar 14 '18 at 2:19










                • $begingroup$
                  What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                  $endgroup$
                  – technazi
                  Oct 20 '18 at 19:39










                • $begingroup$
                  Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                  $endgroup$
                  – Aditya
                  Oct 21 '18 at 2:00


















                $begingroup$
                nice way of answering +1 for that.
                $endgroup$
                – Toros91
                Mar 14 '18 at 1:48




                $begingroup$
                nice way of answering +1 for that.
                $endgroup$
                – Toros91
                Mar 14 '18 at 1:48












                $begingroup$
                Kudos to this community.. Learning a lot from it..
                $endgroup$
                – Aditya
                Mar 14 '18 at 2:11




                $begingroup$
                Kudos to this community.. Learning a lot from it..
                $endgroup$
                – Aditya
                Mar 14 '18 at 2:11




                1




                1




                $begingroup$
                True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                $endgroup$
                – Toros91
                Mar 14 '18 at 2:19




                $begingroup$
                True that man, I've been a member since October, 2017. I've learned a lot of things. Hope it be the same for you as well. I've been reading your answers, they are good .BTW sorry for the thing which you had gone through on SO. I couldn't see the whole thing but as Neil Slater said good that you kept your cool all the way till the end. Keep it up! We still have a long way to go. :)
                $endgroup$
                – Toros91
                Mar 14 '18 at 2:19












                $begingroup$
                What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                $endgroup$
                – technazi
                Oct 20 '18 at 19:39




                $begingroup$
                What's the order in which these should be processed? In addition to data cleaning and data splitting. Which out of the 5 is the first step?
                $endgroup$
                – technazi
                Oct 20 '18 at 19:39












                $begingroup$
                Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                $endgroup$
                – Aditya
                Oct 21 '18 at 2:00






                $begingroup$
                Data splitting is done at the very end when you make sure that the data is ready to be sent for Modelling...And imho there's no such ordering for the above mentioned things because they overlap quite a few times(feature extraction, feature engineering, Feature transformation.) but Feature Selection is surely done after splitting the data into train as validation provided that you are using your models metric or something equivalent on a validation dataset (to measure it's performance)for Cross Validation or something equivalent,You can iteratively start dropping columns and see imp colsorimp
                $endgroup$
                – Aditya
                Oct 21 '18 at 2:00













                3












                $begingroup$

                I think they are 2 different things,



                Lets start with Feature Selection:



                This technique is used for selecting the features which explain the most of the target variable(has a correlation with the target variable).This test is ran just before the model is applied on the data.



                To explain it better let us go by an example: there are 10 feature and 1 target variable, 9 features explain 90% of the target variable and 10 features together explains 91% of the target variable. So the 1 variable is not making much of a difference so you tend to remove that before modelling(It is subjective to the business as well). I can also be called as Predictor Importance.



                Now lets talk about Feature Extraction,



                Which is used in Unsupervised Learning,extraction of contours in images, extraction of Bi-grams from a text, extraction of phonemes from recording of spoken text.
                When you don't know anything about the data like no data dictionary, too many features which means the data is not in understandable format. Then you try applying this technique to get some features which explains the most of the data. Feature extraction involves a transformation of the features, which often is not reversible because some information is lost in the process of dimensionality reduction.



                You can apply Feature Extraction on the given data to extract features and then apply Feature Selection with respect to the Target Variable to select the subset which can help in making a good model with good results.



                you can go through these Link-1,Link-2 for better understanding.



                we can implement them in R, Python, SPSS.



                let me know if need any more clarification.






                share|improve this answer











                $endgroup$


















                  3












                  $begingroup$

                  I think they are 2 different things,



                  Lets start with Feature Selection:



                  This technique is used for selecting the features which explain the most of the target variable(has a correlation with the target variable).This test is ran just before the model is applied on the data.



                  To explain it better let us go by an example: there are 10 feature and 1 target variable, 9 features explain 90% of the target variable and 10 features together explains 91% of the target variable. So the 1 variable is not making much of a difference so you tend to remove that before modelling(It is subjective to the business as well). I can also be called as Predictor Importance.



                  Now lets talk about Feature Extraction,



                  Which is used in Unsupervised Learning,extraction of contours in images, extraction of Bi-grams from a text, extraction of phonemes from recording of spoken text.
                  When you don't know anything about the data like no data dictionary, too many features which means the data is not in understandable format. Then you try applying this technique to get some features which explains the most of the data. Feature extraction involves a transformation of the features, which often is not reversible because some information is lost in the process of dimensionality reduction.



                  You can apply Feature Extraction on the given data to extract features and then apply Feature Selection with respect to the Target Variable to select the subset which can help in making a good model with good results.



                  you can go through these Link-1,Link-2 for better understanding.



                  we can implement them in R, Python, SPSS.



                  let me know if need any more clarification.






                  share|improve this answer











                  $endgroup$
















                    3












                    3








                    3





                    $begingroup$

                    I think they are 2 different things,



                    Lets start with Feature Selection:



                    This technique is used for selecting the features which explain the most of the target variable(has a correlation with the target variable).This test is ran just before the model is applied on the data.



                    To explain it better let us go by an example: there are 10 feature and 1 target variable, 9 features explain 90% of the target variable and 10 features together explains 91% of the target variable. So the 1 variable is not making much of a difference so you tend to remove that before modelling(It is subjective to the business as well). I can also be called as Predictor Importance.



                    Now lets talk about Feature Extraction,



                    Which is used in Unsupervised Learning,extraction of contours in images, extraction of Bi-grams from a text, extraction of phonemes from recording of spoken text.
                    When you don't know anything about the data like no data dictionary, too many features which means the data is not in understandable format. Then you try applying this technique to get some features which explains the most of the data. Feature extraction involves a transformation of the features, which often is not reversible because some information is lost in the process of dimensionality reduction.



                    You can apply Feature Extraction on the given data to extract features and then apply Feature Selection with respect to the Target Variable to select the subset which can help in making a good model with good results.



                    you can go through these Link-1,Link-2 for better understanding.



                    we can implement them in R, Python, SPSS.



                    let me know if need any more clarification.






                    share|improve this answer











                    $endgroup$



                    I think they are 2 different things,



                    Lets start with Feature Selection:



                    This technique is used for selecting the features which explain the most of the target variable(has a correlation with the target variable).This test is ran just before the model is applied on the data.



                    To explain it better let us go by an example: there are 10 feature and 1 target variable, 9 features explain 90% of the target variable and 10 features together explains 91% of the target variable. So the 1 variable is not making much of a difference so you tend to remove that before modelling(It is subjective to the business as well). I can also be called as Predictor Importance.



                    Now lets talk about Feature Extraction,



                    Which is used in Unsupervised Learning,extraction of contours in images, extraction of Bi-grams from a text, extraction of phonemes from recording of spoken text.
                    When you don't know anything about the data like no data dictionary, too many features which means the data is not in understandable format. Then you try applying this technique to get some features which explains the most of the data. Feature extraction involves a transformation of the features, which often is not reversible because some information is lost in the process of dimensionality reduction.



                    You can apply Feature Extraction on the given data to extract features and then apply Feature Selection with respect to the Target Variable to select the subset which can help in making a good model with good results.



                    you can go through these Link-1,Link-2 for better understanding.



                    we can implement them in R, Python, SPSS.



                    let me know if need any more clarification.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Mar 13 '18 at 6:45

























                    answered Mar 13 '18 at 6:15









                    Toros91Toros91

                    1,9612628




                    1,9612628























                        3












                        $begingroup$

                        The two are very different: Feature Selection indeed reduces dimensions, but feature extraction adds dimensions which are computed from other features.



                        For panel or time series data, one usually has the datetime variable, and one does not want to train the dependent variable on the date itself as those do not occur in the future. So you should eliminate the datetime: feature elimination.



                        On the other hand, weekday/weekend day may be very relevant, so we need to compute the weekday status from the datetime: feature extraction.






                        share|improve this answer









                        $endgroup$


















                          3












                          $begingroup$

                          The two are very different: Feature Selection indeed reduces dimensions, but feature extraction adds dimensions which are computed from other features.



                          For panel or time series data, one usually has the datetime variable, and one does not want to train the dependent variable on the date itself as those do not occur in the future. So you should eliminate the datetime: feature elimination.



                          On the other hand, weekday/weekend day may be very relevant, so we need to compute the weekday status from the datetime: feature extraction.






                          share|improve this answer









                          $endgroup$
















                            3












                            3








                            3





                            $begingroup$

                            The two are very different: Feature Selection indeed reduces dimensions, but feature extraction adds dimensions which are computed from other features.



                            For panel or time series data, one usually has the datetime variable, and one does not want to train the dependent variable on the date itself as those do not occur in the future. So you should eliminate the datetime: feature elimination.



                            On the other hand, weekday/weekend day may be very relevant, so we need to compute the weekday status from the datetime: feature extraction.






                            share|improve this answer









                            $endgroup$



                            The two are very different: Feature Selection indeed reduces dimensions, but feature extraction adds dimensions which are computed from other features.



                            For panel or time series data, one usually has the datetime variable, and one does not want to train the dependent variable on the date itself as those do not occur in the future. So you should eliminate the datetime: feature elimination.



                            On the other hand, weekday/weekend day may be very relevant, so we need to compute the weekday status from the datetime: feature extraction.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Mar 13 '18 at 14:41









                            vinniefvinnief

                            1414




                            1414























                                3












                                $begingroup$

                                As Aditya said, there are 3 feature-related terms that sometimes are confused with each other. I will try and give summary explanation to each one of them:





                                • Feature extraction: Generation of features from data that are in a format that is difficult to analyse directly/are not directly comparable (e.g. images, time-series, etc.) In the example of a time-series, some simple features could be for example: length of time-series, period, mean value, std, etc.


                                • Feature transformation: Transformation of existing features in order to create new ones based on the old ones. A very popularly used technique for dimensionality reduction is Principal Component Analysis (pca) that uses some orthogonal transformation in order to produce a set of linearly non-correlated variables based on the initial set of variables.


                                • Feature selection: Selection of the features with the highest "importance"/influence on the target variable, from a set of existing features. This can be done with various techniques: e.g. Linear Regression, Decision Trees, calculation of "importance" weights (e.g. Fisher score, ReliefF)


                                If the only thing you want to achieve is dimensionality reduction in an existing dataset, you can use either feature transformation or feature selection methods. But if you need to know the physical interpretation of the features you identify as "important" or you are trying to limit the amount of data that need to be collected for your analysis (you need all the initial set of features for feature transformation), then only feature selection can work.



                                You can find more details on Feature Selection and Dimensionality Reduction in the following links:




                                • A summary of Dimension Reduction methods


                                • Classification and Feature Selection: A Review


                                • Relevant question and answers in Stack Overflow







                                share|improve this answer











                                $endgroup$


















                                  3












                                  $begingroup$

                                  As Aditya said, there are 3 feature-related terms that sometimes are confused with each other. I will try and give summary explanation to each one of them:





                                  • Feature extraction: Generation of features from data that are in a format that is difficult to analyse directly/are not directly comparable (e.g. images, time-series, etc.) In the example of a time-series, some simple features could be for example: length of time-series, period, mean value, std, etc.


                                  • Feature transformation: Transformation of existing features in order to create new ones based on the old ones. A very popularly used technique for dimensionality reduction is Principal Component Analysis (pca) that uses some orthogonal transformation in order to produce a set of linearly non-correlated variables based on the initial set of variables.


                                  • Feature selection: Selection of the features with the highest "importance"/influence on the target variable, from a set of existing features. This can be done with various techniques: e.g. Linear Regression, Decision Trees, calculation of "importance" weights (e.g. Fisher score, ReliefF)


                                  If the only thing you want to achieve is dimensionality reduction in an existing dataset, you can use either feature transformation or feature selection methods. But if you need to know the physical interpretation of the features you identify as "important" or you are trying to limit the amount of data that need to be collected for your analysis (you need all the initial set of features for feature transformation), then only feature selection can work.



                                  You can find more details on Feature Selection and Dimensionality Reduction in the following links:




                                  • A summary of Dimension Reduction methods


                                  • Classification and Feature Selection: A Review


                                  • Relevant question and answers in Stack Overflow







                                  share|improve this answer











                                  $endgroup$
















                                    3












                                    3








                                    3





                                    $begingroup$

                                    As Aditya said, there are 3 feature-related terms that sometimes are confused with each other. I will try and give summary explanation to each one of them:





                                    • Feature extraction: Generation of features from data that are in a format that is difficult to analyse directly/are not directly comparable (e.g. images, time-series, etc.) In the example of a time-series, some simple features could be for example: length of time-series, period, mean value, std, etc.


                                    • Feature transformation: Transformation of existing features in order to create new ones based on the old ones. A very popularly used technique for dimensionality reduction is Principal Component Analysis (pca) that uses some orthogonal transformation in order to produce a set of linearly non-correlated variables based on the initial set of variables.


                                    • Feature selection: Selection of the features with the highest "importance"/influence on the target variable, from a set of existing features. This can be done with various techniques: e.g. Linear Regression, Decision Trees, calculation of "importance" weights (e.g. Fisher score, ReliefF)


                                    If the only thing you want to achieve is dimensionality reduction in an existing dataset, you can use either feature transformation or feature selection methods. But if you need to know the physical interpretation of the features you identify as "important" or you are trying to limit the amount of data that need to be collected for your analysis (you need all the initial set of features for feature transformation), then only feature selection can work.



                                    You can find more details on Feature Selection and Dimensionality Reduction in the following links:




                                    • A summary of Dimension Reduction methods


                                    • Classification and Feature Selection: A Review


                                    • Relevant question and answers in Stack Overflow







                                    share|improve this answer











                                    $endgroup$



                                    As Aditya said, there are 3 feature-related terms that sometimes are confused with each other. I will try and give summary explanation to each one of them:





                                    • Feature extraction: Generation of features from data that are in a format that is difficult to analyse directly/are not directly comparable (e.g. images, time-series, etc.) In the example of a time-series, some simple features could be for example: length of time-series, period, mean value, std, etc.


                                    • Feature transformation: Transformation of existing features in order to create new ones based on the old ones. A very popularly used technique for dimensionality reduction is Principal Component Analysis (pca) that uses some orthogonal transformation in order to produce a set of linearly non-correlated variables based on the initial set of variables.


                                    • Feature selection: Selection of the features with the highest "importance"/influence on the target variable, from a set of existing features. This can be done with various techniques: e.g. Linear Regression, Decision Trees, calculation of "importance" weights (e.g. Fisher score, ReliefF)


                                    If the only thing you want to achieve is dimensionality reduction in an existing dataset, you can use either feature transformation or feature selection methods. But if you need to know the physical interpretation of the features you identify as "important" or you are trying to limit the amount of data that need to be collected for your analysis (you need all the initial set of features for feature transformation), then only feature selection can work.



                                    You can find more details on Feature Selection and Dimensionality Reduction in the following links:




                                    • A summary of Dimension Reduction methods


                                    • Classification and Feature Selection: A Review


                                    • Relevant question and answers in Stack Overflow








                                    share|improve this answer














                                    share|improve this answer



                                    share|improve this answer








                                    edited Mar 13 '18 at 14:41









                                    Aditya

                                    1,4101525




                                    1,4101525










                                    answered Mar 13 '18 at 14:05









                                    missrgmissrg

                                    36718




                                    36718






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Data Science Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        Use MathJax to format equations. MathJax reference.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29006%2ffeature-selection-vs-feature-extraction-which-to-use-when%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        How to label and detect the document text images

                                        Vallis Paradisi

                                        Tabula Rosettana