Technical name for this data wrangling process? Multiple columns into multi-factor single column












3












$begingroup$


What is the technical name for the following data wrangling process? I want to collapse Table A into Table B. (To make the data suitable for ANOVA.)



Table A:



ArmyVet_ID  Served_WW2  Served_KoreanWar    Served_VietnamWar
110001 1 0 0
110002 1 0 0
110004 0 1 0
110005 0 1 0
110009 0 0 1
110010 0 0 1


Table B:



ArmyVet_ID    Served
110001 WW2
110002 WW2
110004 KoreanWar
110005 KoreanWar
110009 VietnamWar
110010 VietnamWar


Also, the question of how to do the above conversion using R has been asked to death on SO. However, there seem to be way too many ways to do it. If anyone's figured out the absolutely best way to do it (quickest, easiest), I'd appreciate pointers.



Update after correct answer marked below: It turns out that Table A is called "wide format" and B is called "long format".










share|improve this question











$endgroup$








  • 2




    $begingroup$
    The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
    $endgroup$
    – Shape
    Jun 8 '16 at 22:42








  • 1




    $begingroup$
    @Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
    $endgroup$
    – thanks_in_advance
    Jun 9 '16 at 6:08








  • 1




    $begingroup$
    It's "normalization" in database theory.
    $endgroup$
    – Diego
    Jun 9 '16 at 8:55
















3












$begingroup$


What is the technical name for the following data wrangling process? I want to collapse Table A into Table B. (To make the data suitable for ANOVA.)



Table A:



ArmyVet_ID  Served_WW2  Served_KoreanWar    Served_VietnamWar
110001 1 0 0
110002 1 0 0
110004 0 1 0
110005 0 1 0
110009 0 0 1
110010 0 0 1


Table B:



ArmyVet_ID    Served
110001 WW2
110002 WW2
110004 KoreanWar
110005 KoreanWar
110009 VietnamWar
110010 VietnamWar


Also, the question of how to do the above conversion using R has been asked to death on SO. However, there seem to be way too many ways to do it. If anyone's figured out the absolutely best way to do it (quickest, easiest), I'd appreciate pointers.



Update after correct answer marked below: It turns out that Table A is called "wide format" and B is called "long format".










share|improve this question











$endgroup$








  • 2




    $begingroup$
    The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
    $endgroup$
    – Shape
    Jun 8 '16 at 22:42








  • 1




    $begingroup$
    @Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
    $endgroup$
    – thanks_in_advance
    Jun 9 '16 at 6:08








  • 1




    $begingroup$
    It's "normalization" in database theory.
    $endgroup$
    – Diego
    Jun 9 '16 at 8:55














3












3








3


1



$begingroup$


What is the technical name for the following data wrangling process? I want to collapse Table A into Table B. (To make the data suitable for ANOVA.)



Table A:



ArmyVet_ID  Served_WW2  Served_KoreanWar    Served_VietnamWar
110001 1 0 0
110002 1 0 0
110004 0 1 0
110005 0 1 0
110009 0 0 1
110010 0 0 1


Table B:



ArmyVet_ID    Served
110001 WW2
110002 WW2
110004 KoreanWar
110005 KoreanWar
110009 VietnamWar
110010 VietnamWar


Also, the question of how to do the above conversion using R has been asked to death on SO. However, there seem to be way too many ways to do it. If anyone's figured out the absolutely best way to do it (quickest, easiest), I'd appreciate pointers.



Update after correct answer marked below: It turns out that Table A is called "wide format" and B is called "long format".










share|improve this question











$endgroup$




What is the technical name for the following data wrangling process? I want to collapse Table A into Table B. (To make the data suitable for ANOVA.)



Table A:



ArmyVet_ID  Served_WW2  Served_KoreanWar    Served_VietnamWar
110001 1 0 0
110002 1 0 0
110004 0 1 0
110005 0 1 0
110009 0 0 1
110010 0 0 1


Table B:



ArmyVet_ID    Served
110001 WW2
110002 WW2
110004 KoreanWar
110005 KoreanWar
110009 VietnamWar
110010 VietnamWar


Also, the question of how to do the above conversion using R has been asked to death on SO. However, there seem to be way too many ways to do it. If anyone's figured out the absolutely best way to do it (quickest, easiest), I'd appreciate pointers.



Update after correct answer marked below: It turns out that Table A is called "wide format" and B is called "long format".







r dataset data-cleaning data-wrangling






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jun 7 '16 at 3:04







thanks_in_advance

















asked Jun 7 '16 at 1:42









thanks_in_advancethanks_in_advance

20116




20116








  • 2




    $begingroup$
    The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
    $endgroup$
    – Shape
    Jun 8 '16 at 22:42








  • 1




    $begingroup$
    @Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
    $endgroup$
    – thanks_in_advance
    Jun 9 '16 at 6:08








  • 1




    $begingroup$
    It's "normalization" in database theory.
    $endgroup$
    – Diego
    Jun 9 '16 at 8:55














  • 2




    $begingroup$
    The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
    $endgroup$
    – Shape
    Jun 8 '16 at 22:42








  • 1




    $begingroup$
    @Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
    $endgroup$
    – thanks_in_advance
    Jun 9 '16 at 6:08








  • 1




    $begingroup$
    It's "normalization" in database theory.
    $endgroup$
    – Diego
    Jun 9 '16 at 8:55








2




2




$begingroup$
The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
$endgroup$
– Shape
Jun 8 '16 at 22:42






$begingroup$
The answer given is right, but 'wide format to long format' might be even more specific, also, I recommend reshape2 instead of reshape, reshape is, as I understand it, underdocumented
$endgroup$
– Shape
Jun 8 '16 at 22:42






1




1




$begingroup$
@Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
$endgroup$
– thanks_in_advance
Jun 9 '16 at 6:08






$begingroup$
@Shape Do you know of any more synonyms of "reshape"? It shocks me that when I look for 'reshape' and Weka or Rapidminer I get nothing on google.
$endgroup$
– thanks_in_advance
Jun 9 '16 at 6:08






1




1




$begingroup$
It's "normalization" in database theory.
$endgroup$
– Diego
Jun 9 '16 at 8:55




$begingroup$
It's "normalization" in database theory.
$endgroup$
– Diego
Jun 9 '16 at 8:55










2 Answers
2






active

oldest

votes


















3












$begingroup$

It is usually called reshaping! For a great description of the process, see this walkthrough, or read up on Hadley Wickham's documentation for the reshape package!






share|improve this answer









$endgroup$





















    1












    $begingroup$

    df['Served'] = (df.iloc[:, 1:] == 1).idxmax(1)





    share|improve this answer










    New contributor




    Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$













    • $begingroup$
      Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
      $endgroup$
      – I_Play_With_Data
      2 days ago











    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "557"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f12099%2ftechnical-name-for-this-data-wrangling-process-multiple-columns-into-multi-fact%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3












    $begingroup$

    It is usually called reshaping! For a great description of the process, see this walkthrough, or read up on Hadley Wickham's documentation for the reshape package!






    share|improve this answer









    $endgroup$


















      3












      $begingroup$

      It is usually called reshaping! For a great description of the process, see this walkthrough, or read up on Hadley Wickham's documentation for the reshape package!






      share|improve this answer









      $endgroup$
















        3












        3








        3





        $begingroup$

        It is usually called reshaping! For a great description of the process, see this walkthrough, or read up on Hadley Wickham's documentation for the reshape package!






        share|improve this answer









        $endgroup$



        It is usually called reshaping! For a great description of the process, see this walkthrough, or read up on Hadley Wickham's documentation for the reshape package!







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jun 7 '16 at 2:13









        Kyle.Kyle.

        1,2001829




        1,2001829























            1












            $begingroup$

            df['Served'] = (df.iloc[:, 1:] == 1).idxmax(1)





            share|improve this answer










            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$













            • $begingroup$
              Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
              $endgroup$
              – I_Play_With_Data
              2 days ago
















            1












            $begingroup$

            df['Served'] = (df.iloc[:, 1:] == 1).idxmax(1)





            share|improve this answer










            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$













            • $begingroup$
              Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
              $endgroup$
              – I_Play_With_Data
              2 days ago














            1












            1








            1





            $begingroup$

            df['Served'] = (df.iloc[:, 1:] == 1).idxmax(1)





            share|improve this answer










            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$



            df['Served'] = (df.iloc[:, 1:] == 1).idxmax(1)






            share|improve this answer










            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|improve this answer



            share|improve this answer








            edited 2 days ago









            I_Play_With_Data

            1,214532




            1,214532






            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered 2 days ago









            Gene XuGene Xu

            111




            111




            New contributor




            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            Gene Xu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.












            • $begingroup$
              Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
              $endgroup$
              – I_Play_With_Data
              2 days ago


















            • $begingroup$
              Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
              $endgroup$
              – I_Play_With_Data
              2 days ago
















            $begingroup$
            Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
            $endgroup$
            – I_Play_With_Data
            2 days ago




            $begingroup$
            Welcome to the site! I have submitted an edit for your answer so that it displays as properly formatted code with the markdown language.
            $endgroup$
            – I_Play_With_Data
            2 days ago


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f12099%2ftechnical-name-for-this-data-wrangling-process-multiple-columns-into-multi-fact%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to label and detect the document text images

            Vallis Paradisi

            Tabula Rosettana