Map list to bin numbers












4












$begingroup$


Does WL have the equivalent of Matlab's discretize or NumPy's digitize? I.e., a function that takes a length-N list and a list of bin edges and returns a length-N list of bin numbers, mapping each list item to its bin number?










share|improve this question











$endgroup$












  • $begingroup$
    HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
    $endgroup$
    – b3m2a1
    yesterday












  • $begingroup$
    I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
    $endgroup$
    – Alan
    yesterday












  • $begingroup$
    Related: 140577
    $endgroup$
    – Carl Woll
    yesterday








  • 1




    $begingroup$
    Did you try BinCounts? I guess it is what you need.
    $endgroup$
    – Rom38
    yesterday










  • $begingroup$
    @Rom38 You probably meant BinLists, right?
    $endgroup$
    – Henrik Schumacher
    23 hours ago
















4












$begingroup$


Does WL have the equivalent of Matlab's discretize or NumPy's digitize? I.e., a function that takes a length-N list and a list of bin edges and returns a length-N list of bin numbers, mapping each list item to its bin number?










share|improve this question











$endgroup$












  • $begingroup$
    HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
    $endgroup$
    – b3m2a1
    yesterday












  • $begingroup$
    I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
    $endgroup$
    – Alan
    yesterday












  • $begingroup$
    Related: 140577
    $endgroup$
    – Carl Woll
    yesterday








  • 1




    $begingroup$
    Did you try BinCounts? I guess it is what you need.
    $endgroup$
    – Rom38
    yesterday










  • $begingroup$
    @Rom38 You probably meant BinLists, right?
    $endgroup$
    – Henrik Schumacher
    23 hours ago














4












4








4





$begingroup$


Does WL have the equivalent of Matlab's discretize or NumPy's digitize? I.e., a function that takes a length-N list and a list of bin edges and returns a length-N list of bin numbers, mapping each list item to its bin number?










share|improve this question











$endgroup$




Does WL have the equivalent of Matlab's discretize or NumPy's digitize? I.e., a function that takes a length-N list and a list of bin edges and returns a length-N list of bin numbers, mapping each list item to its bin number?







list-manipulation data






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 23 hours ago









user64494

3,60911122




3,60911122










asked yesterday









AlanAlan

6,6501125




6,6501125












  • $begingroup$
    HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
    $endgroup$
    – b3m2a1
    yesterday












  • $begingroup$
    I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
    $endgroup$
    – Alan
    yesterday












  • $begingroup$
    Related: 140577
    $endgroup$
    – Carl Woll
    yesterday








  • 1




    $begingroup$
    Did you try BinCounts? I guess it is what you need.
    $endgroup$
    – Rom38
    yesterday










  • $begingroup$
    @Rom38 You probably meant BinLists, right?
    $endgroup$
    – Henrik Schumacher
    23 hours ago


















  • $begingroup$
    HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
    $endgroup$
    – b3m2a1
    yesterday












  • $begingroup$
    I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
    $endgroup$
    – Alan
    yesterday












  • $begingroup$
    Related: 140577
    $endgroup$
    – Carl Woll
    yesterday








  • 1




    $begingroup$
    Did you try BinCounts? I guess it is what you need.
    $endgroup$
    – Rom38
    yesterday










  • $begingroup$
    @Rom38 You probably meant BinLists, right?
    $endgroup$
    – Henrik Schumacher
    23 hours ago
















$begingroup$
HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
$endgroup$
– b3m2a1
yesterday






$begingroup$
HistogramList seems similar. This could also be done efficiently with GroupBy and some easy little Compile-d selection determiner. Or maybe hit it first with Sort then write something that only checks the next bin up. Again, can be easily Compile-d.
$endgroup$
– b3m2a1
yesterday














$begingroup$
I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
$endgroup$
– Alan
yesterday






$begingroup$
I need it to work like a map (in terms of the order of the items in the resulting list). Of course it is possible to write something ...
$endgroup$
– Alan
yesterday














$begingroup$
Related: 140577
$endgroup$
– Carl Woll
yesterday






$begingroup$
Related: 140577
$endgroup$
– Carl Woll
yesterday






1




1




$begingroup$
Did you try BinCounts? I guess it is what you need.
$endgroup$
– Rom38
yesterday




$begingroup$
Did you try BinCounts? I guess it is what you need.
$endgroup$
– Rom38
yesterday












$begingroup$
@Rom38 You probably meant BinLists, right?
$endgroup$
– Henrik Schumacher
23 hours ago




$begingroup$
@Rom38 You probably meant BinLists, right?
$endgroup$
– Henrik Schumacher
23 hours ago










3 Answers
3






active

oldest

votes


















3












$begingroup$

Here's a version based on Nearest:



digitize[edges_] := DigitizeFunction[edges, Nearest[edges -> "Index"]]
digitize[data_, edges_] := digitize[edges][data]

DigitizeFunction[edges_, nf_NearestFunction][data_] := With[{init = nf[data][[All, 1]]},
init + UnitStep[data - edges[[init]]] - 1
]


For example:



SeedRandom[1]
data = RandomReal[10, 10]
digitize[data, {2, 4, 5, 7, 8}]



{8.17389, 1.1142, 7.89526, 1.87803, 2.41361, 0.657388, 5.42247, 2.31155, 3.96006, 7.00474}



{5, 0, 4, 0, 1, 0, 3, 1, 1, 4}




Note that I broke up the definition of digitize into two pieces, so that if you do this for multiple data sets with the same edges list, you only need to compute the nearest function once.






share|improve this answer











$endgroup$





















    5












    $begingroup$

    This is a very quick-n-dirty, but may serve as a simple example.



    This creates a piecewise function following the first definition in Matlab's discretize documentation, then applies that to the data.



    disc[data_, edges_] := Module[{e = Partition[edges, 2, 1], p, l},
    l = Length@e;
    p=Piecewise[Append[Table[{i, e[[i, 1]] <= x < e[[i, 2]]}, {i, l - 1}]
    , {l,e[[l, 1]] <= x <= e[[l, 2]]}]
    , "NaN"];
    Table[p, {x, data}]];


    From the first example in the above referenced documentation:



    data={1, 1, 2, 3, 6, 5, 8, 10, 4, 4};
    edges={2, 4, 6, 8, 10};

    disc[data,edges]



    {NaN,NaN,1,1,3,2,4,4,2,2}




    I'm sure there are more efficient/elegant solutions, and will revisit as time permits.






    share|improve this answer











    $endgroup$





















      2












      $begingroup$

      You may also use Interpolation with InterpolationOrder -> 0. However, employing Nearest as Carl Woll did will usually be much faster.



      First, we prepare the interplating function.



      m = 20;
      binboundaries = Join[{-1.}, Sort[RandomReal[{-1, 1}, m - 1]], {1.}];

      f = Interpolation[Transpose[{binboundaries, Range[0, m]}], InterpolationOrder -> 0];


      Now you can apply it to lists of values as follows:



      vals = RandomReal[{-1, 1}, 1000];   
      Round[f[vals]]





      share|improve this answer









      $endgroup$














        Your Answer





        StackExchange.ifUsing("editor", function () {
        return StackExchange.using("mathjaxEditing", function () {
        StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
        StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
        });
        });
        }, "mathjax-editing");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "387"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f194844%2fmap-list-to-bin-numbers%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        3












        $begingroup$

        Here's a version based on Nearest:



        digitize[edges_] := DigitizeFunction[edges, Nearest[edges -> "Index"]]
        digitize[data_, edges_] := digitize[edges][data]

        DigitizeFunction[edges_, nf_NearestFunction][data_] := With[{init = nf[data][[All, 1]]},
        init + UnitStep[data - edges[[init]]] - 1
        ]


        For example:



        SeedRandom[1]
        data = RandomReal[10, 10]
        digitize[data, {2, 4, 5, 7, 8}]



        {8.17389, 1.1142, 7.89526, 1.87803, 2.41361, 0.657388, 5.42247, 2.31155, 3.96006, 7.00474}



        {5, 0, 4, 0, 1, 0, 3, 1, 1, 4}




        Note that I broke up the definition of digitize into two pieces, so that if you do this for multiple data sets with the same edges list, you only need to compute the nearest function once.






        share|improve this answer











        $endgroup$


















          3












          $begingroup$

          Here's a version based on Nearest:



          digitize[edges_] := DigitizeFunction[edges, Nearest[edges -> "Index"]]
          digitize[data_, edges_] := digitize[edges][data]

          DigitizeFunction[edges_, nf_NearestFunction][data_] := With[{init = nf[data][[All, 1]]},
          init + UnitStep[data - edges[[init]]] - 1
          ]


          For example:



          SeedRandom[1]
          data = RandomReal[10, 10]
          digitize[data, {2, 4, 5, 7, 8}]



          {8.17389, 1.1142, 7.89526, 1.87803, 2.41361, 0.657388, 5.42247, 2.31155, 3.96006, 7.00474}



          {5, 0, 4, 0, 1, 0, 3, 1, 1, 4}




          Note that I broke up the definition of digitize into two pieces, so that if you do this for multiple data sets with the same edges list, you only need to compute the nearest function once.






          share|improve this answer











          $endgroup$
















            3












            3








            3





            $begingroup$

            Here's a version based on Nearest:



            digitize[edges_] := DigitizeFunction[edges, Nearest[edges -> "Index"]]
            digitize[data_, edges_] := digitize[edges][data]

            DigitizeFunction[edges_, nf_NearestFunction][data_] := With[{init = nf[data][[All, 1]]},
            init + UnitStep[data - edges[[init]]] - 1
            ]


            For example:



            SeedRandom[1]
            data = RandomReal[10, 10]
            digitize[data, {2, 4, 5, 7, 8}]



            {8.17389, 1.1142, 7.89526, 1.87803, 2.41361, 0.657388, 5.42247, 2.31155, 3.96006, 7.00474}



            {5, 0, 4, 0, 1, 0, 3, 1, 1, 4}




            Note that I broke up the definition of digitize into two pieces, so that if you do this for multiple data sets with the same edges list, you only need to compute the nearest function once.






            share|improve this answer











            $endgroup$



            Here's a version based on Nearest:



            digitize[edges_] := DigitizeFunction[edges, Nearest[edges -> "Index"]]
            digitize[data_, edges_] := digitize[edges][data]

            DigitizeFunction[edges_, nf_NearestFunction][data_] := With[{init = nf[data][[All, 1]]},
            init + UnitStep[data - edges[[init]]] - 1
            ]


            For example:



            SeedRandom[1]
            data = RandomReal[10, 10]
            digitize[data, {2, 4, 5, 7, 8}]



            {8.17389, 1.1142, 7.89526, 1.87803, 2.41361, 0.657388, 5.42247, 2.31155, 3.96006, 7.00474}



            {5, 0, 4, 0, 1, 0, 3, 1, 1, 4}




            Note that I broke up the definition of digitize into two pieces, so that if you do this for multiple data sets with the same edges list, you only need to compute the nearest function once.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited yesterday

























            answered yesterday









            Carl WollCarl Woll

            73.2k397191




            73.2k397191























                5












                $begingroup$

                This is a very quick-n-dirty, but may serve as a simple example.



                This creates a piecewise function following the first definition in Matlab's discretize documentation, then applies that to the data.



                disc[data_, edges_] := Module[{e = Partition[edges, 2, 1], p, l},
                l = Length@e;
                p=Piecewise[Append[Table[{i, e[[i, 1]] <= x < e[[i, 2]]}, {i, l - 1}]
                , {l,e[[l, 1]] <= x <= e[[l, 2]]}]
                , "NaN"];
                Table[p, {x, data}]];


                From the first example in the above referenced documentation:



                data={1, 1, 2, 3, 6, 5, 8, 10, 4, 4};
                edges={2, 4, 6, 8, 10};

                disc[data,edges]



                {NaN,NaN,1,1,3,2,4,4,2,2}




                I'm sure there are more efficient/elegant solutions, and will revisit as time permits.






                share|improve this answer











                $endgroup$


















                  5












                  $begingroup$

                  This is a very quick-n-dirty, but may serve as a simple example.



                  This creates a piecewise function following the first definition in Matlab's discretize documentation, then applies that to the data.



                  disc[data_, edges_] := Module[{e = Partition[edges, 2, 1], p, l},
                  l = Length@e;
                  p=Piecewise[Append[Table[{i, e[[i, 1]] <= x < e[[i, 2]]}, {i, l - 1}]
                  , {l,e[[l, 1]] <= x <= e[[l, 2]]}]
                  , "NaN"];
                  Table[p, {x, data}]];


                  From the first example in the above referenced documentation:



                  data={1, 1, 2, 3, 6, 5, 8, 10, 4, 4};
                  edges={2, 4, 6, 8, 10};

                  disc[data,edges]



                  {NaN,NaN,1,1,3,2,4,4,2,2}




                  I'm sure there are more efficient/elegant solutions, and will revisit as time permits.






                  share|improve this answer











                  $endgroup$
















                    5












                    5








                    5





                    $begingroup$

                    This is a very quick-n-dirty, but may serve as a simple example.



                    This creates a piecewise function following the first definition in Matlab's discretize documentation, then applies that to the data.



                    disc[data_, edges_] := Module[{e = Partition[edges, 2, 1], p, l},
                    l = Length@e;
                    p=Piecewise[Append[Table[{i, e[[i, 1]] <= x < e[[i, 2]]}, {i, l - 1}]
                    , {l,e[[l, 1]] <= x <= e[[l, 2]]}]
                    , "NaN"];
                    Table[p, {x, data}]];


                    From the first example in the above referenced documentation:



                    data={1, 1, 2, 3, 6, 5, 8, 10, 4, 4};
                    edges={2, 4, 6, 8, 10};

                    disc[data,edges]



                    {NaN,NaN,1,1,3,2,4,4,2,2}




                    I'm sure there are more efficient/elegant solutions, and will revisit as time permits.






                    share|improve this answer











                    $endgroup$



                    This is a very quick-n-dirty, but may serve as a simple example.



                    This creates a piecewise function following the first definition in Matlab's discretize documentation, then applies that to the data.



                    disc[data_, edges_] := Module[{e = Partition[edges, 2, 1], p, l},
                    l = Length@e;
                    p=Piecewise[Append[Table[{i, e[[i, 1]] <= x < e[[i, 2]]}, {i, l - 1}]
                    , {l,e[[l, 1]] <= x <= e[[l, 2]]}]
                    , "NaN"];
                    Table[p, {x, data}]];


                    From the first example in the above referenced documentation:



                    data={1, 1, 2, 3, 6, 5, 8, 10, 4, 4};
                    edges={2, 4, 6, 8, 10};

                    disc[data,edges]



                    {NaN,NaN,1,1,3,2,4,4,2,2}




                    I'm sure there are more efficient/elegant solutions, and will revisit as time permits.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 20 hours ago

























                    answered yesterday









                    ciaociao

                    17.5k138109




                    17.5k138109























                        2












                        $begingroup$

                        You may also use Interpolation with InterpolationOrder -> 0. However, employing Nearest as Carl Woll did will usually be much faster.



                        First, we prepare the interplating function.



                        m = 20;
                        binboundaries = Join[{-1.}, Sort[RandomReal[{-1, 1}, m - 1]], {1.}];

                        f = Interpolation[Transpose[{binboundaries, Range[0, m]}], InterpolationOrder -> 0];


                        Now you can apply it to lists of values as follows:



                        vals = RandomReal[{-1, 1}, 1000];   
                        Round[f[vals]]





                        share|improve this answer









                        $endgroup$


















                          2












                          $begingroup$

                          You may also use Interpolation with InterpolationOrder -> 0. However, employing Nearest as Carl Woll did will usually be much faster.



                          First, we prepare the interplating function.



                          m = 20;
                          binboundaries = Join[{-1.}, Sort[RandomReal[{-1, 1}, m - 1]], {1.}];

                          f = Interpolation[Transpose[{binboundaries, Range[0, m]}], InterpolationOrder -> 0];


                          Now you can apply it to lists of values as follows:



                          vals = RandomReal[{-1, 1}, 1000];   
                          Round[f[vals]]





                          share|improve this answer









                          $endgroup$
















                            2












                            2








                            2





                            $begingroup$

                            You may also use Interpolation with InterpolationOrder -> 0. However, employing Nearest as Carl Woll did will usually be much faster.



                            First, we prepare the interplating function.



                            m = 20;
                            binboundaries = Join[{-1.}, Sort[RandomReal[{-1, 1}, m - 1]], {1.}];

                            f = Interpolation[Transpose[{binboundaries, Range[0, m]}], InterpolationOrder -> 0];


                            Now you can apply it to lists of values as follows:



                            vals = RandomReal[{-1, 1}, 1000];   
                            Round[f[vals]]





                            share|improve this answer









                            $endgroup$



                            You may also use Interpolation with InterpolationOrder -> 0. However, employing Nearest as Carl Woll did will usually be much faster.



                            First, we prepare the interplating function.



                            m = 20;
                            binboundaries = Join[{-1.}, Sort[RandomReal[{-1, 1}, m - 1]], {1.}];

                            f = Interpolation[Transpose[{binboundaries, Range[0, m]}], InterpolationOrder -> 0];


                            Now you can apply it to lists of values as follows:



                            vals = RandomReal[{-1, 1}, 1000];   
                            Round[f[vals]]






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 22 hours ago









                            Henrik SchumacherHenrik Schumacher

                            59.6k582166




                            59.6k582166






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Mathematica Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f194844%2fmap-list-to-bin-numbers%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                How to label and detect the document text images

                                Vallis Paradisi

                                Tabula Rosettana