Multiple filtering pandas columns based on values in another column

I have a pandas dataframe df1:

df1

Now, I want to filter the rows in df1 based on unique combinations of (Campaign, Merchant) from another dataframe, df2, which look like this:

enter image description here

What I tried is using .isin, with a code similar to the one below:

df1.loc[df1['Campaign'].isin(df2['Campaign']) &

        df1['Merchant'].isin(df2['Merchant'])]

The problem here is that the conditions are independent eg : I want to check if (A,1) from df2 is in df1, but with the above condition, since I am checking all the list, not row by row, it would return all rows in df1 where Campaign column is A OR Merchant column is 1.

Do you have any suggestion for this multiple pandas filtering?

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

add a comment |

I have a pandas dataframe df1:

df1

Now, I want to filter the rows in df1 based on unique combinations of (Campaign, Merchant) from another dataframe, df2, which look like this:

enter image description here

What I tried is using .isin, with a code similar to the one below:

df1.loc[df1['Campaign'].isin(df2['Campaign']) &

        df1['Merchant'].isin(df2['Merchant'])]

Do you have any suggestion for this multiple pandas filtering?

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

add a comment |

I have a pandas dataframe df1:

df1

Now, I want to filter the rows in df1 based on unique combinations of (Campaign, Merchant) from another dataframe, df2, which look like this:

enter image description here

What I tried is using .isin, with a code similar to the one below:

df1.loc[df1['Campaign'].isin(df2['Campaign']) &

        df1['Merchant'].isin(df2['Merchant'])]

Do you have any suggestion for this multiple pandas filtering?

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

I have a pandas dataframe df1:

df1

Now, I want to filter the rows in df1 based on unique combinations of (Campaign, Merchant) from another dataframe, df2, which look like this:

enter image description here

What I tried is using .isin, with a code similar to the one below:

df1.loc[df1['Campaign'].isin(df2['Campaign']) &

        df1['Merchant'].isin(df2['Merchant'])]

Do you have any suggestion for this multiple pandas filtering?

python pandas

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

edited 18 hours ago

tuomastik

751418

edited 18 hours ago

tuomastik

751418

edited 18 hours ago

tuomastik

751418

asked yesterday

Remus Raphael

112

asked yesterday

Remus Raphael

112

asked yesterday

Remus Raphael

112

add a comment |

1 Answer
1

active

oldest

votes

import pandas as pd



df1 = pd.DataFrame({"Random numbers 1": pd.np.random.randn(6),

                    "Campaign": ["A"] * 5 + ["B"],

                    "Merchant": [1, 1, 1, 2, 3, 1]})



df2 = pd.DataFrame({"Random numbers 2": pd.np.random.randn(6),

                    "Campaign": ["A"] * 2 + ["B"] * 2 + ["C"] * 2,

                    "Merchant": [1, 2, 1, 2, 1, 2]})



columns_consider = ["Campaign", "Merchant"]

combined = pd.concat((df1[columns_consider].drop_duplicates(),

                      df2[columns_consider].drop_duplicates()), ignore_index=True)



identical = combined[combined.duplicated()]



print(identical)

Output:

  Campaign  Merchant

4        A         1

5        A         2

6        B         1

answered 23 hours ago

tuomastik

751418

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47562%2fmultiple-filtering-pandas-columns-based-on-values-in-another-column%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

import pandas as pd



df1 = pd.DataFrame({"Random numbers 1": pd.np.random.randn(6),

                    "Campaign": ["A"] * 5 + ["B"],

                    "Merchant": [1, 1, 1, 2, 3, 1]})



df2 = pd.DataFrame({"Random numbers 2": pd.np.random.randn(6),

                    "Campaign": ["A"] * 2 + ["B"] * 2 + ["C"] * 2,

                    "Merchant": [1, 2, 1, 2, 1, 2]})



columns_consider = ["Campaign", "Merchant"]

combined = pd.concat((df1[columns_consider].drop_duplicates(),

                      df2[columns_consider].drop_duplicates()), ignore_index=True)



identical = combined[combined.duplicated()]



print(identical)

Output:

  Campaign  Merchant

4        A         1

5        A         2

6        B         1

answered 23 hours ago

tuomastik

751418

add a comment |

import pandas as pd



df1 = pd.DataFrame({"Random numbers 1": pd.np.random.randn(6),

                    "Campaign": ["A"] * 5 + ["B"],

                    "Merchant": [1, 1, 1, 2, 3, 1]})



df2 = pd.DataFrame({"Random numbers 2": pd.np.random.randn(6),

                    "Campaign": ["A"] * 2 + ["B"] * 2 + ["C"] * 2,

                    "Merchant": [1, 2, 1, 2, 1, 2]})



columns_consider = ["Campaign", "Merchant"]

combined = pd.concat((df1[columns_consider].drop_duplicates(),

                      df2[columns_consider].drop_duplicates()), ignore_index=True)



identical = combined[combined.duplicated()]



print(identical)

Output:

  Campaign  Merchant

4        A         1

5        A         2

6        B         1

answered 23 hours ago

tuomastik

751418

add a comment |

import pandas as pd



df1 = pd.DataFrame({"Random numbers 1": pd.np.random.randn(6),

                    "Campaign": ["A"] * 5 + ["B"],

                    "Merchant": [1, 1, 1, 2, 3, 1]})



df2 = pd.DataFrame({"Random numbers 2": pd.np.random.randn(6),

                    "Campaign": ["A"] * 2 + ["B"] * 2 + ["C"] * 2,

                    "Merchant": [1, 2, 1, 2, 1, 2]})



columns_consider = ["Campaign", "Merchant"]

combined = pd.concat((df1[columns_consider].drop_duplicates(),

                      df2[columns_consider].drop_duplicates()), ignore_index=True)



identical = combined[combined.duplicated()]



print(identical)

Output:

  Campaign  Merchant

4        A         1

5        A         2

6        B         1

answered 23 hours ago

tuomastik

751418

import pandas as pd



df1 = pd.DataFrame({"Random numbers 1": pd.np.random.randn(6),

                    "Campaign": ["A"] * 5 + ["B"],

                    "Merchant": [1, 1, 1, 2, 3, 1]})



df2 = pd.DataFrame({"Random numbers 2": pd.np.random.randn(6),

                    "Campaign": ["A"] * 2 + ["B"] * 2 + ["C"] * 2,

                    "Merchant": [1, 2, 1, 2, 1, 2]})



columns_consider = ["Campaign", "Merchant"]

combined = pd.concat((df1[columns_consider].drop_duplicates(),

                      df2[columns_consider].drop_duplicates()), ignore_index=True)



identical = combined[combined.duplicated()]



print(identical)

Output:

  Campaign  Merchant

4        A         1

5        A         2

6        B         1

answered 23 hours ago

tuomastik

751418

answered 23 hours ago

tuomastik

751418

answered 23 hours ago

tuomastik

751418

answered 23 hours ago

tuomastik

751418

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk