Finding P value

Finding P value - Explain

def get_pvalue(con_conv, test_conv,con_size,  test_size,):  

    lift =  - abs(test_conv - con_conv)

    scale_one = con_conv * (1 - con_conv) * (1 / con_size)

    scale_two = test_conv * (1 - test_conv) * (1 / test_size)

    scale_val = (scale_one + scale_two)**0.5

    p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

    return p_value

I have this function and I would like to know what it is actually doing and how it is actually calculating the p-value.

This is to find the difference between the conversion rate of control and test and group from an A/B test.

con_conv --> Conversion rate for control group

test_conv --> Conversion rate for test group

con_size --> population size for control group

test_size --> population size for test group

I understand that scale_one and scale_two are calculating the variance for each group, but I don't understand why they are adding both of them to calculate the standard deviation and why they are multiplying the cdf with 2 to get the p_value.

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

add a comment |

def get_pvalue(con_conv, test_conv,con_size,  test_size,):  

    lift =  - abs(test_conv - con_conv)

    scale_one = con_conv * (1 - con_conv) * (1 / con_size)

    scale_two = test_conv * (1 - test_conv) * (1 / test_size)

    scale_val = (scale_one + scale_two)**0.5

    p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

    return p_value

I have this function and I would like to know what it is actually doing and how it is actually calculating the p-value.

This is to find the difference between the conversion rate of control and test and group from an A/B test.

con_conv --> Conversion rate for control group

test_conv --> Conversion rate for test group

con_size --> population size for control group

test_size --> population size for test group

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

add a comment |

def get_pvalue(con_conv, test_conv,con_size,  test_size,):  

    lift =  - abs(test_conv - con_conv)

    scale_one = con_conv * (1 - con_conv) * (1 / con_size)

    scale_two = test_conv * (1 - test_conv) * (1 / test_size)

    scale_val = (scale_one + scale_two)**0.5

    p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

    return p_value

I have this function and I would like to know what it is actually doing and how it is actually calculating the p-value.

This is to find the difference between the conversion rate of control and test and group from an A/B test.

con_conv --> Conversion rate for control group

test_conv --> Conversion rate for test group

con_size --> population size for control group

test_size --> population size for test group

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

def get_pvalue(con_conv, test_conv,con_size,  test_size,):  

    lift =  - abs(test_conv - con_conv)

    scale_one = con_conv * (1 - con_conv) * (1 / con_size)

    scale_two = test_conv * (1 - test_conv) * (1 / test_size)

    scale_val = (scale_one + scale_two)**0.5

    p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

    return p_value

I have this function and I would like to know what it is actually doing and how it is actually calculating the p-value.

This is to find the difference between the conversion rate of control and test and group from an A/B test.

con_conv --> Conversion rate for control group

test_conv --> Conversion rate for test group

con_size --> population size for control group

test_size --> population size for test group

python statistics

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

edited 18 hours ago

Stephen Rauch♦

1,52551330

edited 18 hours ago

Stephen Rauch♦

1,52551330

edited 18 hours ago

Stephen Rauch♦

1,52551330

asked 20 hours ago

Kartikeya Sharma

101

New contributor

asked 20 hours ago

Kartikeya Sharma

101

asked 20 hours ago

Kartikeya Sharma

101

New contributor

Kartikeya Sharma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

1 Answer
1

active

oldest

votes

p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

This is the key for your question: The p-value is the probability that the null hypothesis is true.

If the null hypothesis is true: Your model does not find any differences between groups.
If false: Your model finds differences between groups.

Given that you are using a model which its subyacent assumption is normallity (amongst others), the hypothesis test is to be tried comparing the probability in the context of a normal distribution.

The function stats.norm.cdf returns the probability of "lift being close to zero" if lift is supposed to be "normal". If lift is zero, then there is no difference between groups, so a p-value of <0.01 tell us that the probability that the groups are equal is almost 0, meaning that your groups are different.

The 2 is due to a concept called "two-tailed distribution": The difference between groups can be A greater than B or B greater that A, that's why you measure the difference in either two of the ways.

The addition between standard deviations obeys the concept of:
$Var(X+Y) = Var(X) + Var(Y)$ if $X$ and $Y$ are independent.

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Kartikeya Sharma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49248%2ffinding-p-value-explain%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

This is the key for your question: The p-value is the probability that the null hypothesis is true.

If the null hypothesis is true: Your model does not find any differences between groups.
If false: Your model finds differences between groups.

The 2 is due to a concept called "two-tailed distribution": The difference between groups can be A greater than B or B greater that A, that's why you measure the difference in either two of the ways.

The addition between standard deviations obeys the concept of:
$Var(X+Y) = Var(X) + Var(Y)$ if $X$ and $Y$ are independent.

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

add a comment |

p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

This is the key for your question: The p-value is the probability that the null hypothesis is true.

If the null hypothesis is true: Your model does not find any differences between groups.
If false: Your model finds differences between groups.

The 2 is due to a concept called "two-tailed distribution": The difference between groups can be A greater than B or B greater that A, that's why you measure the difference in either two of the ways.

The addition between standard deviations obeys the concept of:
$Var(X+Y) = Var(X) + Var(Y)$ if $X$ and $Y$ are independent.

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

add a comment |

p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

This is the key for your question: The p-value is the probability that the null hypothesis is true.

If the null hypothesis is true: Your model does not find any differences between groups.
If false: Your model finds differences between groups.

The 2 is due to a concept called "two-tailed distribution": The difference between groups can be A greater than B or B greater that A, that's why you measure the difference in either two of the ways.

The addition between standard deviations obeys the concept of:
$Var(X+Y) = Var(X) + Var(Y)$ if $X$ and $Y$ are independent.

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

p_value = 2 * stats.norm.cdf(lift, loc = 0, scale = scale_val )

This is the key for your question: The p-value is the probability that the null hypothesis is true.

If the null hypothesis is true: Your model does not find any differences between groups.
If false: Your model finds differences between groups.

The 2 is due to a concept called "two-tailed distribution": The difference between groups can be A greater than B or B greater that A, that's why you measure the difference in either two of the ways.

The addition between standard deviations obeys the concept of:
$Var(X+Y) = Var(X) + Var(Y)$ if $X$ and $Y$ are independent.

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

edited 19 hours ago

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

answered 20 hours ago

Juan Esteban de la Calle

938

answered 20 hours ago

Juan Esteban de la Calle

938

New contributor

Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

Kartikeya Sharma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Kartikeya Sharma is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk

Finding P value - Explain

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Callistus I

Tabula Rosettana

How to label and detect the document text images

Finding P value - Explain

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Callistus I

Tabula Rosettana

How to label and detect the document text images

1 Answer
1

1 Answer
1

1 Answer
1