Ambiguity in Perceptron loss function

Bishop's Perceptron loss

On one hand, it is stated in equation 4.54 of Chris Bishop's book (pattern recognition and machine learning) that the loss function of perceptron algorithm is given by:

$${E_p}(mathbf{w}) = - sumlimits_{n in M} {{mathbf{w}^T}{phi _n}} {t_n}$$

where $M$ denotes the set of all misclassified data points.

Original Perceptron loss

On the other hand, The loss function used in the original perceptron paper written by Frank Rosenblatt is given by (wikipedia):

$${1 over s}sumlimits_{j = 1}^s {left| {{d_j} - {y_j}left( t right)} right|} $$
which when translated to the notation of Bishop's book is given by:

$${1 over N}sumlimits_{n = 1}^N {left| {{t_n} - {mathbf{w}^T}{phi _n}} right|} $$
where $N$ denotes the set of all data points.

My question

My question is that why Bishop's version of Perceptron loss is different from the original paper? Considering Bishop's book as a highly recognized book in machine learning field, can we call it Bishop's Perceptron?!

Scikit-learn's implementation

By the way, it seems that Scikit-learn uses Bishop's version of Perceptron loss (Scikit-learn documentation). It is apparent from the following formula and figure:

-np.minimum(xx, 0)

which for one sample reduces to:
$$ - min left( {0,{mathbf{w}^T}{phi _n}{t_n}} right)$$

enter image description here

edited 15 hours ago

asked yesterday

pythinker

8191213

add a comment |

Bishop's Perceptron loss

On one hand, it is stated in equation 4.54 of Chris Bishop's book (pattern recognition and machine learning) that the loss function of perceptron algorithm is given by:

$${E_p}(mathbf{w}) = - sumlimits_{n in M} {{mathbf{w}^T}{phi _n}} {t_n}$$

where $M$ denotes the set of all misclassified data points.

Original Perceptron loss

On the other hand, The loss function used in the original perceptron paper written by Frank Rosenblatt is given by (wikipedia):

$${1 over s}sumlimits_{j = 1}^s {left| {{d_j} - {y_j}left( t right)} right|} $$
which when translated to the notation of Bishop's book is given by:

$${1 over N}sumlimits_{n = 1}^N {left| {{t_n} - {mathbf{w}^T}{phi _n}} right|} $$
where $N$ denotes the set of all data points.

My question

Scikit-learn's implementation

By the way, it seems that Scikit-learn uses Bishop's version of Perceptron loss (Scikit-learn documentation). It is apparent from the following formula and figure:

-np.minimum(xx, 0)

which for one sample reduces to:
$$ - min left( {0,{mathbf{w}^T}{phi _n}{t_n}} right)$$

enter image description here

edited 15 hours ago

asked yesterday

pythinker

8191213

add a comment |

Bishop's Perceptron loss

On one hand, it is stated in equation 4.54 of Chris Bishop's book (pattern recognition and machine learning) that the loss function of perceptron algorithm is given by:

$${E_p}(mathbf{w}) = - sumlimits_{n in M} {{mathbf{w}^T}{phi _n}} {t_n}$$

where $M$ denotes the set of all misclassified data points.

Original Perceptron loss

On the other hand, The loss function used in the original perceptron paper written by Frank Rosenblatt is given by (wikipedia):

$${1 over s}sumlimits_{j = 1}^s {left| {{d_j} - {y_j}left( t right)} right|} $$
which when translated to the notation of Bishop's book is given by:

$${1 over N}sumlimits_{n = 1}^N {left| {{t_n} - {mathbf{w}^T}{phi _n}} right|} $$
where $N$ denotes the set of all data points.

My question

Scikit-learn's implementation

By the way, it seems that Scikit-learn uses Bishop's version of Perceptron loss (Scikit-learn documentation). It is apparent from the following formula and figure:

-np.minimum(xx, 0)

which for one sample reduces to:
$$ - min left( {0,{mathbf{w}^T}{phi _n}{t_n}} right)$$

enter image description here

edited 15 hours ago

asked yesterday

pythinker

8191213

Bishop's Perceptron loss

On one hand, it is stated in equation 4.54 of Chris Bishop's book (pattern recognition and machine learning) that the loss function of perceptron algorithm is given by:

$${E_p}(mathbf{w}) = - sumlimits_{n in M} {{mathbf{w}^T}{phi _n}} {t_n}$$

where $M$ denotes the set of all misclassified data points.

Original Perceptron loss

On the other hand, The loss function used in the original perceptron paper written by Frank Rosenblatt is given by (wikipedia):

$${1 over s}sumlimits_{j = 1}^s {left| {{d_j} - {y_j}left( t right)} right|} $$
which when translated to the notation of Bishop's book is given by:

$${1 over N}sumlimits_{n = 1}^N {left| {{t_n} - {mathbf{w}^T}{phi _n}} right|} $$
where $N$ denotes the set of all data points.

My question

Scikit-learn's implementation

By the way, it seems that Scikit-learn uses Bishop's version of Perceptron loss (Scikit-learn documentation). It is apparent from the following formula and figure:

-np.minimum(xx, 0)

which for one sample reduces to:
$$ - min left( {0,{mathbf{w}^T}{phi _n}{t_n}} right)$$

enter image description here

machine-learning classification perceptron

edited 15 hours ago

asked yesterday

pythinker

8191213

edited 15 hours ago

asked yesterday

pythinker

8191213

edited 15 hours ago

asked yesterday

pythinker

8191213

asked yesterday

pythinker

8191213

asked yesterday

pythinker

8191213

add a comment |

1 Answer
1

active

oldest

votes

I managed to find the Bishop's version by unearthing! the Rosenblatt's 1962 Principles of neurodynamics, Page 110 book, so the Wikipedia's version must be the alternative one.

It is worth noting that the book also has a chapter on error back-propagation (page 292), which resembles the Wikipedia version, but I think it is not exactly the same.

answered 7 hours ago

Esmailian

3,181320

1

$begingroup$
Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.
$endgroup$
– pythinker
6 hours ago

$begingroup$
@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)
$endgroup$
– Esmailian
6 hours ago

1

$begingroup$
I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)
$endgroup$
– pythinker
6 hours ago

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49152%2fambiguity-in-perceptron-loss-function%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I managed to find the Bishop's version by unearthing! the Rosenblatt's 1962 Principles of neurodynamics, Page 110 book, so the Wikipedia's version must be the alternative one.

It is worth noting that the book also has a chapter on error back-propagation (page 292), which resembles the Wikipedia version, but I think it is not exactly the same.

answered 7 hours ago

Esmailian

3,181320

1

$begingroup$
Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.
$endgroup$
– pythinker
6 hours ago

$begingroup$
@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)
$endgroup$
– Esmailian
6 hours ago

1

$begingroup$
I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)
$endgroup$
– pythinker
6 hours ago

add a comment |

I managed to find the Bishop's version by unearthing! the Rosenblatt's 1962 Principles of neurodynamics, Page 110 book, so the Wikipedia's version must be the alternative one.

It is worth noting that the book also has a chapter on error back-propagation (page 292), which resembles the Wikipedia version, but I think it is not exactly the same.

answered 7 hours ago

Esmailian

3,181320

1

$begingroup$
Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.
$endgroup$
– pythinker
6 hours ago

$begingroup$
@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)
$endgroup$
– Esmailian
6 hours ago

1

$begingroup$
I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)
$endgroup$
– pythinker
6 hours ago

add a comment |

I managed to find the Bishop's version by unearthing! the Rosenblatt's 1962 Principles of neurodynamics, Page 110 book, so the Wikipedia's version must be the alternative one.

It is worth noting that the book also has a chapter on error back-propagation (page 292), which resembles the Wikipedia version, but I think it is not exactly the same.

answered 7 hours ago

Esmailian

3,181320

I managed to find the Bishop's version by unearthing! the Rosenblatt's 1962 Principles of neurodynamics, Page 110 book, so the Wikipedia's version must be the alternative one.

It is worth noting that the book also has a chapter on error back-propagation (page 292), which resembles the Wikipedia version, but I think it is not exactly the same.

answered 7 hours ago

Esmailian

3,181320

answered 7 hours ago

Esmailian

3,181320

answered 7 hours ago

Esmailian

3,181320

answered 7 hours ago

Esmailian

3,181320

1

$begingroup$
Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.
$endgroup$
– pythinker
6 hours ago

$begingroup$
@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)
$endgroup$
– Esmailian
6 hours ago

1

$begingroup$
I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)
$endgroup$
– pythinker
6 hours ago

add a comment |

1

$begingroup$
Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.
$endgroup$
– pythinker
6 hours ago

$begingroup$
@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)
$endgroup$
– Esmailian
6 hours ago

1

$begingroup$
I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)
$endgroup$
– pythinker
6 hours ago

Thanks a bunch for your detailed and demanding response. I’m really surprised by the amount of search you did in order to fix the issue.

– pythinker
6 hours ago

@pythinker My pleasure! I'm glad I could help. This detective stuff is fun ;)

– Esmailian
6 hours ago

I’m pleased to hear that you are interested in this type of questions. So I will try to pose more questions like this :)

– pythinker
6 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk