Delete all lines which don't have n characters before delimiter

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

I have a very long text file (from here) which should contain 6 hexadecimal characters then a 'break' (which appears as one character and doesn't seem to show up properly in the code markdown below) followed by a few words:

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

5080    Cisco Systems, Inc

0E+00   ASUSTek COMPUTER INC.

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

2354    ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

I've done some looking around and can't see something which would work in this situation. My question is, how can I use grep/sed/awk/perl to delete all lines of this text file which do not start with exactly 6 hexadecimal characters and then a 'break'?

P.S. For bonus points, what's the best way of sorting the file alphabetically and numerically according to the hex characters (i.e. 000000 -> FFFFFF)? Should I just use sort?

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

add a comment |

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

5080    Cisco Systems, Inc

0E+00   ASUSTek COMPUTER INC.

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

2354    ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

P.S. For bonus points, what's the best way of sorting the file alphabetically and numerically according to the hex characters (i.e. 000000 -> FFFFFF)? Should I just use sort?

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

add a comment |

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

5080    Cisco Systems, Inc

0E+00   ASUSTek COMPUTER INC.

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

2354    ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

P.S. For bonus points, what's the best way of sorting the file alphabetically and numerically according to the hex characters (i.e. 000000 -> FFFFFF)? Should I just use sort?

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

5080    Cisco Systems, Inc

0E+00   ASUSTek COMPUTER INC.

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

2354    ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

P.S. For bonus points, what's the best way of sorting the file alphabetically and numerically according to the hex characters (i.e. 000000 -> FFFFFF)? Should I just use sort?

text-processing sed grep text-formatting

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

edited 13 hours ago

codeforester

405418

edited 13 hours ago

codeforester

405418

edited 13 hours ago

codeforester

405418

asked 15 hours ago

Rocco

735

asked 15 hours ago

Rocco

735

asked 15 hours ago

Rocco

735

add a comment |

2 Answers
2

active

oldest

votes

$ awk '$1 ~ /^[[:xdigit:]]{6}$/' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

This uses awk to extract the lines that contains exactly six hexadecimal digits in the first field. The [[:xdigit:]] pattern matches a hexadecimal digit, and {6} requires six of them. Together with the anchoring to the start and end of the field with ^ and $ respectively, this will only match on the wanted lines.

Redirect to some file to save it under a new name.

Note that this seems to work with GNU awk (commonly found on Linux), but not with awk on e.g. OpenBSD, or mawk.

A similar approach with sed:

$ sed -n '/^[[:xdigit:]]{6}>/p' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

In this expression, > is used to match the end of the hexadecimal number. This ensures that longer numbers are not matched. The > pattern matches a word boundary, i.e. the zero-width space between a word character and a non-word character.

For sorting the resulting data, just pipe the result trough sort, or sort -f if your hexadecimal numbers uses both upper and lower case letters

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

add a comment |

And for completeness, you can do this with grep too:

$ grep -E '^[[:xdigit:]]{6}b' oui.txt 

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

$

This extended grep expression searches for exactly 6 hex digits at the beginning of each line, followed immediately by a non-whitespace-to-whitespace boundary (b).

answered 9 hours ago

Digital Trauma

6,10211730

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511695%2fdelete-all-lines-which-dont-have-n-characters-before-delimiter%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

$ awk '$1 ~ /^[[:xdigit:]]{6}$/' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

Redirect to some file to save it under a new name.

Note that this seems to work with GNU awk (commonly found on Linux), but not with awk on e.g. OpenBSD, or mawk.

A similar approach with sed:

$ sed -n '/^[[:xdigit:]]{6}>/p' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

For sorting the resulting data, just pipe the result trough sort, or sort -f if your hexadecimal numbers uses both upper and lower case letters

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

add a comment |

$ awk '$1 ~ /^[[:xdigit:]]{6}$/' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

Redirect to some file to save it under a new name.

Note that this seems to work with GNU awk (commonly found on Linux), but not with awk on e.g. OpenBSD, or mawk.

A similar approach with sed:

$ sed -n '/^[[:xdigit:]]{6}>/p' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

For sorting the resulting data, just pipe the result trough sort, or sort -f if your hexadecimal numbers uses both upper and lower case letters

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

add a comment |

$ awk '$1 ~ /^[[:xdigit:]]{6}$/' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

Redirect to some file to save it under a new name.

Note that this seems to work with GNU awk (commonly found on Linux), but not with awk on e.g. OpenBSD, or mawk.

A similar approach with sed:

$ sed -n '/^[[:xdigit:]]{6}>/p' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

For sorting the resulting data, just pipe the result trough sort, or sort -f if your hexadecimal numbers uses both upper and lower case letters

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

$ awk '$1 ~ /^[[:xdigit:]]{6}$/' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

Redirect to some file to save it under a new name.

Note that this seems to work with GNU awk (commonly found on Linux), but not with awk on e.g. OpenBSD, or mawk.

A similar approach with sed:

$ sed -n '/^[[:xdigit:]]{6}>/p' file

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

For sorting the resulting data, just pipe the result trough sort, or sort -f if your hexadecimal numbers uses both upper and lower case letters

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

edited 14 hours ago

answered 15 hours ago

Kusalananda♦

141k17262438

answered 15 hours ago

Kusalananda♦

141k17262438

answered 15 hours ago

Kusalananda♦

141k17262438

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

add a comment |

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

Perfect, thank you very much. Exactly what I was looking for!

– Rocco
14 hours ago

add a comment |

And for completeness, you can do this with grep too:

$ grep -E '^[[:xdigit:]]{6}b' oui.txt 

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

$

This extended grep expression searches for exactly 6 hex digits at the beginning of each line, followed immediately by a non-whitespace-to-whitespace boundary (b).

answered 9 hours ago

Digital Trauma

6,10211730

add a comment |

And for completeness, you can do this with grep too:

$ grep -E '^[[:xdigit:]]{6}b' oui.txt 

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

$

This extended grep expression searches for exactly 6 hex digits at the beginning of each line, followed immediately by a non-whitespace-to-whitespace boundary (b).

answered 9 hours ago

Digital Trauma

6,10211730

add a comment |

And for completeness, you can do this with grep too:

$ grep -E '^[[:xdigit:]]{6}b' oui.txt 

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

$

This extended grep expression searches for exactly 6 hex digits at the beginning of each line, followed immediately by a non-whitespace-to-whitespace boundary (b).

answered 9 hours ago

Digital Trauma

6,10211730

And for completeness, you can do this with grep too:

$ grep -E '^[[:xdigit:]]{6}b' oui.txt 

00107B  Cisco Systems, Inc

00906D  Cisco Systems, Inc

0090BF  Cisco Systems, Inc

000C6E  ASUSTek COMPUTER INC.

001BFC  ASUSTek COMPUTER INC.

001E8C  ASUSTek COMPUTER INC.

0015F2  ASUSTek COMPUTER INC.

001FC6  ASUSTek COMPUTER INC.

60182E  ShenZhen Protruly Electronic Ltd co.

F4CFE2  Cisco Systems, Inc

501CBF  Cisco Systems, Inc

$

This extended grep expression searches for exactly 6 hex digits at the beginning of each line, followed immediately by a non-whitespace-to-whitespace boundary (b).

answered 9 hours ago

Digital Trauma

6,10211730

answered 9 hours ago

Digital Trauma

6,10211730

answered 9 hours ago

Digital Trauma

6,10211730

answered 9 hours ago

Digital Trauma

6,10211730

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk