Update: sortcanon.py Version 0.0.3
2023-8-28 01:44:56 Author: blog.didierstevens.com(查看原文) 阅读量:20 收藏

Update: sortcanon.py Version 0.0.3

Some new options for my tool sortcanon.py to handle more inputs.

A bit of context: when one sorts a list of IPv4 addresses as text, one gets a result as follows. Take this list:

Just sorting this gives this result:

The IPv4 address starting with 185 comes first, because by default, sorting is string based and digit 1 comes before digit 3.

With sortcanon, one can provide a Python function that will be used to interpret the input and achieve the desired sorting. There are a couple of builtin functions, like ipv4. This is the result:

This time, the IPv4 address starting with 185 comes last, because it has the highest most significant byte.

Recently, I had to sort some files where with extra data, like IPv4 addresses with port numbers. Something like this list:

But this did not work:

Because the function that parses IPv4 addresses, does not expect a port number.

I could create a custom function to handle this, but I pursued another solution. I added an option to select the part of the line, that will be used for sorting, with a regular expression. This is done with option -s (select). Like this:

Regular expression “^([^ ]+) ” selects all characters from the beginning of the line (^) until the first space character (excluded). This selection is stored in a capture group (), and the ipv4 sorting function takes this capture group as input, in stead of the complete line.

The list I selected as example, has some duplicate IPv4 addresses:

If I use option -u (unique), duplicate lines are removed:

But of course the lines with identical IPv4 address 53… remain, because the lines themselves are different (different port number).

This is the desired result, most of the time. But I had an exceptional case, where I had to drop duplicate IPv4 addresses, but still keep one port number. This can be done with option –selectoptions u:

sortcanon_V0_0_3.zip (http)
MD5: CF742211DCF5AD893B882658980E6998
SHA256: 44DECFCDCA4966F8A8A2B1172EFA6B706294935C20D6A12C5A68F5D395396A77

No comments yet.


文章来源: https://blog.didierstevens.com/2023/08/27/update-sortcanon-py-version-0-0-3/
如有侵权请联系:admin#unsafe.sh