In recent exchanges, members of the faculty have tried in vain to attack other Computer Scientists and disparage their work. Quite frankly, I find the results embarrassing -- instead of cutting the opponent down, many of the remarks have been laughably innocuous. Something must be done about it because any outsider who hears such blather will think less of our department: no group can hold the respect of others unless its members can deal a devastating verbal blow at will.
This short essay is an effort to help faculty make their remarks more pointed, and help avoid wimpy vindictives. It explains how to insult CS research, shows where to find the Achilles' heel in any project, and illustrates how one can attack a researcher.
I wanted to parse long strings of ascii-represented-hex to convert them into the hex they represent.
That is, "0123456789ABCDEF" into a numerical 0x0123456789ABCDEF.
Of course, the problem with this is that a number might be larger than can be parsed into a single integer - however large the type is (8, 16, 32, 64, more bits), the string may be larger. So I decided to use repeated calls to
strtoul(const char * str, char ** end, int base)
. The idea was simple:
end
would be set to point to the end of the parsed number; the subsequent call to
strtoul
would then parse the next n hex characters (16 on a modern system, usually) or until the end of the string.
But here's why I think strtoul does something really stupid: end
gets set to the end of the number (delimited by whitespace or the end of the string); not the end of what strtoul was able to parse. In other words, for strings longer than the standard 16 (or whatever) hex characters, end
does not get set to anything useful.
Coupled with endianness issues (x86 being little endian, but humans almost always writing big-endian) and needing to pad the last input (left or right depending on system endianness) in some cases, I decided that it was easier to just consume two chars at a time, check they were within the right bounds, and do character subtraction to get each byte value, and simply stick the resultant bytes into a byte array (essentially an endian-less arbitrarily large integer, already in a perfectly serialized format for transmission over a network or using a hardware communication protocol, which was the end consumer of this input anyways).