Other People's Technical Stuff

Stuff about Internet standards

ISO 8859-1 Latin 1 and Unicode characters in ampersand entities

How do you portably get chracters like ‘≠’ and ‘™’ into your HTML?

Things are much better now, but it used to be that the Windows non-standard CP-1252 encoding was making things difficult on the Internet as people published pages using it and Microsoft browsers incorrectly displayed it as CP-1252 instead of ISO 8859-1 while everybody else saw weird funny characters in strange places because of this. And, conversely, Microsoft browsers would display perfectly reasonable ISO 8859-1 characters as the wrong thing. HTML ampersand entities were a way of unambiguously specifying which character you really meant so that no matter which encoding the browser used your page would look as intended.

With UTF-8 being nearly universal now, this is not nearly so much of an issue.

rfc2279 - UTF-8, a transformation format of ISO 10646
How does UTF-8 encode Unicode into characters that a non-Unicode aware 8-bit clean program will still handle? UTF-8 is probably the most popular Unicode encoding format because it doesn't take any extra space to encode straight ASCII text, and can sort of be handled by old programs that don't know about Unicode as long as those programs are 8-bit clean.
Descriptions of Base32, Base64, and Base16 encoding.
These encodings are used for encoding arbitrary data in some form of ASCII. Each encoding has it's own advantages and disadvantages. The lower the base #, the more characters it takes to represent any given piece of data. But, case matters in base64 encoding, and, for things that humans might type in or say to one another, case shouldn't matter.
Description of Base85 encoding
This started as a joke, an April 1st RFC, but some people have actually started using it on purpose. It is very slightly more efficient than Base64. You can encode 48 bytes of data using 64 Base64 characters, or 60 base85 characters. That is a 6% savings. But it is much more complex to encode in.
HMAC: Keyed-Hashing for Message Authentication
A general method for using hashes for authenticating the source and verifying the contents of a message.

Stuff about nitty-gritty programming

IEEE-754 Floating-Point Conversion from Floating-Point to Hexadecimal
More than you ever wanted to know about exactly how floating point (most numbers with decimal points in them) numbers are represented.
libevent
libevent is a way to do event driven programming without having to worry about whether the underlying system is using velnerable old poll, kqueue, real-time signals, epoll, or some other mechanism. I need to base StreamModule on this.
RealTime Signals for Highly Concurrent Network I/O
A paper describing how much better things got for Squid when they jiggered it to use real-time signals instead of poll. I would recommend using libevent at this point, but this does make a good case study as well as a good explanation of our real-time signals work.
The C10K problem
It's time for servers on the Internet to handle ten thousand clients simultaneously, don't you think? After all, the Internet is a big place now.
Program Library HOWTO
Stuff about how shared libraries work under Linux. A lot of the information can be applied to any system that uses ELF format libraries and executables. This includes Solaris, newer versions of HPUX, and UnixWare.

C and C++ stuff

Most of the links I used to have here have gone dead. And so rather than give useless information, I have just removed them. I left the original page in an HTML comment if you are really interested, and I may get around to finding better links.

C++ Reference
One of the best reference sites out there.
Compiler Explorer
See which compilers like or disklike your C++. See what assembly various compilers will produce for some C++ code. An invaluable site for various small-scale C++ experiments.
Modern C++ Design by Andrei Alexandrescu

This book was ground breaking once. It introduced people to a lot of interesting design ideas making use of templates. It is one of the first places where it was explicitly recognized that templates were Turing complete, and explored the implications in detail.

Now, it is kind of 'old-hat'. And some of the tricks have been superceded by C++ language features that are much more convenient to use and much more clearly express your intent.

Single Unix Specification

The above is a truly excellent manual of C and Unix functions.

Random useful tidbits



Eric Hopper eric-www@omnifarious.org My homepage