In my previous post on network sockets, I described a simple implementation of a server and a client socket in C++11. The initial version was only capable of reacting to a new client connection. What about a socket being ready for reading, though?

select() and socket sets

There are numerous ways of doing multiplexing for sockets—of course I used the easiest one which is the worst and slowest. Anyway, after some quality time with the socket documentation, I implemented the following procedure:

  1. Create two empty socket sets, a master and a client socket. The sockets API refers to them as fd_set.
  2. Create a server socket (as in the previous post) and add it to the master set.
  3. Loop forever! Copy the master socket set to the client socket set and call select().
  4. If the call to select() finishes without an error, one of the sockets in the set is ready for something. This something might be a read operation or a new client.
  5. Handle this by creating new client sockets as appropriate, calling the corresponding callback functions using std::async, and updating the socket sets.

Sounds rather straightforward, but the corresponding code is messy and not yet very smart. For example, I am looping over all possible file descriptors and I had to adjust the ClientSocket class to un-register at the server class once a socket is closed. I am not quite content with the implementation.

How to use it?

The API is still using the callback approach introduced in the previous post. The cool thing is that this now also works whenever a client socket has something to say. For example, to implement RFC 862, the Echo protocol, we merely need the following code:

server.onRead( [&] ( std::weak_ptr<ClientSocket> socket )
{
  if( auto s = socket.lock() )
  {
    auto data = s->read();
    s->write( data );
  }
} );

Note that I have changed the std::unique_ptr to an std::weak_ptr because the server is now responsible for managing the client sockets. Since all handlers are called being asynchronously, it is possible that a socket is already unavailable because it has been closed.

Where is the code?

I have updated the GitHub repository introduced in the last post. The repository now also contains a simple echo server implementation.

The code is released under an MIT licence.

Posted Wednesday evening, August 12th, 2015 Tags:

Inspired by Sheldon’s “Fun with Flags” series, I decided to collect my thoughts about the beautiful subject of typography in a similar way. My friends and students know that I can go on and on about this topic. Maybe writing it down like this decreases the length of my rants in the real-world.

For this first part, I want to explain some definitions and dispel some myths about typography.

What is typography?

Typography is the art and craft of arranging type to make written language readable and appealing.

In this definition, type refers to the composition of text by some means. I am most used to computer typesetting, so I am assuming that we are doing electronic typesetting here. Most of my examples will either refer to HTML or LaTeX, because I am most familiar with them.

Note that the definition contains the words art and craft. Typography is thus decidedly not all about making a document look “nice” or anything. Some of the rules are grounded in the reality of the perception of readers—see below for some explanations.

What makes you qualified to write about this?

To be honest, nothing specifically. I like good typography. Over the course of my career as a student and a researcher, I have always given some thought about presenting my text in an aesthetically pleasing manner.

I am not claiming that my way of doing things is the best way. I am also not claiming that my documents are the best-looking ones out there. Furthermore, I am well aware of the fact that typography for web pages leaves some things to be desired. For example, I am not quite satisfied with the way quotes or apostrophes are typeset. But this is a limitation of the web medium, not a principal limitation.

Are you saying that looks are more important than content?

No. Typography is not about making bad content look good. You are confusing this with marketing. Of course, your content needs to be worth the effort of typesetting. The problem with electronic typesetting is that—whether we want to or not—each and everyone of us has become a little typographer. Every word processing software permits a myriad of ways of adjusting the look of text. So, unless you are writing everything in a text editor and only apply the design later on—which, coincidentally, might not be the worst idea—you should at least be aware of the basic principles of typography.

But this is all subjective!

As a matter of fact, it is not. At least not entirely. During the course of this series I also want to provide pointers to relevant research whenever applicable. I plan on separating the purely aesthetic aspects, which are certainly subjective or reliant on a fashion-du-jour, from the aspects that have a solid basis in perception.

Let us start with this right now. Here are some interesting research findings about typography:

  1. In Reader Preferences and Typography, Tinker and Paterson reported that there is a close agreement between the apparent legibility and the apparent pleasingness of a text. The study is slightly scant on details, though, as Tinker and Paterson only compared text set in lower case with text set in bold face or all capitals. Still, it seems to point in the direction that certain text markups shall only be used sparingly. Bold face, for example, draws the attention of a reader towards a single word or phrase. The effect is lost if everything is set in bold face.

  2. In Typography and Readability, Burtt examined the readability of text if certain typographical elements are changed. The selection of the font face proved to be somewhat significant, for example. Very ornate fonts, such as Cloister Black were deemed to confuse some readers. Of course, this is not entirely surprising, but it hints at the importance of selecting the “right” font. Furthermore, Burtt reported that combining several font faces can actually decrease the reading efficiency by as much as 11%. Eye movement observations also suggested that long texts should not consist exclusively of capital letters. Burtt also observed that line lengths should not be too large. For 10 point type, line length should vary between 75mm and 90mm.

  3. In The Effects of Line Length on Children and Adults’ Online Reading Performance, Bernard et al. found that medium (65–75 characters per line) or narrow length texts (45 characters per line) are preferred for online reading by adults and children, respectively.

  4. In The Aesthetics of Reading, Larson and Picard examined the performance of participants in cognitive tasks after exposing them to examples of bad and good typography. While Larson and Picard concluded that most readers do not perceive very intricate details such as ligatures or small caps (which may thus be thought of belonging to the aesthetics part of typography), they found that readers greatly prefer good typography to bad typography. I would take the cognitive differences with a grain of salt, though, because the sample size does not seem to be large enough—only 20 subjects, which were divided into two groups. Yet, subjects rated the text with good typography to be significantly easier to read.

I hope I have convinced you that some typographical standards are worth thinking about. In the forthcoming posts, I will always refer to relevant research if applicable, or explicitly state that a rule is “only” based on aesthetic preferences.

Stay tuned.

Posted Sunday afternoon, August 16th, 2015 Tags:

For the first “real” episode in this series, I wanted to start very slowly. Let us take a look at very basic typographic symbols, namely dots and dashes (and some spaces). Except for the rules about hyphens, dashes, and the like, all the rules presented here are more about aesthetics and custom.

Dots

We use a simple full stop (or period for you Americans) to indicate the end of a sentence. Nothing new here so far.

When we want to indicate a multiplication in a formula, sometimes it makes sense to show a multiplication symbol. In this case, the proper symbol to use is \cdot (for LaTeX users) or &middot; (for HTML). This symbols is typeset as · and it is also used to indicate a transition between words, for example on a business card. A sample business card might read “John Doe · Tinker & Tailor”.

Never, never use \times or &times; to indicate multiplication in any mathematics text written for university students. In the classroom, you might find “5 × 4 = 20”, but the &times; symbol is customarily used to indicate things like a cross product in mathematics.

If you want to indicate that something is missing from a quotation, a typeset matrix, or anything else, use the wonderful ellipsis character. It consists of three dots that are spaced differently than three regular dots. In LaTeX, you can use \dots. In HTML, use &hellip;. It looks like this:

1,2,3,…,n

The advantage of the predefined ellipsis character is that it will not be broken up. Some style guides ask you to write an ellipsis as a sequence of dots and spaces. They are wrong. There is a difference between “. . .” and “…” in all fonts.

Dashes

Now let us dash to the dashes! There are three types of dashes. First, we have the simple hyphen, which is technically not a dash but I am calling it one anyway because one usually thinks of it as a dash. Second, we have the en-dash. Third, we have the em-dash.

The hyphen is typeset using a simple - character. We use it to indicate a break in a word, when said word needs to be split because the end of a line has been reached. We also use it to spell some composite words, such as good-hearted or mother-in-law. A further use of the hyphen is to remove ambiguities in some compound adjectives. For example, small-arms fire, high-school students, and so on. XKCD 37 also has a nice example for this.

I do not know all the rules about when to use a hyphen or not. As a non-native speaker, I am slightly biased here. When in doubt, consult your dictionary of choice. At least you now know what the hyphen is meant for.

Moving on to the next dash, the en-dash. In LaTeX we set it by typing two dashes (--). In HTML, we write &ndash; to obtain “–”. The en-dash is customarily used to indicate ranges. For example:

The theorem is proved on pages 23–42.

1887–1895 was the period of the first construction of the Kiel Canal.

Most bibliographical services in the web are unable to get the en-dash right. I am looking at you, ACM and IEEE.

The en-dash is also used to indicate joint authorship, such as the Barnes–Hut simulation.

Last, the em-dash. It is the longest of all the dashes. In LaTeX, it is typeset by three hyphens in a row ---. In HTML, use the entity &mdash;. The em-dash is rather long, “—”. It is used to indicate a break between parts of a sentence. Most typographers consider it stronger than a comma but weaker than a semicolon or a pair of parentheses. If this comparison does make sense to you, please use the em-dash. Here is an example from my own writing:

Topologists aim to identify invariants of such spaces—properties that do not change when the space is stretched, bent, and twisted by homeomorphisms.

I typically use the em-dash without any further spaces because I consider it long enough to indicate a break by its own. This again is a matter of personal style. If you want to use dash to indicate pauses, omissions, and so on, please use the em-dash. It gives the eye something to follow and indicates a break. To bore you with a personal anecdote: When converting some eBooks to my Kindle, em-dashes are made into hyphens. This really confuses me when reading because I expect a compound word or a line-break but get a parenthetical remark.

In short: Dots. Dashes. Use them properly, please.

Posted late Tuesday evening, August 18th, 2015 Tags:

When writing your magnum opus, you might want to put emphasis on certain facts, names, and so on. There are three good ways and one bad way of doing this.

Let us take a look at the bad way first: Underlining. Depending on the font you use, it might look very irregular. For web content, it might confuse your readers because they expect a hyperlink. Also, it is slightly hard to read.

This is a remnant from the age of typewriters where other text decorations were impossible to do. Let this remnant enjoy its well-deserved retirement. In modern times, we can do without it. It evokes, at least for me, the distinct style of yellow press publication.

So, what are the three good ways, then? To wit, italics, bold, or small caps. Italics are easily achieved in LaTeX with \textit{Text} (or \emph{Text}, which is preferable for a number of reasons; see below) or in HTML with <em>Text</em>. Similarly, you obtain a bold decoration with \textbf{Text} in LaTeX or <strong>Text</strong> in HTML—although <strong> only defaults to bold text. How it is actually rendered depends on the CSS. Last, small caps are capital characters that are set similar to surrounding lower case letters, making them not as conspicuous as upper-case characters. Whether they look good or not depends very much on the font you use. In LaTeX, you can get them using \textsc{Text}, in HTML you have to add font-variant:small-caps; to the current style.

This is pretty straightforward. The one thing you should avoid is combining different modes of emphasis. Bold and italics do not mix well in most cases.

As a parting thought, some words on \emph in LaTeX: You should rather use \emph than \textit because the former command is somewhat smart and can recognize nesting. For example, if your personal style is to typeset figure captions in italics but you need to emphasize something in the caption, \emph will detect this and render the text upright. When using \textit, nothing would happen here because, well, the text is already in italics. If this does not convince you, just think about the flexibility offered by \emph: You could redefine it to work differently for some sections in your document if that strikes your fancy.

To end with a happy note for the poor underlined text: You can use underlining to mark corrections in a text. This might be especially helpful when collaborating on a document. But do not use underlines for anything that is going to be included in the final version of a document.

TLDR: Do not use underlining for emphasis in a text. Use italics, bold, or small caps, depending on your font and your fancy.

Posted Monday evening, August 31st, 2015 Tags: