In my previous instalment of this little series, I showed the most common words appearing in password hints. This time, I started fiddling around with D3.js in order to make a more dynamic visualization. The result is a sort of "landscape" of a random sampling of the passwords. See the project page for more details and a small demonstration.

I am aware that this sort of visualization is not yet optimal. Ultimately, I would like a dissimilarity function for judging how different two passwords can be. Since Adobe did not use a hash function but rather something like 3DES in block mode, we can at least detect when two passwords are equal or when one is a subsequence of the other.

Maybe the next part of this series of posts will show an even better way for working with these data sets.