In my previous instalment of this little series, I showed the most common words appearing in password hints. This time, I started fiddling around with D3.js in order to make a more dynamic visualization. The result is a sort of "landscape" of a random sampling of the passwords. See the project page for more details and a small demonstration.

I am aware that this sort of visualization is not yet optimal. Ultimately, I would like a dissimilarity function for judging how different two passwords can be. Since Adobe did not use a hash function but rather something like 3DES in block mode, we can at least detect when two passwords are equal or when one is a subsequence of the other.

Maybe the next part of this series of posts will show an even better way for working with these data sets.

Posted late Sunday evening, May 4th, 2014 Tags: