What is the best way to go about explaining the difference between these two different types of entropy? I can see the difference myself and give all sorts of intuitive reasons for how the concepts work and how they kind of relate. At the same time I can see why my (undergraduate) physicist friends would be skeptical when I tell them that no, I haven't got it backwards and a string of all '1's has nearly zero entropy while a perfectly random string is 'maximum entropy'. After all, if your entire physical system degenerates into a mush with no order that you know nothing about then you say it is full of entropy.
How would I make them understand the concepts before nerdy undergraduate arrogance turns off their brains? Preferably giving them the kind of intuitive grasp that would last rather than just persuading them via authoritative speech, charm and appeal to authority. I prefer people to comprehend me than to be able to repeat my passwords. (Except where having people accept my authority and dominance will get me laid in which case I may have to make concessions to practicality.)
Actually, I had always heard that it was Szilard, back in 1929, that came up with the original idea. So says wikipedia.
I first heard of Szilard's thought experiment back in high school from Pierce's classic popularization of Shannon's theory Symbols, Signals, and Noise. This book, which I strongly recommend, is now available free online. The best non-mathematical exposition of Shannon ever. (Well, there is some math, but it is pretty simple).
Szilard's idea is pretty cool. A heat engine with a working fluid consisting of a very thin gas. How thin? A single molecule.
So, from my reading Szilard's answer and Landauer's answer are slightly different. But the descriptions on the page you linked vs. the Maxwell's Demon page are slightly different, and so that may be the source of my impression. It seems that Szilard claimed that acquiring the information is where the entropy gets balanced, whereas Landauer claims that restoring the Demon to its original memory is where the entropy gets balanced. Regardless of whether or not both are correct / deserve to be called the 'information entropy explanation', Landauer's is the one that inspired my original explanation.