{"id":8765,"date":"2026-04-16T13:53:18","date_gmt":"2026-04-16T13:53:18","guid":{"rendered":"https:\/\/blog.dankohn.info\/?p=8765"},"modified":"2026-04-16T13:53:31","modified_gmt":"2026-04-16T13:53:31","slug":"father-of-information-theory","status":"publish","type":"post","link":"https:\/\/blog.dankohn.info\/index.php\/2026\/04\/16\/father-of-information-theory\/","title":{"rendered":"Father of Information Theory"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"876\" height=\"1024\" src=\"http:\/\/blog.dankohn.info\/wp-content\/uploads\/2026\/04\/fb_img_17763475193672190617121400936393-876x1024.jpg\" alt=\"\" class=\"wp-image-8764\" srcset=\"https:\/\/blog.dankohn.info\/wp-content\/uploads\/2026\/04\/fb_img_17763475193672190617121400936393-876x1024.jpg 876w, https:\/\/blog.dankohn.info\/wp-content\/uploads\/2026\/04\/fb_img_17763475193672190617121400936393-257x300.jpg 257w, https:\/\/blog.dankohn.info\/wp-content\/uploads\/2026\/04\/fb_img_17763475193672190617121400936393-768x898.jpg 768w, https:\/\/blog.dankohn.info\/wp-content\/uploads\/2026\/04\/fb_img_17763475193672190617121400936393.jpg 1125w\" sizes=\"auto, (max-width: 876px) 100vw, 876px\" \/><\/figure>\n\n\n\n<p>In 1948, a 32-year-old at Bell Labs published a paper nobody fully understood.<br><br>Engineers found it too mathematical. Mathematicians found it too engineering-focused. One prominent mathematician reviewed it negatively.<br><br>That paper &#8211; &#8220;A Mathematical Theory of Communication&#8221;, became the founding document of the digital age.<br><br>The man was Claude Shannon. Father of Information Theory.<br><br>At 21, he wrote the most important master&#8217;s thesis of the 20th century.<br><br>Working at MIT on an early mechanical computer, Shannon noticed its relay switches had exactly two states &#8211; open or closed. He had just taken a philosophy course introducing Boolean algebra, which also operated on two values: true and false.<br><br>Nobody had ever connected these two things.<br><br>His 1937 thesis proved that Boolean algebra and electrical circuits are mathematically identical, and that any logical operation could be built from simple switches.<br><br>Howard Gardner called it &#8220;possibly the most important, and also the most famous, master&#8217;s thesis of the century.&#8221;<br><br>Every digital computer ever built traces back to this insight.<br><br>At 29, he proved that perfect encryption exists.<br><br>During WWII, Shannon worked on classified cryptography at Bell Labs. His work contributed to SIGSALY, the secure voice system used for confidential communications between Roosevelt and Churchill.<br><br>In a classified 1945 memorandum, he mathematically proved the one-time pad provides perfect secrecy, unbreakable not just computationally, but provably, permanently, against an adversary with infinite power.<br><br>When declassified in 1949, it transformed cryptography from an art into a science. It laid the foundations for DES, AES, and every modern encryption standard.<br><br>At 32, he defined what information is.<br><br>His 1948 paper introduced one equation:<br>H = \u2212\u03a3 p(x) log p(x)<br><br>Shannon entropy. The average uncertainty in a probability distribution. The minimum bits required to encode a message.<br><br>Three things followed:<br><br>> He defined the bit &#8211; the fundamental unit of all information. His colleague John Tukey coined the name.<br><br>> He proved the channel capacity theorem, every communication channel has a maximum rate of reliable transmission. You can approach it. You can never exceed it.<br><br>> He unified telegraph, telephone, and radio into a single mathematical framework for the first time.<br><br>Robert Lucky of Bell Labs called it the greatest work &#8220;in the annals of technological thought.&#8221;<br><br>Where his equation lives in AI today:<br><br>Cross-entropy loss &#8211; the function training every classifier and language model, is derived directly from H. Decision tree splits use information gain, which is H applied to data. Perplexity, the standard LLM evaluation metric, is an exponentiation of cross-entropy.<br><br>Every time a neural network trains, Shannon&#8217;s formula runs inside it.<br><br>He also built the first AI learning device.<br><br>In 1950, Shannon built Theseus, a mechanical mouse that navigated a maze through trial and error, learned the correct path, and repeated it perfectly. Mazin Gilbert of Bell Labs said: &#8220;Theseus inspired the whole field of AI.&#8221;<br><br>That same year he published the first paper on programming a computer to play chess. He co-organized the 1956 Dartmouth Workshop, the founding event of AI as a field.<br><br>The man:<br>He rode a unicycle through Bell Labs hallways while juggling. He built a flame-throwing trumpet, a rocket-powered Frisbee, and Styrofoam shoes to walk on the lake behind his house.<br><br>He called his home Entropy House.<br><br>When asked what motivated him: &#8220;I was motivated by curiosity. Never by the desire for financial gain. I just wondered how things were put together.&#8221;<br><br>In 1985, he appeared unexpectedly at a conference in Brighton. The crowd mobbed him for autographs. Persuaded to speak at the banquet, he talked briefly, then pulled three balls from his pockets and juggled instead.<br><br>One engineer said: &#8220;It was as if Newton had showed up at a physics conference.&#8221;<br><br>He died in 2001 after a decade with Alzheimer&#8217;s, the cruel irony of information slowly leaving the mind of the man who defined what information was.<br><br>Claude, the AI model, is named after Claude Shannon, the mathematician who laid the foundation for the digital world we rely on today.<\/p>\n\n\n\n<p>From: https:\/\/www.facebook.com\/share\/p\/1BAAXrxA2j\/<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In 1948, a 32-year-old at Bell Labs published a paper nobody fully understood. Engineers found it too mathematical. Mathematicians found it too engineering-focused. One prominent mathematician reviewed it negatively. That paper &#8211; &#8220;A Mathematical Theory of Communication&#8221;, became the founding &hellip; <a href=\"https:\/\/blog.dankohn.info\/index.php\/2026\/04\/16\/father-of-information-theory\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27,2],"tags":[],"class_list":["post-8765","post","type-post","status-publish","format-standard","hentry","category-computer-history","category-computing"],"_links":{"self":[{"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/posts\/8765","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/comments?post=8765"}],"version-history":[{"count":1,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/posts\/8765\/revisions"}],"predecessor-version":[{"id":8766,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/posts\/8765\/revisions\/8766"}],"wp:attachment":[{"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/media?parent=8765"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/categories?post=8765"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.dankohn.info\/index.php\/wp-json\/wp\/v2\/tags?post=8765"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}