![]() |
Twelve Kittens in a Ferrari- Wayne explains how AI is nothing more than a magic trick
Twelve Kittens in a Ferrari - Wayne's explanation of how AI actually works, and why I think it's still a bit of a "magic trick"
Hi everyone. I’ve had this conversation with dozens of people – I thought I would write a small essay / article on how AI works (from my understanding of it) and to also explain my rationale for why I think it’s still a parlor trick (albeit, a great illusion at that). Let me know your thoughts!!! Background – I've got two degrees in engineering from MIT – started dot-com PelicanParts.com (European automotive parts on the Internet) where I programmed all the shopping carts, back-end inventory systems and built the heuristic search engine from scratch. Let’s start off with me explaining how our current AI models work, but in a dumbed-down fashion using old tech as an explanation. About 20-30 years ago, Amazon built and developed a feature that we all take for granted these days – “Customers who bought X also bought Y.” Back in 1999, this was ground-breaking technology. When I saw this, I told my computer programmer, “we need to get this on our PelicanParts.com site asap.” He mentioned it would take about six months to do, at which point I said “hogwash” (or something less kid-friendly). I proceeded to go home that weekend, worked on it until 2AM all weekend, and by Monday morning, we had our “Customers who bought X also bought Y” feature implemented in the shopping cart. It turned out to be easier than expected. Here’s how I did it (yes, this is applicable to the AI discussion). For those who know something about SQL databases, this is easy to follow. For those who don’t, think of a SQL database like a large filing cabinet with individual folders inside. To implement the “Customers who bought X also bought Y” feature I simply did the following: 1- I selected all unique part numbers in our part number database. Basically a master list of everything we sell. 2- For each individual part number in the master list, I took that part number and looked at our data. Let’s pretend that the first part number was for a rod bolt. I selected all orders (from the past two years or so) that had that rod bolt part number on it. I now had a order master list of all order numbers that contained that individual part - the rod bolt. 3- Then I selected *all* of the line items on *all* of the orders that contained that rod bolt part number. This results in a big list of items all jumbled together. But, of course, the commonality between all of these order numbers is that each one contained that part number I was looking for (the rod bolt). 4- Finally, I just sorted the big list and summed up how many times each individual part number appeared in the list (this is easy for the computer to do - Excel will do this easily as well). Obviously, the rod bolt was tops on the list, but I was interested in the ten next part numbers that appeared on orders where a rod bolt was ordered. These I took and then wrote down in the “part number record” for the rod bolt next to other attributes like price, weight, description, etc. That way the shopping cart could simply select the part number record and read what the top ten parts are that were associated with the rod bolt. Not surprisingly, it works really well. The number one recommended part associated with a rod bolt is – not surprisingly – a rod nut. Literally no one orders one without the other (well, hardly ever). Interestingly enough, the system exposed some really weird correlations that I couldn’t have guessed either. Something like people replacing their brake master cylinder might also order a car cover (random example – I don’t remember the exact weird combos that came up). At first I thought I had errors in my code, but sure enough, there were a bunch of people who ordered brake cylinders and car covers! |
[continued]
Implementing this feature made for an instant 5% increase in sales, minimum, on day one. It also probably saved a bunch of time for people who were going to have to go back and scroll through the catalog to find related parts. Win-win-win for everyone. Okay, so what does this have to do with AI? Well, our very primitive “Customers who bought X also bought Y” is a very, very similar mechanism to how modern LLMs (Large Language Models like ChatGPT and Grok) work. Indeed, this feature in our shopping cart from the 2000s is like comparing a Texas Instruments digital calculator from the late 1970s, to the modern equivalent of an iPhone – they both run on solid-state transistor technology, but the advances since then are almost incomprehensible. Getting back to AI - the modern AI models are confusing to a lot of people. They misspell words, they “lie” and have “hallucinations”. If you ask it to draw twelve kittens in a Ferrari, it will give you 14 kittens in something that looks a little bit like a Ferrari, but not quite. How does all of this work? It works just like our “Customers who bought X also bought Y”, but with decades more data and light years more advanced. With our “Customers who bought X also bought Y” feature, we used to set the server up to run / train itself every Friday night from about 2AM to Saturday morning. It took a long time to process and run through all of the data. The training data for our feature does not store results, nor does it store any indications or history of where the data came from – I cannot go back and trace the path – I have to just trust the data and the original learning method. Fun fact - if some customers had a weird coupon code that caused them to buy a sweatshirt or model car that was on sale, *and* they were buying other parts, then it might give off some weird associations – just like AI will hallucinate from time to time. AI trains itself on patterns and then figure out what the “next thing” should be when processing a task. This is why the results come in slowly from AI – it’s not like the old days where you had a 2400 baud model and it’s just sending stuff slowly – it’s building the answer one word at a time based upon probabilities and the training data it has. Just like if you were on our shopping cart, and then added a rod bolt, and then added a rod nut, and then added some assembly lube, and then added some rod bearings, one-at-a-time based upon the shopping cart’s recommendations – this is how AI is processing the response to you. This is why when you indeed ask it to draw twelve kittens in a Ferrari, you see the image come in top-down – just like an old 2400 baud modem loading images in the 1990s. This is not because of bandwidth though – the AI platform is figuring out in real-time, pixel-by-pixel, what is most likely to come next based upon an analysis of your prompt. If you were a human and were drawing a picture of twelve kittens in a Ferrari, you would start, probably, by drawing a Ferrari and then drawing the kittens in and around the Ferrari. AI doesn’t do that, and doesn’t work that way. It goes top down pixel-by-pixel. This is how hallucinations happen, how spelling errors happen and so on. At this point in time, all LLMs have this problem. Which brings me to my point – it *is* just a magic trick at this moment in time. Don’t get me wrong, it’s a really, really, really – David Copperfield making the Statue of Liberty disappear – type of great, legendary trick. But it’s still a trick. There’s no brain, there’s no logic, there’s no thought pattern. It’s just predicting patterns and spitting everything back to you in a very, very, very advanced manner. No different than our “Customers who bought X also bought Y” feature (note how I don’t call that an “algorithm” because it’s not – it’s just spitting back predictive patterns based upon pre-learned / pre-analyzed data.) The LLM models themselves – they don’t take up trillions of gigabytes of storage. They are quite manageable in size – the current OpenAI model I think is estimated to be about 750GB. I think the new iPhone actually has more storage space than that! The LLM "database" / "training data" doesn’t contain a “copy” of the Internet in its database, just a summary of training data – just like our “Customers who bought X also bought Y” feature with data that takes up a tiny fraction of the space of the original order data. So, back to the magic trick metaphor – there are people out there who are fooled by the trick and don’t quite understand how the LLMs work (I was indeed at first too). They are convinced that LLMs and other AI tech will take over the world and launch WWIII and will achieve sentience, etc. - i.e. they will come alive. If one understands the underlying technology and how it’s just repeating back patterns in response to prompts, it becomes obvious that the current technology is still fairly primitive – even though it seems insanely powerful and dynamic. That’s the trick part. Heck, when I first saw it work, I was like “no way, there has got to be an army of people in India typing this stuff back to me.” But after I dived into it some more and began to see the parallels between the LLMs and our “Customers who bought X also bought Y” feature, it became a bit more apparent to me what we were seeing. Conclusion? The technology is amazing. The same feeling I had when I ran the first search for the first time in our “Customers who bought X also bought Y” feature - that *is* exactly the same feeling I had when I first first used ChatGPT more than a year ago. But, at the same time, since I understand how it works, I understand its limitations, and more importantly, I understand why it makes errors and why it’s not likely to “take over the world.” At least any time in the semi-near future. Neat trick indeed, but at the end of the day, it’s still just like a more advanced Texas Instruments calculator from the late 1970s than it is like a human being. Okay, so the words "magic trick" may be a little harsh and "click bait", but it's designed to make my point that it's more of an illusion than it is a thinking, learning being like some people believe it to be... Thoughts? -Wayne http://forums.pelicanparts.com/uploa...1763335260.jpg |
I mean, two of those 14 kittens are OUTSIDE of the "Ferrari" so it got it right, right?
|
Fascinating stuff, Wayne. I'm not a "technology person" myself but always interested in what someone is working on and how it's done.
Your narrative is easy to follow for just the average person - appreciate that! That obviously comes from the sales and marketing side....something I can very much relate to! :) |
Right, this is what I've been thinking (and occasionally saying) all along. "AI" aka 'literal' "artificial intelligence" still doesn't exist, or certainly not that any of us have seen. It's not smarter than the average bear. It's an impressive tool, and will get better and better, but it's certainly not what I think of when I think of "artificial intelligence."
|
(W)as a computer science geek and this thread is awesome!
Twelve Coopers in a Tank :D Thanks Wayne ... the wizard behind the curtain ;) |
Quote:
14 is correct too ;) |
Quote:
-Wayne |
I still say Octal is good :)
|
It would be a gross error for anyone to think I understand code. I can say this though observation being a customer of PP since the very early days. Wayne deserves a lot more credit than he gets in my estimation. When Wayne and Tom launched the website and Wayne was building his own in-house system he called O(xxx), there was no Rock Axxx.
Now "O" and the catalog to me are 2 different things. One is a data base and the other is what everyone now calls an app. It's a program run along side of the data base that does things that I don't know about. I've just seen it work, that's all and that's enough for me to make the statement. Steve above put thoughts into words that make sense to me. And I'm glad Wayne shared his thoughts about AI. I only had a hunch but it turns out my hunch is in line with Wayne's explanation. Wayne, thank you. |
AI is a master plagiarizer, nothing more. No logic, no reason, no creativity. For routine, redundant, well defined tasks it has massive potential. But it’s not intelligent.
|
I was recently reading about Yann LeCun's work on what he terms "World Model" of AI's real best future. It was in the WSJ on Nov 15. I certainly don't pretend to understand enough to converse about it but it is an interesting take on this idea vs the LLM types.
|
Hi Wayne, I'm also an llm skeptic. I worked in a fang's ai group for a year.
I've got a different analogy, maybe a little closer to the truth but then also maybe less accessible. There's a thing people do with data called 'interpolation'. If you have some points you can do a linear regression that draws a balanced straight line through the scattered points and then the best guess for values between the points is the value on that line. Yeah straight line is probably not a good match so there's all sorts of curved interpolation functions. The point here is that without knowledge of how the points came to be, we really don't know how well the interpolation is working. For example if we're graphing values of sine at 0, pi, 2 pi, ... x pi, then the scatter chart will show points all in a straight line, the interpolations will all be straight lines because the interpolation doesn't know the values came from the sine function. The guesses are only correct for multiples of pi. A knowledgable person could set up the interpolation to just use sine but an LLM starts without any knowledge so builds a big probability model to interpolate the values its trained with. LLMs training creates an interpolation model for its input data. Instead of 2d scatter charts it creates vectors (list of numbers) and constructs a giant n-dimensional vector field. If you look into how to interpolate vector fields you'll see there's a ton of problems. As the number of dimensions increases the number of potential discontinuities grows exponentially. A general high dimension state spaces is very scary to interpolate because you find yourself making stuff up at every turn. Anyway, llm creates a giant model to interpolate between data vectors. The amazing thing is that this approach does appear to work for things we didn't expect. Like human handwriting. Like human languages. Playing games of probability. Human emotional behavior. This is a real reason to be amazed at these new things we made - because they really can do some powerful stuff. The future of LLMs is whatever new uses we can find for them - they really can do some unexpectedly nifty stuff. What llms can't 'do' is the discrete stuff that cannot be interpolated. Integer math. Bitwise math. Logic. Mathematics. Rationality. Writing code. A generated llm response is a tensor multiply. And a tensor multiply isn't turing complete. LLM cannot compute. And when an llm can't 'do' something it still does its interpolation trick and generates an answer. The answer will read like its correct but the process that created the answer is strictly lexical and isn't based on any semantic. |
Good discussion everyone, let's keep it going. This is good info and good feedback.
-Wayne |
You need to add an explanation of what LLM stands for.
I once had an assignment at work to develop a way to predict what it would cost to manufacture a machined part. I knew nothing about 'programing'. I used Excel to make my program. I started off by figuring out how to get a cell to change the value in one cell based on what another cell had in it. It's a very powerful tool once you get into it. I wish I'd kept a copy of what I came up with. While I was working on it I came upon a problem with an if/then statement that was four layers deep. I asked the in house computer programmer for some help. He looked at what I had come up with and offered me a job. He said he had guys who'd been working for him for 3 years who couldn't have come up with what I did. When it was finished you could take a drawing for a part and based on answers for the 20 or so questions I'd come up with (with dialog boxes coming up to ask the questions) it would give you an estimated cost. You could even change the type of material used and it'd recalculate the cost for you, almost instantly. It was some of the most fun I had ever had while working. |
Very cool. Good suggestion on the LLM (Large Language Model). I thought about that, and then forgot to add the explanation in there. I hate it when people use acronyms that others don't know. Good call.
Excel is a very, very, very powerful tool - you can write whole apps inside the spreadsheet itself, *or* you can use VBA (Visual Basic for Applications) to write whole production-level programs that run on the back end. The trouble with Excel, I think, lies with updates and version control, etc. I.E. if you update Excel (automatically), it may break your code. Hopefully not, but it does happen. Maintenance nightmare... -Wayne |
Eight years now after selling Pelican, and I'm still trying to figure out what to to. I bought this yesterday, I think I'm bored: https://barnfinds.com/1-of-27-built-1999-bmw-540i-touring-wagon/
Need to start a new company. Something robotics and manufacturing. Either that, or real estate. Two completely opposite things... -Wayne |
Great discussion and summary of LLM generative AI.
I tend to agree that it's often "interpolation" how to string words or pixels together into something resembling reality. For me, it's a great tool for drafting letters and reports. It's a shortcut for research on common reasonably common topics. But as the data gets thin, the interpolations get more extreme and it will start to show a lack of "thought". But damn if I'm not impressed with what Gemini can do to help teach (or check) my kids' Algebra 2 homework. I will often just take a picture of the question and ask for an explanation, and then subsequently ask it to solve the problem showing work. It's stunningly good, concise, and logical. It also checks homework and has noted things like "Your son did all the steps correctly, but transcribed the final answer wrong when he wrote it in the answer box" or, "it appears he was on the correct track, but erased those steps in the left hand margin". Gemini's ability to make these levels of insight seemingly goes beyond the correlation/interpolation explanation of LLM AI. I suspect the latest models are layering in more, at least for Gemini and math. Sent from my CPH2451 using Tapatalk |
Wow, a peek up AI's skirt!
Great read, thanks Wayne :) rjp |
‘If x then y’ is a simple trial and error learning method… The “intelligence” is the dataset.
I think it’s best to define what intelligence is and the limits between human intelligence (planning, prediction, meta cognition, creativity, problem solving, social intelligence, innovation, etc) and the archaic trial and error method used by llm. Post 13 touched on this… to be pragmatic, i feel the trial and error method is more than adequate for our current society (which we see by the displacement of industries to automation)… However the age old idiom, “there’s no replacement for displacement” will be the true equalizer to this fancy search engine. |
| All times are GMT -8. The time now is 12:06 PM. |
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
Search Engine Optimization by vBSEO 3.6.0
Copyright 2025 Pelican Parts, LLC - Posts may be archived for display on the Pelican Parts Website