LinkedIn on Gemini

Jan 29

Last week, I wrote a post about Google’s Gemini LLM, and its offer to analyze a naming masterlist (a moderately complicated spreadsheet). As I was about to press ‘Post,’ LinkedIn kindly offered to let me "Rewrite with AI." For the sake of science, I had to accept.

Now, my article was 500 words, and LinkedIn’s was another 250 — I’m not going to ask you to compare and contrast these articles yourself, though they’re included below if you’d like to. Fortunately for us all, there’s one short section from these mirrored works that perfectly captures everything I want to say about the summary produced by the LinkedIn LLM (which doesn’t seem to be CoPilot for some reason, so I’m going to call it LiLLM).

Let’s check it out.

My article		LiLLM's version
SIMPLE ANALYSIS Still, anyone can make a foolish suggestion. I tried a softball, asking the LLM, "How many names have been approved?" The correct answer was 4—there were only 4 names with values for the last gate, and all those values were Y. It had accurately identified the purposes of these gates once, so this should be a lay-up, right? The answer Gemini gave, for this sheet with 323 names in it, was that 1,751 of them had been approved.		Even when prompted with a straightforward question like "How many names have been approved?" Gemini faltered, indicating an erroneous count of 1,751 approvals instead of the actual 4. This discrepancy underscores the gap between LLMs' capacity for broad analysis and their inability to deliver precise, factual answers consistently.

So, what do we think of this snippet? In broad strokes, what advantages might this point to in a tool like LiLLM have, and what disadvantages?

Objectively, this LLM performed as advertised. I put in some writing, and some other writing—still legible!—came out the other side. It turned 77 words into 48, just about 60% the length of mine, making it a quicker read and perhaps easier to digest. As part and parcel of that truncation, LiLLM’s output has a completely different style and tone—something that all LLM makers and many users view as a huge positive, smooshing away any messy colloquialisms in favor of erroneous counts and underscored discrepancies.

So far, this seems to be exactly what we’d want from a summarizing tool. After all, rewriting—editing, summarizing—is all about preserving and elevating important information, while eliminating noise and potential sources of confusion, and bringing in other context and ideas where relevant. But we haven’t seen that last part yet, and, as is so often the case with LLM output, LiLLM’s work starts to fall apart when you dig into the details.

Any semblance of narrative has been stripped from the original, which is fundamentally a story about my personal experience and thought process. I’ve tried to invite my readers into the experience of working with an LLM, and tried to encourage them to give it the benefit of the doubt, take it easy on Gemini—so that when the insanity of the answer is revealed, it might hit the reader as hard as it hit me in the moment.

But narrative takes repetition, and words whose purpose might not instantly be clear in the contest of a single sentence. LiLLM condenses several sentences of narrative to the word “straightforward,” leads with the bland fact that Gemini fails to answer the question posed, and gives two of the three key details I included. But it misses the nuance: the point of my anecdote isn’t that Gemini gave an answer that was too big, but rather that Gemini gave an answer that was five times bigger than the entire data set. Someone reading LiLLM’s post will have to remember this fact from several paragraphs ago, and might even assume LiLLM itself has made a memory error here, to really get the full force of the idea.

The other thing LiLLM has removed, is any room for the reader. I didn’t explain the anecdote I shared, trusting my readers to draw their own conclusions about the value of a tool that can make such an egregious error so casually. But LiLLM can’t leave anything to chance, here, and so it wraps things up for the reader. It states a sentiment it has already stated once and will state again before the post is through: LLMs are good at broad analysis, but they’re bad at delivering precise factual answers.

This is a little uncharitable to the reader, sure, but it also ironically makes my final point for me: that final claim is false. At least, it’s false to indicate that I believe LLMs are good at analysis—I don’t think they are—but it is the one that LiLLM wants to average out to. It might not rise to the level of hallucination, because it’s probably an uncontroversial, plausible statement in most settings. But no specific setting is “most settings.”

LLMs love to leap to plausible-seeming conclusions and strip away context like there’s no tomorrow, and that might be acceptable when you’re directly asking ChatGPT or DeepSeek a question that you plan to double-check. But for me, it’s unacceptable—especially at the cost of a liter of clean water—when I’m posting something that ostensibly represents my own thoughts, ideas, and experiences, and those are the things the LLM is stripping away and taking clumsy liberties with. It’s not an utter catastrophe, but it’s also far from useful.

Tune in next week as we delve ever deeper into the digital ouroboros, to discover what Squarespace AI will do to this post, assuming I let it! (Spoiler: it’s even worse with numbers).

***

If you’d like to read some more clumsy liberties, bring your fine-toothed comb below. (Maybe print this page out and break out your highlighters). Otherwise, you might be interested in other things I think about, or even in hearing it straight from the horse’s mouth.

My article		LiLLM's version

Today I finally took Google up on its insistent offer, and asked Gemini to "summarize" a naming masterlist (a spreadsheet with 323 names, their risk assessments, and approval status). How'd it do? What can this tell us about LLMs' applicability to naming, and even to rudimentary data analysis?		Gemini, Google's language model, recently provided a summary of a naming masterlist spreadsheet containing 323 names, their risk assessments, and approval statuses.

A GOOD SUMMARY Gemini offered me a summary, where it identified the role of different columns in the sheet. And it was pretty good! Most of its analysis was accurate, only ignoring a few and only describing one incorrectly (it thought SL meant Sound Like, not Shortlist). It gave descriptions like "Status/Risk Assessment: A risk assessment for the name, ranging from 'Very Low' to 'TBD' (To Be Determined)." I was honestly pleasantly surprised by the output there. That was the only positive.		The summary accurately identified most column roles, with only minor inaccuracies. For instance, it misinterpreted "SL" as "Sound Like" instead of "Shortlist." Notably, Gemini described the "Status/Risk Assessment" column well, detailing the risk levels from "Very Low" to "TBD" (To Be Determined), showcasing its analytical capabilities.

NON SEQUITUR SUGGESTIONS Gemini then suggested that I ask it two other questions, so I did. The questions it suggested were, "Create a formula that calculates the percentage of females in the table." and "Create a formula that calculates the total cost of the event." Of course, this was a naming masterlist—it had nothing to do with individuals and genders, no reference to an event, and not a single cell formatted with a dollar sign. So, why these suggestions? My guess here is that Gemini has been manually equipped with some suggestions that Google's coders expect might be relevant, and it's trying to force-fit these presets onto new data. As always, there's a lot more manual programming in your "AI" than you might expect.		However, Gemini's subsequent suggestions veered off-topic, proposing formulas unrelated to the naming data.

Out of curiosity, I still asked it to perform these tasks, and it answered... poorly. The formulas it suggested involved columns that it seemed to have identified properly at the start, containing risk assessments and approval statuses, and tried to count values including the headers of those columns, which it had identified as headers before.		Despite the disconnect, the model attempted to execute these suggestions poorly, highlighting the limitations of pre-programmed suggestions.

This reinforces what we already knew about LLMs: they do not "know" things, or "remember" them, or "learn." It probably would have taken an intern a bit longer than Gemini to tell me what all these columns meant, but that intern would never in a million years offer to calculate the cost of the event.		This experience underscores the inherent limitations of Language Learning Models (LLMs) like Gemini, emphasizing their inability to grasp contextual nuances or provide accurate answers beyond basic analysis.

SIMPLE ANALYSIS Still, anyone can make a foolish suggestion. I tried a softball, asking the LLM, "How many names have been approved?" The correct answer was 4—there were only 4 names with values for the last gate, and all those values were Y. It had accurately identified the purposes of these gates once, so this should be a lay-up, right? The answer Gemini gave, for this sheet with 323 names in it, was that 1,751 of them had been approved.		Even when prompted with a straightforward question like "How many names have been approved?" Gemini faltered, indicating an erroneous count of 1,751 approvals instead of the actual 4. This discrepancy underscores the gap between LLMs' capacity for broad analysis and their inability to deliver precise, factual answers consistently.

THE VERDICT LLMs can do an OK job at giving you an approximate analysis of something big and abstract, like the structure of a spreadsheet. It can help close the gap between "I have no idea what's happening here" and "I kinda get it." But they can't give you a hard fact, no matter how simple, and you can't rely on one answer being based on the last.		In essence, while LLMs like Gemini can offer a rudimentary understanding of complex data structures, they fall short in providing definitive answers or handling specific queries accurately. Their value lies in bridging the comprehension gap between complexity and clarity, rather than delivering precise, reliable information.

Wow, you read all the way to the end. Good, uh, good job! I bet there’s some more stuff down to the bottom-right that might interest you, if you’re still hungry for words! Yum yum!

James Mueller https://maimsjeweler.com

LinkedIn on Gemini

6 reasons your strategy should be ‘internal’