84 points

I’m convinced people who can’t tell when a chat bot is hallucinating are also bad at telling whether something else they’re reading is true or not. What online are you reading that you’re not fact checking anyway? If you’re writing a report you don’t pull the first fact you find and call it good, you need to find a couple citations for it. If you’re writing code, you don’t just write the program and assume it’s correct, you test it. It’s just a tool and I think most people are coping because they’re bad at using it

permalink
report
reply
9 points

Yeah. GPT models are in a good place for coding tbh, I use it every day to support my usual practice, it definitely speeds things up. It’s particularly good for things like identifying niche python packages & providing example use cases so I don’t have to learn shit loads of syntax that I’ll never use again.

permalink
report
parent
reply
31 points

In other words, it’s the new version of copying code from Stack Overflow without going to the trouble of properly understanding what it does.

permalink
report
parent
reply
6 points

Pft you must have read that wrong, its clearly turning them into master programmer one query at a time.

permalink
report
parent
reply
5 points

I know how to write a tree traversal, but I don’t need to because there’s a python module that does it. This was already the case before LLMs. Now, I hardly ever need to do a tree traversal, honestly, and I don’t particularly want to go to the trouble of learning how this particular python module needs me to format the input or whatever for the one time this year I’ve needed to do one. I’d rather just have something made for me so I can move on to my primary focus, which is not tree traversals. It’s not about avoiding understanding, it’s about avoiding unnecessary extra work. And I’m not talking about saving the years of work it takes to learn how to code, I’m talking about the 30 minutes of work it would take for me to learn how to use a module I might never use again. If I do, or if there’s a problem I’ll probably do it properly the second time, but why do it now if there’s a tool that can do it for me with minimum fuss?

permalink
report
parent
reply
3 points

The usefulness of Stack Overflow or a GPT model completely depends on who is using it and how.

It also depends on who or what is answering the question, and I can’t tell you how many times someone new to SO has been scolded or castigated for needing/wanting help understanding something another user thinks is simple. For all of the faults of GPT models, at least they aren’t outright abusive to novices trying to learn something new for themselves.

permalink
report
parent
reply
69 points
*

I just tried out Gemini.

I asked it several questions in the form of ‘are there any things of category x which also are in category y?’ type questions.

It would often confidently reply ‘No, here’s a summary of things that meet all your conditions to fall into category x, but sadly none also fall into category y’.

Then I would reply, ‘wait, you don’t know about thing gamma, which does fall into both x and y?’

To which it would reply ‘Wow, you’re right! It turns out gamma does fall into x and y’ and then give a bit of a description of how/why that is the case.

After that, I would say ‘… so you… lied to me. ok. well anyway, please further describe thing gamma that you previously said you did not know about, but now say that you do know about.’

And that is where it gets … fun?

It always starts with an apology template.

Then, if its some kind of topic that has almost certainly been manually dissuaded from talking about, it then lies again and says ‘actually, I do not know about thing gamma, even though I just told you I did’.

If it is not a topic that it has been manually dissuaded from talking about, it does the apology template and then also further summarizes thing gamma.

I asked it ‘do you write code?’ and it gave a moderately lengthy explanation of how it is comprised of code, but does not write its own code.

Cool, not really what I asked. Then command ‘write an implementation of bogo sort in python 3.’

… and then it does that.

Awesome. Hooray. Billions and billions of dollars for a shitty way to reform web search results into a coversational form, which is very often confidently wrong and misleading.

permalink
report
reply
14 points

Idk why we have to keep re-hashing this debate about whether AI is a trustworthy source or summarizer of information when it’s clear that it isn’t - at least not often enough to justify this level of attention.

It’s not as valuable as the marketing suggests, but it does have some applications where it may be helpful, especially if given a conscious effort to direct it well. It’s better understood as a mild curiosity and a proof of concept for transformer-based machine learning that might eventually lead to something more profound down the road but certainly not as it exists now.

What is really un-compelling, though, is the constant stream of anecdotes about how easy it is to fool into errors. It’s like listening to an adult brag about tricking a kid into thinking chocolate milk comes from brown cows. It makes it seem like there’s some marketing battle being fought over public perception of its value as a product that’s completely detached from how anyone actually uses or understands it as a novel piece of software.

permalink
report
parent
reply
21 points
*

Probably it keeps getting rehashed because people who actually understand how computers work are extremely angry and horrified that basically every idiot executive believes the hype and then asks their underlings to inplement it, and will then blame them for doing what they asked them to do when it turns out their idea was really, unimaginably stupid, but idiot executive gets golden parachute and software person gets fired.

That, and/or the widespread proliferation of this bullshit is making stupid people more stupid, and just making more people stupid in general.

Or how like all the money and energy spent on this is actively murdering the environment and dooming the vast majority of our species, when it could be put toward building affordable housing or renovating crumbling infrastructure.

Don’t worry, if we keep throwing exponential increasing amounts of effort at the thing with exponentially diminishing returns, eventually it’ll become God!

permalink
report
parent
reply
-1 points

Then why are we talking about someone getting it to spew inaccuracies in order to prove a point, rather than the decision of marketing execs to proliferate its use for a million pointless implementations nobody wants at the expense of far higher energy usage?

Most people already know and understand that it’s bad at most of what execs are trying to push it as, it’s not a public-perception issue. We should be talking about how energy-expensive it is, and curbing its use on tasks where it isn’t anything more than an annoying gimmick. At this point, it’s not that people don’t understand its limitations, it’s that they don’t understand how much energy it’s costing and how it’s being shoved into everything we use without our noticing.

Somebody hopping onto openAI or Gemini to get help with a specific topic or task isn’t the problem. Why are we trading personal anecdotes about sporadic personal usage when the problem is systemic, not individualized?

people who actually understand how computers work

Bit idea for moderators: there should be a site or community-wide auto-mod rule that replaces this phrase with ‘eat all their vegitables’ or something that is equally un-serious and infantilizing as ‘understand how computers work’.

permalink
report
parent
reply
3 points

to fool into errors

tricking a kid

I’ve never tried to fool or trick AI with excessively complex questions. When I tried to test it (a few different models over some period of time - ChatGPT, Bing AI, Gemini) I asked stuff as simple as “what’s the etymology of this word in that language”, “what is [some phenomenon]”. The models still produced responses ranging from shoddy to absolutely ridiculous.

completely detached from how anyone actually uses

I’ve seen numerous people use it the same way I tested it, basically a Google search that you can talk with, with similarly shit results.

permalink
report
parent
reply
0 points

Why do we expect a higher degree of trustworthiness from a novel LLM than we de from any given source or forum comment on the internet?

At what point do we stop hand-wringing over llms failing to meet some perceived level of accuracy and hold the people using it responsible for verifying the response themselves?

Theres a giant disclaimer on every one of these models that responses may contain errors or hallucinations, at this point I think it’s fair to blame the user for ignoring those warnings and not the models for not meeting some arbitrary standard.

permalink
report
parent
reply
11 points

And then more money spent on adding that additional garbage filter to the beginning and the end of the process which certainly won’t improve the results.

permalink
report
parent
reply
7 points

copilot did the same with basic math. just to test it I said “let’s say I have a 10x6 rectangle. what number would I have to divide width and height by, in order to end up with a rectangle that’s half the area?”

it said “in order to make it half, you should divide them by 2. so [pointlessly lengthy steps explaining the divisions]”

I said “but that would make the area 5x3 = 15 units which is not half the area of 60”

it said “you’re right! in order to … [fixing the answer to √2 using approximation”

I don’t know if I said it then, or after some other fucking nonsense but when I said “you’re useless” it had the fucking audacity to take offense and end the conversation!

like fuck off, you don’t get to have fake pride if you don’t have basic fake intelligence but use it in your description.

permalink
report
parent
reply
8 points
*

Its a perfect encapsulation of the corpo mindset:

Whatever I do is profound, meaningful, with endless possibilities for future greatness…

… even though I’m just talking out of my ass 99% of the time…

… and if you have the audacity, the nerve, to have a completely normal reaction when you determine that that is what I am doing, pshaw, how uncouth, I won’t stand for your abuse!

They’ve done it. They’ve made a talking (not thinking) machine in their own image.

And it was not good.

You start a conversation you can’t even finish it You’re talkin’ a lot, but you’re not sayin’ anything When I have nothing to say, my lips are sealed Say something once, why say it again?

Psycho Killer Qu’est-ce que c’est

permalink
report
parent
reply
2 points

please further describe thing gamma that you previously said you did not know about, but now say that you do know about.’

It’s quite amusing to ask it about conspiracy theories. There’s a huge amount in it’s training set (not because the theories are true, just that they are often written about) that it has been dissuaded from discussing.

permalink
report
parent
reply
1 point
*

Cool, not really what I asked. Then command ‘write an implementation of bogo sort in python 3.’

… and then it does that.

Alright, but… it did the thing. That’s a feature older search engines couldn’t reliably perform. The output is wonky and the conversational style is misleading. But its not materially worse than sifting through wrong answers on StackExchange or digging through a stack of physical textbooks looking for Python 3 Bogo Sort IRL.

I agree AI has annoying flaws and flubs. And it does appear we’re spending vast resources doing what a marginal improvement to Google five years ago could have done better. But this is better than previous implementations of search, because it gives you discrete applicable answers rather than a collection of dubiously associated web links.

permalink
report
parent
reply
6 points
*

But this is better than previous implementations of search, because it gives you discrete applicable answers rather than a collection of dubiously associated web links.

Except for when you ask it to determine if a thing exists by describing its properties, and then it says no such thing exists while providing a discrete response explaining in detail how there are things that have some, but not all of those properties…

… And then when you ask it specifically about a thing you already know about that has all those properties, it tells you about how it does exist and describes it in detail.

What is the point of a ‘conversational search engine’ if it cannot help you find information unless you already know about said information?!

The whole, entire point of formatting it into a conversational format is to trick people into thinking they are talking to an expert, an archivist with encyclopedaeic knowledge, who will give them accurate answers.

Yet it gatekeeps information that it does have access to but omits.

The format of providing a bunch of likely related links to a query is a format much more reminiscent of doing actual research, with no impression that you will immediately find what you want right away, that this is a tool to aide you in your research process.

This is only an improvement if you want to further unteach people how to do actual research and critical thinking.

permalink
report
parent
reply
0 points

Except for when you ask it to determine if a thing exists by describing its properties

Basic search can’t answer that either. You’re describing a task neither system is well equipped to accomplish.

permalink
report
parent
reply
1 point

I don’t feel like off-the-cuff summaries by AI can replace web sites and detailed articles written by knowledgeable humans. Maybe if you’re looking for a basic summary of a topic.

permalink
report
parent
reply
1 point

I don’t feel like off-the-cuff summaries by AI can replace web sites and detailed articles written by knowledgeable humans

No. But that’s not what a typical search result returns.

There’s also no guarantee the “detailed articles” you get back are well-informed or correct. Lots of top search results are just ad copy or similar propaganda. YouTube, in particular, is rife with long winded bullshitters.

What you’re looking for is a well-edited trustworthy encyclopedia, not a search engine.

permalink
report
parent
reply
31 points

I beg someone to help me. There is this new guy at my workplace, officially as a developer who can’t write code at all. He has pasted an entire project I did into ChatGPT with “optimize this” and pull requested it. I swear.

permalink
report
reply
16 points

Report up the chain, if it’s safe to do so and they are likely to understand.

Also, check what your company’s rules regarding data security and LLM use are. My understanding is that at many places putting private company or customer data into an outside LLM is seen as shouting company secrets out to the open internet. At least that’s the policy where I’m at. Pasting an entire project in would definitely violate things for my workplace.

In general that’s rude as hell. New guy comes in, grabs an entire project they have no background with, and just chucks it at an LLM? No actual review of it themselves, just an assumption that your code is so shit that a general use text generator will do better? Doesn’t sound like a “team player” to me (management eats that kind of talk up).

Maybe couch it as “I want to make sure that as a team, we’re utilizing the tools available to us in the best way possible to multiply our strengths. That said, I’m concerned the approach that [LLM idiot] is using will only result in more work for the team. Using chatGPT as he has is an explosive approach, when I feel that a more scalpel-like approach to address specific areas for improvement would be the best method moving forward. We should be using these tools to address specific concerns, not chucking everything at the wall in some never ending chase of an undefined idea of ‘more optimized’.”

Perhaps frame it in terms of man hours? The immediateness of 5 minutes in chatGPT can cost the team multiple workdays in reviewing the output, whereas more focused code review up front can reduce the man hour cost significantly.

There’s also a bunch of articles out there online about how overuse of LLMs is leading to a measurable decrease in code quality and increase in security issues in code bases.

permalink
report
parent
reply
5 points

Such a great answer, thank you lots!

permalink
report
parent
reply
27 points

Because of I haven’t found anyone asking the same question on a search index, ChatGPT won’t tell me to just use Google or close my question as a duplicate when it’s not a duplicate.

permalink
report
reply
20 points
*

Reminder that all these Chat-formatted LLMs are just text-completion engines trained on text formatted like a chat. You’re not having a conversation with it, it’s “completing” the chat history you’re providing it. By randomly(!) choosing the next text tokens that seems like they best fit the text provided.

If you don’t directly provide, in the chat history and/or the text completion prompt, the information you’re trying to retrieve, you’re essentially fishing for text in a sea of random text tokens that seems like it fits the question.

It will always complete the text, even if the tokens it chooses minimally fit the context, it chooses the best text it can but it will always complete the text.

This is how they work, and anything else is usually the company putting in a bunch of guide bumpers to reformat prompts into coaxing the models to respond in a “smarter” way (see GPT-4o and “chain of reasoning”)

permalink
report
reply
7 points

They were trained on reddit. How much would you trust a chatbot whose brain consists of the entirety of reddit put in a blender?

I am amazed it works as well as it does. Gemini only occasionally tells people to kill themselves.

permalink
report
parent
reply

People Twitter

!whitepeopletwitter@sh.itjust.works

Create post

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a tweet or similar
  4. No bullying or international politcs
  5. Be excellent to each other.
  6. Provide an archived link to the tweet (or similar) being shown if it’s a major figure or a politician.

Community stats

  • 7.5K

    Monthly active users

  • 812

    Posts

  • 20K

    Comments