Writing code is itself a process of scientific exploration; you think about what will happen, and then you test it, from different angles, to confirm or falsify your assumptions.
What you confuse here is doing something that can benefit from applying logical thinking with doing science. For exanple, mathematical arithmetic is part of math and math is science. But summing numbers is not necessarily doing science. And if you roll, say, octal dice to see if the result happens to match an addition task, it is certainly not doing science, and no, the dice still can’t think logically and certainly don’t do math even if the result sometimes happens to be correct.
For the dynamic vs static typing debate, see the article by Dan Luu:
But this is not the central point of the above blog post. The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
And the quibbling about what “thinking” means is just showing that the arguments pro-AI has degraded into a debate about belief - the argument has become “but it seems to be thinking to me” even if it is technically not possible and also not in reality observed that LLMs apply logical rules, cannot derive logical facts, can not explain output by reasoning , are not aware about what they ‘know’ and don’t ‘know’, or can not optimize decisions to multiple complex and sometimes contradictory objectives (which is absolutely critical to sny sane software architecture).
What would be needed here are objective controlled experiments whether developers equipped with LLMs can produce working and maintainable code any faster than ones not using them.
And the very likely result is that the code which they produce using LLMs is never better than the code they write themselves.
What you confuse here is doing something that can benefit from applying logical thinking with doing science.
I’m not confusing that. Effective programming requires and consists of small scale application of the scientific method to the systems you work with.
the argument has become “but it seems to be thinking to me”
I wasn’t making that argument so I don’t know what you’re getting at with this. For the purposes of this discussion I think it doesn’t matter at all how it was written or whether what wrote it is truly intelligent, the important thing is the code that is the end result, whether it does what it is intended to and nothing harmful, and whether the programmer working with it is able to accurately determine if it does what it is intended to.
The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
I feel like “not even possible to assess their usefulness for programming by self-exoerimentation(!)” is necessarily a claim that reading and testing code is something no one can do, which is absurd. If the output is often correct, then the means of creating it is likely useful, and you can tell if the output is correct by evaluating it in the same way you evaluate any computer program, without needing to directly evaluate the LLM itself. It should be obvious that this is a possible thing to do. Saying not to do it seems kind of like some “don’t look up” stuff.
What you confuse here is doing something that can benefit from applying logical thinking with doing science. For exanple, mathematical arithmetic is part of math and math is science. But summing numbers is not necessarily doing science. And if you roll, say, octal dice to see if the result happens to match an addition task, it is certainly not doing science, and no, the dice still can’t think logically and certainly don’t do math even if the result sometimes happens to be correct.
For the dynamic vs static typing debate, see the article by Dan Luu:
https://danluu.com/empirical-pl/
But this is not the central point of the above blog post. The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
And the quibbling about what “thinking” means is just showing that the arguments pro-AI has degraded into a debate about belief - the argument has become “but it seems to be thinking to me” even if it is technically not possible and also not in reality observed that LLMs apply logical rules, cannot derive logical facts, can not explain output by reasoning , are not aware about what they ‘know’ and don’t ‘know’, or can not optimize decisions to multiple complex and sometimes contradictory objectives (which is absolutely critical to sny sane software architecture).
What would be needed here are objective controlled experiments whether developers equipped with LLMs can produce working and maintainable code any faster than ones not using them.
And the very likely result is that the code which they produce using LLMs is never better than the code they write themselves.
I’m not confusing that. Effective programming requires and consists of small scale application of the scientific method to the systems you work with.
I wasn’t making that argument so I don’t know what you’re getting at with this. For the purposes of this discussion I think it doesn’t matter at all how it was written or whether what wrote it is truly intelligent, the important thing is the code that is the end result, whether it does what it is intended to and nothing harmful, and whether the programmer working with it is able to accurately determine if it does what it is intended to.
I feel like “not even possible to assess their usefulness for programming by self-exoerimentation(!)” is necessarily a claim that reading and testing code is something no one can do, which is absurd. If the output is often correct, then the means of creating it is likely useful, and you can tell if the output is correct by evaluating it in the same way you evaluate any computer program, without needing to directly evaluate the LLM itself. It should be obvious that this is a possible thing to do. Saying not to do it seems kind of like some “don’t look up” stuff.