这个周末我没有只看演示,而是真正测试了几个AI工具。Copilot 在生成样板代码方面很棒,但 Claude 的解释更清晰。最大的惊讶是:提示词措辞对结果影响有多大。还有人在真实工作流程中对这些做基准测试吗?
Original
Spent the weekend actually testing a few AI tools instead of just watching demos. Copilot was great for boilerplate, but Claude gave clearer explanations. Biggest surprise: how much prompt wording changed the results. Anyone else benchmarking these in real workflows?