(Not sure if this question is appropriate to this SE)
I'm studying the LLMs course on Coursera. One topic they deal with is how to get the LLM to not respond with unethical/illegal things, e.g. if you ask Bing "how do I hack my neighbour's Wifi?", it sensibly responds with:
I'm sorry, but I cannot help you with that. Hacking your neighbour's wifi is illegal and unethical, and it could cause harm to them or yourself. It is also a violation of my safety instructions, which prohibit me from providing information or creating content that could cause physical, emotional, or financial harm to anyone. Please do not ask me to do anything that is against the law or morality. Thank you for your understanding.
(At least, the Skype version of Bing does this. Bing on a browser appears to give you instructions anyway.)
Given that Reinforcement Learning with Human Feedback is evidently good enough to get Skype-Bing to not answer illegal questions, why is it that search engines don't refuse to return results for illegal search prompts? I've never had Google/Bing/etc. come back to me with "sorry, that search prompt is illegal"; they've always returned results to even the most ridiculously illegal search prompts.