What is cyborgism?
In the context of AI safety
A research field about how to prevent risks from advanced artificial intelligence.
An AI model that takes in some text and predicts how the text is most likely to continue.
A program that has been trained to recognize certain patterns or make certain decisions without further human intervention. Sometimes simply called “An AI”.
A feature of a region of input space that corresponds to a useful pattern. For example, in an image detector, a set of neurons that detects cars might be a feature.
The authors of the Alignment Forum post that introduces the idea argue that, just as the steam engine was initially used to build horseless carriages before the new capabilities were adapted to more efficient designs (i.e., cars), our current usage of LLMs follows familiar patterns and is quite limited compared to what it could become. They suggest that, instead of viewing the ways in which LLMs differ from agents
A system that can be understood as taking actions towards achieving a goal.
Research aimed at making AI more capable. This is sometimes contrasted with AI research aimed at safety.
Proponents of cyborgism have developed human-LLM collaboration tools such as LOOM, a tool that enables users to explore multiple simultaneous branches of an LLM’s "simulations".
Cyborgism in this context is distinct from cyborg art and cyberfeminism. ↩︎
In principle, cyborg systems could be composed of a human and any AI model, but the agenda emerged from an understanding of LLMs and in practice is used that way. ↩︎
They note that some of the modifications to the base models such as RLHF make LLMs more agent-like, which is capabilities research and reduces the time we have before humanity loses control. ↩︎
They remark that the non-RLHFed base models perform better as multiverse generators than their RLHFed counterparts. RLHF constrains the model’s output in a way that is relevant to a chatbot but not to a multiverse generator. ↩︎