Model card / system card
When new AI models are released, it is customary to include a "model card" or "system card" that describes the model's capabilities, how the model was trained, how the model was tested for safety, and any unique risks that the model may present. Publishing a robust and honest model card is an important act of transparency, even if it highlights alarming risks or limitations.
These documents can get technical, but they are usually pretty accessible and contain important information. Journalists can find fascinating details in a model card, and when a new model drops, it's always worth a read.
Details about how model makers "red team" new models can be especially revealing, including examples of safety failures and the steps the company has taken to reduce potential harms.
Example model cards:
'Our evaluations found that o1-preview and o1-mini can help experts with the operational planning of reproducing a known biological threat, which meets our medium risk threshold,' the system card reports.— Sherwood News
In the model's system card, OpenAI details how well the new 5.1 models compare to the earlier 5.0 models on internal benchmarks for disallowed content. The company has said it is prioritizing the addition of new checks to help users who may be suffering a mental health crisis, after a series of alarming incidents where ChatGPT encouraged self-harm and reinforced delusional behavior.— Sherwood News