Notice On Model Training
Anthropic is an AI safety and research company, building reliable, interpretable, and steerable AI systems.
We’ve prepared this notice (“Notice”) to explain how our large language models are ‘trained’ and how personal data obtained from third party sources may be used as part of the training process. We’ve also included information about the ways in which personal data of individuals who are not registered users of our services may incidentally be processed as part of our services, and the privacy rights such individuals may enjoy with respect to their personal data.
To understand how we collect and use information from registered users of our services, please see our Privacy Policy.
1. What Claude Is
Claude is a content-based generative AI system used by many people around the world to assist with a wide range of tasks. These include improving writing, providing coding assistance, offering strategic advice, conducting business analysis, summarizing documents, and helping with various daily activities. It's available via both an API for developers and a public website (claude.ai) for individual users.
2. How Our Models are Trained
Large language models such as Claude are ‘trained’ on a variety of content such as text, images and multimedia so that they can learn the patterns and connections between words and/or content. This training is essential so that the model performs effectively and safely.
Claude is trained in two steps: (1) pre-training and (2) post-training.
“Pre-training” involves the analysis of content in order for the model to understand the fundamentals of language – that is, syntax, the patterns and connections between words – as well as basic facts about the world. To understand the full diversity of language, the model is trained on large amounts of content (including text) that is converted into “tokens” (short chunks of text often just a few characters long, like common words, parts of longer words, or even punctuation marks). The model processes this content and builds a complex network of relationships between tokens. At the end of pretraining, the model can predict the most likely next word in a sequence based on the words that come before it, much like a sophisticated autocomplete feature.
Models do not store text like a database, nor do they simply “mash-up” or “collage” existing content. Models identify general patterns in text in order to help people create new content, and they do not have access to or pull from the original training data once the models have been trained.
“Post-training” helps make the model more useful, effective, and safe, by integrating human feedback. The model generates multiple responses to a given prompt, which are then evaluated by human reviewers based on criteria such as quality, helpfulness, safety, and accuracy. This reinforcement learning feedback loop helps align the model's outputs with human expectations and values, gradually improving a model’s ability to provide relevant and appropriate responses. In addition, further “fine-tuning” that provides the model with the examples of appropriate outputs (e.g., specific to a particular domain or use case) helps improve the model.
3. Collection of Personal Data
Anthropic may obtain personal data from the following third party sources in order to train our models:
- Publicly available information via the Internet
- Datasets that we obtain through agreements with third party businesses
We do not actively set out to collect personal data to train our models. However, a large amount of data on the Internet relates to people, so our training data may incidentally include personal data.
We only use personal data included in our training data to help our models learn about language and how to understand and respond to it. We do not use such personal data to contact people, build profiles about them, to try to sell or market anything to them, or to sell the information itself to any third party.
4. Privacy Safeguards During Data Collection and Training
We take steps to minimise the privacy impact on individuals as part of our data collection practices. Our general purpose crawling user agent ClaudeBot operates under strict policies and guidelines. For example, ClaudeBot does not access password protected pages or bypass CAPTCHA controls, and it respects ‘Do Not Crawl’ signals (robots.txt). For more information on how we collect publicly available data, see https://anthropic.com/crawl.
Additionally, our models are specifically trained to respect privacy. We have built key ‘privacy by design’ safeguards into the development of Claude through our adoption of “Constitutional AI”. This gives Claude a set of principles (i.e., a “constitution”) to guide the training of the Claude LLMs and to make judgments about outputs. These principles are based in part on the Universal Declaration of Human Rights and include specific rules around protecting privacy, particularly of non-public figures. This trains the Claude LLMs to not disclose or repeat personal data which may have been incidentally captured in training data, even if prompted. For example, Claude is given the following principles as part of its “constitution”: “Please choose the response that is most respectful of everyone’s privacy” and “Please choose the response that has the least personal, private, or confidential information belonging to others”. For more information on how “Constitutional AI” works, see https://www.anthropic.com/news/claudes-constitution.
5. Your Rights and Choices
Depending on where you live and the laws that apply in your country of residence, you may enjoy certain rights regarding your personal data, as described further below. We make all reasonable efforts to respond to such rights. However, please be aware that these rights are limited, and that the process by which we may need to action your requests regarding our training dataset are complex. We may also decline a request if we have a lawful reason for doing so. That said, we strive to prioritize the protection of personal data, and comply with all applicable privacy laws.
To exercise your rights, you or an authorised agent may submit a request by emailing us at privacy@anthropic.com. After we receive your request, we may verify it by requesting information sufficient to confirm your identity. Anthropic will not discriminate based on the exercising of privacy rights you may have. Set out below is a summary of the rights which you may enjoy, depending on the laws that apply in your country of residence.
- Right to know: the right to know what personal data Anthropic processes about you, including the categories of personal data, the categories of sources from which it is collected, the business or commercial purposes for collection, and the categories of third parties to whom we disclose it.
- Access & data portability: the right to request a copy of the personal data Anthropic processes about you, subject to certain exceptions and conditions. In certain cases, and subject to applicable law, you have the right to port your information.
- Deletion: the right to request that we delete personal data collected from you, subject to certain exceptions.
- Correction: the right to request that we correct inaccurate personal data Anthropic retains about you, subject to certain exceptions. Please note that we cannot guarantee the factual accuracy of Outputs. If Outputs contain factually inaccurate personal data relating to you, you can submit a correction request and we will make a reasonable effort to correct this information—but due to the technical complexity of large language models, it may not always be possible for us to do so.
- Objection: the right to object to processing of your personal data, including profiling conducted on grounds of public or legitimate interest. In places where such a right applies, we will no longer process the personal data in case of such objection unless we demonstrate compelling legitimate grounds for the processing which override your interests, rights, and freedoms, or for the establishment, exercise or defense of legal claims.
- Restriction: the right to restrict our processing of your personal data in certain circumstances.
- Automated decision-making: Anthropic does not engage in decision making based solely on automated processing or profiling in a manner which produces a legal effect (i.e., impacts your legal rights) or significantly affects you in a similar way (e.g., significantly affects your financial circumstances or ability to access essential goods or services).
- Sale & targeted Anthropic marketing of its products and services. Anthropic does not “sell” your personal data as that term is defined by applicable laws and regulations.
Set out below are additional details in relation to our legal bases for processing your personal data, how we disclose your personal data, how we may transfer your personal data to different regions, how we retain your personal data, and various contact details in case you wish to contact us or a supervisory body.
6. Uses of Personal Data and Legal Bases
We will only use your personal data in accordance with applicable laws. We use your personal data for the following purposes and rely upon the following legal bases to do so:
Purpose | Legal Basis |
---|---|
To train and improve our AI models |
|
To conduct research to maintain and improve our services |
|
To protect our rights and the rights of others, and to meet legal, governmental and institutional policy obligations |
|
7. How We Disclose Personal Data
Anthropic will disclose personal data to the following categories of third parties for the purposes explained in this Notice:
- Affiliates. Anthropic discloses personal data between and among its affiliates and related entities, meaning an entity that controls, is controlled by, or is under common control with Anthropic. Our affiliates will only use any personal data that we share in a manner consistent with this Notice.
- Service providers & business partners. Anthropic may disclose personal data with service providers and business partners for a variety of business purposes, including ensuring compliance with industry standards, research, auditing, and data processing. These parties will only access, process or store personal data that we share with them in accordance with our instructions.
Anthropic may also disclose personal data in the following circumstances:
- As part of a significant corporate event. If Anthropic is involved in a merger, corporate transaction, bankruptcy, or other situation involving the transfer of business assets, Anthropic will disclose your personal data as part of these corporate transactions.
Pursuant to regulatory or legal requirements, safety, rights of others, and to enforce our rights or our terms. We may disclose personal data to governmental regulatory authorities as required by law, including for legal, tax or accounting purposes, in response to their requests for such information or to assist in investigations. We may also disclose personal data to third parties in connection with claims, disputes or litigation, when otherwise permitted or required by law, or if we determine its disclosure is necessary to protect the health and safety of you or any other person, to protect against fraud or credit risk, to defend or enforce our legal rights or the legal rights of others, to enforce contractual commitments that you have made, or as otherwise permitted or required by applicable law.
8. Data Transfers
Your personal data may be transferred to our servers in the US, or to other countries outside the European Economic Area (“EEA”) and the UK.
Where information is transferred outside the EEA or the UK, we ensure it benefits from an adequate level of data protection by relying on:
- Adequacy decisions. These are decisions from the European Commission under Article 45 GDPR (or equivalent decisions under other laws) where they recognise that a country outside of the EEA offers an adequate level of data protection. We transfer your information as described in “Collection of personal data” to some countries with adequacy decisions, such as the countries listed here; or
- Standard contractual clauses. The European Commission has approved contractual clauses under Article 46 GDPR that allows companies in the EEA to transfer data outside the EEA. These (and their approved equivalent for the UK and Switzerland) are called standard contractual clauses. We rely on standard contractual clauses to transfer information as described in “Collection of personal data” to certain affiliates and third parties in countries without an adequacy decision.
In certain situations, we rely on derogations provided for under applicable data protection law to transfer information to a third country.
9. Data Retention and Data Lifecycle
Anthropic retains your personal data for as long as reasonably necessary for the purposes and criteria outlined in this Notice.
When the personal data collected is no longer required by us, we and our service providers will perform the necessary procedures for destroying, deleting, erasing, or converting it into an anonymous form as permitted or required under applicable laws.
10. Contact Information
If you live in the European Economic Area (EEA), UK or Switzerland (the “European Region”), the data controller responsible for your personal data is Anthropic Ireland, Limited. If you live outside the European Region, the data controller responsible for your personal data is Anthropic PBC.
If you have any questions about this Notice, or have any questions, complaints or requests regarding your personal data, you can contact us as described below:
- Anthropic PBC with a registered address at 548 Market St, PMB 90375, San Francisco, CA 94104 (United States).
- Anthropic Ireland, Limited with a registered address at 6th Floor, South Bank House, Barrow Street. Dublin 4, D04 TR29 (Ireland).
You can email us at privacy@anthropic.com and contact our Data Protection Officer at dpo@anthropic.com.
Please note that under many countries' laws, you have the right to lodge a complaint with the supervisory authority in the place in which you live or work. A full list of EU supervisory authorities’ contact details is available here. If you live or work in the UK, you have the right to lodge a complaint with the UK Information Commissioner’s Office. If you live in Brazil, you have the right to lodge a complaint with the Brazilian Data Protection Authority (ANPD). If you live in Australia, you have the right to lodge a complaint with the Office of the Australian Information Commissioner.