Ipr In Licensing AI-Generated Research Data.
IPR in Licensing AI-Generated Research Data
1. Introduction
AI-generated research data refers to datasets, outputs, predictions, simulations, models, or analytical results produced autonomously or semi-autonomously by artificial intelligence systems, often used in:
Biomedical research
Climate modeling
Drug discovery
Financial forecasting
Social science analytics
Licensing such data raises complex IPR questions, including:
Who owns AI-generated data?
Is AI-generated data protected by copyright?
Can databases generated by AI be licensed?
What role do trade secrets and contracts play?
How is infringement assessed when data is reused?
2. Applicable IPR Regimes
a. Copyright
Protects original expression, not raw facts or data.
AI-generated data usually lacks human authorship → weak copyright protection.
b. Database Rights
Protect investment in collecting, verifying, or presenting data.
Highly relevant to AI-generated research datasets.
c. Trade Secrets
Used when AI-generated research data is confidential and commercially valuable.
d. Contractual Licensing
The primary legal mechanism for controlling AI-generated data.
Licenses define use, redistribution, training rights, and derivative works.
3. Core Legal Issues in Licensing AI-Generated Research Data
| Issue | Explanation |
|---|---|
| Authorship | AI cannot legally be an author |
| Ownership | Usually vests in AI developer, deployer, or funder |
| Protectability | Raw data usually not protected |
| Licensing | Contracts substitute for weak IP protection |
| Reuse & training | Often restricted via license terms |
Case Laws Governing AI-Generated Research Data Licensing
Case 1: Feist Publications v. Rural Telephone Service (USA)
Background
Rural Telephone compiled a directory of names and phone numbers.
Feist copied the data to create its own directory.
Legal Issue
Are factual datasets protected by copyright?
Decision
The US Supreme Court held that facts are not copyrightable.
Only original selection or arrangement can be protected.
Relevance to AI-Generated Research Data
AI-generated raw research data (measurements, results, outputs) is not protected by copyright.
Licensing becomes essential to control reuse.
Importance
Foundation case explaining why contracts and database rights dominate AI data licensing.
Case 2: British Horseracing Board v. William Hill (EU)
Background
British Horseracing Board invested heavily in compiling race data.
William Hill reused that data for betting services.
Legal Issue
Does investment in data generation justify database protection?
Decision
Court recognized database rights where substantial investment exists.
Unauthorized extraction infringes the database maker’s rights.
Relevance to AI-Generated Research Data
AI-generated research datasets often involve significant computational and financial investment.
Database rights support licensing and enforcement.
Importance
Core authority for licensing AI-generated datasets in Europe.
Case 3: Naruto v. Slater (USA – “Monkey Selfie Case”)
Background
A monkey took photographs using a photographer’s camera.
Claim was made that the monkey owned the copyright.
Legal Issue
Can non-human creators own copyright?
Decision
Court ruled only humans can be authors under copyright law.
Relevance to AI-Generated Research Data
AI cannot be an author or rights holder.
Ownership must vest in:
AI developer
Research institution
Employer
Contractually designated entity
Importance
Frequently cited in disputes over AI-generated research outputs.
Case 4: SAS Institute v. World Programming Ltd. (EU)
Background
SAS Institute claimed infringement when World Programming replicated software functionality using SAS data.
Legal Issue
Are functionality, programming language, and data formats protected?
Decision
Court held that functionality and data formats are not protected by copyright.
Only expression is protected.
Relevance to AI-Generated Research Data
AI models trained on research data may replicate functional outcomes without infringement.
Licensing terms become the main control mechanism.
Importance
Reinforces limits of copyright in data-driven technologies.
Case 5: hiQ Labs v. LinkedIn (USA)
Background
hiQ scraped publicly available LinkedIn data for analytics.
LinkedIn tried to block access.
Legal Issue
Can public data be restricted from reuse?
Decision
Court held that publicly accessible data can generally be scraped unless protected by contract or specific laws.
Relevance to AI-Generated Research Data
Public AI-generated research datasets may be reused freely.
Licensing and access controls are critical for protection.
Importance
Emphasizes the importance of private licensing over public disclosure.
Case 6: Google v. Oracle (USA)
Background
Google reused Java APIs in Android.
Oracle claimed copyright infringement.
Legal Issue
Does reuse of structured data and interfaces infringe copyright?
Decision
Court applied fair use, emphasizing innovation and functional necessity.
Relevance to AI-Generated Research Data
AI systems often reuse research data structures.
Licensing terms must clarify:
Training permissions
Derivative use
Commercial exploitation
Importance
Demonstrates how fair use arguments interact with licensing disputes.
Case 7: Meta Platforms v. Bright Data (USA)
Background
Bright Data scraped large datasets from Meta platforms.
Meta claimed IP and contractual violations.
Legal Issue
Can platform owners restrict data reuse through terms?
Decision
Court recognized the power of contractual restrictions over data usage.
Relevance to AI-Generated Research Data
Research institutions can license AI-generated data with strict contractual limits.
Contract law overrides weak IP protection.
Importance
Confirms that licensing contracts are the strongest protection tool.
4. Key Principles Emerging from Case Law
Raw data is not copyrightable
AI cannot be an author or owner
Database rights protect investment, not creativity
Public data is hard to restrict without contracts
Licensing agreements are central to control and monetization
Trade secrets apply if confidentiality is maintained
5. Best Practices for Licensing AI-Generated Research Data
Clearly define ownership in contracts
Specify permitted uses (research, training, commercial)
Restrict redistribution and derivative datasets
Address AI training and model outputs
Use confidentiality clauses where applicable
6. Conclusion
Licensing AI-generated research data sits at the intersection of copyright limitations, database rights, trade secret law, and contract law. Case law shows that:
Traditional IP law offers limited protection
Courts rely heavily on human authorship and originality principles
Contracts and database rights are the most effective tools for licensing and enforcement
As AI research expands, well-drafted licensing agreements—guided by these judicial principles—are essential to control ownership, use, and commercialization of AI-generated research data.

comments