Thoth AI Research

Structured Research Data for AI Companies, Researchers, and Knowledge Systems

Thoth AI Research provides source-linked extended research packets built through the THOTH workflow — organized around confirmed facts, timelines, uncertainty, source metadata, editorial context, and reusable schemas.

Limited early access. Usage-based pricing. No unlimited plans.

TheDailyGlobe as Proof of Work

TheDailyGlobe is powered by the same research workflow.

THOTH is the research engine behind TheDailyGlobe. TheDailyGlobe is a working example of how THOTH turns source-linked research packets into editor-reviewed public-interest journalism.

TheDailyGlobe uses THOTH to support editor-reviewed journalism. The public news site is one example of what structured research workflows can power. Thoth AI Research opens a separate path for AI companies, researchers, and organizations that need clean, organized research packets rather than finished articles.

THOTH supports research and organization. TheDailyGlobe articles remain editor-reviewed before publication.

Visit TheDailyGlobe →

What Thoth AI Research Provides

Thoth AI Research is not a feed of TheDailyGlobe articles. It provides access to structured extended research packets created through THOTH, the research and editorial workflow engine behind TheDailyGlobe.

THOTH creates proprietary structured research packets built from metadata, summaries, source notes, topic tags, uncertainty labels, editorial framing, and reusable schemas. For approved projects, THOTH can also support custom schemas designed around specific research questions, data-gathering needs, and output formats.

Each packet includes

Background context
Timelines and key events
Confirmed facts
What remains unclear
Source notes
Source URLs
Publisher and date metadata
Topic tags
Rights and usage notes
Export-ready structured formats
Custom schemas for approved projects

Who It Is For

01

AI and RAG Teams

For teams building retrieval, evaluation, knowledge, or context systems that need structured, source-linked research data.

02

Researchers and Labs

For academic, policy, civic, or institutional researchers who need organized public-interest research packets.

03

Media Intelligence Teams

For teams tracking policy, science, geopolitics, courts, public health, culture, business, or technology topics.

04

Dataset Builders

For organizations that need structured topic datasets without starting from raw scraping or unorganized search results.

Why It Is Different

Built for provenance, not volume alone.

Raw scraped records are cheap. Clean, structured, source-linked research packets are different. Thoth AI Research is designed around provenance, attribution, uncertainty, usability, and custom schema design.

Raw Data

  • Scattered pages
  • Inconsistent structure
  • Weak attribution
  • Unclear uncertainty
  • Difficult to reuse
  • Hard to adapt to specific research questions

Thoth AI Research

  • Structured research packets
  • Source URLs and dates
  • Confirmed facts separated from uncertainty
  • Timelines and context
  • Exportable formats
  • Custom schemas for specific data-gathering and output needs

Example Packet Structure

Each Thoth AI Research packet follows a consistent schema. Below is a representative structure — not actual data access.

topicsectiontagsschemabackgroundtimelineconfirmedFactswhatRemainsUnclearsourceNoteseditorialContextsourcesrightsNote

Every packet separates confirmed facts from uncertainty, preserves source attribution, and is structured for downstream use in AI pipelines, research workflows, or knowledge systems.

{
  "topic": "Student Loan Repayment Changes",
  "section": "U.S.",
  "tags": [
    "education",
    "student loans",
    "federal policy"
  ],
  "schema": "standard_research_packet",
  "background": [],
  "timeline": [],
  "confirmedFacts": [],
  "whatRemainsUnclear": [],
  "sourceNotes": [],
  "editorialContext": [],
  "sources": [
    {
      "publisher": "",
      "date": "",
      "url": "",
      "type": ""
    }
  ],
  "rightsNote": "Structured research notes and
  source-linked summaries. Not copied
  source articles."
}

Schema sample only. Not actual data access.

Pricing

Usage-based pricing that scales with the dataset you need.

Pricing is based on completed research packets, with each project scoped around volume, topic complexity, delivery format, custom schema needs, and approved usage rights.

One-time runs

Starter Data Run

$250prepaid
  • 15 research packets
  • 1 topic area
  • Markdown export
  • Basic JSON export
  • Source URLs and publisher/date metadata
  • Internal evaluation or research use
  • Additional packets: $15 each
  • No API access
  • No resale or public redistribution

Research Builder

$750prepaid
  • 50 research packets
  • Up to 2 topic areas
  • Markdown + JSON export
  • Basic CSV export
  • Timeline fields
  • Confirmed-fact fields
  • Uncertainty fields
  • Source metadata
  • Additional packets: $15 each
  • No model training rights unless separately approved

Custom Dataset Build

From $5,000
  • Defined topic area
  • Minimum 300 packets
  • Custom scope
  • Structured export
  • Source metadata
  • Delivery package
  • Usage terms

Monthly feeds

Professional Data Feed

$1,500/month
  • 125 research packets/month
  • Up to 5 topic areas
  • Markdown + JSON + CSV export
  • Topic tags
  • Structured source metadata
  • Monthly delivery
  • Internal RAG/evaluation use
  • Additional packets: $12 each
  • No resale or public redistribution

Growth Research Feed

$3,000/month
  • 300 research packets/month
  • Up to 10 topic areas
  • Structured JSON/CSV/Markdown
  • Batch delivery
  • Light schema customization
  • Priority processing
  • Commercial internal use
  • RAG/evaluation use
  • Additional packets: $10 each

Enterprise Research Licensing

From $10,000/month
  • Custom packet volume
  • Custom topic coverage
  • API or scheduled feed delivery if approved
  • Custom schema
  • Commercial RAG/evaluation rights
  • Usage reporting
  • Data provenance documentation
  • Legal/data-rights review
  • Optional model-training rights by contract

Note: Model training, fine-tuning, redistribution, sublicensing, synthetic data generation, API access, and commercial model-improvement rights require separate written approval and written terms.

Data Integrity & Rights

Designed to be cleaner than scraped data.

Thoth AI Research is built around structured research notes, source-linked summaries, metadata, uncertainty labels, and reusable schemas. It is not a dump of copied third-party articles.

  • Source links preserved when available
  • Confirmed facts separated from uncertainty
  • Usage terms required before access
  • Human/editorial review may apply depending on package
  • Custom schemas available for approved projects
  • No full copied third-party articles
  • No private user data
  • No unattributed scraped text

Early Access

Request-access only.

Thoth AI Research is currently available by request only. We are evaluating qualified research, AI, and data-use cases before opening broader platform access.

Future access may include a logged-in interface for approved users to define dataset topics, request research packet runs, choose output formats, use approved custom schemas, and export structured datasets.

Request Access

Request Thoth AI Research Access

Thoth AI Research is in limited early access. Submit your request and we will follow up to discuss scope, pricing, and terms.

No instant access — every project is scoped
Usage-based pricing only
Model training, fine-tuning, redistribution, sublicensing, synthetic data generation, API access, and commercial model-improvement rights require separate written approval
No unlimited plans

No instant access is granted. Every project is individually scoped. Model training, fine-tuning, redistribution, sublicensing, synthetic data generation, API access, and commercial model-improvement rights require separate written approval.