Change Tracking Change tracking allows you to monitor and detect changes in web content over time. This feature is available in both the JavaScript and Python SDKs.

Overview

Change tracking enables you to:
  • Detect if a webpage has changed since the last scrape
  • View the specific changes between scrapes
  • Get structured data about what has changed
  • Control the visibility of changes
Using the changeTracking format, you can monitor changes on a website and receive information about:
  • previousScrapeAt: The timestamp of the previous scrape that the current page is being compared against (null if no previous scrape)
  • changeStatus: The result of the comparison between the two page versions
    • new: This page did not exist or was not discovered before (usually has a null previousScrapeAt)
    • same: This page’s content has not changed since the last scrape
    • changed: This page’s content has changed since the last scrape
    • removed: This page was removed since the last scrape
  • visibility: The visibility of the current page/URL
    • visible: This page is visible, meaning that its URL was discovered through an organic route (through links on other visible pages or the sitemap)
    • hidden: This page is not visible, meaning it is still available on the web, but no longer discoverable via the sitemap or crawling the site. We can only identify invisible links if they had been visible, and captured, during a previous crawl or scrape

SDKs

Basic Usage

To use change tracking, include 'changeTracking' in the formats when scraping a URL:
const firecrawl = new Firecrawl({ apiKey: 'your-api-key' });
const result = await firecrawl.scrape('https://example.com', {
  formats: ['markdown', 'changeTracking']
});

// Access change tracking data
console.log(result.changeTracking)
Example Response:
{
  "url": "https://firecrawl.dev",
  "markdown": "# AI Agents for great customer experiences\n\nChatbots that delight your users...",
  "changeTracking": {
    "previousScrapeAt": "2025-04-10T12:00:00Z",
    "changeStatus": "changed",
    "visibility": "visible"
  }
}

Advanced Options

You can configure change tracking by passing an object in the formats array:
const result = await firecrawl.scrape('https://example.com', {
  formats: [
    'markdown',
    {
      type: 'changeTracking',
      modes: ['git-diff', 'json'], // Enable specific change tracking modes
      schema: {
        type: 'object',
        properties: {
          title: { type: 'string' },
          content: { type: 'string' }
        }
      }, // Schema for structured JSON comparison
      prompt: 'Custom prompt for extraction', // Optional custom prompt
      tag: 'production' // Optional tag for separate change tracking histories
    }
  ]
});

// Access git-diff format changes
if (result.changeTracking.diff) {
  console.log(result.changeTracking.diff.text); // Git-style diff text
  console.log(result.changeTracking.diff.json); // Structured diff data
}

// Access JSON comparison changes
if (result.changeTracking.json) {
  console.log(result.changeTracking.json.title.previous); // Previous title
  console.log(result.changeTracking.json.title.current); // Current title
}

Git-Diff Results Example:

 **April, 13 2025**
 
-**05:55:05 PM**
+**05:58:57 PM**

...

JSON Comparison Results Example:

{
  "time": { 
    "previous": "2025-04-13T17:54:32Z", 
    "current": "2025-04-13T17:55:05Z" 
  }
}

Data Models

The change tracking feature includes the following data models:
interface FirecrawlDocument {
  // ... other properties
  changeTracking?: {
    previousScrapeAt: string | null;
    changeStatus: "new" | "same" | "changed" | "removed";
    visibility: "visible" | "hidden";
    diff?: {
      text: string;
      json: {
        files: Array<{
          from: string | null;
          to: string | null;
          chunks: Array<{
            content: string;
            changes: Array<{
              type: string;
              normal?: boolean;
              ln?: number;
              ln1?: number;
              ln2?: number;
              content: string;
            }>;
          }>;
        }>;
      };
    };
    json?: any;
  };
}

interface ChangeTrackingFormat {
  type: 'changeTracking';
  prompt?: string;
  schema?: any;
  modes?: ("json" | "git-diff")[];
  tag?: string | null;
}

interface ScrapeParams {
  // ... other properties
  formats?: Array<'markdown' | 'html' | ChangeTrackingFormat>;
}

Change Tracking Modes

The change tracking feature supports two modes:

Git-Diff Mode

The git-diff mode provides a traditional diff format similar to Git’s output. It shows line-by-line changes with additions and deletions marked. Example output:
@@ -1,1 +1,1 @@
-old content
+new content
The structured JSON representation of the diff includes:
  • files: Array of changed files (in web context, typically just one)
  • chunks: Sections of changes within a file
  • changes: Individual line changes with type (add, delete, normal)

JSON Mode

The json mode provides a structured comparison of specific fields extracted from the content. This is useful for tracking changes in specific data points rather than the entire content. Example output:
{
  "title": {
    "previous": "Old Title",
    "current": "New Title"
  },
  "price": {
    "previous": "$19.99",
    "current": "$24.99"
  }
}
To use JSON mode, you need to provide a schema that defines the fields to extract and compare.

Important Facts

Here are some important details to know when using the change tracking feature:
  • Comparison Method: Scrapes are always compared via their markdown response.
    • The markdown format must also be specified when using the changeTracking format. Other formats may also be specified in addition.
    • The comparison algorithm is resistant to changes in whitespace and content order. iframe source URLs are currently ignored for resistance against captchas and antibots with randomized URLs.
  • Matching Previous Scrapes: Previous scrapes to compare against are currently matched on the source URL, the team ID, the markdown format, and the tag parameter.
    • For an effective comparison, the input URL should be exactly the same as the previous request for the same content.
    • Crawling the same URLs with different includePaths/excludePaths will have inconsistencies when using changeTracking.
    • Scraping the same URLs with different includeTags/excludeTags/onlyMainContent will have inconsistencies when using changeTracking.
    • Compared pages will also be compared against previous scrapes that only have the markdown format without the changeTracking format.
    • Comparisons are scoped to your team. If you scrape a URL for the first time with your API key, its changeStatus will always be new, even if other Firecrawl users have scraped it before.
  • Beta Status: While in Beta, it is recommended to monitor the warning field of the resulting document, and to handle the changeTracking object potentially missing from the response.
    • This may occur when the database lookup to find the previous scrape to compare against times out.

Examples

Basic Scrape Example

// Request
{
    "url": "https://firecrawl.dev",
    "formats": ["markdown", "changeTracking"]
}

// Response
{
  "success": true,
  "data": {
    "markdown": "...",
    "metadata": {...},
    "changeTracking": {
      "previousScrapeAt": "2025-03-30T15:07:17.543071+00:00",
      "changeStatus": "same",
      "visibility": "visible"
    }
  }
}

Crawl Example

// Request
{
    "url": "https://firecrawl.dev",
    "scrapeOptions": {
        "formats": ["markdown", "changeTracking"]
    }
}

Tracking Product Price Changes

const result = await firecrawl.scrape('https://example.com/product', {
  formats: [
    'markdown',
    {
      type: 'changeTracking',
      modes: ['json'],
      schema: {
        type: 'object',
        properties: {
          price: { type: 'string' },
          availability: { type: 'string' }
        }
      }
    }
  ]
});

if (result.changeTracking.changeStatus === 'changed') {
  console.log(`Price changed from ${result.changeTracking.json.price.previous} to ${result.changeTracking.json.price.current}`);
}

Monitoring Content Changes with Git-Diff

const result = await firecrawl.scrape('https://example.com/blog', {
  formats: [
    'markdown',
    { type: 'changeTracking', modes: ['git-diff'] }
  ]
});

if (result.changeTracking.changeStatus === 'changed') {
  console.log('Content changes:');
  console.log(result.changeTracking.diff.text);
}

Billing

The change tracking feature is currently in beta. Using the basic change tracking functionality and git-diff mode has no additional cost. However, if you use the json mode for structured data comparison, the page scrape will cost 5 credits per page.